A network of over 100 herbaria spread out across the southeastern United States recently completed the herculean task of fully digitizing more than three million specimens collected by botanists and naturalists over a span of 200 years. The project, which was funded by the National Science Foundation, is part of a larger, ongoing effort by natural history institutions worldwide to make their biological collections easily accessible to researchers studying broad patterns of evolution, extinction, range shifts, and climate change.
In a new study published in the journal Applications in Plant Sciences, researchers involved in the project analyzed the rate at which specimens could be reliably photographed, digitized, and databased to assess how much similar efforts might cost in the future.
"Everybody who was interested in this recognized very early on just how much labor and money we were talking about," said senior author Joey Shaw, associate professor of biology and herbarium curator at the University of Tennessee at Chattanooga.
Although digitization efforts had been underway at large institutions since the turn of the century, to date, no one had developed a robust framework for determining how much it cost to bring a set of collections online, noted Shaw.
"No one had really been doing it at any sort of scale to understand how many specimens you could catalog per minute."
Such information can be vitally important for smaller institutions that rely on a dwindling fraction of administrative funding to maintain their collections. At universities and liberal arts colleges, these efforts are also heavily reliant on students, who perform the work for college research credit or as student employees, which comes with its own associated set of benefits and pitfalls.
Digitizing specimens is often the first direct experience students have with natural history, and, for many, it blossoms into a lifelong appreciation for biology. For some, as was the case for four authors on the study, this appreciation later culminates in graduate studies and careers in the natural sciences.
Student employment, however, is by nature transient, and resources invested in training students is often duplicated when they graduate or move on to other projects. So, at the outset of the project's conception, Shaw and his colleagues wanted to integrate into their analyses the lag associated with student training along with the subsequent increase in productivity as students gained experience.
The resulting estimates were incredibly precise, to the extent that Shaw and his colleagues could pinpoint exactly when classes let out by analyzing the rates at which specimen photographs were uploaded.
"The students ended up being so efficient that the computer and internet speed actually starting slowing down the process when classes let out, when there's a surge of people picking up their phones," Shaw said.
With hundreds of thousands of specimens now freely accessible online, Shaw hopes that their data will help inform workers at other herbaria hoping to replicate their results. All of the partnering institutions in this study are members of the SouthEast Regional Network of Expertise and Collections (SERNEC), which supports more than 200 herbaria in the region that house a combined 15 million specimens, the majority of which have yet to be digitized.
With increasing amounts of habitat loss due to urbanization and deforestation, herbaria offer researchers a valuable window into ecosystems long-since developed or demolished. Many collections include specimens that are now extinct in the wild, while others have yielded the discovery of entirely new species. And as average global temperatures continue to rise, scientists are increasingly turning to herbaria collection data to analyze the effects climate change has already had on plant communities.
"Biologists have accumulated species data from all over the world, in the form of biological specimens, since at least the Renaissance, and we continue this practice today," Shaw said. "Our recent work has been to convert the data of these biological specimens into a freely accessible online database. It will be the largest dataset assembled on Earth's biodiversity, and we are only at the beginning of dreaming up research, conservation, and land management questions that will be answered with this database."