iDigBio, the National Science Foundation-sponsored project to help digitize the nation’s natural history collections, now houses more than 100 million specimen records in its online database, offering access to one of the largest virtual collections of life on Earth.
Natural history collections worldwide contain more than a billion specimens of plants, animals and fungi. But before digitization, the wealth of scientific information they offered was largely confined to the drawers and shelves of museums and universities.
NSF established iDigBio, or Integrated Digitized Biocollections, to coordinate a nationwide digitization effort by developing the infrastructure to standardize and preserve specimen data long term and by helping institutions launch their digitization efforts. The program is part of NSF’s Advancing Digitization of Biodiversity Collections initiative.
Now in its seventh year, iDigBio, based at the University of Florida with the Florida Museum of Natural History and Florida State University as core partners, has amassed data from more than 1,900 collections from about 820 institutions in its online portal. The volume of data has reached a “critical mass” at which researchers can begin using it to investigate broad scale evolutionary and ecological questions, said Larry Page, director of iDigBio and curator of ichthyology at the Florida Museum.
“What’s exciting about being at more than 100 million specimen records is you can ask larger questions over space, time and biodiversity. Big data sheds light not just on one species but whole blocks of species — aquatic and terrestrial,” Page said. “The more data we have, the better we’ll be able to predict the impacts of climate change, human disease, landscape modifications and changes that will impact crops.”
The portal can be searched by fields such as scientific name, location, collector, time period, region and date collected, unlocking natural history data for researchers, educators and the public.
From tracing the spread of invasive lionfish worldwide to examining an outlier leopard shark collected 155 miles north of its normal range, researchers can tap into information previously accessible only by visiting individual collections in person or online or by borrowing specimens.
“It’s a matter of mobilizing data, getting them out there and making them discoverable and searchable,” said David Jennings, iDigBio project manager. “Then you might be able to find linkages or discover things about ecosystems that weren’t known before because you had to search in 10 or 15 different places.”
Researchers are using the data to explore a variety of questions, including whether pollinator communities return if their native habitat is restored, how World War II put a damper on insect collecting and how spiders and plants colonize lava flows in Idaho.
iDigBio also offers lesson plans for educators to use in their classrooms, resources for undergraduate students and guides on how citizen scientists can get involved.
Reflecting on how far the program has come since its inception, Page said the challenges of getting it off the ground were daunting.
“Each institution has its own way of doing things, and we are all used to acting alone,” he said. “The problem with tying everything together initially was making an entity that was searchable so the data would be useful to the science community.”
Changing the culture of curation to include digitization as a natural step in the preservation process has proven to be a bigger obstacle than the technical tangles of standardizing data submitted in various formats and building a database that can hold millions of specimen records, Jennings said.
Many smaller institutions lack the technology, support, funding or know-how to digitize and mobilize their collection data. iDigBio helps fill the gap with training in digitization practices and informatics skills, but the cost of equipment and salaries associated with digitization can still present formidable challenges, he said.
Digitizing a specimen involves several steps: adding information — such as its scientific name, collector, date collected and location — to a database; imaging the specimen via photography, CT scan or audio recording; and georeferencing it by pinpointing where it was collected on a virtual map. Mobilizing these data moves the digitized specimen from a local hard drive to an aggregator so that it can be shared and publicly visible, said Molly Phillips, iDigBio education and outreach coordinator.
“Mobilization describes the process of sharing data online,” she said. “We use that verb as way to encompass all the steps that go into that, and it can be a pretty complicated process.”
Natural history collections provide irreplaceable archives of life on Earth, Page said, and digitization helps ensure these resources are preserved for the future.
For example, holotypes, the reference specimens that stand as the defining example of a new species, are stored in collections. Field guides and species distribution maps are based on museum collections. Temporal data allow researchers to study gradual changes in ecosystems and the timing of key biological events such as when plants flower. Collections reveal which species are native and which are invasive. Specimens provide tissue for molecular work, which underpins studies of evolutionary history. And each specimen contains other specimens in the form of gut contents and parasites.
As for iDigBio, passing the 100 million milestone does not mean the program’s work is done, Jennings said.
“We’ve made an effort to identify all the collections in the U.S.,” he said. “That number is continually morphing, but it’s about 1,300. We contact them and see what we can do, find out who has data we can mobilize and get those into our portal. Our scope is to get it all.”
• Learn more about iDigBio.