Data sharing has become a routine part of doing science. Most funding agencies now require data management plans that include preservation and maintenance of completed datasets over the long-term, and many peer-reviewed journals require prospective authors to publish their datasets in conjunction with their papers. Large data repositories such as NOAA NCEI, where scientists can submit datasets for long-term preservation, maintenance, and accessibility, also require standardized metadata (detailed information about the dataset), which is a related but separate entity from the actual dataset.

As the number of published datasets and metadata records proliferate, it often becomes necessary to cite them in peer-reviewed papers, grant proposals, progress reports, annual work evaluations, curriculum vitae, and elsewhere. Citation for a metadata record, dataset, or online database is similar to citation for a published paper or report. The ultimate purpose for citations of anything is to allow others to find and access the item. In general, the following information should be included:

  • Authors, originators, proprietors, or principal contacts
  • Year of publication (of the metadata, dataset, or database, not the associated paper or report). If it's a continuously updated entity, use the year of the most recent update.
  • Title or name 
  • Publisher name or name of repository if there is one
  • Name of individual or organization responsible for hosting the repository
  • Location information if appropriate
  • DOI or URL (as direct as possible to the specific item)
  • Last accessed date (the date that the citer last saw the referenced material). Use a date format that is as unambiguous as possible.

DISL Data Management suggests citations of our metadata records in the following format:

Originator names (metadata date). Metadata record title. Data Management Center, Dauphin Island Sea Lab, Alabama, USA. Available at: metadata link. Last accessed: date.

Here are some examples for a metadata record citation:

Carmichael, R.H. and E. Hieb (2017). Water Quality at Six Sites in Alabama Waters Representing West Indian Manatee Habitat (2008-2016). Data Management Center, Dauphin Island Sea Lab, Alabama, USA. Available at: http://cf.disl.org/datamanagement/metadata_folder/DISL-Carmichael-MSN-011-2017.xml. Last accessed: 21 November 2017.

Carmichael, R.H. and M. Estes (2017). American Horseshoe Crab Abundance in the Northern Central Gulf of Mexico in 2012-2013. Data Management Center, Dauphin Island Sea Lab, Alabama, USA. Available at: http://cf.disl.org/datamanagement/metadata_folder/DISL-Carmichael-Estes-001-2016.xml. Last accessed: 21 November 2017.

Here is an example for a dataset citation at NOAA NCEI:

Carmichael, Ruth; Estes Jr, Maurice (2016). American Horseshoe Crab Abundance in the Northern Central Gulf of Mexico from 2012-05-21 to 2013-08-20 (NCEI Accession 0149391). Version 1.1. NOAA National Centers for Environmental Information. Dataset. doi:10.7289/V5ZS2TH4 [21 November 2017]

Here is an example for a database citation:

Carmichael, R.H., Hieb, E., Aven, A., Taylor, N., Seely, C., Delo, J., and Pabody, C. (2017). Dauphin Island Sea Lab's Manatee Sighting Network Database (1912-2017). Dauphin Island Sea Lab, Alabama, USA. manatee.disl.org/sightings/. Last accessed: 21 November 2017.

For datasets and databases that are not yet fully accessible to the public but must be cited, where citing a metadata record about the resource would not be sufficient, we suggest creating a website about the resource that can then be linked in the citation. The website should include information about how to access the resource, such as contact information and use restrictions, as well as describe the contents of the resource in detail and/or link to any relevant metadata records. 

More information about citing many types of electronic sources from Purdue OWL:

APA Reference List: Electronic Sources 
MLA Works Cited: Electronic Sources 
Chicago Style Web Sources