Metadata

 

Definition

Metadata is information about the context, content, quality, provenance, and/or accessibility of a set of data.

Relevance

Metadata may be . . .

  • required for depositing a data set in disciplinary repositories or for publishing it in research journals
  • critical documentation for the longevity and reproducibility of research data
  • useful for visualizing or analyzing the data in data files

What are some examples of metadata?

Metadata can exist in a variety of different formats. Some of the most common ones are summarized in the table below.

Type of metadataExample of this type
A text or html document.Metadata includes authors, dates, location, etc. This metadata accompanies data on Seasonal Frost Depths, Midwestern USA (1971-1981) that is archived in the National Snow and Ice Data Center.
An XML document linked to data files.Metadata includes authors, locations, dates, etc. This metadata is linked to TIGER/Line Shapefile data on Wisconsin Congressional Districts, 2009 provided on Data.gov.
(Note: you may need to select “View page source” in your browser to see the XML format.)
Follows the FDGC (Federal Geographic Data Committee) digital geospatial metadata standard.
Information embedded in an XML data file.Metadata includes authors, dates, organism, publication, instrument, etc. It is kept within the X-ray diffraction data file for UDP-galactopyranose mutase in the Protein Data Bank repository.
(Note: you may need to select “View page source” in your browser to see the XML format.)
Follows the PDBML (Protein Data Bank Markup Language) specification.

What metadata help is available?

A data specialist from one of the following groups may be able to help you find, adapt, and use an appropriate metadata standard.

A sample of the Ecology Metadata Language (EML) standard

A sample of the Ecology Metadata Language (EML) standard

What metadata should I use?

Metadata standards specify what pieces of information are included and how they are expressed in digital files. Some are generic enough to be useful across a wide array of disciplines, while others are highly specific to disciplinary areas.

We cannot provide a comprehensive list here. Instead, we include examples in broad disciplinary areas, plus a “general” category. Where possible, we selected examples that appear to have broad adoption within or across disciplinary areas.

Metadata Standards

Links to a few representative metadata standards in disciplinary areas
Disciplinary areaMetadata standardDescription
General Dublin CoreWidely used in disciplinary and institutional repositories.
Disciplinary Metadata from the DCCSearchable list of disciplinary metadata standards and related information. Includes biology, Earth science, physical science, social science & humanities and general research data.
Life Sciences Darwin CoreDesigned to facilitate the sharing of information about biological diversity. It is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples and related information.
Ecology Metadata Language (EML)Maintained by the Ecological Society of America. Consists of XML modules that can be used to document ecological datasets.
Humanities Seeing Standards: A Visualization of the Metadata UniverseInformation on 105 cultural heritage metadata standards.
Text Encoding InitiativeA widely-used standard for representing textual materials in XML.
Social SciencesDDIA metadata specification for the social and behavioral sciences created by the Data Documentation Initiative. Used to document data through its lifecycle and to enhance dataset interoperability.