Metadata
Definition
Metadata is information about the context, content, quality, provenance, and/or accessibility of a set of data.
Relevance
Metadata may be . . .
- required for depositing a data set in disciplinary repositories or for publishing it in research journals
- critical documentation for the longevity and reproducibility of research data
- useful for visualizing or analyzing the data in data files
What are some examples of metadata?
Metadata can exist in a variety of different formats. Some of the most common ones are summarized in the table below.
| Type of metadata | Example of this type |
|---|---|
| A text or html document. | Metadata includes authors, dates, location, etc. This metadata accompanies data on Seasonal Frost Depths, Midwestern USA (1971-1981) that is archived in the National Snow and Ice Data Center. |
| An XML document linked to data files. | Metadata includes authors, locations, dates, etc. This metadata is linked to TIGER/Line Shapefile data on Wisconsin Congressional Districts, 2009 provided on Data.gov. (Note: you may need to select “View page source” in your browser to see the XML format.) Follows the FDGC (Federal Geographic Data Committee) digital geospatial metadata standard. |
| Information embedded in an XML data file. | Metadata includes authors, dates, organism, publication, instrument, etc. It is kept within the X-ray diffraction data file for UDP-galactopyranose mutase in the Protein Data Bank repository. (Note: you may need to select “View page source” in your browser to see the XML format.) Follows the PDBML (Protein Data Bank Markup Language) specification. |
What metadata help is available?
A data specialist from one of the following groups may be able to help you find, adapt, and use an appropriate metadata standard.
- An informatics specialist or IT consultant in your department.
- A digital curation consultant.
- The subject librarian for your department.
- A disciplinary society in your research area.
What metadata should I use?
Metadata standards specify what pieces of information are included and how they are expressed in digital files. Some are generic enough to be useful across a wide array of disciplines, while others are highly specific to disciplinary areas.
We cannot provide a comprehensive list here. Instead, we include examples in broad disciplinary areas, plus a “general” category. Where possible, we selected examples that appear to have broad adoption within or across disciplinary areas.
Metadata Standards
Links to a few representative metadata standards in disciplinary areas| Disciplinary area | Metadata standard | Description |
|---|---|---|
| General | Dublin Core | Widely used in disciplinary and institutional repositories. |
| Disciplinary Metadata from the DCC | Searchable list of disciplinary metadata standards and related information. Includes biology, Earth science, physical science, social science & humanities and general research data. | |
| Altova Schema library | A reference library to common (and uncommon) industry and cross-industry schemas. | |
| Life Sciences | Darwin Core | Designed to facilitate the sharing of information about biological diversity. It is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples and related information. |
| Ecology Metadata Language (EML) | Maintained by the Ecological Society of America. Consists of XML modules that can be used to document ecological datasets. | |
| Humanities | Seeing Standards: A Visualization of the Metadata Universe | Information on 105 cultural heritage metadata standards. |
| Text Encoding Initiative | A widely-used standard for representing textual materials in XML. | |
| Social Sciences | DDI | A metadata specification for the social and behavioral sciences created by the Data Documentation Initiative. Used to document data through its lifecycle and to enhance dataset interoperability. |


