Sustainable Data Formats

March 28th, 2011

Definition

A sustainable digital format is one that is compatible, for the foreseeable future, with software needed to open and read it.

Relevance

In order to read most types of digital data, you need to open it in a compatible software. Unfortunately, as software applications change or disappear over time, data file formats can become obsolete. If there is a risk of your data format becoming obsolete during its useful lifetime, you may need to migrate it to a new format. The resources needed to do this could be included as a budget item in your data plan.

Recommendations

Wherever possible, select data formats that have the following sustainability attributes:

Sustainability attributeExample
adheres to specifications that are publically documented (versus formats based on proprietary specifications)TIFF (Tagged Image File Format) format for images
is in widespread use and readable with available softwareHTML for hypertext

CSV (Comma Separated Values) for tabular data
is self-describing, i.e., contains embedded metadata that help interpret the context and structure of the data fileXML (Extensible Markup Language) files contain headers and tags describing the file's content
contains as much of the original information as possibleMotion JPEG 2000, a “lossless” format for digital video

Resources

ResourceSource
FAQs about selecting sustainable formatsNational Archives
FAQs about digital audio and video formatsNational Archives
Sustainability of Digital FormatsLibrary of Congress