Written by Heather Wacha

The stains found on medieval manuscripts immediately draw us to moments from a book’s past life, signalling the remains of human interactions over time. The Library of Stains project has set out to privilege the very manuscripts that are often overlooked due to heavy soiling and damage, and to use their stains, which have typically been undervalued, to learn more about their history and use. The project ran from August 2017 to September 2018, and set as its goals 1) to provide an online database that will allow scholars, librarians, and conservators to better analyze the materiality, provenance, use and preservation of manuscripts and early-printed books, 2) to document and disseminate a methodological approach for analyzing stains, and 3) to provide a model for public-facing interdisciplinary collaboration. With generous support of a microgrant from the Council on Library and Information Resources (CLIR), we were able to image and analyze stains from about 40 Western European manuscripts, ranging from the 12th to the 18th centuries, held in the University of Pennsylvania Libraries, the Science History Institute, the Library of Congress, the University of Wisconsin Special Collections, and the University of Iowa Special Collections. The project was led by a team of interdisciplinary postdoctoral scholars — Erin Connelly, Alberto Campagnolo, and Heather Wacha — and collaborators Michael Toth of R.B. Toth Associates and William Christens-Berry of Equipoise Imaging.

In recent years, a variety of imaging techniques have been used on cultural heritage materials since they provide, for the most part, non-invasive and non-destructive means for gathering data. Multispectral imaging in particular, a versatile photography-based imaging technique that is typically applied to documents for the recovery of difficult-to-read information, is often utilized to map different materials over the entire area covered in the photographs. While scholars have successfully used multispectral imaging to bring invisible text into the purview of the visible, no one has yet used this imaging technique to begin to characterize the stains found in medieval manuscripts.

The Data

Multispectral imaging works by taking images with an achromatic camera and with lights that illuminate an object at a specific wavelength—from near-infrared (IR), through visible light, to benign ultraviolet (UV) radiation. Each shot captures an image for each wavelength illumination. This results in a stack of registered photographs that are available for further analysis. Looking through the stack, one can notice how different materials react differently to each wavelength, and see details that are not visible in natural light, but clearly noticeable under UV or IR illumination.

For a deeper understanding of the data recorded and the variety of material responses to the different wavelengths, we processed the stack of images and analyzed the data through statistical algorithms capable of simplifying it and of finding patterns in it. One type of output that proves particularly useful as an investigative tool to distinguish different components (i.e., materials reacting in different ways under the different lights) is the result of Principal Component  Analysis (PCA), which works by analyzing the light response of each pixel throughout the full stack of images. A statistical analysis technique is applies for decomposing a set of data into its intrinsic variability, preserving the maximum variability of the data in fewer dimensions. From this, false-color images can be generated, where different components are assigned an arbitrary color to help in discerning similar and dissimilar light responses. (See Figure 1 below.)

Figure 1. False-color PCA processing for Philadelphia. University of Pennsylvania. MS Codex 1058. f. 36v.

Working in a similar way to PCA, by looking at the spectral response of single pixels (or groups of pixels) across the full stack, it is also possible to plot spectral curves that are characteristic of the material (or groups of materials) present in the selected area. (See Figure 2 below.) These curves are particularly useful because their shape can be used to compare and discern various materials present in a document or collection, and they therefore allow the data to be analyzed by scientists and humanists alike, fostering communication and collaboration between different fields.

Figure 2. Spectral curves for stains on University of Wisconsin manuscript MS 255. f. Ats23v. We would like to acknowledge the work done on this manuscript by Leah Pope Parker.

Over the course of the project, we have collected about 220 GB of data that is now hosted by the University of Pennsylvania. The repository is archived under a single directory in such a way that all files are either core data or they serve as support in the understanding and use of data on the part of both humans and machines. The data gathered from the University of Wisconsin and the University of Iowa manuscripts are visually displayed in Digital Mappa, and more information about the Library of Stains project is available on Zenodo.

If you want to learn more, visit the Library of Stains exhibit in the UW Special Collections, on the 9th floor of Memorial Library. The exhibit starts on February 20.