Documenting DH: Bronwen Masemann and LIS 640, Digital Humanities Analytics

Written by Laura Schmidt

Documenting DH is a project from the Digital Humanities Research Network (DHRN), which consists of a series of audio interviews with various humanities scholars and students around the University of Wisconsin-Madison campus. Each interviewee is given a chance to talk about how they view data and how they manage data or how they teach data to others. In February, we interviewed Bronwen Masemann, an instructor in the School of Library and Information Studies and her interview will be accessible to the public on the DHRN website starting February 24, 2017. She is currently teaching LIS 640: Digital Humanities Analytics, which provides upper-level and graduate students the “technology skills to analyze and plan data-driven projects in the humanities, social sciences and other fields.”

How do you define data to the students in your class?

Bronwen starts her classes by discussing the nature of text as data. She asks questions like whether or not words are data, how we can count or group words, and how can we visualize patterns in words. Even discussing whether spaces in a text can be considered data is helpful in understanding the multiple interpretations of texts, from the printed word to the digitized word to visual representations of words. Many of her students are unfamiliar with this type of research, so Bronwen doesn’t shy away from anything.

Bronwen will even ask students to consider metadata as data. For example, “if you have a data set that contains titles and publishers of all of the biographies published in England between 1910 and 1930, that is metadata for a set of textual data, the textual data being all of the words in those books. We can mine those words for trends and to examine research questions, but we can also visualize metadata to start to ask questions about trends within publishing between 1910 and 1930.”

Are there particular assignments where you teach or emphasize data management?

The digital humanities class has a large number of hands-on activities, one of which includes a data cleaning tutorial using OpenRefine, an open source tool that fixes mistakes in data sets and can help identify patterns within a set. This semester, Bronwen is using a tutorial called Grateful Data, a data set related to Grateful Dead songs! For the final projects, she is asking students to “use the methods and concepts that they have learned in the class to attack a research question that they already feel knowledgeable about.” Bronwen teaches them throughout the class about methods for proper documentation and preservation of their data sets and she is excited to see them use the skills that they have learned.

Do your students find that there is particular kind of data that is more challenging than another?

Thinking about words as data is pretty hard, but XML-encoded data definitely wins as most challenging. Switching between raw textual data into encoded textual data is difficult, even for the students who are used to creating records for objects or items in a collection using metadata, because students have difficulty thinking about how they would encode specific words or paragraphs and are unsure how to break things into chunks. It is out of their comfort zone, but that’s why they consider it fun! Students work on the whole process from scanning a page to performing optical character recognition on the page to then encoding the page using TEI.

What do you find most interesting or exciting about working with DH centric teaching?

Students who don’t feel confident about their technology skills will quickly acquire a high level of skill after going through Bronwen’s class, particularly because they are motivated by the subject matter. “A lot of my students have a deep interest in literature or history or other aspects of the humanities and once they realize they can use these tools to explore questions of interest to them, they just sort of take off and that’s really exciting,” according to Bronwen. However, Bronwen is most excited when her students start teaching themselves the tools she doesn’t have time to cover in depth. A semester is a short amount of time to discuss everything the field of digital humanities has to offer, so sparking an interest with her students is the highlight of her teaching.