Documenting DH: Eric Hoyt - Research Data Services

Written by Heather Wacha

Documenting DH is a project from the Digital Humanities Research Network (DHRN). It consists of a series of audio interviews with various humanities scholars and students around the University of Wisconsin-Madison campus. Each interviewee is given a chance to talk about how they view data, work with data, manage data, or teach data to others. Most recently, we interviewed Eric Hoyt, Associate Professor in Media and Cultural Studies who talked about his extensive participation in digital humanities that started while he was a doctoral student and continues at the University of Wisconsin, Madison – from the Digital Media Library to his most recent project, PodcastRE, supported by NEH and UW2020 grants.

How do you manage the data you work with?

One of Hoyt’s recent projects is call PodcastRE.org, which aims to capture, organize and archive millions of podcasts. The project developed out of an idea of colleague Jeremy Morris, Associate Professor in the Media and Cultural Studies Department. In talking about podcasts in general, Hoyt and Morris realized that they “needed to capture and systematically save the large volume of podcasts that are being produced and distributed today.” Indeed, if we think about the early history of film, 90% of silent films are now gone and most recordings of early radio and TV have been lost. The early days, months and years can be a vulnerable time for new and innovative media. Hoyt and his colleagues, Morris, Peter Sengstock, and Samuel Hansen are trying stay ahead of the game by gathering and archiving many, many, many podcasts.

The goal of this project consists of building a database for archiving podcasts. In so doing, they already have close to a quarter million different podcasts.
As you can imagine, that involves a lot of data – data such as RSS feeds from the podcasts, downloaded copies of the audio files, and multiple sorts of metadata. Hoyt notes that it gets complicated when the metadata that describes the audio files is different from the encoded metadata within mp3s, and this is sometimes different from the XML metadata, and this is sometimes different from the metadata provided on the podcast’s website. Hoyt finds it interesting that people are sometimes in a hurry and don’t always enter their metadata in a thoughtful, consistent way. While this may present a challenge for the PodcastRE project team since it is harder to put together a project with a streamlined search function, the inconsistencies in metadata can reveal the different ways that people choose to describe their work and why.

What excites you about the data you’ve been working with recently?

What Hoyt finds most exciting about his data is the underlying substrate that provides the data. He is an avid Hollywood fan and loves the objects he explores – trade papers from Hollywood that each have different perspectives and stories in them about the films being produced in Hollywood. Many scholars focus only on a handful of these magazines, but there are many more that Hoyt collects in his Digital Media Library and each has its own character. He loves seeing the images, the metadata, and the derivatives that have OCR texts all stored as data.

He feels the same about the podcast project. He’s a big fan of listening to podcasts and loves that they have a low barrier of entry. In fact, you can even have someone at the University of Wisconsin doing podcasts about Digital Humanities on campus! The genre allows for a diversity of voices and a number of viewpoints to be heard.

What advice can you give humanists wanting to manage their data effectively?

Get your hands dirty! Hoyt warns that there’s rarely anyone waiting in the wings willing to figure out what you need to do, so turn yourself into that person. Just dig in and mess around with your data and figure out what you need to learn and what you need to be able to do. Sometimes this many mean that you are not always following best practices. If that is the case, then that’s OK.

Hoyt also recommends thinking about your audience. Who will be YOUR audience and how can you make your project the most useful and engaging for them?

For Hoyt, DH is an umbrella that can mean any number of things. His interview attests the variety of DH projects he’s been involved in, as well as the diverse data he encounters, along with all their challenges and intrigue. If you are interested in hearing more, you can go to Eric Hoyt’s interview on the DHRN webpage where Hoyt talks all all his DH projects.

Research Data Services (RDS) is an interdisciplinary organization committed to advancing research data management practice on the UW-Madison campus. We focus on providing researchers with the tools and resources that support their efforts to store, analyze and share data.