Written by Laura Schmidt
Documenting DH is a project from the Digital Humanities Research Network (DHRN). It consists of a series of audio interviews with various humanities scholars and students around the University of Wisconsin-Madison campus. Each interviewee is given a chance to talk about how they view data, work with data, manage data, or teach data to others. Most recently, we interviewed Reginold Royston, an assistant professor at the University of Wisconsin’s iSchool. His research focuses on civic innovation, online education, and media in Africa and in underserved communities in the United States. His interview is now accessible on the DHRN website.
How do you teach data management to your students?
Reginold’s focus has mainly been on digital literacy, because he feels that content production with new media tools is the first place to start when it comes to producing and interpreting academically relevant media content. He doesn’t believe that his pedagogy around data management has been particularly strong, but he strongly believes in metadata and a basic archival directory system. He believes that using digital tools and data management is something that students should learn on their own, as a first pass. He will gradually work them through best practices as their projects progress, but trying and failing is a key part to his pedagogy.
Now that Reginold is at the iSchool, where archives and library science are a major focus, he understands that topic modeling, data management, and metadata is extremely important, as is learning how to use data that students have collected or produced. Students in the social sciences and humanities are often stewards of their own information, so they need to have a sense of how to organize their data in the best way.
How do you manage the diverse data you work with?
Reginold’s data ranges from in-person audio and video recordings at live events to basic academic research, like writings, historical archives, and bibliography. His biggest focus is on basic naming conventions. For example, when he uses screen-capturing software, he will organize the titles by which site he captured (TW for Twitter, FB for Facebook, etc.) and the date. His media is organized in a basic directory structure, paying particular attention to format. Reginold believes that this is extremely necessary, especially if you want to use your computer’s search feature, which saves a huge amount of time.
When thinking about rich media, particularly metadata, tagging, and keywords, Evernote is an essential tool in his arsenal of research. He captures dozens of screen grabs every day and he tags them to group them up at least once a month. Reginold understands that every researcher, academic, and social scientist has their own particular way of doing research and cataloging their data and he thinks that people should use what works best for themselves.
What do you find most interesting and exciting about working with data?
Reginold’s enthusiasm for this question was not lost on us. He thinks the search tools we have allow us to organize our information in ways that are useful and quickly available. For example, a project like Slave Voyages–which is a huge archive of nautical maps, inventories of names, some oral histories of the Transatlantic Slave Trade–lets you search across archives and databases and locations. Whereas a researcher in a physical library would have to rely on their own notetaking sensibility, their personal conventions, and their memory. Better questions can be asked with more distributed databases and better tagged data.
Reginold says it best in his own words: “One of the interesting things I find about the use of data and sometimes the reliance on database or computational methods of investigation, are the limits of database approaches to the understanding knowledge.” He uses an example of when he was examining tweets around the 2014 World Cup. He collected a week’s worth of tweets that focused on about thirty keywords and examined the tweets of sixty individuals. This created a dataset of six million tweets, which he said, “was relatively small for Twitter grabs.” His goal was to examine the relationship between politics and football, but he couldn’t devise anything through the data using computational or observational methods. This made him understand that, “there are limits to the ways we think about traditional database questions and how to ask those computationally. I didn’t find anything statistically relevant about the four or five questions that I asked, but it’s incumbent upon me to go back and to look at that data and to figure out what was statically relevant.”