DH Tools Part 1: Off-the-Shelf

You don’t have to learn an entirely new programming language to do cutting edge digital humanities work. There are many sophisticated, useful off the shelf tools that you can use for your research. Many of them are as simple as using a web browser and can produce thoughtful, well-designed, and interactive research outputs. If your research requires some coding know-how, read our post on tutorials and resources for acquiring programming skills. 

As always, be sure to read the Terms of Service and Privacy Policy for any tool you use. Seek to understand how your data will be stored, shared, or if your final output is made public. Part of good data management is understanding how your tools handle your data and making responsible choices about what tools you select.


DH Tools Part 2: Moving Computationally

For scholars in the humanities, digging into computational approaches, tools, and methods can open new possibilities for exploration and building more tailored outputs. Below, we’ve collected a few trusted resources that can help get you started. This post is dedicated to programming tools that can help automate tasks, analyze data, or create projects. If you don’t have the time to invest in learning a programming language, follow this link to read our post on “off the shelf” DH tools.


Research Data Tips for Leaving UW-Madison

A newly designed banner with a graphic of mascot Bucky Badger’s face hangs between the columns of Bascom Hall at the University of Wisconsin-Madison during autumn on Oct. 27, 2014. In the foreground is the Abraham Lincoln statue and pedestrians walking across Bascom Hill. (Photo by Jeff Miller/UW-Madison)

If your time as a researcher or student at UW-Madison is coming to an end, good luck with your new opportunities! As you make the shift, it’s important to begin the process of off-boarding – taking all the necessary steps to ensure a seamless transition when formally separating from the university.

This is especially important when it comes to your research data. Off-boarding requires a careful assessment of all the data, accounts, and tools you have used while at UW-Madison and an understanding of policies on transitioning your research data to your collaborators, departments, or new institutions. 

To help, we have put together this brief guide. But remember, many labs, departments, and colleges have their own off-boarding procedures, so it’s best to inquire there for more specific guidance. UW-Madison has also gathered some role-specific resources to get started.


ResearchDrive Now Available

The UW–Madison Office of the Vice Chancellor for Research and Graduate Education (VCRGE) and the Division of Information Technology (DoIT) are excited to announce ResearchDrive, a secure, shareable data storage solution for faculty principal investigators (PIs), permanent PIs, and their research group members.  The new service is the first phase of a Research Cyberinfrastructure strategic initiative that is a collaborative effort with the VCRGE, DoIT, the Research Technology Advisory Group (RTAG), the Libraries, and campus research computing centers to support the growing data and computing needs of researchers.

The university provides each PI with 5 terabytes (TB) available at no cost and additional storage at $200/TB/year including support, training, and onboarding for researchers.  The quota per PI ensures that ResearchDrive is a predictable resource that can be leveraged for faculty recruitments and included in data management plans and grant proposals. ResearchDrive is suited for a variety of research purposes, including storing research data and files, storage for data inputs/outputs of research computing, archiving data, and others.  It is a secure and permanent place to store data and includes security and data protection features based on the NIST Cybersecurity framework such as encryption, snapshots, off-site replication, ransomware protection, and monitoring by the Cybersecurity Operations Center (CSOC).


An Introduction to Web Scraping for Research

Like web archiving, web scraping is a process by which you can collect data from websites and save it for further research or preserve it over time. Also like web archiving, web scraping can be done through manual selection or it can involve the automated crawling of web pages using pre-programmed scraping applications.

Unlike web archiving, which is designed to preserve the look and feel of websites, web scraping is mostly used for gathering textual data. Most web scraping tools also allow you to structure the data as you collect it. So, instead of massive unstructured text files, you can transform your scraped data into spreadsheet, csv, or database formats that allow you to analyze and use it in your research. 

There are many applications for web scraping. Companies use it for market and pricing research, weather services use it to track weather information, and real estate companies harvest data on properties. But researchers also use web scraping to perform research on web forums or social media such as Twitter and Facebook, large collections of data or documents published on the web, and for monitoring changes to web pages over time. If you are interested in identifying, collecting, and preserving textual data that exists online, there is almost certainly a scraping tool that can fit your research needs. 

Please be advised that if you are collecting data from web pages, forums, social media, or other web materials for research purposes and it may constitute human subjects research, you must consult with and follow the appropriate UW-Madison Institutional Review Board process as well as follow their guidelines on “Technology & New Media Research”.  (more…)

Link Roundup November 2019

light bulb

Jennifer Patiño

World Digital Preservation Day is November 7th and this year’s theme is “At-Risk Digital Materials.”

Researchers at the University of Hawaiʻi at Mānoa uncovered a glitch in a computer program that produced different results depending on operating systems, possibly affecting more than 100 published studies. A good reminder to make sure you have a detailed README file for any code you create!

Wired reports on a study in Science that revealed racial bias in a widely used algorithm that assigned lower levels of care to Black patients in U.S. hospitals. The study shows how by focusing on healthcare costs, the algorithm replicated disparities in access and provides suggestions on reformulating the algorithm.

Kent Emerson

Researchers at UW-Madison’s Wisconsin Institute for Discovery, working on a project called Wisconsin Expansion of Renewable Electricity with Optimization under Long-term Forecasts (WEREWOLF), are producing mathematical models that will help policy makers make decisions about the future of Wisconsin’s renewable energy resources.

The Roy Rosenzweig Center for History and New Media at George Mason University is celebrating its 25th anniversary. During this time, the RRCHNM has produced some of the most widely used open source digital resources including Omeka, Zotero, and Tropy as well as discrete art and art history projects.