New RDS Marketing Materials

New RDS Flyer.

In February, we teased new materials for RDS and discussed the impetus to change our images. Now we’re rolling them out!

Above you can find our new main flyer, which you may have already seen at last week’s Showcase, as one of the screen savers on a campus library computer, or on our Twitter page @UWMadRschSvcs.

We hope you enjoy our new materials and if you have any questions or concerns, please feel free to reach out to us!

Link Roundup March 2017

In this series, members of the RDS team share links to research data related stories, resources, and news that caught their eye each month. Feel free to share your favorite stories with us on Twitter @UWMadRschSvcs!

Cameron Cook

If you aren’t sure how to best document your data for reuse, Mozilla Science has a checklist just for you!

The Economist shared a beautiful visualization of the flowering dates of the sakura in Kyoto, Japan from 800 AD to today. This visualization was made by Yasuyuki Aono of Osaka Prefecture University who shares their awesome data here.

This preprint analyzes and discusses the data sharing policies of biomedical journals.

Documenting DH: Brianna Marshall

Written by Heather Wacha

Documenting DH is a project from the Digital Humanities Research Network (DHRN).  It consists of a series of audio interviews with various humanities scholars and students around the University of Wisconsin-Madison campus. Each interviewee is given a chance to talk about how they view data, work with data, manage data, or teach data to others.  In March, we interviewed Brianna Marshall, Digital Curation Coordinator and lead of Research Data Services (RDS) at the University of Wisconsin-Madison. As part of this position, she promotes and supports digital humanties scholarship across the university. DHRN feels very fortunate to be able to record Brianna’s experiences and insights before she leaves the University of Wisconsin at the end of March to enter into a new position as Director of Research Services at the University of California, Riverside. Her interview is now accessible on the DHRN website.

How do you define data?

When talking about data, Brianna likes to keep it simple: Data is the “digital stuff” humanist researchers use to do their research.  Data comes in all shapes and sizes, and Marshall encounters a wide diversity of data formats since she works with both STEM and humanist reseachers.  Basically almost anything that serves as a researcher’s methods, results, or evidence can be seen as data.

According to Marshall, there are two important fallacies that can create unnecessary barriers, especially for humanist scholars.  First, researchers may not always see their work as containing “data,” but as Marshall likes to point out, every researcher has “data,” whether it’s digital or not.  Second, with all the talk about “big data”, researchers who deal with small “data” may not see what they do as applicable to digital computation or analysis. As lead for RDS, Marshall sees her job as helping the university community understand that everyone works with data at some stage and that any data is open to using digital tools to help manage and analyze it. 

What do you recommend for humanists to manage their data?

“The first thing is to recognize that one has data,” states Marshall. “I love those lightbulb moments when I’m talking to someone about their digital stuff, and they realize that this digital stuff is really data that can be used for further digital analysis and inquiry.”

At that moment, Marshall uses her knowledge to talk about the variety of ways to work with data to make it more interesting and use for asking new questions.

Marshall is an expert in connecting researchers with organizations and individuals who can continue the conversation, with a view to deepening scholarship in new digital directions.

What do you find most interesting about working with data?

“It’s vexing! It’s messy! It’s frustrating.”  For Marshall, working with data is a two-edged activity.  At the initial stages it’s important to realize that it’s going to be a challenge and there will be a lot of trial and error.  Data can be overwhelming and isolating, but it’s everywhere, both in our personal live and in our professional lives.

But when we can share best practices, that makes the job more interesting and more rewarding. Marshall enjoys using her toolkit of best practices and adapting them to individual projects and individual challenges.  She mixes and remixes her tools to make them most effective for the context with which she is presented.

When the product is finished, the goals achieved, and the challenges overcome, that’s when the satisfaction can feel that much more rewarding.

How do you see RDS fitting into the wider Digital Humanities community?

Marshall freely admits that data can be a hard sell and outreach is an important component to working in any organization that offers services for dealing with data.  She is an avid advocate for digital humanities and helping the university community to create and use its data in efficient and interesting ways.

Marshall sees RDS as a trusted partner on campus and she sees her position, as the head of RDS, as someone who connects people and organizations. One of the main functions of RDS and any organization on campus that wants to promote and encourage best practices in data management and analysis, is to be a connector.  RDS offers a variety of services for anyone who wants to rethink the data they have – how to manage it and what to do with it to uncover new approaches and analyses.


Tool: Identity Finder

Information from DoIT’s recent news and DoIT’s Identity Finder information.

What is Identity Finder?

This tool was featured recently by DoIT, but we wanted to cover the information again as it’s a great tool for those interested in ensuring security on their local machines. Identity Finder is a software tool that can help find personally identifiable or sensitive information on your local machine. Finding restricted information allows you to take steps to ensure protection or encryption of that information. (more…)

Open Science Grid (OSG) User School 2017

Information from the OSG User School page.

The Open Science Grid (OSG) User School is now accepting applications for the 2017 session!

The school runs July 17-21 at the University of Wisconsin-Madison campus. Participants “… will learn to use high-throughput computing (HTC) systems — at your own campus or using the national Open Science Grid (OSG) — to run large-scale computing applications that are at the heart of today’s cutting-edge science. Through lectures, discussions, and lots of hands-on activities with experienced OSG staff, you will learn how HTC systems work, how to run and manage lots of jobs and huge datasets to implement a scientific computing workflow, and where to turn for more information and help. Take a look at the high-level curriculum and syllabus for more details.”

Application deadline is Friday, April 7, 2017 by the end of the day. The school is geared toward graduate students, but will consider applications from advanced undergraduates, post-doctoral students, faculty, and staff.  Accepted students will receive financial support for the cost of basic travel and local costs.

You can learn more about the school, see application details, and contact information on the OSG User School page.

February 2017 Brown Bag: Pierce Edmiston

The Rebecca J. Holz series in Research Data Management is a monthly lecture series hosted during the spring and fall academic semesters. Research Data Services invites speakers from a variety of disciplines to talk about their research or involvement with data.

On February 15th Pierce Edmiston, a PhD candidate in the Department of Psychology at UW-Madison, gave a talk entitled “Adopting open source practices for better science”. Details on viewing the slides will be listed below following the description of the brown bag talk.

Emiston’s talk was framed around a discussion of reproducibility and the ways that bias creeps into the research process, which he then provided suggestions on practices to help prevent these issues. The introduction to his talk covered his personal motivations for adopting reproducible practices, gave an overview of the literature surrounding reproducibility and the reproduciblity crisis in psychological science, and then covered why he thinks open source could be the answer. Edmiston then offered three suggestions of open source practices to adopt – including version control, dynamic documents, and what he calls “building from source”. The talk finished with examples of these practices from his own work and further literature on openness in practice and culture. Edmistion has also provided a list of linked references for all of the literature mentioned in his presentation.


Rescuing Unloved Data – Love Your Data Week 2017

Information from Love Your Data Week.

Message of the day

“Data that is mobile, visible and well-loved stands a better chance of surviving” ~ Kurt Bollacker

Things to consider

Legacy, heritage and at-risk data share one common theme: barrier to access. Data that has been recorded by hand (field notes, lab notebooks, handwritten transcripts, measurements or ledgers) or on outdated technology or using proprietary formats are at risk.

Securing legacy data takes time, resources and expertise but is well worth the effort as old data can enable new research and the loss of data could impede future research. So how to approach reviving legacy or at-risk data?

How do you eat an elephant? One bite at a time.

  1. Recover and inventory the data
    • Format, type
    • Accompanying material–codebooks, notes, marginalia
  2. Organize the data
    • Depending on discipline/subject: date, variable, content/subject
  3. Assess the data
    • Are there any gaps or missing information
    • Triage–consider nature of data along with ease of recovery
  4. Describe the data
    • Assign metadata at the collection/file level
  5. Digitize/normalize the data:
    • Digitization is not preservation. Choose a file format that will retain its functionality (and accessibility!) over time: “Which file formats should I use?”
  6. Review
    • Confirm there are no gaps or indicate where gaps exist
  7. Deposit and disseminate
    • Make the data open and available for re-use