Tool: Identity Finder

Information from DoIT’s recent news and DoIT’s Identity Finder information.

What is Identity Finder?

This tool was featured recently by DoIT, but we wanted to cover the information again as it’s a great tool for those interested in ensuring security on their local machines. Identity Finder is a software tool that can help find personally identifiable or sensitive information on your local machine. Finding restricted information allows you to take steps to ensure protection or encryption of that information. (more…)

Open Science Grid (OSG) User School 2017

Information from the OSG User School page.

The Open Science Grid (OSG) User School is now accepting applications for the 2017 session!

The school runs July 17-21 at the University of Wisconsin-Madison campus. Participants “… will learn to use high-throughput computing (HTC) systems — at your own campus or using the national Open Science Grid (OSG) — to run large-scale computing applications that are at the heart of today’s cutting-edge science. Through lectures, discussions, and lots of hands-on activities with experienced OSG staff, you will learn how HTC systems work, how to run and manage lots of jobs and huge datasets to implement a scientific computing workflow, and where to turn for more information and help. Take a look at the high-level curriculum and syllabus for more details.”

Application deadline is Friday, April 7, 2017 by the end of the day. The school is geared toward graduate students, but will consider applications from advanced undergraduates, post-doctoral students, faculty, and staff.  Accepted students will receive financial support for the cost of basic travel and local costs.

You can learn more about the school, see application details, and contact information on the OSG User School page.

February 2017 Brown Bag: Pierce Edmiston

The Rebecca J. Holz series in Research Data Management is a monthly lecture series hosted during the spring and fall academic semesters. Research Data Services invites speakers from a variety of disciplines to talk about their research or involvement with data.

On February 15th Pierce Edmiston, a PhD candidate in the Department of Psychology at UW-Madison, gave a talk entitled “Adopting open source practices for better science”. Details on viewing the slides will be listed below following the description of the brown bag talk.

Emiston’s talk was framed around a discussion of reproducibility and the ways that bias creeps into the research process, which he then provided suggestions on practices to help prevent these issues. The introduction to his talk covered his personal motivations for adopting reproducible practices, gave an overview of the literature surrounding reproducibility and the reproduciblity crisis in psychological science, and then covered why he thinks open source could be the answer. Edmiston then offered three suggestions of open source practices to adopt – including version control, dynamic documents, and what he calls “building from source”. The talk finished with examples of these practices from his own work and further literature on openness in practice and culture. Edmistion has also provided a list of linked references for all of the literature mentioned in his presentation.


Rescuing Unloved Data – Love Your Data Week 2017

Information from Love Your Data Week.

Message of the day

“Data that is mobile, visible and well-loved stands a better chance of surviving” ~ Kurt Bollacker

Things to consider

Legacy, heritage and at-risk data share one common theme: barrier to access. Data that has been recorded by hand (field notes, lab notebooks, handwritten transcripts, measurements or ledgers) or on outdated technology or using proprietary formats are at risk.

Securing legacy data takes time, resources and expertise but is well worth the effort as old data can enable new research and the loss of data could impede future research. So how to approach reviving legacy or at-risk data?

How do you eat an elephant? One bite at a time.

  1. Recover and inventory the data
    • Format, type
    • Accompanying material–codebooks, notes, marginalia
  2. Organize the data
    • Depending on discipline/subject: date, variable, content/subject
  3. Assess the data
    • Are there any gaps or missing information
    • Triage–consider nature of data along with ease of recovery
  4. Describe the data
    • Assign metadata at the collection/file level
  5. Digitize/normalize the data:
    • Digitization is not preservation. Choose a file format that will retain its functionality (and accessibility!) over time: “Which file formats should I use?”
  6. Review
    • Confirm there are no gaps or indicate where gaps exist
  7. Deposit and disseminate
    • Make the data open and available for re-use


Finding the Right Data – Love Your Data Week 2017

Information from Love Your Data Week.

Message of the day

Need to find the right data? Have a clear question and locate quality data sources.

Things to consider

romanticlocationicon_nounprojectIn a 2004 Science Daily News article, the National Science Foundation used the phrase “here there be data”to highlight the exploratory nature of traversing the “untamed” scientific data landscape. The use of that phrase harkens to older maps of the world where unexplored territories or areas on maps bore the warning ‘here, there be [insert mythical/fantastical creatures]’ to alert explorers to the dangers of the unknown. While the research data landscape is (slightly) less foreboding, there’s still an adventurous quality to looking for research data.


Documenting DH: Bronwen Masemann and LIS 640, Digital Humanities Analytics

Written by Laura Schmidt

Documenting DH is a project from the Digital Humanities Research Network (DHRN), which consists of a series of audio interviews with various humanities scholars and students around the University of Wisconsin-Madison campus. Each interviewee is given a chance to talk about how they view data and how they manage data or how they teach data to others. In February, we interviewed Bronwen Masemann, an instructor in the School of Library and Information Studies and her interview will be accessible to the public on the DHRN website starting February 24, 2017. She is currently teaching LIS 640: Digital Humanities Analytics, which provides upper-level and graduate students the “technology skills to analyze and plan data-driven projects in the humanities, social sciences and other fields.” (more…)

Good Data Examples – Love Your Data Week 2017

Information from Love Your Data Week.

Message of the day

Good data are FAIR – Findable, Accessible, Interoperable, Re-usable

Things to consider

What makes data good?

  1. It has to be readable and well enough documented for others (and a future you) to understand.
  2. Data has to be findable to keep it from being lost. Information scientists have started to call such data FAIR — Findable, Accessible, Interoperable, Re-usable. One of the most important things you can do to keep your data FAIR is to deposit it in a trusted digital repository. Do not use your personal website as your data archive.
  3. Tidy data are good data. Messy data are hard to work with.
  4. Data quality is a process, starting with planning through to curation of the data for deposit.

Remember! “Documentation is a love letter to your data”