If the data you need still exists;
If you found the data you need;
If you understand the data you found;
If you trust the data you understand;
If you can use the data you trust;
Someone did a good job of data management.

Rex Sanders ‐ USGS‐Santa Cruz*

Data management practices have been described in detail in a variety of documentation and tutorials, which may focus on specific needs and resources applicable to the organization that produced them. The following is a selected list of resources that are general enough to apply to different disciplines, and more broadly than the university or agency that developed them.

Guides and Tutorials

Data Science MOOCs

Several Massively Open Online Courses cover topics related to data analysis and research methods. Even if you choose not to do the coursework and earn a statement of completion, it’s easy to sign up for the courses, which gives you access to lectures and examples.

The Class Central website has curated a list of several data science and analysis methods MOOCs, developed by reputable sources.

The MOOCs listed here have been developed through Johns Hopkins University, and offered through the Coursera platform. They are part of a Data Science Specialization series of of courses, and have applicability to data management practices outside of specific analytical techniques. Each of these courses lasts 4 weeks, and are frequently offered. Currently, there is a new offering of each course starting each month from March through June, 2015.

The Data Scientist’s Toolbox, Jeff Leek, Roger Peng, Brian Caffo

“The course gives an overview of the data, questions, and tools that data analysts and data scientists work with.” It focuses on a practical introduction to tools, using version control, markdown, git, GitHub, R, and RStudio.

Getting and Cleaning DataJeff Leek, Roger Peng, Brian Caffo

“This course will cover the basic ways that data can be obtained…..It will also cover the basics of data cleaning and how to make data “tidy”… The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.” Tools used in this course:  Github, R, RStudio

Reproducible Research, Jeff Leek, Roger Peng, Brian Caffo

“Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them…This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.” Tools: R markdown, knitr

*Rex Sanders quote from: Environmental Data Management: CHALLENGES AND OPPORTUNITIES, Jamie Gerrard | March 2014


Looking for additional information about research data management? Contact us.

Second Spring 2015 Holz Brown Bag Talk http://researchdata.wisc.edu/news/second-spring-2015-holz-brown-bag-talk/ http://researchdata.wisc.edu/news/second-spring-2015-holz-brown-bag-talk/#comments Wed, 11 Feb 2015 15:00:02 +0000 http://researchdata.wisc.edu/?p=5241 [...]]]> Photo courtesy of Kristin Briney

Photo courtesy of Kristin Briney

Our second brown bag talk, “Zero to Sixty: Establishing Research Data Services from Scratch,” will be presented by Kristin Briney, Data Services Librarian at the University of Wisconsin-Milwaukee.

TIME: Wednesday, February 25, 12pm-1pm.

PLACE: Cat Lab (4191F), School of Library and Information Studies, 4th floor of Helen C. White Hall.

ABSTRACT: What does it take to create research data services where none existed before? Kristin Briney will discuss establishing data services at the University of Wisconsin-Milwaukee. Her talk will include strategy and lessons learned 18 months into the process.

ABOUT KRISTIN: Kristin is a PhD chemist who works at the interface of science, technology, and information management. Her particular interests are: helping researchers manage their data, improving informatics systems through robust metadata and workflows, teaching information retrieval and management skills, and using technology to make science accessible to everyone.

Please RSVP for this talk if you plan to attend. View other talks in this series in our archive.

NADDI 2015 at UW-Madison http://researchdata.wisc.edu/news/naddi-2015-at-uw-madison/ http://researchdata.wisc.edu/news/naddi-2015-at-uw-madison/#comments Tue, 10 Feb 2015 22:07:33 +0000 http://researchdata.wisc.edu/?p=5130 [...]]]> NADDI_color

Research Data Services is proud to co-sponsor the third annual North American Data Documentation Initiative conference, occurring April 8-10 at the University of Wisconsin-Madison.

The theme for NADDI 2015, Research Data Management: Enhancing Discoverability with Open Metadata Standards, emphasizes an applied use of DDI to research data. Meant to appeal to individuals involved in creating, managing and using research data, the conference encourages the submission of presentations that showcase the importance of DDI metadata for not only discovering and using research data, but as a practical and utilitarian principle supporting research data production and management.

The conference also encourages presentations on current data service models at other institutions who want to brainstorm how to integrate DDI into their workflows. Finally, because UW-Madison is home to two longitudinal studies (MIDUS and Wisconsin Longitudinal Study) that collect biological and other non-survey data types, NADDI2015 will be a convenient forum to discuss documenting complex use cases with DDI.

The call for presentations is open through February 13. For more information, visit the NADDI 2015 website or download the NADDI 2015 informational flyer.

About DDI

The Data Documentation Initiative (DDI) is an open metadata standard for describing data and data collection activities. DDI’s principal goal is making research metadata machine-actionable. The specification can document and manage different stages of data lifecycles, such as conceptualization, collection, processing, analysis, distribution, discovery, repurposing, and archiving.

Workshop on Data Management for Ecologists a Success http://researchdata.wisc.edu/news/workshop-on-data-management-for-ecologists-a-success/ http://researchdata.wisc.edu/news/workshop-on-data-management-for-ecologists-a-success/#comments Tue, 10 Feb 2015 18:01:08 +0000 http://researchdata.wisc.edu/?p=5229 [...]]]>
Photo courtesy of Brianna Marshall

Photo courtesy of Brianna Marshall

By Erin Carrillo, Information Services Librarian, Steenbock Library

In November, RDS held a two day data management workshop for graduate student researchers. Participants were from several departments across campus, including Limnology, Entomology, Forest and Wildlife Ecology, Geography, and the Nelson Institute for Environmental Studies, and were part of a cohort of graduate students doing research in the area of biodiversity conservation, funded by an NSF Integrative Graduate Education and Research Traineeship grant.

We planned the workshop with two graduate students, Kara Cromwell (Zoology) and Alex Latzka (Center for Limnology), who saw a need to provide new researchers with the knowledge and skills to navigate the changing research data landscape. From funder and publisher requirements for data management plans and data sharing, to the ongoing development of metadata standards and discipline-specific data repositories, researchers need to be aware of trends within their discipline and practice good data management from the outset. Kara and Alex also wanted to encourage and facilitate the sharing of research data within the group.

The workshop addressed several broad topics within data management, but content was tailored to the specific needs of the group. We administered a survey to the group at the beginning of the planning process to gauge students’ current knowledge of data management practices, as well as their specific needs. We identified several areas of focus, and modules were developed for each area. Stephanie Hampton, a visiting scientist coming from Washington State and former deputy director of NCEAS (National Center for Ecological Analysis and Synthesis), was invited by grad students in the Center for Limnology. She had recently published a few high impact papers on the future of ecology, especially with respect to Big Data, and gave a short talk giving participants perspective on why sound data management will matter as they advance in their careers.

The final program was:

  • Spreadsheets, Jan Cheetham, DoIT Academic Technology and Barry Radler, Institute on Aging
  • File Organization, Elliott Shuppy, School of Library and Information Studies (SLIS)
  • Storage & Preservation, Brianna Marshall, Digital Curation Coordinator; Luke Bluma, DoIT Storage & Backup; Elliott Shuppy
  • Metadata, Corinna Gries, Center for Limnology, North Temperate Lakes Long Term Ecological Research (LTER)
  • Data Management Plans, Corinna Gries
  • Keynote talk by Stephanie E. Hampton, Kaeser Scholar, Washington State University, Director of the Center for Environmental Research, Education, and Outreach

We built in designated work time at the end of the first day to give participants an opportunity to apply what they had learned and collaborate with their colleagues. Module presenters were available to answer questions.  Presenters deposited slide decks and other workshop materials in a Box folder that we shared with participants after the workshop.

We had participants complete a pre- and post-workshop survey to assess the effectiveness of the workshop. The results revealed that participants generally rated their ability to practice good data management higher after the workshop. We also got this positive feedback from Kara:

“Alex and I heard a lot of positive feedback throughout the workshop… The schedule flowed smoothly, the content was very well suited to the needs of the group, and all the modules were engaging. We really appreciate the time you invested, and I know everyone (including many who weren’t able to attend) will continue to take advantage of the resources posted in the Box folder. It was a definite success!”

It was a pleasure to work with Kara and Alex and their group, and we look forward to using what we learned from planning this workshop to organize similar workshops tailored to the needs of researchers in different disciplines across campus.

Is your lab or department interested in working with RDS to develop a discipline-specific data management workshop? Contact us.

Manage Your Data with LabArchives http://researchdata.wisc.edu/storing-data/manage-your-data-with-labarchives/ http://researchdata.wisc.edu/storing-data/manage-your-data-with-labarchives/#comments Tue, 10 Feb 2015 14:18:55 +0000 http://researchdata.wisc.edu/?p=5211 [...]]]> line beaker

By Jan Cheetham, Research and Instructional Technologies Consultant, DoIT

LabArchives is an ELN (Electronic Lab Notebook) that provides data storage, data documentation, collaboration, and export features. Like traditional paper lab notebooks, an ELN can serve as a continuous and complete record of the research process.


Collaboration and Sharing

LabArchives provides flexible permissions and roles for lab members and their collaborators. It is recommended that PI’s assume the Owner role in all their lab’s notebooks, in alignment with UW-Madison’s Policy on Data Stewardship, Access, and Retention and to ensure that no data is lost when lab members graduate or leave the university.

There are several approaches for organizing notebooks and managing edit/read rights of individuals. Permissions can be set at the level of the notebook, page, or entry. It also possible for individuals in the Owner or Admin role to share notebooks, pages, and entries with collaborators outside the university. Although LabArchives has a method for creating Digital Object Identifiers (DOIs) for notebooks, this requires making the notebook publicly available. The UW-Madison LabArchives site currently has the public sharing feature turned off as a security measure to prevent inadvertent sharing of notebooks.

The ELN provides a timestamp and record of every user action, creating an electronic record of who added or edited an entry and when. In addition, nothing can be permanently deleted from the ELN. ( LabArchives allows you to move a notebook, page, or entry to a Delete Bin; however, these items are not actually deleted and can be recovered at any time.)

Organizing and Documenting

The ability to blend digital data with the human readable narrative of the research process is one of the main advantages of an ELN over other file sharing/storage services or hybrid paper/electronic systems. LabArchives has a number of different entry types for entering data and recording the narratives. Below are a few suggestions that will help ensure that the information you enter in LabArchives can be readily retrieved.

Naming conventions
LabArchives currently does not offer a way to browse through folders or pages chronologically. Therefore, you may want to use file-naming conventions for pages (and possibly, folders). Names should contain a project name, date, experiment identifier, etc. For more specific suggestions, see naming conventions in an ELN.  It is also a good idea to use similar naming conventions for files you attach or link to in the ELN to make it easier to trace through versions and locate those with transformations.

Documenting attached files
In LabArchives, you upload and attach a single data file to an attachment entry on a page. The file can be of any type and up to 250 MB in size. The entry will display the name of the attached file and you can also enter a description with detailed information (metadata) about the file. When you upload a new version of the file to the same entry, LabArchives retains all prior versions and lets you revert back to older versions through the entry’s revision history. However, as noted below, only the most recent version is included in HTML export. Therefore, to ensure that all data files that you or someone else would need to reproduce your findings are archived both inside the ELN and in HTML exports, be sure to create a separate attachment entry for each essential file that needs to be retained in its original, unaltered form. Then, new versions of the data file (in which the original data are cleaned, transformed, analyzed, visualized, etc.) should be added to the ELN as one or more new entries.

Documenting linked files
When data files are too big (>250 MB) or too numerous to attach to the ELN, you can create links to them from within a rich text entry. However, LabArchives does not check links or verify locations, so you will need to ensure the files are in a secure and permanent location. It is also a good practice to record the name of the file and its location directly in the rich text entry since the URL you add when you create a link is not directly visible in the entry.

Exporting and Archiving

LabArchives has two export formats, PDF and HTML. The PDF version is similar to a scanned paper notebook page. The HTML version lacks some of the appearance of the notebook but contains more complete information, including attached files. As with any digital platform you use for your research data, you will want to have a backup and archival plan. This should take into account how often you make changes to the notebook and include methods for retaining duplicate copies of important data files in alternate locations.

PDFs can be created for a single entry or page or entire notebook. PDFs include: text entries, thumbnails of images and widgets, annotations and descriptions of attachments, user name and time/date stamps. They do not include: attached files, version history of attachments, or comments. URLs of links in rich text entries may be retrievable, depending on the application you use to read the PDF.

The HTML option exports an entire notebook. Each page in the notebook is a separate HTML file and the most recent version of each attached file is also included. This export option also does not include version history of attachments or comments. Again, URLs that you add to create links in rich text entries may be retrievable, depending on the browser you use to read the HTML pages.

Do you have additional questions or concerns about electronic lab notebooks? Contact us.

Data Archiving Platforms: MINDS@UW http://researchdata.wisc.edu/storing-data/data-repos/data-archiving-platforms-mindsuw/ http://researchdata.wisc.edu/storing-data/data-repos/data-archiving-platforms-mindsuw/#comments Mon, 02 Feb 2015 20:40:08 +0000 http://researchdata.wisc.edu/?p=5076 [...]]]> by Brianna Marshall, Digital Curation Coordinator, General Library System

This is part one of a three-part series where I explore platforms for archiving and sharing your data. To help you better understand your options, here are the areas I will address for each platform:

  • Background information on who can use it and what type of content is appropriate
  • Options for sharing and access
  • Archiving and preservation benefits the platform offers
  • Compliance with the forthcoming OSTP mandate


About: MINDS@UW is the University of Wisconsin’s institutional repository, intended to capture, archive, and provide access to scholarship originating from campus researchers of any discipline. It is supported by the UW Libraries and free for all UW-affiliated researchers to use. While a wide variety of file formats are supported, this platform is best suited to handling text-based formats.

Sharing and access: Items in the repository are given a permanent URL that can be used to share the item; however, DOIs are not minted at this time. Items can be made open access (accessed free of charge by anyone, anywhere, at any time) or they can be embargoed (no access is provided until a certain time, up to a few years, has passed). Embargoed items are still discoverable since the metadata is indexed in the repository but the content will not be visible.

Archiving and preservation: The Libraries are committed to long-term preservation of all MINDS@UW items. In addition to the current backup practices in place, the Libraries are collaborating with the UW-Madison Office of the CIO to design and pilot a campus-scaled digital preservation infrastructure. This service, and the libraries’ own preservation repositories, will eventually be aligned with the Digital Preservation Network (DPN).

OSTP mandate: The OSTP mandate requires all federal funding agencies with over $100 million in R&D funds to make greater efforts to make grant-funded research outputs more accessible. This will likely mean that data must be publicly accessible and have an assigned DOI (though you’ll need to check with your funding agency for the exact requirements). Because MINDS@UW cannot provide a DOI at this time, it is not a suitable place for funder data.

The UW Libraries are always looking to improve this platform to better fit the needs of researchers. If you have a question, comment, or suggestion related to MINDS@UW, please contact repository manager Brianna Marshall.


Do you have additional questions or concerns about where you should archive your data? Contact us.

Join RDS for Spring 2015’s First Holz Brown Bag Talk http://researchdata.wisc.edu/news/join-rds-for-spring-2015s-first-holz-brown-bag-talk/ http://researchdata.wisc.edu/news/join-rds-for-spring-2015s-first-holz-brown-bag-talk/#comments Mon, 12 Jan 2015 21:19:49 +0000 http://researchdata.wisc.edu/?p=5134 [...]]]> We’re kicking off the 2015 Holz brown bag series with the talk “Data Management in Biological Microscopy: A Librarian’s Approach,” presented by Elliott Shuppy.

TIME: Wednesday January 28th, 12pm-1pm.

PLACE: Bunge Room, School of Library and Information Studies, 4th floor of Helen C. White Hall.

ABSTRACT: A growing demand for convenient sharing of research image data between the Laboratory for Optical and Computational Imaging (LOCI) and partner laboratories stimulated the need for enhanced data management processes and accompanying documentation. Elliott began working as a graduate researcher at LOCI in Fall 2014, and has collaborated with research scientists to meet their present data management needs. During his talk he will discuss his involvement augmenting one scientist’s data management workflow, including reflections on current practices, positioned and proposed tactics, and next steps in the process.

ABOUT ELLIOTT: Elliott is a graduate student at the School of Library and Information Studies at UW-Madison. His professional emphasis is in data management and curation practices and education across campus units.

Please RSVP for this talk if you plan to attend. View other talks in this series in our archive.

DMPTool unavailable October 18, 10 p.m. – 2 a.m. http://researchdata.wisc.edu/news/dmptool-unavailable-october-18-10-p-m-2-a-m/ http://researchdata.wisc.edu/news/dmptool-unavailable-october-18-10-p-m-2-a-m/#comments Tue, 14 Oct 2014 17:11:21 +0000 http://researchdata.wisc.edu/?p=5086 Due to server maintenance, the DMPTool will be unavailable on Saturday, October 18, from 10:00 p.m. (CST) until 2:00 a.m. (CST), October 19.

DMPTool is an online tool that helps researchers develop data management plans. For more information or to use the tool, see http://researchdata.wisc.edu/make-a-plan/dmptool

Data Management Resources for Librarians http://researchdata.wisc.edu/news/data-management-resources-for-librarians/ http://researchdata.wisc.edu/news/data-management-resources-for-librarians/#comments Mon, 22 Sep 2014 12:54:20 +0000 http://researchdata.wisc.edu/?p=5059 [...]]]> by Elliott Shuppy

Research data management has quickly grown into a necessity for librarians on the UW-Madison campus. We understand that this topic can be complex and intimidating, so we wanted to provide resources on some of the most important topics that librarians may be curious about. Compiled below are links for liaisons to explore, reference, and further equip themselves for reference inquiries and conversations around data.

What is data?

This might be a scary question to some, but one with very important implications. See how Minnesota and Oregon have responded.

Why manage data?

MIT and Minnesota lay out plainly the benefits of data management for researchers.

What is a data management plan?

These links provide fairly comprehensive lists of required components and descriptions of data management plans.

Questions to ask

Helpful sets of questions for librarians to consider when conducting data-related interviews with patrons can be found in the below links.

Terms & definitions

Both Minnesota and Data One offer extensive glossaries of useful terminology for anyone dealing with data matters.

Federal requirements for data

In early 2013, the White House Office of Science and Technology Policy (OSTP) released a mandate requiring public access for federally funded research data. The Department of Energy was the first of many departments to release its requirements for researchers, which take effect October 1, 2014.

DOE Public Access Plan: Scientific Publications & Data Management Plan http://researchdata.wisc.edu/news/doe-public-access-plan-scientific-publications-data-management-plan/ http://researchdata.wisc.edu/news/doe-public-access-plan-scientific-publications-data-management-plan/#comments Wed, 03 Sep 2014 14:26:36 +0000 http://researchdata.wisc.edu/?p=5039 [...]]]> DOE Public Access Plan:  Scientific Publications & Data Management Plan
September 11, 2014  from 11:00-12:15pm
Engineering Hall, Room 3609

L&S Pre-Award Services, together with CALS, Engineering and RSP, is hosting an informational presentation on this new DOE requirement.  Presenters include Julie Schneider from the Ebling Library, and Ryan Schryer and Brianna Marshall from UW Research Data Services.  Those who submit proposals to and have award funding from the DOE should attend.

Please register at the OHRD link

