Hey UW! Tell Us About Your Digital Preservation Needs

Some data storages in macro shot

 

Do you have data worth preserving for future generations? Do you know how much you have collected? Or how long you should keep it for? What are the consequences if it’s lost, and how important is it to you to prevent that loss?

The Vice Provost for Information Technology and Chief Information Officer and the Vice Provost for Libraries and University Librarian have chartered a working group to make recommendations related to the long-term management of digital assets.  The Long-term Digital Asset Management Working Group (LDAM), including several Research Data Services consultants, is engaging campus stakeholders in discussions aimed at helping us understand their particular needs and use cases with respect to the curation and preservation of digital assets over time.

To that end, the team has put together a brief survey. Our hope is to gather input via the survey and then meet face-to-face with respondents to dive deeper into their needs, concerns, constraints, etc.  The information we glean from this process will give the team an overview of campus needs in this space, and, perhaps inform the shape of a digital preservation service for campus if the needs are sufficient to address at a campus scale.

Share your digital preservation needs with us here: http://bit.ly/2bDeUXj

Save

Data Archiving Platforms: Figshare

by Brianna Marshall, Digital Curation Coordinator

This is part three of a three-part series where I explore platforms for archiving and sharing your data. Read the first post in the series, focused on UW’s institutional repository, MINDS@UW or read the second post, focused on data repository Dryad.

To help you better understand your options, here are the areas I will address for each platform:

  • Background information on who can use it and what type of content is appropriate.
  • Options for sharing and access
  • Archiving and preservation benefits the platform offers
  • Whether the platform complies with the forthcoming OSTP mandate

figshare

About

figshare is a discipline-neutral platform for sharing research in many formats, including figures, datasets, media, papers, posters, presentations and filesets. All items uploaded to figshare are citable, shareable and discoverable.

Sharing and access

All publicly available research outputs are stored under Creative Commons Licenses. By default, figures, media, posters, papers, and filesets are available under a CC-BY license, datasets are available under CC0, and software/code is available under the MIT license. Learn more about sharing your research on figshare.

Archiving and preservation

figshare notes that items will be retained for the lifetime of the repository and that its sustainability model “includes the continued hosting and persistence of all public research outputs.” Research outputs are stored directly in Amazon Web Service’s S3 buckets. Data files and metadata are backed up nightly and replicated into multiple copies in the online system. Learn more about figshare’s preservation policies.

OSTP mandate

The OSTP mandate requires all federal funding agencies with over $100 million in R&D funds to make greater efforts to make grant-funded research outputs more accessible. This will likely mean that data must be publicly accessible and have an assigned DOI (though you’ll need to check with your funding agency for the exact requirements). All items uploaded to figshare are minted a DataCite DOI, so as long as your data is set to public it is a good candidate for complying with the mandate.

Visit figshare.

Have additional questions or concerns about where you should archive your data? Contact us.

Welcome to Our New Website

DataManBy Brianna Marshall, RDS Chair

The RDS team has made several changes to the design and content of our website.

First, we’re aiming to highlight the main three services RDS provides to the UW-Madison community: assistance with data management plans, consultations, and education and training.

Second, we’re introducing new content. One of the main questions we receive is about forthcoming federal funding requirements from the 2013 White House OSTP memorandum. We’ve created a brief yet helpful chart to get you started thinking about the impact of this mandate on your particular funding agency.

And last but not least, we’ve cleaned up the design to help you find what you need on our site, quickly and easily. Please reach out to us with questions or comments.

 

NADDI Reflections [part 1]

NADDI_RDS

Evan (L) and Morgaine (R)

This post on NADDI 2015 was written by Morgaine Gilchrist Scott, one of two recipients of an RDS student scholarship. Read Evan Meszaros’ reflection.

In my past life, I was a public health researcher. In my current one, I’m a first year SLIS graduate student. I’m amazed and appalled at the data I once lost due to convenience. I don’t think we knew (or cared about) anything better than the proprietary format which met our immediate needs perfectly. I just looked up the software, and it’s already dead.

Have you ever heard of the Överkalix study? It’s often indicated as the seminal study in epigenetics. Scientists were able to discover things like a greater BMI at 9 years in the sons (but not the daughters) of fathers who began smoking early, and that a granddaughter’s risk of cardiovascular mortality increased when there was a sharp change in food availability for their paternal grandmothers.

But HOW were researchers able to conclude these things? Data. Old data. Old, easily explainable, data. Scientists looked at records from 1890, 1905, and 1920 on birthrates and various environmental factors and were able to follow up with children and grandchildren. These records were obviously kept on paper in a safe place and in the same language used today. But in today’s digital age, we may be depriving future generations of intuiting similarly ground breaking conclusions from the data collected today.

We’re producing data at a greater rate than ever before, and who knows what could be useful in the future. But with poor metadata, and the use of proprietary formats, we’re also losing more than ever. Fortunately, the good people involved with the Data Documentation Initiative are working towards a world where that won’t happen. I learned about so many easy, free, and important tools at NADDI. I can’t wait to implement them in my own research.

Now, you’ve missed the conference. That’s a shame, but we won’t hold that against you. NADDI has opened the doors here at Madison to making sure you have sustainable data. I’d encourage you to talk to someone from the RDS team and they can show you some free or cheap tools that are so easy to use, you’ll barely notice them. These tools, and the future of DDI will make sure that your data will contribute to science for as long as possible.

Morgaine Gilchrist-Scott is currently a Masters candidate in the School of Library and Information Science at UW-Madison. She hails from Ohio and has worked in Boston and New York before coming to Madison. She hopes to continue in data management and STEM librarianship with her degree.

NADDI Reflections [part 2]

NADDI_RDS

Evan (L) and Morgaine (R)

This post on NADDI 2015 was written by Evan Meszaros, one of two recipients of an RDS student scholarship. Read Morgaine Gilchrist-Scott’s reflection.

The NADDI 2015 conference afforded its attendees a smorgasbord of content, from the basic to the advanced, and across a range of contexts, from the narrowly-focused to the bigger picture. As a newcomer to NADDI in addition to being a newcomer to most related topics, the broader and more basic views resonated with me the most.

Jane Fry, a Data Specialist at Carleton University’s MacOdrum Library in Ottawa, led one such basic and broad workshop session, entitled, “Discover the Power of DDI Metadata.” Fry introduced the Data Documentation Initiative (DDI) to those unfamiliar with the international, XML-based metadata specification, and discussed its applications, history, versioning, and the current challenges it faces as its developers improve its functionality and expand its adoption.

A plenary session featuring the UW-Madison School of Library and Information Studies’ Faculty Associate, Dorothea Salo, explored DDI’s place as an emerging metadata standard (mainly for large, social sciences datasets) amidst a zoo of established information standards. Her take-no-prisoners critique of the DDI community’s progress, however, sparked plenty of discussion and revealed that there is lots of work yet to be done to get the word out effectively.

The diversity and scale of projects implementing DDI—as well as the internationality of stakeholders in the initiative was also on display throughout conference. A number of sessions explored noteworthy projects (a growing list of which can be found here), while others focused on the programs and scripts (e.g. Colectica MTNA’s OpenDataForge) used to support DDI in these projects.

Two sessions in particular, both led by academic data librarians, very helpfully painted a picture of the broader world of research data services (RDS) in which tools like DDI are playing an ever more prominent role. Kristin Briney, Data Services Librarian at UW-Milwaukee, summarized her findings-to-date for a study she and her collaborators are conducting on the current state of RDS as it exists in an official capacity at larger research universities across the US. While the findings she described were preliminary, their survey work suggests some interesting correlations amongst the size and research budgets of these institutions and the presence of established data services personnel/departments or data policies.

Perhaps even more applicable to my own position, the subsequent session provided a glimpse into another university’s data services “operation”. Brianna Marshall, Digital Curation Coordinator, and Trisha Adamus, Data, Network, and Translational Research Librarian, both from UW-Madison’s Research Data Services, delivered reports of successful strategies and ongoing challenges faced while carrying out RDS core functions on their campus. A couple takeaways gleaned from this session (and the ensuing conversations it sparked) included suggestions to improve education and outreach, by hosting a ‘brown bag’ series or publishing a digest of RDS stories of interest to researchers) and to develop a toolkit for researchers that would be keyed to the various stages of the research data lifecycle. It’s clear from the many impressive projects and potentialities discussed throughout the conference that DDI, and the community of developers, partners, and software applications it represents, should be an important part of any such RDS toolkit.

Evan Meszaros is a graduate student in the UW-Madison School of Library and Information Studies, having just completed his first year in its online degree program. He is also a newly-hired librarian at Case Western Reserve University, where he plays both research data services and traditional/reference librarian roles.

Data Archiving Platforms: Dryad

by Brianna Marshall, Digital Curation Coordinator

This is part two of a three-part series where I explore platforms for archiving and sharing your data. Read the first post in the series, focused on UW’s institutional repository, MINDS@UW.

To help you better understand your options, here are the areas I address for each platform:

  • Background information on who can use it and what type of content is appropriate.
  • Options for sharing and access
  • Archiving and preservation benefits the platform offers
  • Whether the platform complies with the forthcoming OSTP mandate

Dryad

About

Dryad is a repository appropriate for data that accompanies published articles in the sciences or medicine. Many journals partner with Dryad to provide submission integration, which makes linking the data between Dryad and the journal easy for you. Pricing varies depending on the journal you are publishing in; some journals cover the data publishing charge (DPC) while others do not. Read more about Dryad’s pricing model or browse the journals with sponsored DPCs.

Sharing and access

Data uploaded to Dryad are made available for reuse under the Creative Commons Zero (CC0) license. There are no format restrictions to what you upload, though you are encouraged to use community standards if possible. Your data will be given a DOI, enabling you to get credit for sharing.

Archiving and preservation

According to the Dryad website, “Data packages in Dryad are replicated across multiple systems to support failover, improve access times, allow recovery from disk failures, and preserve bit integrity. The data packages are discoverable and backed up for long-term preservation within the DataONE network.”

OSTP mandate

The OSTP mandate requires all federal funding agencies with over $100 million in R&D funds to make greater efforts to make grant-funded research outputs more accessible. This will likely mean that data must be publicly accessible and have an assigned DOI (though you’ll need to check with your funding agency for the exact requirements). As long as the data you need to share is associated with a published article, Dryad is a good candidate for OSTP-compliant data: it mints DOIs and makes data openly available under a CC0 license.

Visit Dryad.

Have additional questions or concerns about where you should archive your data? Contact us.