Building a Practical DM Foundation

5070_Lab_microscope_originalBy Elliott Shuppy, Masters Candidate, School of Library and Information Studies

In addition to being an active research lab on the UW-Madison campus, the Laboratory for Optical and Computational Imaging (LOCI) initiates quite a lot of experimental instrumentation techniques and develops software to support those techniques. One major database platform development is OMERO, which stands for Open Microscopy Environment Remote Object. OMERO is an open, consortium-driven software package that is set up with the capabilities to view, organize, share, and analyze image data. One hiccough is that it’s not widely used at LOCI.

Having identified this problem, my mentor Kevin Elicieri, LOCI director, and I thought it would be a good idea for me to develop expertise in this software as a project for ZOO 699 and figure out how to incorporate it into a researcher workflow at LOCI. On-site researcher Jayne Squirrel was the ideal candidate as she is a highly organized researcher working in the lab, providing us an excellent use case. Before we could insert OMERO into her workflow, we had to lay some formal foundational management practices, which will be transferable in her use of OMERO.

We identified four immediate needs:

  • Simple and consistent folder structure
  • Identify all associated files
  • ID system that can be used in OMERO database
  • Documentation

We then developed solutions to meet each need. The first solution was a formalized folder structure, which we chose to organize by Jayne’s workload:

Lab\Year (YYYY)\Project\Sub-project\Experiment\Replicates\Files

This folder structure will help organize and regularize naming of files and data sets not only locally and on the backup server, but also within the OMERO platform.

In order to identify all files associated with a particular experiment we developed a unique identifier that we termed the Experiment ID.  This identifier will lead file names and consists of the following values: initial of collaborating lab (O or H) and a numerical sequence based on current year, month, series number of experiments, and replicate.

Example: O_1411_02_R1

The example reads Ogle lab, 2014, November, second experiment (within the month of November), replicate one. Incorporating this ID into file names will help to identify and recall data sets of a particular experiment and any related files such as processed images and analyses.

Further, both the file organization and experiment ID can aid organization and identification within OMERO.  The database platform has two levels of nesting resolution.  The folder is the top tier; within each folder a dataset can be nested; each dataset contains a number of image data. So, we can adapt folder structure naming to organize files and datasets and apply the unique identifier to name uploaded image objects.  These upgrades make searching more robust and similar in process to local drive searches.

Lastly, we developed documentation for reference. We realized that Experiment ID’s need to be accessible at the prep bench and microscope.  We subsequently created a mobile accessible spreadsheet containing information on each experiment. We termed this document the Experimental Worksheet and it contains the following information:

  • Experiment ID
  • Experiment Description
  • Experiment Start Date
  • Project Name
  • Sub-project Name
  • Notes

This document will act as a quick reference of bare bones experiment information for Jayne and student workers. Too, we realized that Jayne’s student workers need to know what the processes are in each step of her workflow. So, we developed step-by-step procedures and policy for each phase of the workflow. These procedural and policy documents set management expectations and conduct for Jayne’s data. Now, with such a data management foundation laid, the next step is to get to our root problem, discern how Jayne can best benefit from using OMERO and where it makes sense in her workflow.

Data Management Resources for Librarians

by Elliott Shuppy

Research data management has quickly grown into a necessity for librarians on the UW-Madison campus. We understand that this topic can be complex and intimidating, so we wanted to provide resources on some of the most important topics that librarians may be curious about. Compiled below are links for liaisons to explore, reference, and further equip themselves for reference inquiries and conversations around data.

What is data?

This might be a scary question to some, but one with very important implications. See how Minnesota and Oregon have responded.

Why manage data?

MIT and Minnesota lay out plainly the benefits of data management for researchers.

What is a data management plan?

These links provide fairly comprehensive lists of required components and descriptions of data management plans.

Questions to ask

Helpful sets of questions for librarians to consider when conducting data-related interviews with patrons can be found in the below links.

Terms & definitions

Both Minnesota and Data One offer extensive glossaries of useful terminology for anyone dealing with data matters.

Federal requirements for data

In early 2013, the White House Office of Science and Technology Policy (OSTP) released a mandate requiring public access for federally funded research data. The Department of Energy was the first of many departments to release its requirements for researchers, which take effect October 1, 2014.