Love Your Data Week – Day 1

Content is adapted from the Love Your Data website.


1

Happy Love Your Data Week!

We’re excited to start the weeklong celebration of data! The first day of Love Your Data Week focuses on keeping your data safe. Below you’ll find some tips and an activity to get you thinking about when and how you can protect your data. When you’re done, share it with us on Twitter! We’d love to see how you mapped your project and it’ll help spread the word about keeping your data safe!

Good Practice

Follow the 3-2-1 Rule

  • Keep 3 copies of any important file (1 primary, 2 backup copies)
  • Store files on at least 2 different media types (e.g., 1 copy on an internal hard drive and a second in a secure cloud account or an external hard drive)
  • Keep at least 1 copy offsite (i.e., not at your home or in the campus lab)

Things to Avoid

  • Storing the only copy of your data on your laptop or flash drive
  • Storing critical data on an unencrypted laptop or flash drive
  • Saving copies of your files haphazardly across 3 or 4 places
  • Sharing the password to your laptop or cloud storage account

Today’s Activity

Data snapshots or data locks are great for tracking your data from collection through analysis and write up. Librarians call this provenance, and it can be really important. Errors are inevitable. Data snapshots can save you lots of time when you make a mistake in cleaning or coding your data. Taking periodic snapshots of your data, especially before the next phase begins (collection or processing or analysis) can keep you from losing crucial data and time if you need to make corrections. These snapshots then get archived somewhere safe (not where you store active files) just in case you need them. If something should go wrong, copy the files you need back to your active storage location, keeping the original snapshot in your archival location. For a 5-year longitudinal study, you might take snapshots every quarter. If you will be collecting all the data for your study in a 2-week period, you will want to take snapshots more often, probably every day. How much data can you afford to lose? Oh, and (almost) always keep the raw data! The only time when you might not is it’s easier and less expensive to recreate the data than keep it around.

Instructions: Draw a quick workflow diagram of the data lifecycle for your project (check out our examples on Instagram and Pinterest). Think about when major data transformations happen in your workflow. Taking a snapshot of your data just before and after the transformation can save you from heartache and confusion if something goes wrong.