Let’s Talk About Storage

By Luke Bluma, IT Engagement Manager for the Campus Computing Infrastructure (CCI)

Data is a critical part of our lives here at UW-Madison. We collect, analyze, and share data every day to get our jobs done. Data comes in all shapes and sizes and it needs the right place to live. That’s where storage comes in.

However, storage can be a loaded term. It can mean a thumb drive, or your computer’s hard drive, or storage that is accessed via a server or cloud storage or a large campus-wide storage service. It is all of these things, but not all of these will fit your needs. Your needs are what matters and they will drive what solution(s) will work for you.

I am the Engagement Manager for the Campus Computing Infrastructure (CCI) initiative. I work with campus partners on their data center, server, storage and/or backup needs. Storage is currently a big focus for me, so I wanted to share some thoughts about evaluating potential storage solutions.

Storage Array in Data Center

Storage for CCI

The main areas to think about are:

  • What kinds of data are you working with?
  • What are your “must have’s”?
  • What storage options are available at UW-Madison?

What kinds of data are you working with?

This is the first big question you want to focus on because it drastically impacts what options are available to you. Are you working with FERPA data, sensitive data, restricted data, PCI data, etc.? Each of these will impact what service(s) you can or can’t utilize. For more information on Restricted Data see: https://www.cio.wisc.edu/security/about/campus-initiatives/restricted-data-security-standards/

What are your “must have’s”?

Once you have identified the types of data you are working with, then it is crucial to determine what are your must have requirements for a storage solution. Does it need to be secure? If so, how secure? Does it need to be accessed by people outside of UW-Madison? Does it need to be high performance storage? Does it need to scale to 20+ TB? Does it need to be accessible via the web? These are just example questions, and the key here is that there is no perfect storage solution. Some services do X, Y, Z and others do X, Y, A but not Z. So determining your “must have’s” will help you figure out which services you can work with, and which you can’t.

What storage options are available at UW-Madison?

Now that you have identified the kinds of data, and the “must have’s” for your solution the final step is to evaluate what storage options are available to you at UW-Madison. Storage is an evolving technology so specific services will change over time, but here are good places to start to learn more about what services are available to you:

  • Local IT – if you have a local IT group, then talk to them first about what local options may be available to you
  • Campus Computing Infrastructure (CCI) – if you need network storage or server storage that isn’t focused on high performance computing then CCI has several options that could work depending on your needs
  • Advanced Computing Initiative (ACI) – if you need to do high performance or high throughput computing then ACI has several options that could work depending on your needs
  • Division of Information Technology (DoIT) – if you need cloud storage, like Box.com, or local storage, like an external hard drive, then DoIT has solutions that could work for you as well

This can seem like a lot to think about, and to be honest it can be quite confusing at times. The good news is that you have help! Research Data Services (RDS) can be a great starting point for your storage needs. We can focus on the key question: what are you looking to do? Then we can help you evaluate some potential options for moving forward based on your needs.

To get started contact RDS at http://researchdata.wisc.edu/help/contact-us/ or contact me at cci@cio.wisc.edu

Tools: SpiderOak

What It Is: Cloud-based file storage, synchronization, and back-ups. SpiderOak is available on Windows, Linux, OS X, iOS, Android, and N900 Maemo.

Cost: Free, premium, and enterprise accounts available. The pricing for storage is better compared to Dropbox; $10/month gets you 100GB at SpiderOak vs. 50GB from Dropbox. SpiderOak also has no maximum storage limit. Additionally, it offers a 50% educational discount to anyone with a valid .edu email address.

Ease of Use: SpiderOak’s forte is security, not interface design. The web and mobile interfaces are fairly plain and not nearly as user-friendly as Dropbox’s interfaces. Additionally, while Dropbox has a very simple set-up–everything goes in the Dropbox folder and syncs to all your devices unless you tell it not to–SpiderOak’s set up is a bit more involved. First, you need to set up a back-up. You can choose multiple folders and even specific types of files. After you’ve done this, you can sync the folders across your devices. Finally, access from the web and mobile interfaces is read-only. You can only upload files from the desktop client.

Sharing and Collaboration: SpiderOak provides ShareRooms which allow you to selectively share folders (with anyone; not limited to other SpiderOak users), but the files are read-only. It also allows sharing of a single file, but this is read-only as well. The sharing is more secure: the ShareRoom is access through a unique URL and a RoomKey (password) must be entered, but there is no mechanism for collaborative editing.

Organizing: Other than the traditional hierarchical file system structure, SpiderOak does not have any built-in organizational features.

Exporting: Files can easily be exported. Simply de-select the folders or files in question from the syncing and back-up.

Backups and Versioning: This is one area where SpiderOak does well. It says all historical versions of a file, and does extensive de-duplication, so only the parts that are different are saved, not the entire file.

Security: SpiderOak is, as Ars Technica puts it, “Dropbox for the security obsessive.” Its main selling point is not that’s cloud storage, but that it is secure cloud storage. Unlike the other major cloud storage services, SpiderOak employees cannot access your files. Both Dropbox and SpiderOak encrypt their data, but SO also encrypts the decryption key. The downside to SpiderOak’s superior security is that if you forget your password, your files are gone.

 

Tools: Dropbox

https://www.dropbox.com/

What It Is: Cloud-based file storage and synchronization

Cost: Free, premium, and enterprise accounts available.

Ease of Use: With its focus on appealing to a broad audience of users, the interface is designed to be simple and the software is engineered to be easy for the average person to install and use.

Sharing and Collaboration: Users can share folders or individual files; items can also be shared with people who do not have Dropbox accounts. The software is also available on many platforms; in addition to its web interface, there are desktop clients for Windows, Mac, Linux, and the major mobile devices as well. Dropbox’s focus on mass appeal and cross-platform availability makes it a good fit for both groups with users who have varying levels of technical expertise and groups whose members use different operating systems.

Organizing: Other than the traditional hierarchical filesystem structure, Dropbox does not have any built-in organizational features. There are add-ons for tagging and attaching notes, but that isn’t a particularly active area among the third-party development community and the few available add-ons are still in alpha and beta-testing phases. For group collaboration with Dropbox, maintaining findability of files is dependent upon all group members following the same folder-organization and file-naming conventions.

Exporting: Dropbox also shines in its ability to easily export the files put into it. Items can simply be dragged and dropped to another folder on the computer.

Backups and Versioning: Two areas in which Dropbox does quite well are versioning and backups. It automatically creates conflicted copies of files that have been edited by multiple people at the same time, and all free accounts come with a 30-day file history, meaning that you can recover deleted files and revert to old versions of files from within the previous 30 days. Indefinite versioning and file history are available as a paid add-on, but that service is not retroactive, meaning that if you purchase it, the indefinite history will only apply to changes made after you’ve purchased the add-on.

Security: Lack of adequate security has been brought up as a criticism by Dropbox’s detractors. While there are both third-party apps and a number of DIY methods to make individual Dropbox accounts more secure, if you are working with sensitive data and/or things which are covered by legislation such as HIPAA or FERPA, you’ll want to look for a syncing solution that places a higher priority on security.