Refresh Your Data Management Savvy This New Year

Happy New Year’s! The start of a new year and a new semester are as good a time as ever to evaluate your data management practices. Here are some reminders about data management best practices, groups on campus who can help you with managing your data, and some upcoming opportunities for you to sharpen your skills.

(more…)

Tool: Scalar

What Is Scalar?

Scalar is a free and open-source authoring and publishing platform that allows users to integrate multiple media types into born-digital scholarly works. Built by the Alliance for Networking Visual Culture, Scalar allows users to create publications that would be the length of an essay, article, or even a book. Scalar’s flexible content management structure means that it allows users to adapt its features for their own needs.

(more…)

OpenCitations Enhances Citation Data with COCI

OpenCitations has been working toward enhancing citations to make citation data more easily discoverable and retrievable. In July, OpenCitations released COCI, the OpenCitations Index of Crossref open DOI-to-DOI references. The initial release of COCI created first-class data entities out of citation information in order to index Crossref and to make this information machine-readable. The July release also included the OpenCitations Corpus (OCC), a repository of downloadable bibliographic and citation data. OpenCitations has been building upon the data model that they created, and released the newest version of COCI this week: they have extended the data model, and the index now contains almost 450 million citation links between DOIs from Crossref reference data.

(more…)

Tool: Tabula

Information adapted from the Tabula website.

What is Tabula?

If you’ve ever needed data that only exists in a PDF format, you’ve likely discovered that you can’t easily copy and paste the data, which makes being able to actually use it difficult.  Tabula is a free, open-source tool you can use for “liberating data tables locked inside PDF files.”

For an example of Tabula being used to extract data for a visualization project, check out this blog post by the Jane Speaks Initiative. Other examples can also be found on the Tabula website.

What can Tabula help you do?

Tabula runs in your web browser, making it easy to browse to the PDF containing the data you need, select the portion of the PDF containing the data tables, and then easily extract the data from the tables into a CSV file or a Microsoft Excel spreadsheet.

How do you get it?

You can download Tabula for free from its website. It is also available on GitHub.

What else should you know?

Tabula works only with text-based PDFs; the developers note that it will not work with scanned documents. Tabula is available for Windows, Mac OS X, and Linux operating systems.