An Introduction to Web Archiving for Research

Web archiving is the practice of collecting and preserving resources from the web. The most well known and widely used web archive is the Internet Archive’s Wayback Machine. The Internet Archive was launched in 1996 by Brewster Kahle with the mission of providing “Universal Access to All Knowledge” The Wayback Machine uses an automated process called crawling to collect pages from all over the web and stores them on servers at the Internet Archive headquarters in San Francisco. 

Institutions such as government agencies, universities, and libraries also actively archive the web, but often with narrower collection scope. There are also many web archiving projects run by smaller teams and individual researchers, and these too usually have specific areas of focus. If there are web resources you are interested in collecting and preserving, with a little research and learning of the tools, you can absolutely create your own web archive. 

Please be advised that if you are archiving web pages, forums, social media, or other web materials for research purposes and it may constitute human subjects research, you must consult with and follow the appropriate UW-Madison Institutional Review Board process as well as follow their guidelines on “Technology & New Media Research”.  (more…)

Tools: Transana

Transana
http://www.transana.org/

Description: “Transana is software for professional researchers who want to analyze digital video or audio data. Transana lets you analyze and manage your data in very sophisticated ways. Transcribe it, identify analytically interesting clips, assign keywords to clips, arrange and rearrange clips, create complex collections of interrelated clips, explore relationships between applied keywords, and share your analysis with colleagues. The result is a new way to focus on your data, and a new way to manage large collections of video and audio files and clips.”

Cost/legal restrictions: Transana is licensed under the GNU/GPL license; purchase and licensing details are at the Transana is Open Source page. Source code is available from the Sourceforge Transana project page.

Notes: Developed at the Wisconsin Center for Educational Research, University of Wisconsin-Madison. Part of the Digital Insight project.