World Digital Preservation Day: Disaggregating COVID-19 Racial/Ethnic Data 

This year, the theme for World Digital Preservation Day (WDPD) is “Digits: for Good.” According to the Digital Preservation Council, this year’s theme “refers to the hard work, resilience, and responsiveness of our colleagues which will enable research and development data used in finding a vaccine for COVID-19 to be preserved, shared and studied….” In line with this theme, its reference to the pandemic, and the currency of COVID-19 cases spiking across Wisconsin, it feels necessary for me as Southeast Asian/HMoob librarian to talk about racial/ethnic disaggregated data when it comes to COVID-19 and to connect this issue to digital preservation. 

One of the main values of digital preservation is that things are preserved for good, or at least for at long as possible, in order for future generations to access and use it for research, memory-making and understanding. In this vein, the COVID-19 data that is currently being generated and collected is the data that will inform current and future research and public health response as well as how we remember the impact of the pandemic and our lessons learned from it. 

COVID-19 data from state and national health departments/organizations are being collected along many data points; to name a few: age, gender, residence/location, residence type, etc. In Wisconsin, the Department of Health Services aggregates race/ethnicity into six racial/ethnic groups: Asian Pacific Islander, Black, Hispanic or Latinx, American Indian, White, and Multiple or other races. From the outset, one might applaud the six categories, especially the category of Multiple or other races. One might also say that these categories reflect the racial/ethnic demographics of Wisconsin. These categories, however, are not enough, because they lack nuance and overlook subpopulations. There is tremendous variation in these racial/ethnic groups: “API” can mean HMoob (who immigrated to the US as early as the 1970s) or an East Asian group, such as Chinese or Japanese whose ancestors may have immigrated to the US centuries ago.

In Quyen T. Dinh, Katrina D. Mariategue and Anna H. Byon’s paper “COVID-19: Revealing Unaddressed Systemic Barriers in the 45th Anniversary of the Southeast Asian American Experience,” they explore challenges facing the Southeast Asian American (SEAA) communities, stating that  “the seemingly lower rate of COVID-19 contraction by Asian American communities paints a misleading picture and conceals the real impacts facing SEAA  communities who have higher rates of pre-existing chronic health conditions compared to other Asian American communities.” In short, shared racialized experiences among APIs do not mean shared material realities and outcomes. Disaggregated data is important because it tells a more complete picture about racial/ethnic disparities in exposure, testing, diagnosis, treatment and death. 

Many folks in digital preservation might say that their role is to preserve the data as is, not to prescribe to researchers and collectors how data should be organized and presented. I push back against this idea because when it comes to equity and diversity in digital preservation, equitable categories of analysis are as important as preservation and access of materials. Digital preservation involves management of digital objects to ensure accuracy, functionality, and accessibility of those objects over time. This is an inherently political process, because it involves controlling memory and representation. To ensure accuracy of digital objects over time, the push for disaggregated data is key. Disaggregated racial/ethnic data gives the nuances that would allow digital preservation to accurately represent and help retell a collective experience that honors the patterns and trends in each community.