The Impact of Data Invisibility and the Need for Disaggregation

Data on ethnicity and race in the United States is often lumped into five or six broad categories, in ways that can render communities invisible or hide disparate impacts of inequality on subgroups. Although broad categories such as Hispanic/Latino, Black or African American, White, American Indian or Alaska Native, and Native Hawaiian or Other Pacific Islander, can make it easier to study groups, especially in relation to each other, they flatten the experiences of those communities, replicate marginalization, and don’t often account for intersectionality of identities. In order to collect the kind of data that can better reflect the realities of these communities and lead to more just policies and outcomes, government organizations and researchers alike need to disaggregate their racial and ethnic data.

Data disaggregation, the breaking down of larger categories into more specific sub-groups, can help researchers, community members, and policy makers better understand the needs of those groups. The Coronavirus pandemic has highlighted the need for better disaggregation practices to understand the impact on the communities most affected as well as how to best target resources. Because most states aggregate Asian Americans and Native Hawaiians and Pacific Islanders in their COVID-19 reporting, the higher coronavirus death rate among Native Hawaiians and Pacific Islanders can become hidden in the overall picture. Wisconsin aggregates their COVID-19 data for Asian Pacific Islander populations and you can read about how that paints a misleading picture with regards to the Hmong community and other Southeast Asian Americans here. It can be difficult to disaggregate data that is either never collected or miscategorized and this has been a barrier to American Indian and Native Alaskan communities in finding out the true impact of COVID-19. Along with incomplete racial and ethnic COVID-19 data, very few states have collected COVID-19 data that included gender identity or sexual orientation, making it difficult to know how LGBTQ+ members of BIPOC communities have fared in health outcomes or vaccine access.

Without adequate disaggregation, studies can perpetuate perceptions that groups behave as monoliths. A study from the Pew Research Center claims that only 3% of Hispanics use the term Latinx, a gender neutral alternative for Latino/Latina. And while the study disaggregates by factors such as whether participants are foreign born, languages spoken, and education level, it doesn’t disaggregate by sexual orientation, or gender identity beyond the binary male and female. This makes it impossible to know whether nonbinary, transgender, or other LGBTQ+ Latinxs were included in the study. Intersectionality must be taken into account when disaggregating data about communities so that the differences between subgroups are respected and that the voices of those most impacted are included when studying certain subjects.

In “The Critical Role of Racial/Ethnic Data Disaggregation for Health Equity” Tina J. Kauh, Jen’nan Ghazal Read and A. J. Scheitler discuss the impact that data aggregation has on several communities in terms of perpetuating health inequity through the differences that are excluded:

Aggregated data perpetuates the model minority myth and obscures the challenges people within Asian American communities face.
The ethnic, cultural, and linguistic differences between 562 Native American nations and their access to health resources depending on whether they live on reservations.
Differences in nativity, immigration status, and language between those who identify as Black or African American including those who have immigrated from Africa or the Caribbean.
Three quarters of the Hispanic/Latino population identify as Mexican, Puerto Rican, or Cuban, which each have vastly different cultures and histories.
Differences in origin, immigration status, and language for those who identify as White, which may include people from Western and Eastern Europe, the Middle East, and North Africa.

When considering how far to disaggregate data, Race Matters Institute of Just Partners suggests disaggregating data “with sufficient detail to understand varying groups’ circumstances.” They also suggest going beyond individual level indicators where possible to include structural indicators because without structural data, individuals may be blamed for inequitable outcomes. And Carlos Sánchez Huizar highlights the importance of including marginalized communities in the data dialogue and engaging community leaders, constituents, and data experts in better data collection practices. Ultimately, in our own data practices, it’s important to center communities and their needs, to recognize the importance for community members of seeing themselves reflected in the data, and that data that treats groups as though they are a monolith renders them invisible and can mask harm.

Research Data Services (RDS) is an interdisciplinary organization committed to advancing research data management practice on the UW-Madison campus. We focus on providing researchers with the tools and resources that support their efforts to store, analyze and share data.