|
|
In both political behavior research and
voting rights litigation, turnout and vote choice for different
racial groups are often inferred using aggregate election results
and racial composition. Over the past several decades, many
statistical methods have been proposed to address this ecological
inference problem. We propose an alternative method to reduce
aggregation bias by predicting individual-level ethnicity from voter
registration records. Building on the existing methodological
literature, we use Bayes's rule to combine the Census Bureau's
Surname List with various information from geocoded voter
registration records. We evaluate the performance of the proposed
methodology using approximately nine million voter registration
records from Florida, where self-reported ethnicity is available. We
find that it is possible to reduce the false positive rate among
Black and Latino voters to 6% and 3%, respectively, while
maintaining the true positive rate above 80%. Moreover, we use our
predictions to estimate turnout by race and find that our estimates
yields substantially less amounts of bias and root mean squared
error than standard ecological inference estimates. We provide
open-source software to implement the proposed methodology.
The open-source
software is available for implementing the proposed
methodology. |
Khanna, Kabir, Kosuke Imai, Santiago
Olivella, and Evan T. Rosenman. ``wru: Who Are You?
Bayesian Prediction of Racial Category Using Surname and
Geolocation.'' available through The Comprehensive R
Archive Network and GitHub |
Imai, Kosuke, Santiago Olivella, and Evan
T. Rosenman. (2022). ``Addressing Census
data problems in race imputation via fully Bayesian Improved
Surname Geocoding and name supplements.'' Science
Advances, Vol. 8, No. 49,
pp. 1-10. |
Rosenman, Evan T.R., Santiago Olivella,
and Kosuke Imai. (2023). ``Race and ethnicity
data for first, middle, and last names.''
Scientific Data, Vol. 10, No. 299,
pp. 1-11. |
McCartan, Cory, Robin Fisher, Jacob
Goldin, Daniel E. Ho, Kosuke Imai. ``Estimating Racial Disparities When Race is Not
Observed.'' Journal of the American Statistical
Association, Forthcoming. |