In both political behavior research and
voting rights litigation, turnout and vote choice for different
racial groups are often inferred using aggregate election results
and racial composition. Over the past several decades, many
statistical methods have been proposed to address this ecological
inference problem. We propose an alternative method to reduce
aggregation bias by predicting individual-level ethnicity from voter
registration records. Building on the existing methodological
literature, we use Bayes's rule to combine the Census Bureau's
Surname List with various information from geocoded voter
registration records. We evaluate the performance of the proposed
methodology using approximately nine million voter registration
records from Florida, where self-reported ethnicity is available. We
find that it is possible to reduce the false positive rate among
Black and Latino voters to 6% and 3%, respectively, while
maintaining the true positive rate above 80%. Moreover, we use our
predictions to estimate turnout by race and find that our estimates
yields substantially less amounts of bias and root mean squared
error than standard ecological inference estimates. We provide
open-source software to implement the proposed methodology.
The
open-source
software is available for implementing the proposed
methodology.