Validating Self-reported Turnout by Linking Public Opinion Surveys with Administrative Records

Although it is widely known that the self-reported turnout rates obtained from public opinion surveys tend to substantially overestimate the actual turnout rates, scholars sharply disagree on what causes this bias. Some blame overreporting due to social desirability, whereas others attribute it to non-response bias and the accuracy of turnout validation. While we can validate self-reported turnout by directly linking surveys with administrative records, most existing studies rely on proprietary merging algorithms with little scientific transparency and report conflicting results. To shed a light on this debate, we apply a probabilistic record linkage model, implemented via the open-source software package fastLink, to merge two major election studies -- the American National Election Studies and the Cooperative Congressional Election Survey -- with a national voter file of over 180 million records. For both studies, fastLink successfully produces validated turnout rates close to the actual turnout rates, leading to public-use validated turnout data for the two studies. Using these merged data sets, we find that the bias of self-reported turnout originates primarily from overreporting rather than non-response. Our findings suggest that those who are educated and interested in politics are more likely to overreport turnout. Finally, we show that fastLink performs as well as a proprietary algorithm.

Enamorado, Ted and Kosuke Imai. (2019). ``Validating Self-reported Turnout by Linking Public Opinion Surveys with Administrative Records.'' Public Opinion Quarterly, Vol. 83, No. 4 (Winter), pp. 723–748.

Abstract

Related Paper and Software