|
|
This paper introduces the largest and most comprehensive
dataset of US presidential campaign television
advertisements, available in digital format. The dataset also
includes machine-searchable transcripts and high-quality summaries
designed to facilitate a variety of academic research. To date,
there has been great interest in collecting and analyzing US
presidential campaign advertisements, but the need for manual
procurement and annotation led many to rely on smaller subsets. We
design a large-scale parallelized, AI-based analysis pipeline that
automates the laborious process of preparing, transcribing, and
summarizing videos. We then apply this methodology to the 9,707
presidential ads from the Julian P. Kanter Political Commercial
Archive. We conduct extensive human evaluations to show that these
transcripts and summaries match the quality of manually generated
alternatives. We illustrate the value of this data by including an
application that tracks the genesis and evolution of current focal
issue areas over seven decades of presidential elections. Our
analysis pipeline and codebase also show how to use LLM-based tools
to obtain high-quality summaries for other video
datasets. |
Tarr, Alexander, June Hwang, and Kosuke
Imai. (2023). ``Automated Coding
of Political Campaign Advertisement Videos: An Empirical
Validation Study.'' Political Analysis,
Vol. 31, No. 4 (October), pp. 554-574. |