|
|
Over the last two decades, the amount and variety of data available to social scientists have dramatically increased. While in the 1990s most researchers were analyzing a handful of national surveys and government data, today's quantitative social scientists conduct their own randomized experiments and surveys and analyze a diverse array of large-scale data sets, ranging from textual to spatial data. This emerging trend demands new statistical methodologies that enable social scientists to overcome these data analytical and computational challenges. I have developed fast and reliable computational methods for popular Bayesian models such as the multinomial probit and ecological inference models. I have also worked on the development of computational methods for lage-scale data sets in social science research. They include the fast and scalable estimation of various ideal point models for massive data, a dynamic clustering method for large scale product-level trade data, a dynamic regression model for networks, analyses of textual and video data, simulation and enumeration methods for redistricting, and a method for record linkage with large-scale administrative data. |
Algorithm-assisted human
decision-making and policy learning: |
Imai, Kosuke, Zhichao Jiang, D. James
Greiner, Ryan Halen, and Sooahn Shin. (2023). ``Experimental Evaluation of
Algorithm-Assisted Human Decision-Making: Application to Pretrial
Public Safety Assessment.'' (with discussion)
Journal of the Royal Statistical Society, Series A (Statistics
in Society), Vol. 186, No. 2 (April), pp. 167-189. Read
before the Royal Statistical Society. |
Ben-Michael, Eli, D. James Greiner, Kosuke
Imai, and Zhichao Jiang. ``Safe Policy Learning through
Extrapolation: Application to Pre-trial Risk
Assessment.'' Journal of the American Statistical
Association, Forthcoming. |
Ben-Michael, Eli, D. James Greiner, Melody
Huang, Kosuke Imai, Zhichao Jiang, Sooahn Shin. ``Does AI help humans make better decisions? A
methodological framework for experimental
evaluation.'' |
Imai, Kosuke and Zhichao
Jiang. (2023). ``Principal Fairness for Human and
Algorithmic Decision-Making.'' Statistical
Science, Vol. 38 No. 2 (July), pp317-328. |
Ben-Michael, Eli, Kosuke Imai, and Zhichao
Jiang. (2024). ``Policy Learning with Asymmetric
Counterfactual Utilities.'' Journal of the
American Statistical Association, Vol. 119, No. 548,
pp. 3045-3058. |
Zhang, Yi, Eli Ben-Michael, and Kosuke
Imai. ``Safe Policy
Learning under Regression Discontinuity Designs..''
|
Jia, Zeyang, Eli Ben-Michael, and Kosuke
Imai. ``Bayesian
Safe Policy Learning with Chance Constrained Optimization:
Application to Military Security Assessment during the Vietnam
War..'' Journal of the Royal Statistical Society,
Series A (Statistics in Society), Forthcoming. |
Koch, Benedikt and Kosuke Imai. ``Statistical Decision Theory with
Counterfactual Loss.'' |
Heterogeneous treatment effects: |
Imai, Kosuke, and Aaron
Strauss. (2011). ``Estimation of Heterogeneous
Treatment Effects from Randomized Experiments, with Application to
the Optimal Planning of the Get-out-the-vote
Campaign.'' Political Analysis, Vol. 19, No. 1
(Winter), pp. 1-19. (lead article) Winner of Political Analysis
Editors' Choice Award.
|
Imai, Kosuke and Marc Ratkovic. (2013).
``Estimating Treatment
Effect Heterogeneity in Randomized Program
Evaluation.'' Annals of Applied Statistics,
Vol. 7, No. 1 (March), pp. 443-470. Winner of the Tom Ten Have
Memorial Award.
|
Imai, Kosuke and Michael Lingzhi
Li. (2023). ``Experimental Evaluation
of Individualized Treatment Rules.'' Journal of
the American Statistical Association, Vol. 118, No. 541,
pp. 242-256. |
Li, Michael Lingzhi and Kosuke
Imai. (2024). ``Neyman
Meets Causal Machine Learning: Experimental Evaluation of
Individualized Treatment Rules.'' Journal of
Causal Inference, Vol 12, No. 1, pp. 1-20. Special Issue on
Neyman (1923) and its influences on causal
inference. |
Imai, Kosuke and Michael Lingzhi
Li. (2025). ``Statistical
Inference for Heterogeneous Treatment Effects Discovered by
Generic Machine Learning in Randomized Experiments.''
Journal of Business & Economic Statistics,
Forthcoming. |
Li, Michael Lingzhi and Kosuke
Imai. ``Statistical
Performance Guarantee for Subgroup Identification with Generic
Machine Learning.'' |
Jia, Zeyang, Kosuke Imai, and Michael
Lingzhi Li. ``Cramming
Contextual Bandits for On-policy Statistical
Evaluation.'' |
Zhang, Yi and Kosuke Imai. ``Individualized Policy
Evaluation and Learning under Clustered Network
Interference.'' |
Zhang, Yi, Melody Huang, and Kosuke
Imai. ``Minimax
Regret Estimation for Generalizing Heterogeneous Treatment Effects
with Multisite Data.'' |
Zhou, Lingxiao, and Kosuke Imai, Jason
Lyall, and Georgia Papadogeorgou. ``Estimating
Heterogeneous Treatment Effects for Spatio-Temporal Causal
Inference: How Economic Assistance Moderates the Effects of
Airstrikes on Insurgent Violence.'' |
Highdimensional treatments: |
Egami, Naoki, and Kosuke
Imai. (2019). ``Causal
Interaction in Factorial Experiments: Application to Conjoint
Analysis.'' Journal of the American Statistical
Association, Vol. 114, No. 526 (June),
pp. 529-540. |
de la Cuesta, Brandon, Naoki Egami, and
Kosuke Imai. (2022). ``Experimental Design and
Statistical Inference for Conjoint Analysis: The Essential Role of
Population Distribution..'' Political
Analysis, Vol. 30, No. 1 (January), pp. 19-45. |
Goplerud, Max, Kosuke Imai, Nicole
E. Pashley. (2025). ``Estimating Heterogeneous
Causal Effects of High-Dimensional Treatments: Application to
Conjoint Analysis.'' Annals of Applied Statistics,
Vol. 19, No. 2 (June), pp. 866-888. |
Ham, Dae Woong, Kosuke Imai, and Lucas
Janson. (2024). ``Using
Machine Learning to Test Causal Hypotheses in Conjoint
Analysis.'' Political Analysis, Vol. 32,
No. 3 (July), pp. 329-344. |
Highdimensional propensity score: |
Ning, Yang, Sida Peng, and Kosuke
Imai. (2020). ``Robust
Estimation of Causal Effects via High-Dimensional Covariate
Balancing Propensity Score..'' Biometrika,
Vol. 107, No. 3 (September), pp. 533–554. |
Clustering and scaling methods for
large-scale data: |
Imai, Kosuke, James Lo, and Jonathan
Olmsted. (2016). ``Fast Estimation of Ideal
Points with Massive Data.'' American Political
Science Review, Vol. 110, No. 4 (December),
pp. 631-656.
|
Kim, In Song, Steven Liao, and Kosuke
Imai. (2020). ``Measuring Trade Profile with
Granular Product-level Trade Data.'' American
Journal of Political Science, Vol. 64, No. 1 (January),
pp. 102-117. |
Olivella, Santiago, Tyler Pratt, and
Kosuke Imai. (2022). ``Dynamic Stochastic Blockmodel
Regression for Network Data: Application to International
Conflicts..'' Journal of the American
Statistical Association, Vol. 117, No. 539, pp. 1068-1081.
|
Lo, Adeline, Santiago Olivella, and Kosuke
Imai. ``A
Statistical Model of Bipartite Networks: Application to
Cosponsorship in the United States Senate..''
|
Analysis of unstructured data:
texts, video, and maps: |
Imai, Kosuke and Kentaro Nakamura. ``Gen-AI Powered
Inference.'' |
Imai, Kosuke and Kentaro Nakamura. ``Causal Representation Learning
with Generative Artificial Intelligence: Application to Texts as
Treatments.'' |
McCartan, Cory, Jacob Brown, and Kosuke
Imai. (2024). ``Measuring and Modeling
Neighborhoods.'' American Political Science
Review, Vol. 118, No. 4 (November),
pp. 1966-1985. |
Breuer, Adam, Bryce J. Dietrich, Michael
H. Crespin, Matthew Butler, J.A. Pyrse, Kosuke Imai. ``Using AI to Summarize US
Presidential Campaign TV Advertisement Videos,
1952-2012.'' Scientific Data,
Forthcoming. |
Tarr, Alexander, June Hwang, and Kosuke
Imai. (2023). ``Automated Coding of
Political Campaign Advertisement Videos: An Empirical Validation
Study.'' Political Analysis, Vol. 31, No. 4
(October), pp. 554-574. |
Eshima, Shusei, Kosuke Imai, and Tomoya
Sasaki. (2024). ``Keyword-Assisted Topic
Models.'' American Journal of Political
Science, Vol. 68, No. 2 (April),
pp. 730-750. |
Algorithms for legislative
redistricting and applications: |
Miyazaki, Sho, Kento Yamada, and Kosuke
Imai. ``Estimating
the Partisan Bias of Japanese Legislative Redistricting Plans Using
a Simulation Algorithm.'' |
McCartan, Cory, Christopher Kenny, Tyler
Simko, Emma Ebowe, Michael Zhao, and Kosuke Imai. ``Redistricting Reforms Reduce
Gerrymandering by Constraining Partisan Actors.''
|
Kenny, Christopher T., Cory McCartan,
Tyler Simko, Shiro Kuriwaki, and Kosuke Imai. (2023). ``Widespread Partisan
Gerrymandering Mostly Cancels Nationally, but Reduces Electoral
Competition .'' Proceedings of the National
Academy of Sciences, Vol. 120, No. 25,
e2217322120. |
McCartan, Cory, Christopher T. Kenny,
Tyler Simko, George Garcia III, Kevin Wang, Melissa Wu, Shiro
Kuriwaki, and Kosuke Imai. (2022). ``Simulated redistricting plans
for the analysis and evaluation of redistricting in the United
States: 50stateSimulations.'' Scientific
Data, Vol. 9, No. 689, pp. 1-10. |
Kenny, Christopher T., Shiro Kuriwaki,
Cory McCartan, Evan T.R. Rosenman, Tyler Simko, and Kosuke
Imai. (2023). ``Comment: The Essential Role
of Policy Evaluation for the 2020 Census Disclosure Avoidance
System..'' Harvard Data Science Review,
Special Issue 2: Dierential Privacy for the 2020 U.S. Census
(January), pp. 1-16. |
McCartan, Cory and Kosuke
Imai. (2023). ``Sequential Monte Carlo for
Sampling Balanced and Compact Redistricting Plans.''
Annals of Applied Statistics, Vol. 17, No. 4 (December),
pp. 3300-3323.. |
Fifield, Benjamin, Michael Higgins,
Kosuke Imai, and Alexander Tarr. (2020). ``Automated Redistricting Simulation
Using Markov Chain Monte Carlo.'' Journal of
Computational and Graphical Statistics, Vol. 29, No. 4,
pp. 715-728. |
Fifield, Benjamin, Kosuke Imai, Jun
Kawahara, and Christopher T. Kenny. (2020). ``The Essential Role of
Empirical Validation in Legislative Redistricting
Simulation.'' Statistics and Public Policy,
Vol. 7, No. 1, pp 52-68. |
Census and differential privacy: |
Kenny, Christopher T., Shiro Kuriwaki, Cory
McCartan, Evan T.R. Rosenman, Tyler Simko, and Kosuke
Imai. (2021). ``The Use
of Differential Privacy for Census Data and its Impact on
Redistricting: The Case of the 2020 U.S. Census..''
Science Advances, Vol. 7, No. 7 (October),
pp. 1-17. |
McCartan, Cory, Tyler Simko, and Kosuke
Imai. (2023). ``Researchers need better
access to US Census data.'' Science,
Vol. 380, No. 6648 pp. 902-903 |
McCartan, Cory, Tyler Simko, and Kosuke
Imai. (2023). ``Making Differential Privacy
Work for Census Data Users.'' Harvard Data
Science Review, Vol. 5, No. 4 (Fall).
|
Kenny, Christopher, Shiro Kuriwaki, Cory
McCartan, Tyler Simko, and Kosuke Imai. (2024). ``Evaluating Bias and Noise
Induced by the U.S. Census Bureau's Privacy Protection
Methods.'' Science Advances, Vol 10, No. 18
(May), pp. 1-13. |
Record linkage methods: |
Enamorado, Ted, Benjamin Fifield, and
Kosuke Imai. (2019). ``Using a Probabilistic Model to
Assist Merging of Large-scale Administrative
Records.'' American Political Science
Review, Vol. 113, No. 2 (May), pp. 353-371.
|
Enamorado, Ted, and Kosuke
Imai. (2019). ``Validating Self-reported
Turnout by Linking Public Opinion Surveys with Administrative
Records.'' Public Opinion Quarterly,
Vol. 83, No. 4 (Winter), pp. 723–748. |
Multinomial probit models: |
Imai, Kosuke, and David A. van
Dyk. (2005). ``A Bayesian
Analysis of the Multinomial Probit Model Using Marginal Data
Augmentation.'' Journal of Econometrics,
Vol. 124, No. 2 (February), pp. 311-334. |
Imai, Kosuke, and David A. van
Dyk. (2005). ``MNP: R Package
for Fitting the Multinomial Probit Model.''
Journal of Statistical Software, Vol. 14, No. 3 (May),
pp. 1-32. abstract
reprinted in Journal of Computational and Graphical
Statistics, (2005) Vol. 14, No. 3 (September), p. 747. |
Ecological inference and racial
prediction models: |
Imai, Kosuke, and Gary King. (2004). ``Did Illegal Overseas
Absentee Ballots Decide the 2000 U.S. Presidential
Election?.'' Perspectives on Politics,
Vol. 2, No. 3 (September), pp.537-549. Our analysis is a part of The New York Times article, ``How
Bush Took Florida: Mining the Overseas Absentee Vote'' By David
Barstow and Don van Natta Jr. July 15, 2001, Page 1, Column
1.
|
Imai, Kosuke, Ying Lu, and Aaron
Strauss. (2008). ``Bayesian and Likelihood
Inference for 2 x 2 Ecological Tables: An Incomplete Data
Approach.'' Political Analysis, Vol. 16,
No. 1 (Winter), pp. 41-69.
|
Imai, Kosuke, Ying Lu, and Aaron
Strauss. (2011). ``eco:
R Package for Ecological Inference in 2 x 2
Tables.'' Journal of Statistical Software,
Vol. 42, No. 5 (Special Volume on Political Methodology),
pp. 1-23.
|
Imai, Kosuke and Kabir
Khanna. (2016). ``Improving
Ecological Inference by Predicting Individual Ethnicity from Voter
Registration Record.'' Political Analysis,
Vol. 24, No. 2 (Spring), pp. 263-272.
|
Imai, Kosuke, Santiago Olivella, and Evan
T.R. Rosenman. (2022). ``Addressing Census data
problems in race imputation via fully Bayesian Improved Surname
Geocoding and name supplements.'' Science
Advances, Vol. 8, Issue 49, pp. 1-10. |
Rosenman, Evan T.R., Santiago Olivella, and
Kosuke Imai. (2023). ``Race and ethnicity data for
first, middle, and last names.'' Scientific
Data, Vol. 10, No. 299, pp. 1-11. |
McCartan, Cory, Robin Fisher, Jacob
Goldin, Daniel E. Ho, Kosuke Imai. ``Estimating Racial Disparities When
Race is Not Observed.'' Journal of the American
Statistical Association, Forthcoming. |
Imai, Kosuke, Ying Lu, and Aaron Strauss. ``eco: R Package for
Ecological Inference in 2 x 2 Tables.''
available through The
Comprehensive R Archive Network. 2004-2009. |
Imai, Kosuke, and David A. van
Dyk. ``MNP: R Package for Fitting the
Multinomial Probit Model.''
available through The
Comprehensive R Archive Network. 2004-2008. |
Khanna, Kabir, and Kosuke Imai. ``wru: Who Are You? Bayesian
Predictions of Racial Category Using Surname and
Geolocation.'' available through GitHub. 2015. |
Fifield, Benjamin, Christopher T. Kenny,
Cory MaCartan, Alexander Tarr, and Kosuke Imai. ``redist: Computational
Algorithms for Redistricting Simulation.'' available
through The Comprehensive R
Archive Network and GitHub.
|
Imai, Kosuke, James Lo, and Jonathan
Olmsted. ``emIRT: EM
Algorithms for Estimating Item Response Theory
Models.'' available through The Comprehensive R
Archive Network and the GitHub. 2015.
|