|
|
A century ago, Neyman showed how to
evaluate the efficacy of treatment using a randomized experiment
under a minimal set of assumptions. This classical repeated
sampling framework serves as a basis of routine experimental
analyses conducted by today's scientists across disciplines. In
this paper, we demonstrate that Neyman's methodology can also be
used to experimentally evaluate the efficacy of individualized
treatment rules (ITRs), which are derived by modern causal machine
learning algorithms. In particular, we show how to account for
additional uncertainty resulting from a training process based on
cross-fitting. The primary advantage of Neyman's approach is that
it can be applied to any ITR regardless of the properties of
machine learning algorithms that are used to derive the ITR. We
also show, somewhat surprisingly, that for certain metrics, it is
more efficient to conduct this ex-post experimental evaluation of
an ITR than to conduct an ex-ante experimental evaluation that
randomly assigns some units to the ITR. Our analysis demonstrates
that Neyman's repeated sampling framework is as relevant for
causal inference today as it has been since its inception. |
Imai, Kosuke and Michael Lingzhi
Li. (2023). ``Experimental
Evaluation of Individualized Treatment Rules.''
Journal of the American Statistical Association,
Vol. 118, No. 541, pp. 242-256. |
Imai, Kosuke and Michael Lingzhi
Li. (2025). ``Statistical Inference
for Heterogeneous Treatment Effects Discovered by Generic Machine
Learning in Randomized Experiments.'' Journal of
Business & Economic Statistics, Vol. 43, No. 1,
pp. 256-268. |
Li, Michael Lingzhi and Kosuke
Imai. ``Statistical Performance
Guarantee for Subgroup Identification with Generic Machine
Learning.'' |
Jia, Zeyang, Kosuke Imai, and Michael
Lingzhi Li. ``Cramming Contextual
Bandits for On-policy Statistical
Evaluation.'' |
Li, Michael Lingzhi and Kosuke Imai. ``evalITR:
Evaluating Individualized Treatment Rules.'' available
through The Comprehensive R
Archive Network and GitHub |