|
|
Data-driven decision making plays an
important role even in high stakes settings like medicine and public
policy. Learning optimal policies from observed data requires a
careful formulation of the utility function whose expected value is
maximized across a population. Although researchers typically use
utilities that depend on observed outcomes alone, in many settings
the decision maker’s utility function is more properly characterized
by the joint set of potential outcomes under all actions. For
example, the Hippocratic principle to “do no harm” implies that the
cost of causing death to a patient who would otherwise survive
without treatment is greater than the cost of forgoing life-saving
treatment. We consider optimal policy learning with asymmetric
counterfactual utility functions of this form that consider the
joint set of potential outcomes. We show that asymmetric
counterfactual utilities lead to an unidentifiable expected utility
function, and so we first partially identify it. Drawing on
statistical decision theory, we then derive minimax decision rules
by minimizing the maximum expected utility loss relative to
different alternative policies. We show that one can learn minimax
loss decision rules from observed data by solving intermediate
classification problems, and establish that the finite sample excess
expected utility loss of this procedure is bounded by the regret of
these intermediate classifiers. We apply this conceptual framework
and methodology to the decision about whether or not to use right
heart catheterization for patients with possible pulmonary
hypertension. |
Ben-Michael, Eli, D. James Greiner, Kosuke
Imai, and Zhichao Jiang. ``Safe
Policy Learning through Extrapolation: Application to Pre-trial
Risk Assessment.'' |
Koch, Benedikt and Kosuke Imai. ``Statistical Decision Theory with
Counterfactual Loss.'' |