Propensity
score methods are a part of the standard toolkit for applied
researchers who wish to ascertain causal effects from observational
data. While they were originally developed for binary treatments,
several researchers have proposed generalizations of the propensity
score methodology for non-binary treatment regimes. Such extensions
have widened the applicability of propensity score methods and are
indeed becoming increasingly popular themselves. In this article,
we closely examine two methods that generalize propensity scores in
this direction, namely, the propensity function (PF), and the
generalized propensity score (GPS), along with two extensions of the
GPS that aim to improve its robustness. We compare the assumptions,
theoretical properties, and empirical performance of these methods.
On a theoretical level, the GPS and its extensions are advantageous
in that they are designed to estimate the full dose response
function rather than the average treatment effect that is estimated
with the PF. We compare GPS with a new PF method, both of which
estimate the dose response function. We illustrate our findings and
proposals through simulation studies, including one based on an
empirical study about the effect of smoking on healthcare
costs. While our proposed PF-based estimator preforms well, we
generally advise caution in that all available methods can be biased
by model misspecification and extrapolation. |