1 Introduction
Heterogeneous treatment effects are often estimated with a decision problem in mind— should a particular
individual be treated? This question has fostered much research in econometrics, statistics, and machine
learning. However, relatively less attention has been given to another important margin of the decision—
should the treatment itself be adjusted? Whether the treatment is a medical treatment, subsidy, job training,
or audit probability, decision makers can usually entertain changing the treatment value that was observed in
the data. Even experiments with multivalued treatments may not implement an exhaustive list of treatment
values. This is especially true in the social sciences, where testing multiple interventions can be costly, and
in the medical sciences, where specific treatment doses are often tested in clinical trials. In this paper I
propose a method for allocating treatment to a population when the treatment values themselves can be
adjusted to values never before seen in the data. I show how combining the data on existing treatments
with economically motivated shape restrictions can be used to design policies that outperform those possible
when only previously implemented treatments are considered.
I first formulate a decision problem in which the decision maker observes experimental data on some
treatment values and seeks to construct a mapping, or policy, from the space of covariates to the space
of treatments in order to maximize some objective function. I assume all experimentation is done before
the policy is constructed. This setting, which is common in econometrics, is often referred to as treatment
choice or offline policy learning (examples include Athey and Wager (2021), Bhattacharya and Dupas (2012),
Kitagawa and Tetenov (2018) and other examples mentioned in the literature review thereof, Liu (2022),
Mbakop and Tabord-Meehan (2021), Qian and Murphy (2011), Sasaki and Ura (2020), Zhang et al. (2012),
Zhao et al. (2012)). A distinctive feature of this paper as opposed to most policy learning problems is
that the set of treatments that the decision maker can consider may be a strict superset of the support of
the treatment random variable observed in the data. This extends policy learning to practically relevant
situations in which constraints in the design and implementation of experiments or simply differences in the
objectives of the experimenter versus decision maker result in only a few treatment values being piloted in
the experiment, while the decision maker may want to consider many more.
Despite the lack of data on the impacts of these never-before-implemented treatments, I show how to
bound the response to new treatments using simple, economically interpretable restrictions on the shape of
treatment response. For example, a financial incentive may be assumed to have a positive effect, exhibit
diminishing returns, or satisfy smoothness conditions. Such shape restrictions are often exploited to partially
identify treatment effects (e.g. Manski 2009, Mogstad, Santos, and Torgovitsky 2018). The empirical analysis
of the present paper demonstrates that such bounds can be adequately informative for choosing whether and
2