variances. In order to provide valid statistical inferences when using these trimming methods, we
also develop new methods for inference after sample trimming that offer greater flexibility and are
valid in a wider range of settings than previous such methods.
Motivation and interpretation To motivate this approach, consider a single unit (X, Y, Z)
drawn from a super-population distribution, where Xis a covariate vector, Yis a response, Zis
a treatment indicator, and e(X) is the probability of treatment. An inverse-propensity weighted
estimator for E[Y] is Y Z/e(X), which is unbiased, but is well-known to suffer from extremely high-
variance when e(X) takes small values (Basu, 1971; Khan and Ugander, 2021). As such, the goal of
existing sample trimming methods is to preclude this possibility by removing units for which e(X)
takes small values, executing a change of estimand to make what is essentially a bias–variance
trade-off. On the other hand, the inverse-propensity weighted estimate Y Z/e(X) will also have
high-variance if var(Y|X) is large, an issue which is not addressed by existing methods, but may
be a major obstacle when var(Y|X) is extremely large for some values of X. Put simply: if we
do not believe that we can accurately estimate treatment effects on units with propensities of, say,
0.01, then we must also acknowledge that we cannot accurately estimate treatment effects on units
with conditional variances of say, 100, and so we propose to trim these latter units as well.
An important difference between this proposal and existing propensity-based trimming methods
is that the sub-population found by propensity-based methods can be interpreted as a population
that is likely to receive either treatment or control (sometimes called an equipoise population), and
thus may be a natural population of interest. This interpretation does not extend to variance-
based trimming methods—instead, variance-based methods can be interpreted as identifying a
small population of outliers in the data, whose behavior and response to treatment is very different
from that of other units, and trimming these units to focus on an “inlier” population on which
treatment effects can be estimated more accurately. In many cases, this inlier population is also of
natural interest, since treatment effects on the full population may be dominated by the outliers,
and the treatment effect on the inlier population may be more representative of how treatment
will affect the majority of units. We demonstrate this phenomenon, along with further interpretive
issues, as part of a data example in Section 6.3.
In general, the question of whether or not a particular subpopulation is of interest to an analyst
is dependent on both domain considerations and the level of precision with which treatment effects
for that subpopulation can be estimated. The problem of selecting a subpopulation of interest from
a set of candidates is fundamental to the trimming literature, and not unique to our work—even
when propensity trimming alone, the choice of propensity cut-off induces a similar family of sub-
populations, and choosing between those sub-populations requires a similar balancing of variance
and relevance. Our methods can be understood as more effectively navigating this trade-off be-
tween variance and relevance than existing methods, thus offering practitioners a better set of
sub-populations to choose between.
Inference after trimming After applying any sample trimming procedure, another challenge
immediately arises: how to perform valid inference on the trimmed sub-population. Thus our second
contribution in the present work is to provide new theoretical results on inference after sample
trimming. Existing work typically makes strong rate assumptions or parametric assumptions on
the estimation of nuisance components (Crump et al., 2009; Yang and Ding, 2018), and we extend
this work by using doubly-robust estimators to show how valid inference can be performed under
weaker conditions on the estimation of nuisance components. The application of doubly-robust
estimators to this setting requires a subtle choice of estimand as well as careful handling of cross-
2