Semantic Corruptions. In this paper, we explore the use of a different type of knowledge: corruptions of se-
mantic features. Intuitively, imagine trying to predict the label from a corrupted input T(x), where all semantic
information has been removed. Any better-than-chance prediction provides us a window into the nuisances, as
it must rely on them. We will then use these obtained biased models to guide methods that we identify here as
biased-model-based spurious-correlation avoiding methods (B-SCAMs).
B-SCAMs. There is a class of methods in the literature that use predictions of a biased model to adjust for nuisances,
and learn predictors that are free of spurious correlations. Among others, these include Just Train Twice (JTT)
[Liu et al.,2021], EILL [Creager et al.,2021], Nuisance-Randomized Distillation (NURD)[Puli et al.,2022], and
debiased focus loss (DFL), product of experts (POE)[Mahabadi et al.,2019]. The key question arising from these
works is how can we obtain biased models? In empirical studies, prior works on B-SCAMs either use annotations of
the nuisance or an ERM-trained model over the training data as a placeholder for the biased model. The latter
approach, based on an ERM-trained model, is successful if that model completely ignores semantic information.
In practice, these heuristics are rather fragile. Annotations for nuisances are seldom available, and we lack a
principled method to ascertain whether a model trained with ERM relies only on semantic features. Therefore,
employing semantic corruptions could serve as a valuable alternative to these heuristics. We claim that semantic
corruptions offer a principled and useful approach to obtaining biased models.
Semantic corruptions T(x)must strike a delicate balance between removing semantic information and preserving
nuisances. For example, if T(x)replaces all pixels in an image with random noise, it corrupts semantics while
simultaneously erasing all information about the nuisances. An ideal T(x)would isolate nuisances by targeting
only the semantic information in the input, e.g., by in-painting the animal for the task of classifying cows and
penguins. Implementing such ideal corruptions is unrealistic, as they are task-specific and may require human
annotations of the semantic features; e.g., one can segment the objects in every image. Doing so for all classifi-
cation problems is extremely laborious. In tasks like NLI, it is unclear even how to annotate semantics, as they do
not correspond to simple features like subsets of words. In summary, after outlining the desired characteristics of
semantic corruptions, we define corruptions that are beneficial across multiple tasks and do not require human
annotation. Our contributions are as follows:
1. Show that acquiring additional knowledge beyond a labeled dataset is necessary for effectively learning ro-
bust models (theorem 1). Then, in proposition 1, we formalize sufficient conditions under which additional
knowledge in the form of a semantic corruption enables B-SCAMs to learn robust models.
2. Develop multiple semantic corruptions for object recognition and natural language inference. These include
patch randomization, n-gram randomization, frequency filtering, and intensity filtering. Then, we situate
existing procedures, such as region-of-interest masking and premise masking, under the umbrella of semantic
corruptions.
3. Empirically, we demonstrate that any semantic corruption can power any B-SCAM. The corruption-powered
versions of these methods outperform ERM on out-of-distribution (OOD) generalization tasks like Waterbirds,
cardiomegaly detection from chest X-rays, and NLI. Corruption-powered NURD,DFL, and POE achieve perfor-
mance similar to said methods run with extra observed nuisance variables. Corruption-powered JTT outper-
forms vanilla JTT.
2 Biased-model-based spurious-correlation avoiding methods
A spurious correlation is a relationship between the covariates xand the label ythat changes across settings
like time and location [Geirhos et al.,2020]. The features whose relationship with the label changes are called
nuisances. With a vector of nuisances z, let pt r (y,z,x),pte(y,z,x)be the training and test distributions.
Achieving robustness to spurious correlations requires additional knowledge. In the presence of spurious
correlations, the training distribution pt r may not equal the test distribution pt e . Without further assumptions,
no algorithm that only sees data from pt r (y,x)can produce a predictor that works well on pt e . To achieve
generalization when pte ̸=pt r , work in the OOD generalization literature assumes a relationship between the
training and test distributions. We follow the work of Makar et al. [2022],Puli et al. [2022]and assume that only
nuisance-label relationships — i.e. the conditional z|y— changes between training and test. Formally, we let
2