
(a) PDG attacks on vanilla-
trained model
(b) PDG attacks on SPAT-
trained model
Figure 1: Predictions of untargeted adversarial attacks
(PGD-20) by CIFAR-10 vanilla-trained and SPAT-trained
classifiers. (a) For the vanilla-trained model, over 40% of
the dog images are misclassified as cats and (b) it is reduced
to 30.6% with the SPAT-trained model.
are adaptively updated at their own pace. Such self-paced
reweighting offers SPAT more optimization flexibility. In ad-
dition, we further incorporate an HCP-ECP consistency term
in SPAT and show its effectiveness in boosting model adver-
sarial robustness. Our main contributions are:
• We investigate the cause of the unevenly distributed mis-
classification statistics in untargeted attacks. We find that
adversarial perturbations are actually biased by targeted
sample’s hard-class pairs.
• We introduce a SPAT strategy that takes inter-class se-
mantic similarity into account. Adaptively upweighting
hard-class pair loss encourages discriminative feature
learning.
• We propose incorporating an HCP-ECP consistency reg-
ularization term in adversarial training, which boosts
model adversarial robustness by a large margin.
2 Related Work
2.1 Adversarial Attack and Defense
The objective of adversarial attacks is to search for human-
imperceptible perturbation δso that the adversarial sample
x0=x+δ(1)
can fool a model f(x;φ)well-trained on clean data x. Here
φrepresents the trainable parameters in a model. For nota-
tion simplification, we use f(x)to denote f(x;φ)in the
rest of the paper. One main branch of adversarial noise gen-
eration is the gradient-based method, such as the Fast Gradi-
ent Sign Method (FGSM) (Goodfellow, Shlens, and Szegedy
2014), and its variants (Kurakin, Goodfellow, and Bengio
2016; Madry et al. 2017). Another popular strategy is opti-
mization based, such as the CW attack (Carlini and Wagner
2017b).
Several pre/post-processing-based methods have shown
outstanding performance in adversarial detection and clas-
sification tasks (Grosse et al. 2017; Metzen et al. 2017; Xie
et al. 2017; Feinman et al. 2017; Li and Li 2017). They
aim to use either a secondary neural network or random
augmentation methods, such as cropping, compression and
blurring to strengthen model robustness. However, Carlini
et al. showed that they all can be defeated by a tailored at-
tack (Carlini and Wagner 2017a). Adversarial Training, on
the other hand, uses regulation methods to directly enhance
the robustness of classifiers. Such optimization scheme is
often referred to as the ”min-max game”:
argmin
φ
E(x,y)∼D[max
δ∈S
L(f(x0),y)],(2)
where the inner max function aims to generate efficient and
strong adversarial perturbation based on a specific loss func-
tion L, and the outer min function optimizes the network
parameters φfor model robustness. Another branch of AT
aims to achieve logit level robustness, where the objective
function not only requires correct classification of the adver-
sarial samples, but also encourages the logits of clean and
adversarial sample pairs to be similar (Kannan, Kurakin,
and Goodfellow 2018; Zhang et al. 2019; Wang et al. 2019).
Their AT objective functions usually can be formulated as a
compound loss:
L(θ) = Lacc +λLrob (3)
where Lacc is usually the cross entropy (CE) loss on clean
or adversarial data, Lrob quantifies clean-adversarial logit
pairing, and λis a hyper-parameter to control the relative
weights for these two terms. The proposed SPAT in this pa-
per introduces self-paced reweighting mechanisms upon the
compound loss and soft-differentiates hard/easy-class pair
loss in model optimization for model robustness boost.
2.2 Re-weighting in Adversarial Training
Re-weighting is a simple yet effective strategy for address-
ing biases in machine learning, for instance, class imbalance.
When class imbalance exists in the datasets, the training pro-
cedure is very likely over-fit to those categories with a larger
amount of samples, leading to unsatisfactory performance
regarding minority groups. With the re-weighting technique,
one can down-weight the loss from majority classes and ob-
tain a balanced learning solution for minority groups.
Re-weighting is also a common technique for hard ex-
ample mining. Generally, hard examples are those data that
have similar representations but belong to different classes.
Hard sample mining is a crucial component in deep metric
learning (Hoffer and Ailon 2015; Hermans, Beyer, and Leibe
2017) and Contrastive learning (Chen et al. 2020; Khosla
et al. 2020). With re-weighting, we can directly utilize the
loss information during training and characterize those sam-
ples that contribute large losses as hard examples. For ex-
ample, OHEM (Shrivastava, Gupta, and Girshick 2016) and
Focal Loss (Lin et al. 2017) put more weight on the loss of
misclassified samples to effectively minimize the impact of
easy examples.
Previous studies show that utilizing hard adversarial sam-
ples promotes stronger adversarial robustness (Madry et al.
2017; Wang et al. 2019; Mao et al. 2019; Pang et al. 2020).
For instance, MART (Wang et al. 2019) explicitly applies a
re-weighting factor for misclassified samples by a soft de-
cision scheme. Recently, several re-weighting-based algo-
rithms have also been proposed to address fairness-related