Efficient and Effective Augmentation Strategy for Adversarial Training Sravanti Addepalli Samyak Jain R.Venkatesh Babu

2025-05-03 0 0 5.12MB 30 页 10玖币
侵权投诉
Efficient and Effective Augmentation
Strategy for Adversarial Training
Sravanti Addepalli † ∗ Samyak Jain † ⋄ ‡ ∗ R.Venkatesh Babu
Video Analytics Lab, Indian Institute of Science, Bangalore
Indian Institute of Technology (BHU) Varanasi
Abstract
Adversarial training of Deep Neural Networks is known to be significantly more
data-hungry when compared to standard training. Furthermore, complex data aug-
mentations such as AutoAugment, which have led to substantial gains in standard
training of image classifiers, have not been successful with Adversarial Training.
We first explain this contrasting behavior by viewing augmentation during training
as a problem of domain generalization, and further propose Diverse Augmentation-
based Joint Adversarial Training (DAJAT) to use data augmentations effectively
in adversarial training. We aim to handle the conflicting goals of enhancing the
diversity of the training dataset and training with data that is close to the test distri-
bution by using a combination of simple and complex augmentations with separate
batch normalization layers during training. We further utilize the popular Jensen-
Shannon divergence loss to encourage the joint learning of the diverse augmen-
tations, thereby allowing simple augmentations to guide the learning of complex
ones. Lastly, to improve the computational efficiency of the proposed method, we
propose and utilize a two-step defense, Ascending Constraint Adversarial Training
(ACAT), that uses an increasing epsilon schedule and weight-space smoothing to
prevent gradient masking. The proposed method DAJAT achieves substantially
better robustness-accuracy trade-off when compared to existing methods on the
RobustBench Leaderboard on ResNet-18 and WideResNet-34-10. The code for im-
plementing DAJAT is available here: https://github.com/val-iisc/DAJAT.
1 Introduction
Deep Neural Network (DNN) based image classifiers are vulnerable to crafted imperceptible perturba-
tions known as Adversarial Attacks [
45
] that can flip the predictions of the model to unrelated classes
leading to disastrous implications. Adversarial Training [
17
,
31
,
57
] has been the most successful
defense strategy, where a model is explicitly trained to be robust in the presence of such attacks. While
early defenses focused on designing suitable loss functions for training, subsequent works [
33
,
38
]
showed that with careful hyperparameter tuning, even the two most popular methods PGD-AT [
31
]
and TRADES [
57
] yield comparable performance, highlighting the saturation in performance with
respect to changes in the training loss. Schmidt et al. [
39
] observed that adversarial training has a
large sample complexity and further gains require the use of additional training data. Subsequent
works [
7
,
19
] indeed used additional data whose distribution is close to that of the original dataset in
order to obtain performance gains. The availability of large amounts of relevant data is impractical to
assume, leading to an exploration towards augmentations based on Generative Adversarial Networks
[
16
] and Diffusion based models [
23
,
19
]. However, the use of such generative models incurs an
Equal Contribution. Correspondence to Sravanti Addepalli
<
sravantia@iisc.ac.in
>
, Samyak Jain
<
samyakjain.cse18@itbhu.ac.in
>
Work done during internship at Video Analytics Lab, Indian Institute
of Science.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.15318v1 [cs.LG] 27 Oct 2022
Shared
Weights
(Pad + Crop)
Adversarial
Attack (2-step)
(Auto
Augment)
Original
Image
Separate
Batch-norm
layers
Adversarial
Attack (2-step)
TRADES
Loss
TRADES
Loss
JS Divergence
Figure 1: A Schematic representation of the proposed approach DAJAT
additional training cost and suffers from limited diversity in low-data regimes and in datasets with
high-resolution images.
A simple and efficient solution to improve the diversity of training data in standard Empirical Risk
Minimization (ERM) based training has been the use of random transformations such as rotation,
color jitter, and variations in contrast, sharpness and brightness [
28
,
10
,
11
], which can change
images significantly in input space while belonging to the same class as the original image. However,
prior works have surprisingly found that such augmentations, that cause large changes in the input
distribution, do not help adversarial training [
38
,
18
,
44
]. This limits the augmentations in adversarial
training to simple ones - zero padding followed by random crop, and horizontal flip [
38
,
33
,
18
] -
which may not be able to fill in the large data requirement of Adversarial Training.
In this work, we firstly analyze the reasons for this contrasting trend between Standard and Adversarial
Training, and further show that
it is indeed possible to utilize complex augmentations effectively
in Adversarial training as well
, by jointly training on simple and complex data augmentations using
separate batch-normalization layers for each type, as shown in Fig.1. While complex augmentations
increase the data diversity resulting in better generalization, simple augmentations ensure that the
model specializes on the training data distribution as well. We further minimize the Jenson-Shannon
divergence between the softmax outputs of various augmentations to enable the simple augmentations
to guide the learning of complex ones. In order to improve the computational efficiency of the
proposed method, we use two attack steps (instead of 10) during training. By progressively increasing
the magnitude of perturbations and performing smoothing in weight space, we show that it is indeed
possible to improve the stability of training. Our contributions are listed below:
We analyze the reasons for the failure of strong data augmentations in adversarial training
by viewing augmentation during training as a domain generalization problem, and further
propose Diverse Augmentation based Joint Adversarial Training (DAJAT) to utilize data
augmentations effectively in Adversarial training. The proposed approach can be integrated
with many augmentations and adversarial training methods to obtain performance gains.
We propose and integrate DAJAT with an efficient 2-step defense strategy, Ascending Con-
straint Adversarial Training (ACAT) that uses linearly increasing
ε
schedule, cosine learning
rate and weight-space smoothing to prevent gradient masking and improve convergence.
We obtain improved robustness and large gains in standard accuracy on multiple datasets
(CIFAR-10, CIFAR-100, ImageNette) and model architectures (RN-18, WRN-34-10). We
obtain remarkable gains in a low data scenario (CIFAR-100, Imagenette) where data aug-
mentations are most effective.
On CIFAR-100, we outperform all existing methods on
the RobustBench leaderboard [9] with the same model architecture.
2 Related Works
We discuss various existing strategies for improving the Adversarial robustness of Deep Networks.
Adversarial Training (AT):
Goodfellow et al. [
17
] proposed FGSM-AT where single-step adversar-
ial samples were used for training. However, these models were susceptible to gradient masking [
34
],
where the local loss landscape becomes convoluted leading to the generation of weaker single-step
attacks during training. This leads to a false sense of security against single-step attacks, while the
2
models are still susceptible to stronger multi-step attacks such as PGD [
31
]. PGD-AT [
31
,
38
] used
multi-step attacks in a similar adversarial training formulation to obtain robust models that stood the
test of time against several attacks [
3
,
8
,
42
]. TRADES [
57
] explicitly optimizes the trade-off between
the accuracy on natural and adversarial examples by minimizing the cross-entropy loss on natural
images along with the Kullback-Leibler (KL) divergence between the predictions of adversarial and
clean images. In the proposed defense, we use the base loss from TRADES-AT [57].
Several works have explored the use of auxiliary techniques in AT such as weight-space smoothing
[
26
,
50
], architectural changes [
18
,
53
] and increasing the diversity of training data by using additional
natural and synthetic data [
39
,
7
,
18
,
37
]. Increasing the diversity of training data achieves significant
performance gains since the sample complexity of adversarial training is known to be high [39].
Augmentations in Adversarial Training:
While data augmentations such as contrast, sharpness
and brightness adjustments are known to improve performance in the standard training regime, they
have not led to substantial gains in adversarial training. AutoAugment [
10
] uses Proximal Policy
Optimization to find the set of policies that can yield optimal performance on a given dataset. Contrary
to prior works [
38
,
18
], we show that the policies that are optimized for standard training indeed yield
a boost in performance of adversarial training as well, when used in the proposed training framework.
In a recent work, Rebuffi et al. [
37
] show that it is possible to obtain substantial gains in robust
accuracy by using spatial composition based augmentations such as CutMix [
54
] and CutOut [
13
]
that preserve low-level features of the image. Cutmix replaces part of an image with another and
also combines the output softmax vectors in the same ratio, while CutOut blanks out a random area
of an image. The authors hypothesize that the augmentations used in Adversarial training need to
preserve low-level features, which severely limits the possibilities for mitigating the large sample
complexity in adversarial training. We show that by using the proposed approach DAJAT, it is indeed
possible to use augmentations such as color jitter, contrast, sharpness and brightness adjustments that
significantly change the low level statistics of images (Ref: Appendix-B).
3 Preliminaries: Notation and Threat Model
We consider the Adversarial Robustness of DNN based image classifiers. An input image is denoted
as
x∈ X
and the corresponding ground truth label as
y[0,1]
. We denote a simple transformation of
x
obtained using Pad, Crop and Horizontal flip (Pad+Crop+HFlip, referred to as Base augmentations)
by
xbase
and other transformations of
x
by the respective subscript. For example,
xauto
refers to the
image
x
being transformed by AutoAugment (AA) [
10
] followed by the base augmentations. The
function mapping of the classifier
C
from input space
X
to the softmax vectors is denoted using
fθ(.)
,
where
θ
denotes the network parameters. Adversarial examples corresponding to the images
x
,
xbase
and
xauto
are denoted using
˜x
,
˜xbase
and
˜xauto
respectively. We consider the
norm based threat
model, where
˜x
is a valid perturbation within
ε
if it belongs to the set
Aε(x) = {˜x:||˜xx||ε}
.
4 Motivation: Role of Augmentations in Neural Network Training
In this section, we first explore the contrasting factors that influence the training of neural networks
when data is augmented (Sec.4.1), and further delve into the specifics of adversarial training which
make it challenging to obtain gains using complex data augmentations (Sec.4.2).
4.1 Impact of Augmentations in Neural Network Training
Conjecture-1:
We hypothesize that the role of data augmentations in the training of Neural Networks
is influenced by the following contrasting factors:
(i)
Reduced overfitting due to an increase in diversity of the augmented dataset, leading to better
generalization of the network to the test set.
(ii)
Larger domain shift between the augmented data distribution and the test data distribution,
leading to a drop in performance on the test set.
(iii)
Capacity of the Neural Network in being able to generalize well to the augmented data distribu-
tion and the unaugmented data distribution for the given task.
Justification:
The training of Neural Networks using augmented data can be considered as a problem
of domain generalization, where the network is trained on a source domain (augmented data) and is
3
Table 1:
Impact of augmentations:
Perfor-
mance (%) of ACAT models on Base augmen-
tations and AutoAugment (Auto). Clean and
robust accuracy against GAMA attack [
42
] are
reported. The use of AutoAugment results in
1.52% drop in robust accuracy.
Test: No Aug AutoAugment
Model Train Clean Robust Clean Robust
Base 82.41 50.00 63.79 37.07
ResNet-18 Auto 82.54 48.11 76.40 43.22
Base 86.71 55.58 68.24 40.83
WideResNet-34-10 Auto 86.80 53.99 82.64 48.98
Figure 2:
Comparison of BN layer statistics
for
a WRN-34-10 model trained on CIFAR-10 us-
ing DAJAT. BN layers of the Base augmentations
(Pad+Crop,H-Flip) are compared with those of Au-
toAugment. Initial layer (L3) parameters are diverse,
while those of deeper layers (L25) are similar.
expected to generalize to a target domain (test data). We use the theoretical formulation by Ben-David
et al. [4] shown below to justify the respective claims in Conjecture-1:
ϵt(f)ϵs(f) + 1
2dFF(s, t) + λ(1)
(i)
The use of a more diverse or larger source dataset reduces overfitting, improving the performance
of the network on the source distribution. From Eq.1, expected error on the target distribution
ϵt
(test set in this case) is upper bounded by the expected error on the source distribution
ϵs
(augmented dataset) along with other terms. Therefore, improved performance on the
augmented distribution can improve the performance on the test set as well.
(ii)
The expected error on the target distribution
ϵt
is upper bounded by the distribution shift
between the source and target distributions
1
2dFF(s, t)
along with other terms. Therefore,
a larger domain shift between the augmented and test data distributions can indeed limit the
performance gains on the test set.
(iii)
The constant
λ
in Eq.1measures the risk of the optimal joint classifier:
λ= min
f∈F ϵs(f) + ϵt(f)
.
Neural Networks with a higher capacity can minimize the expected risk on the source set
ϵs
and
the risk of the optimal joint classifier effectively. Therefore, capacity of the Neural Network
and complexity of the task influence the gains that can be obtained using augmentations.
4.2 Analysing the role of Augmentations in Adversarial Training
We analyse the trade-off between the factors described in Conjecture-1 for adversarial training when
compared to standard ERM training. In addition to the goal of improving accuracy on clean samples,
adversarial training aims to achieve local smoothness of the loss landscape as well. Hence, the
complexity of Adversarial Training is higher than that of standard ERM training, making it important
to use larger model capacities to obtain gains using data augmentations (based on Conjecture-1 (iii)).
This justifies the gains obtained by Rebuffi et al. [
37
] on the WRN-70-16 architecture by using
CutMix based augmentations (2.9% higher robust accuracy and 1.23% higher clean accuracy). The
same method does not obtain significant gains on smaller architectures such as ResNet-18 where a
1.76% boost in robust accuracy is accompanied by a 2.55% drop in clean accuracy.
Secondly, while the distribution shift between augmented data and test data (
1
2dFF(s, t)
) may be
sufficiently low for natural images leading to improved generalization to test set (Conjecture-1(i,ii)),
the same may not be true for adversarial images. There is a large difference between the augmented
data and test data in pixel space, although they may be similar in feature space. Since adversarial
attacks perturb images in pixel space, the distribution shift between the corresponding perturbations
widens further as shown in Fig.7,8and 9. Based on Conjecture-1(ii), unless this difference is
accounted for, complex augmentations cannot improve the performance of adversarial training. (Ref.
Appendix-B). This trend has also been observed empirically by Rebuffi et al. [
37
], based on which
they conclude that the augmentations designed for robustness need to preserve low-level features.
We present the performance of Adversarial Training by using either Base augmentations (Pad+Crop,
Flip) or AutoAugment [
10
] during training and inference on the CIFAR-10 dataset using ResNet-18
and WideResNet-34-10 architectures in Table-1. Firstly, we note that by using AutoAugment during
training alone, robust accuracy on the test set drops by
1.52%
which is as observed in prior
work [
18
]. Secondly, the clean and robust accuracy drop by
6.5%
when augmented images are
4
used for both training and testing, highlighting the complexity of the learning task. We present
additional results by training without any augmentations, and by training using a combination of both
augmentations in every minibatch in Table-15. We note that the use of Base Augmentations alone
(Pad+Crop+HFlip) still gives the best overall performance on the unaugmented test set.
5 Proposed Method
5.1 Background
We briefly discuss the TRADES-AWP defense [
57
,
50
], which is the base algorithm used in DAJAT.
LAWP = max
ˆ
θ∈M(θ)
1
N
N
X
i=1
LCE(fθ+ˆ
θ(xi), yi) + β·max
˜xi∈Aε(xi)KL(fθ+ˆ
θ(xi)||fθ+ˆ
θ(˜xi)) (2)
Firstly, an adversarial attack is generated by maximizing the KL divergence between the softmax
predictions of the clean and adversarial examples iteratively for 10 attack steps. An Adversarial
Weight Perturbation (AWP) step additionally perturbs the weights of the model to maximize the
overall loss in weight-space. The weight perturbations are constrained in the feasible region
M(θ)
such that for a given layer
l
,
|| ˆ
θl|| ≤ γ· ||θl||
. The overall training loss is a combination of Cross-
entropy loss on clean samples and the KL divergence term. The latter is weighted by a factor
β
that
controls the robustness-accuracy trade-off. Training the model using Adversarial Weight Perturbations
leads to smoothing of loss surface in the weight space, resulting in better generalization [50,44].
5.2 Diverse Augmentation based Joint Adversarial Training (DAJAT)
As discussed in Section-4, the use of augmentations in training can be viewed as a problem of
domain generalization, where performance on the source distribution or augmented dataset is crucial
towards improving the performance on the target distribution or test set. Since adversarial training is
inherently challenging, for limited model capacity, it is difficult to obtain good performance on the
training data that is transformed using complex augmentations. Moreover, the large distribution shift
between augmented data and test data, specifically with respect to low-level statistics, results in poor
generalization of robust accuracy to the test set.
To mitigate these challenges, we propose the combined use of simple and complex augmentations
during training, so that the model can benefit from the diversity introduced by complex augmentations,
while also specializing on the original data distribution that is similar to the simple augmentations.
We propose to use separate batch normalization layers for simple and complex augmentations, so
as to offset the shift in distribution between the two kinds of augmentations. In our main approach,
we propose to use Pad and Crop followed by Horizontal Flip (Pad+Crop+HFlip) as the Simple
augmentations, and Autoaugment followed by Pad+Crop+HFlip as the complex augmentations. We
justify the choice of this augmentation pipeline in Appendix-B.3.
Motivated by AugMix [
22
], we additionally minimize the Jenson-Shannon (JS) divergence between
the softmax outputs of different augmentations, so as to allow the simple augmentations to guide the
learning of complex ones. We present the training loss
LDAJAT
of the proposed method, Diverse
Augmentation based Joint Adversarial Training (DAJAT) below in Eq.5:
LTR(θ, x, y) = LCE(fθ(x), y) + β·max
˜x∈Aε(x)KL(fθ(x)||fθ(˜x)) (3)
˜
θ= argmax
ˆ
θ∈M(θ)
1
N
N
X
i=1
LTR(θ+ˆ
θ, xi,base, yi)(4)
LDAJAT =1
T+ 1 ·1
N
N
X
i=1
LTR(θ+˜
θ, xi,base, yi) +
T
X
t=1
LTR(θ+˜
θ, xi,auto(t), yi)+
1
N
N
X
i=1
JSD(fθ+˜
θ(xi,base), fθ+˜
θ(xi,auto(1)), . . . , fθ+˜
θ(xi,auto(T))) (5)
Adversarial attacks are generated individually for each augmentation by maximizing the respective
KL divergence term of the TRADES loss shown in Eq.3. To improve training efficiency, we compute
˜x
using two attack steps with a step-size of
ε
. We use a combination of a linearly increasing schedule
5
摘要:

EfficientandEffectiveAugmentationStrategyforAdversarialTrainingSravantiAddepalli†∗SamyakJain†⋄‡∗R.VenkateshBabu††VideoAnalyticsLab,IndianInstituteofScience,Bangalore⋄IndianInstituteofTechnology(BHU)VaranasiAbstractAdversarialtrainingofDeepNeuralNetworksisknowntobesignicantlymoredata-hungrywhencompa...

展开>> 收起<<
Efficient and Effective Augmentation Strategy for Adversarial Training Sravanti Addepalli Samyak Jain R.Venkatesh Babu.pdf

共30页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:30 页 大小:5.12MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 30
客服
关注