Efficient and Effective Augmentation Strategy for Adversarial Training Sravanti Addepalli Samyak Jain R.Venkatesh Babu

2025-05-03 0 0 5.12MB 30 页 10玖币

侵权投诉

Efﬁcient and Effective Augmentation

Strategy for Adversarial Training

Sravanti Addepalli † ∗ Samyak Jain † ⋄ ‡ ∗ R.Venkatesh Babu †

†Video Analytics Lab, Indian Institute of Science, Bangalore

⋄Indian Institute of Technology (BHU) Varanasi

Abstract

Adversarial training of Deep Neural Networks is known to be signiﬁcantly more

data-hungry when compared to standard training. Furthermore, complex data aug-

mentations such as AutoAugment, which have led to substantial gains in standard

training of image classiﬁers, have not been successful with Adversarial Training.

We ﬁrst explain this contrasting behavior by viewing augmentation during training

as a problem of domain generalization, and further propose Diverse Augmentation-

based Joint Adversarial Training (DAJAT) to use data augmentations effectively

in adversarial training. We aim to handle the conﬂicting goals of enhancing the

diversity of the training dataset and training with data that is close to the test distri-

bution by using a combination of simple and complex augmentations with separate

batch normalization layers during training. We further utilize the popular Jensen-

Shannon divergence loss to encourage the joint learning of the diverse augmen-

tations, thereby allowing simple augmentations to guide the learning of complex

ones. Lastly, to improve the computational efﬁciency of the proposed method, we

propose and utilize a two-step defense, Ascending Constraint Adversarial Training

(ACAT), that uses an increasing epsilon schedule and weight-space smoothing to

prevent gradient masking. The proposed method DAJAT achieves substantially

better robustness-accuracy trade-off when compared to existing methods on the

RobustBench Leaderboard on ResNet-18 and WideResNet-34-10. The code for im-

plementing DAJAT is available here: https://github.com/val-iisc/DAJAT.

1 Introduction

Deep Neural Network (DNN) based image classiﬁers are vulnerable to crafted imperceptible perturba-

tions known as Adversarial Attacks [

] that can ﬂip the predictions of the model to unrelated classes

leading to disastrous implications. Adversarial Training [

] has been the most successful

defense strategy, where a model is explicitly trained to be robust in the presence of such attacks. While

early defenses focused on designing suitable loss functions for training, subsequent works [

]

showed that with careful hyperparameter tuning, even the two most popular methods PGD-AT [

]

and TRADES [

] yield comparable performance, highlighting the saturation in performance with

respect to changes in the training loss. Schmidt et al. [

] observed that adversarial training has a

large sample complexity and further gains require the use of additional training data. Subsequent

works [

] indeed used additional data whose distribution is close to that of the original dataset in

order to obtain performance gains. The availability of large amounts of relevant data is impractical to

assume, leading to an exploration towards augmentations based on Generative Adversarial Networks

[

] and Diffusion based models [

]. However, the use of such generative models incurs an

∗

Equal Contribution. Correspondence to Sravanti Addepalli

sravantia@iisc.ac.in

, Samyak Jain

samyakjain.cse18@itbhu.ac.in

>‡

Work done during internship at Video Analytics Lab, Indian Institute

of Science.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.15318v1 [cs.LG] 27 Oct 2022

Shared

Weights

(Pad + Crop)

Adversarial

Attack (2-step)

(Auto

Augment)

Original

Image

Separate

Batch-norm

layers

Adversarial

Attack (2-step)

TRADES

Loss

TRADES

Loss

JS Divergence

Figure 1: A Schematic representation of the proposed approach DAJAT

additional training cost and suffers from limited diversity in low-data regimes and in datasets with

high-resolution images.

A simple and efﬁcient solution to improve the diversity of training data in standard Empirical Risk

Minimization (ERM) based training has been the use of random transformations such as rotation,

color jitter, and variations in contrast, sharpness and brightness [

], which can change

images signiﬁcantly in input space while belonging to the same class as the original image. However,

prior works have surprisingly found that such augmentations, that cause large changes in the input

distribution, do not help adversarial training [

]. This limits the augmentations in adversarial

training to simple ones - zero padding followed by random crop, and horizontal ﬂip [

] -

which may not be able to ﬁll in the large data requirement of Adversarial Training.

In this work, we ﬁrstly analyze the reasons for this contrasting trend between Standard and Adversarial

Training, and further show that

it is indeed possible to utilize complex augmentations effectively

in Adversarial training as well

, by jointly training on simple and complex data augmentations using

separate batch-normalization layers for each type, as shown in Fig.1. While complex augmentations

increase the data diversity resulting in better generalization, simple augmentations ensure that the

model specializes on the training data distribution as well. We further minimize the Jenson-Shannon

divergence between the softmax outputs of various augmentations to enable the simple augmentations

to guide the learning of complex ones. In order to improve the computational efﬁciency of the

proposed method, we use two attack steps (instead of 10) during training. By progressively increasing

the magnitude of perturbations and performing smoothing in weight space, we show that it is indeed

possible to improve the stability of training. Our contributions are listed below:

•

We analyze the reasons for the failure of strong data augmentations in adversarial training

by viewing augmentation during training as a domain generalization problem, and further

propose Diverse Augmentation based Joint Adversarial Training (DAJAT) to utilize data

augmentations effectively in Adversarial training. The proposed approach can be integrated

with many augmentations and adversarial training methods to obtain performance gains.

• We propose and integrate DAJAT with an efﬁcient 2-step defense strategy, Ascending Con-

straint Adversarial Training (ACAT) that uses linearly increasing

schedule, cosine learning

rate and weight-space smoothing to prevent gradient masking and improve convergence.

•

We obtain improved robustness and large gains in standard accuracy on multiple datasets

(CIFAR-10, CIFAR-100, ImageNette) and model architectures (RN-18, WRN-34-10). We

obtain remarkable gains in a low data scenario (CIFAR-100, Imagenette) where data aug-

mentations are most effective.

On CIFAR-100, we outperform all existing methods on

the RobustBench leaderboard [9] with the same model architecture.

2 Related Works

We discuss various existing strategies for improving the Adversarial robustness of Deep Networks.

Adversarial Training (AT):

Goodfellow et al. [

] proposed FGSM-AT where single-step adversar-

ial samples were used for training. However, these models were susceptible to gradient masking [

where the local loss landscape becomes convoluted leading to the generation of weaker single-step

attacks during training. This leads to a false sense of security against single-step attacks, while the

models are still susceptible to stronger multi-step attacks such as PGD [

]. PGD-AT [

] used

multi-step attacks in a similar adversarial training formulation to obtain robust models that stood the

test of time against several attacks [

]. TRADES [

] explicitly optimizes the trade-off between

the accuracy on natural and adversarial examples by minimizing the cross-entropy loss on natural

images along with the Kullback-Leibler (KL) divergence between the predictions of adversarial and

clean images. In the proposed defense, we use the base loss from TRADES-AT [57].

Several works have explored the use of auxiliary techniques in AT such as weight-space smoothing

[

], architectural changes [

] and increasing the diversity of training data by using additional

natural and synthetic data [

]. Increasing the diversity of training data achieves signiﬁcant

performance gains since the sample complexity of adversarial training is known to be high [39].

Augmentations in Adversarial Training:

While data augmentations such as contrast, sharpness

and brightness adjustments are known to improve performance in the standard training regime, they

have not led to substantial gains in adversarial training. AutoAugment [

] uses Proximal Policy

Optimization to ﬁnd the set of policies that can yield optimal performance on a given dataset. Contrary

to prior works [

], we show that the policies that are optimized for standard training indeed yield

a boost in performance of adversarial training as well, when used in the proposed training framework.

In a recent work, Rebufﬁ et al. [

] show that it is possible to obtain substantial gains in robust

accuracy by using spatial composition based augmentations such as CutMix [

] and CutOut [

]

that preserve low-level features of the image. Cutmix replaces part of an image with another and

also combines the output softmax vectors in the same ratio, while CutOut blanks out a random area

of an image. The authors hypothesize that the augmentations used in Adversarial training need to

preserve low-level features, which severely limits the possibilities for mitigating the large sample

complexity in adversarial training. We show that by using the proposed approach DAJAT, it is indeed

possible to use augmentations such as color jitter, contrast, sharpness and brightness adjustments that

signiﬁcantly change the low level statistics of images (Ref: Appendix-B).

3 Preliminaries: Notation and Threat Model

We consider the Adversarial Robustness of DNN based image classiﬁers. An input image is denoted

x∈ X

and the corresponding ground truth label as

y∈[0,1]

. We denote a simple transformation of

obtained using Pad, Crop and Horizontal ﬂip (Pad+Crop+HFlip, referred to as Base augmentations)

xbase

and other transformations of

by the respective subscript. For example,

xauto

refers to the

image

being transformed by AutoAugment (AA) [

] followed by the base augmentations. The

function mapping of the classiﬁer

from input space

to the softmax vectors is denoted using

fθ(.)

where

denotes the network parameters. Adversarial examples corresponding to the images

xbase

and

xauto

are denoted using

˜x

˜xbase

and

˜xauto

respectively. We consider the

ℓ∞

norm based threat

model, where

˜x

is a valid perturbation within

if it belongs to the set

Aε(x) = {˜x:||˜x−x||∞≤ε}

4 Motivation: Role of Augmentations in Neural Network Training

In this section, we ﬁrst explore the contrasting factors that inﬂuence the training of neural networks

when data is augmented (Sec.4.1), and further delve into the speciﬁcs of adversarial training which

make it challenging to obtain gains using complex data augmentations (Sec.4.2).

4.1 Impact of Augmentations in Neural Network Training

Conjecture-1:

We hypothesize that the role of data augmentations in the training of Neural Networks

is inﬂuenced by the following contrasting factors:

(i)

Reduced overﬁtting due to an increase in diversity of the augmented dataset, leading to better

generalization of the network to the test set.

(ii)

Larger domain shift between the augmented data distribution and the test data distribution,

leading to a drop in performance on the test set.

(iii)

Capacity of the Neural Network in being able to generalize well to the augmented data distribu-

tion and the unaugmented data distribution for the given task.

Justiﬁcation:

The training of Neural Networks using augmented data can be considered as a problem

of domain generalization, where the network is trained on a source domain (augmented data) and is

Table 1:

Impact of augmentations:

Perfor-

mance (%) of ACAT models on Base augmen-

tations and AutoAugment (Auto). Clean and

robust accuracy against GAMA attack [

] are

reported. The use of AutoAugment results in

∼1.5−2% drop in robust accuracy.

Test: No Aug AutoAugment

Model Train ↓Clean Robust Clean Robust

Base 82.41 50.00 63.79 37.07

ResNet-18 Auto 82.54 48.11 76.40 43.22

Base 86.71 55.58 68.24 40.83

WideResNet-34-10 Auto 86.80 53.99 82.64 48.98

Figure 2:

Comparison of BN layer statistics

for

a WRN-34-10 model trained on CIFAR-10 us-

ing DAJAT. BN layers of the Base augmentations

(Pad+Crop,H-Flip) are compared with those of Au-

toAugment. Initial layer (L3) parameters are diverse,

while those of deeper layers (L25) are similar.

expected to generalize to a target domain (test data). We use the theoretical formulation by Ben-David

et al. [4] shown below to justify the respective claims in Conjecture-1:

ϵt(f)≤ϵs(f) + 1

2dF∆F(s, t) + λ(1)

(i)

The use of a more diverse or larger source dataset reduces overﬁtting, improving the performance

of the network on the source distribution. From Eq.1, expected error on the target distribution

ϵt

(test set in this case) is upper bounded by the expected error on the source distribution

ϵs

(augmented dataset) along with other terms. Therefore, improved performance on the

augmented distribution can improve the performance on the test set as well.

(ii)

The expected error on the target distribution

ϵt

is upper bounded by the distribution shift

between the source and target distributions

2dF∆F(s, t)

along with other terms. Therefore,

a larger domain shift between the augmented and test data distributions can indeed limit the

performance gains on the test set.

(iii)

The constant

in Eq.1measures the risk of the optimal joint classiﬁer:

λ= min

f∈F ϵs(f) + ϵt(f)

Neural Networks with a higher capacity can minimize the expected risk on the source set

ϵs

and

the risk of the optimal joint classiﬁer effectively. Therefore, capacity of the Neural Network

and complexity of the task inﬂuence the gains that can be obtained using augmentations.

4.2 Analysing the role of Augmentations in Adversarial Training

We analyse the trade-off between the factors described in Conjecture-1 for adversarial training when

compared to standard ERM training. In addition to the goal of improving accuracy on clean samples,

adversarial training aims to achieve local smoothness of the loss landscape as well. Hence, the

complexity of Adversarial Training is higher than that of standard ERM training, making it important

to use larger model capacities to obtain gains using data augmentations (based on Conjecture-1 (iii)).

This justiﬁes the gains obtained by Rebufﬁ et al. [

] on the WRN-70-16 architecture by using

CutMix based augmentations (2.9% higher robust accuracy and 1.23% higher clean accuracy). The

same method does not obtain signiﬁcant gains on smaller architectures such as ResNet-18 where a

1.76% boost in robust accuracy is accompanied by a 2.55% drop in clean accuracy.

Secondly, while the distribution shift between augmented data and test data (

2dF∆F(s, t)

) may be

sufﬁciently low for natural images leading to improved generalization to test set (Conjecture-1(i,ii)),

the same may not be true for adversarial images. There is a large difference between the augmented

data and test data in pixel space, although they may be similar in feature space. Since adversarial

attacks perturb images in pixel space, the distribution shift between the corresponding perturbations

widens further as shown in Fig.7,8and 9. Based on Conjecture-1(ii), unless this difference is

accounted for, complex augmentations cannot improve the performance of adversarial training. (Ref.

Appendix-B). This trend has also been observed empirically by Rebufﬁ et al. [

], based on which

they conclude that the augmentations designed for robustness need to preserve low-level features.

We present the performance of Adversarial Training by using either Base augmentations (Pad+Crop,

Flip) or AutoAugment [

] during training and inference on the CIFAR-10 dataset using ResNet-18

and WideResNet-34-10 architectures in Table-1. Firstly, we note that by using AutoAugment during

training alone, robust accuracy on the test set drops by

∼1.5−2%

which is as observed in prior

work [

]. Secondly, the clean and robust accuracy drop by

∼6.5%

when augmented images are

used for both training and testing, highlighting the complexity of the learning task. We present

additional results by training without any augmentations, and by training using a combination of both

augmentations in every minibatch in Table-15. We note that the use of Base Augmentations alone

(Pad+Crop+HFlip) still gives the best overall performance on the unaugmented test set.

5 Proposed Method

5.1 Background

We brieﬂy discuss the TRADES-AWP defense [

], which is the base algorithm used in DAJAT.

LAWP = max

θ∈M(θ)

i=1

LCE(fθ+ˆ

θ(xi), yi) + β·max

˜xi∈Aε(xi)KL(fθ+ˆ

θ(xi)||fθ+ˆ

θ(˜xi)) (2)

Firstly, an adversarial attack is generated by maximizing the KL divergence between the softmax

predictions of the clean and adversarial examples iteratively for 10 attack steps. An Adversarial

Weight Perturbation (AWP) step additionally perturbs the weights of the model to maximize the

overall loss in weight-space. The weight perturbations are constrained in the feasible region

M(θ)

such that for a given layer

|| ˆ

θl|| ≤ γ· ||θl||

. The overall training loss is a combination of Cross-

entropy loss on clean samples and the KL divergence term. The latter is weighted by a factor

that

controls the robustness-accuracy trade-off. Training the model using Adversarial Weight Perturbations

leads to smoothing of loss surface in the weight space, resulting in better generalization [50,44].

5.2 Diverse Augmentation based Joint Adversarial Training (DAJAT)

As discussed in Section-4, the use of augmentations in training can be viewed as a problem of

domain generalization, where performance on the source distribution or augmented dataset is crucial

towards improving the performance on the target distribution or test set. Since adversarial training is

inherently challenging, for limited model capacity, it is difﬁcult to obtain good performance on the

training data that is transformed using complex augmentations. Moreover, the large distribution shift

between augmented data and test data, speciﬁcally with respect to low-level statistics, results in poor

generalization of robust accuracy to the test set.

To mitigate these challenges, we propose the combined use of simple and complex augmentations

during training, so that the model can beneﬁt from the diversity introduced by complex augmentations,

while also specializing on the original data distribution that is similar to the simple augmentations.

We propose to use separate batch normalization layers for simple and complex augmentations, so

as to offset the shift in distribution between the two kinds of augmentations. In our main approach,

we propose to use Pad and Crop followed by Horizontal Flip (Pad+Crop+HFlip) as the Simple

augmentations, and Autoaugment followed by Pad+Crop+HFlip as the complex augmentations. We

justify the choice of this augmentation pipeline in Appendix-B.3.

Motivated by AugMix [

], we additionally minimize the Jenson-Shannon (JS) divergence between

the softmax outputs of different augmentations, so as to allow the simple augmentations to guide the

learning of complex ones. We present the training loss

LDAJAT

of the proposed method, Diverse

Augmentation based Joint Adversarial Training (DAJAT) below in Eq.5:

LTR(θ, x, y) = LCE(fθ(x), y) + β·max

˜x∈Aε(x)KL(fθ(x)||fθ(˜x)) (3)

θ= argmax

θ∈M(θ)

i=1

LTR(θ+ˆ

θ, xi,base, yi)(4)

LDAJAT =1

T+ 1 ·1

i=1

LTR(θ+˜

θ, xi,base, yi) +

t=1

LTR(θ+˜

θ, xi,auto(t), yi)+

i=1

JSD(fθ+˜

θ(xi,base), fθ+˜

θ(xi,auto(1)), . . . , fθ+˜

θ(xi,auto(T))) (5)

Adversarial attacks are generated individually for each augmentation by maximizing the respective

KL divergence term of the TRADES loss shown in Eq.3. To improve training efﬁciency, we compute

˜x

using two attack steps with a step-size of

. We use a combination of a linearly increasing schedule

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EfficientandEffectiveAugmentationStrategyforAdversarialTrainingSravantiAddepalli†∗SamyakJain†⋄‡∗R.VenkateshBabu††VideoAnalyticsLab,IndianInstituteofScience,Bangalore⋄IndianInstituteofTechnology(BHU)VaranasiAbstractAdversarialtrainingofDeepNeuralNetworksisknowntobesignicantlymoredata-hungrywhencompa...

展开>> 收起<<

Efficient and Effective Augmentation Strategy for Adversarial Training Sravanti Addepalli Samyak Jain R.Venkatesh Babu.pdf

共30页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Efficient and Effective Augmentation Strategy for Adversarial Training Sravanti Addepalli Samyak Jain R.Venkatesh Babu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: