Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao1 Zeyu Qin1 Yanbo Fan2

2025-04-27 1 0 5.27MB 21 页 10玖币

侵权投诉

Adaptive Smoothness-weighted Adversarial Training for Multiple

Perturbations with Its Stability Analysis

Jiancong Xiao1, Zeyu Qin1, Yanbo Fan2,∗,

Baoyuan Wu1,3, Jue Wang2, Zhi-Quan Luo1,3,∗

1The Chinese University of Hong Kong, Shenzhen;

2Tencent AI Lab; 3Shenzhen Research Institute of Big Data

jiancongxiao@link.cuhk.edu.cn, zeyuqin@cuhk.link.edu.cn,

fanyanbo0124@gmail.com,wubaoyuan@cuhk.edu.cn,

arphid@gmail.com, luozq@cuhk.edu.cn

Abstract

Adversarial Training (AT) has been demonstrated as one of the most eﬀective methods against adver-

sarial examples. While most existing works focus on AT with a single type of perturbation (e.g., the `∞

attacks), DNNs are facing threats from diﬀerent types of adversarial examples. Therefore, adversarial

training for multiple perturbations (ATMP) is proposed to generalize the adversarial robustness over

diﬀerent perturbation types (in `1,`2, and `∞norm-bounded perturbations). However, the resulting

model exhibits trade-oﬀ between diﬀerent attacks. Meanwhile, there is no theoretical analysis of ATMP,

limiting its further development. In this paper, we ﬁrst provide the smoothness analysis of ATMP and

show that `1,`2, and `∞adversaries give diﬀerent contributions to the smoothness of the loss func-

tion of ATMP. Based on this, we develop the stability-based excess risk bounds and propose adaptive

smoothness-weighted adversarial training for multiple perturbations. Theoretically, our algorithm yields

better bounds. Empirically, our experiments on CIFAR10 and CIFAR100 achieve the state-of-the-art

performance against the mixture of multiple perturbations attacks.

1 Introduction

Deep neural networks (DNNs) are shown to be vulnerable to adversarial examples (Goodfellow et al.,2014;

Szegedy et al.,2013), where a small and malicious perturbation can cause incorrect predictions. Adversarial

training (AT) (Madry et al.,2017), which augments training data with `pnorm-bounded adversarial exam-

ples, is one of the most eﬀective methods to increase the robustness of DNNs against adversarial attacks.

Currently, most existing works focus on adversarial training with a single type of attack, e.g., the `∞attack

(Raghunathan et al.,2019;Gowal et al.,2020). However, some recent works (Tramèr and Boneh,2019) have

experimentally demonstrated that the DNNs trained with a single type of adversarial attack cannot provide

well defense against other types of adversarial examples. Fig. 1(a) provides an example on CIFAR-10. The

plot shows that the `1adversarial training cannot defend the `2and `∞attacks (its robust accuracy are 0%).

The `∞adversarial training can provide some extent of defense against `1and `2attacks (with accuracy

being 17.19% and 53.91%). But it is not comparable to the performance of `1and `2adversarial training,

89.84%, and 61.72%, respectively.

For better robustness against diﬀerent types of attacks, Tramer et al., (Tramèr and Boneh,2019) ex-

tends adversarial training against multiple perturbations (ATMP) (typically the `1,`2, and `∞attacks).

Speciﬁcally, they consider two types of objective functions. The ﬁrst one is the average of all perturba-

tions (AVG), where the inner maximization problem of adversarial training ﬁnds adversarial examples for all

∗Correspondence Authors.

arXiv:2210.00557v1 [cs.LG] 2 Oct 2022

types of attacks. The second one is worst-case perturbation (WST), where the inner maximization problem

ﬁnds the adversarial examples with the largest loss within the union of the `pnorm balls. Following these

settings, researchers have proposed diﬀerent algorithms to solve these two problems. Representative works

are multi-steepest descent (MSD) (Maini et al.,2020), and stochastic adversarial training (SAT) (Madaan

et al.,2020), which use diﬀerent strategies to ﬁnd adversarial examples in the `p-norm balls and obtain some

improvements comparing to the MAX and AVG.

However, there exist several crucial issues that are unsolved w.r.t. ATMP. Firstly, the optimization

process of ATMP is highly unstable compared to that of AT or standard training. Fig. 1(c) and (d) give an

example. The test robust accuracy ﬂuctuates between diﬀerent training epochs. Secondly, it is quite diﬃcult

to achieve a satisfying trade-oﬀ between diﬀerent attacks. None of them achieves the best performance

against all of the three attacks, as shown in Fig. 1(b). Diﬀerent algorithms tend to ﬁnd sub-optimal around

diﬀerent local minima, resulting in a model perform well in one perturbation while worse defend against

others. Last but most important, there is no theoretical study of ATMP currently. The exploration of

ATMP methods is usually experimentally designed, without any theoretical guidelines.

(a) (b) (c) (d)

Figure 1: Crucial issues of adversarial training for multiple perturbations. (a) Performance of adversarial training

with a single type perturbation against other type attacks. (b) Trade-oﬀ between diﬀerent types of adversaries of

four algorithms for ATMP. (c) Robust test accuracy ﬂuctuate between diﬀerent epochs using MAX. (d) Robust test

accuracy ﬂuctuate between diﬀerent epochs using MSD.

In this work, we ﬁrst study the smoothness and the loss landscape of ATMP. We show that the smoothness

of `1,`2, and `∞adversaries give diﬀerent contributions to the smoothness of ATMP. It motivates us to

study a question:

How to use the smoothness properties of diﬀerent `padversaries to design algorithms for ATMP?

We study this question using the notion of uniform stability. In uniform stability analysis, excess risk,

which is the sum of optimization error and generalization error, is highly related to the smoothness of the

loss function. The formal and stability analysis and excess risk upper bound of ATMP are provided in Thm.

5.1. Inspired by the analysis, we propose adaptive smoothness-weighted adversarial training for multiple

perturbations to improve the excess risk bound. Theoretically, our algorithms yields better bound (See our

main results in Thm. 5.2 and Thm. 5.3). Experimental results on CIFAR-10 and CIFAR-100 show that

this technique mitigates the above issues and improves the performance of ATMP. Our solution achieves the

state-of-the-art performance against the mixture of multiple perturbations attacks.

Our contributions are listed as follow:

1. We provide a comprehensive smoothness analysis of adversarial training for single and multiple per-

turbations.

2. We provide a uniform stability analysis on ATMP. Based on the analysis, we propose our stability-

inspired algorithm: adaptive smoothness-weighted adversarial training for multiple perturbations.

3. Theoretically, our algorithms yields better excess risk bound. Experimentally, we obtain an improve-

ment on robust accuracy, achieving the state-of-the-art performance on CIFAR-10 and CIFAR-100.

2 Related Work

In this section, we ﬁrst introduce the standard adversarial training with a single type of perturbation, as

well as its theoretical analysis. We then introduce the adversarial training against multiple perturbations.

Adversarial training Adversarial training (AT) has been demonstrated to be one of the most eﬀective

ways to increase the adversarial robustness (Szegedy et al.,2013). The key idea of AT is to augment the

training set with adversarial examples during training. Currently, most AT-based methods are trained with

a single type of adversarial examples, and the `p(p=1, 2, or ∞) is commonly used to generate adversarial

examples during training (Madry et al.,2017). It is shown that AT overﬁts the adversarial examples on

the training set and generalizes badly on the test sets. Many approaches have been proposed to increase

the adversarial generalization (Raghunathan et al.,2019;Schmidt et al.,2018). Meanwhile, there have been

some attempts for the theoretical understanding of adversarial training, mainly focusing on the convergence

properties and generalization bound. For example, the work of (Gao et al.,2019) studies the convergence of

adversarial training in the neural tangent kernel (NTK) regime. Liu et al. study the smoothness of the loss

function of adversarial training (Liu et al.,2020b). In terms of generalization bound, the work of (Yin et al.,

2019;Awasthi et al.,2020) study the generalization bound in terms of Rademacher complexity. The work of

(Gao et al.,2019) considers the VC-dimension bound of adversarial training. Xing et al. (Xing et al.,2021)

study the generalization of adversarial linear regression.

Adversarial robustness against multiple perturbations models Recently, some works have demon-

strated that adversarial training with a single type of perturbation cannot provide well defense against other

types of adversarial attacks (Tramèr and Boneh,2019) and several ATMP algorithms have been proposed

accordingly (Maini et al.,2020;Madaan et al.,2020;Zhang et al.,2021;Stutz et al.,2020). The work

of (Tramèr and Boneh,2019) proposed to augment diﬀerent types of adversarial examples into adversarial

training and developed two augmentation strategies, i.e., MAX and AVG. The MAX adopts the worst-case

adversarial example among diﬀerent attacks, while the AVG takes all types of adversarial examples into

training. Following the above pipeline, some later works developed diﬀerent aggregation strategies (e.g.,

the MSD (Maini et al.,2020), and SAT (Madaan et al.,2020)) for better robustness or training eﬃciency.

While these works can boost the adversarial robustness against multiple perturbations to some extent, the

training process of ATMP is highly unstable, and there is no theoretical analysis about this. The theoretical

understanding of the training diﬃculty of ATMP is important for the further development of adversarial

robustness for multiple perturbations. Besides, there have also been some other works for adversarial ro-

bustness against multiple perturbations, such as Ensemble models (Maini et al.,2021;Cheng et al.,2021),

Prepossessing (Nandy et al.,2020) and Neural architectures search (NAS) (Liu et al.,2020a). The weakness

of ensemble models or prepossess methods is that the performance is highly related to the classiﬁcation

quality or detection of diﬀerent types of adversarial examples. These methods either have lower performance

or consider diﬀerent tasks from the work we considered. Therefore, we mainly compare the algorithms MAX,

AVG, MSD, and SAT in this work.

3 Preliminaries of Adversarial Training for Multiple Perturbations

Adversarial training is an approach to train a classiﬁer that minimizes the worst-case loss within a norm-

bounded constraint. Let g(θ, z)be the loss function of the standard counterpart. Given training dataset

S={zi}i=1···n, the optimization problem of adversarial training is

min

i=1

max

kzi−z0

ikp≤p

g(θ, z0

i),(3.1)

where pis the perturbation threshold, p= 1,2or ∞for diﬀerent types of attacks. Usually, gcan also be

written in the form of `(fθ(x), y), where fθis the neural network to be trained and (x, y)is the input-label

pair. Adversarial training aims to train a model against a single type of `pattack. As AT with a single type

of attacks may not be eﬀecting under other types of attacks, adversarial training for multiple perturbations

are proposed (Tramèr and Boneh,2019). Following the aforementioned literature, we consider the case that

p= 1,2,∞. Two formulations can be use to tackle this problem.

Worst-case perturbation (WST) The optimization problem of WST is formulated as follow,

min

i=1

max

p∈{1,2,∞} max

kzi−z0

ikp≤p

g(θ, z0

i).(3.2)

WST aims to ﬁnd the worst adversarial examples within the union of the three norm constraints for the inner

maximization problem. The outer minimization problem updates model parameters θto ﬁt these adversarial

examples. The MAX strategy (Tramèr and Boneh,2019) are proposed for the optimization problem in Eq.

(3.2). In each inner iteration, MAX takes the maximum loss on these three adversarial examples. Another

algorithm for the optimization problem in Eq. (3.2) is multi-steepest descent (MSD) (Maini et al.,2020). In

each PGD step in the inner iteration, MSD selects the worst among `1,`2, and `∞attacks.

Average of all perturbations (AVG) The optimization problem of AVG is formulated as follow

min

i=1

Ep∼{1,2,∞} max

kzi−z0

ikp≤p

g(θ, z0

i),(3.3)

where p∼ {1,2,∞} uniformly at random. The goal of the minimax problem in Eq. (3.3) is to train the neural

networks using data augmented with all three types of adversarial examples. The AVG strategy (Tramèr

and Boneh,2019) and the stochastic adversarial training (SAT) (Madaan et al.,2020) are two algorithms to

solve the problem in Eq. (3.3). In each inner iteration, AVG takes the average loss on these three adversarial

examples and SAT randomly chooses one type of adversarial example among `1,`2, and `∞attacks.

Problem WST and AVG are similar but slightly diﬀerent problems. WST aims to defend union attacks,

i.e., the optimal attack within the union of multiple perturbations. AVG aims to defend mixture attacks,

i.e., the attacker randomly pick one `pattack. In this paper, we mainly focus on problem AVG, we also

discuss some solutions for problem WST.

4 Smoothness Analysis

We ﬁrst study the smoothness of the minimax problems in Eq. (3.2) and (3.3). To simplify the notation, let

hp(θ, z) = max

kz−z0kp≤p

g(θ, z0),

havg(θ, z) = Ep∼{1,2,∞} max

kz−z0kp≤p

g(θ, z0),

hwst(θ, z) = max

p∈{1,2,∞} max

kz−z0kp≤p

g(θ, z0)

be the loss function of standard adversarial training, worst-case multiple perturbation adversarial training,

and average of all perturbations adversarial training, respectively. The population and empirical risks are

the expectation and average of hst(·), respectively. We use Rst

D(θ)and Rst

S(θ)to denote the population and

empirical risk for adversarial training with diﬀerent strategy, i.e. st ∈ {1,2,∞,wst,avg}.

Case study: Linear regression We use a simple case, adversarial linear regression, to illustrate the

smoothness of the optimization problem of (3.2) and (3.3). Let fθ(x) = θTxand `(θTx, y) = |θTx−y|2, we

have the following proposition.

Proposition 4.1. Let X= [x1,··· ,xn]T,y= [y1,··· , yn]T, and δ= [δ1,··· ,

δn]T, we have

S(θ)=[kXθ−yk2+√npkθkp∗]2,

Rwst

S(θ) = max

p∈{1,2,∞}[kXθ−yk2+√npkθkp∗]2,

Ravg

S(θ) = Ep∈{1,2,∞}[kXθ−yk2+√npkθkp∗]2.

The proof is deferred to A. From Proposition 4.1, the loss landscape of adversarial training is non-smooth

because of the term kθkp∗. Speciﬁcally, the loss function of `2adversarial training is non-smooth at θ= 0.

For `1adversarial training, the loss function is non-smooth at |θi|=|θj|,∀i, j. For `∞adversarial training,

the loss function is non-smooth at θi= 0,∀i. For adversarial training for multiple perturbations, the loss

function is non-smooth at both θi= 0,∀iand |θi|=|θj|,∀i, j. The non-smooth region of the loss function of

ATMP is the union of that of the single perturbation cases. Diﬀerent `padversaries give diﬀerent contribution

to the smoothness of the loss function of ATMP.

In Fig. 2, we give a numerical simulation and demonstrate the loss landscape in a two-dimensional case.

In `2adversarial training, the loss landscape is smooth almost everywhere, except the original point. In `1

and `∞cases, the non-smooth region is a ‘cross’. In the cases of WST and AVG, the non-smooth region is

the union of two ‘crosses’.

ℓ"ℓ#ℓ$Worst&case All&perturbations

Figure 2: Loss landscape of adversarial linear regression for single and multiple perturbations.

General nonlinear model Now let us consider general nonlinear models. Following the work of (Sinha

et al.,2017), without loss of generality, let us assume

Assumption 4.1. The function gsatisﬁes the following Lipschitzian smoothness conditions:

kg(θ1, z)−g(θ2, z)k ≤ Lkθ1−θ2k,

k∇θg(θ1, z)− ∇θg(θ2, z)k ≤ Lθkθ1−θ2k,

k∇θg(θ, z1)− ∇θg(θ, z2)k ≤ Lθzkz1−z2k,

k∇zg(θ1, z)− ∇θg(θ2, z)k ≤ Lzθkθ1−θ2k.

Assumption 4.1 assumes that the loss function is smooth (in zeroth-order and ﬁrst-order). While ReLU

activation function is non-smooth, recent works (Allen-Zhu et al.,2019;Du et al.,2019) showed that the loss

function of overparamterized DNNs is semi-smooth. It helps justify Assumption 4.1. Under Assumption 4.1,

the following Lemma Provide the smoothness of ATMP.

Lemma 4.1. Under Assumption 4.1, assuming in addtion that g(θ, z)is locally µp-strongly concave for all

z∈ Z in `p-norm. ∀θ1, θ2and ∀z∈ Z, the following properties hold.

1. (Lipschitz function.) khst(θ1, z)−hst(θ2, z)k ≤ Lkθ1−θ2k.

2. (Gradient Lipschitz.) If we Then, k∇θhst(θ1, z)− ∇θhst(θ2, z)k ≤ βstkθ1−θ2k, where

βst =









Lθz Lzθ/µst +Lθst ∈ {1,2,∞},

Lθz Lzθ/minpµp+Lθst =wst,

Ep∼{1,2,∞}βpst =avg.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AdaptiveSmoothness-weightedAdversarialTrainingforMultiplePerturbationswithItsStabilityAnalysisJiancongXiao1,ZeyuQin1,YanboFan2;*,BaoyuanWu1;3,JueWang2,Zhi-QuanLuo1;3;1TheChineseUniversityofHongKong,Shenzhen;2TencentAILab;3ShenzhenResearchInstituteofBigDatajiancongxiao@link.cuhk.edu.cn,zeyuqin@cuhk....

展开>> 收起<<

Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao1 Zeyu Qin1 Yanbo Fan2.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao1 Zeyu Qin1 Yanbo Fan2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: