Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao1 Zeyu Qin1 Yanbo Fan2

2025-04-27 0 0 5.27MB 21 页 10玖币
侵权投诉
Adaptive Smoothness-weighted Adversarial Training for Multiple
Perturbations with Its Stability Analysis
Jiancong Xiao1, Zeyu Qin1, Yanbo Fan2,,
Baoyuan Wu1,3, Jue Wang2, Zhi-Quan Luo1,3,
1The Chinese University of Hong Kong, Shenzhen;
2Tencent AI Lab; 3Shenzhen Research Institute of Big Data
jiancongxiao@link.cuhk.edu.cn, zeyuqin@cuhk.link.edu.cn,
fanyanbo0124@gmail.com,wubaoyuan@cuhk.edu.cn,
arphid@gmail.com, luozq@cuhk.edu.cn
Abstract
Adversarial Training (AT) has been demonstrated as one of the most effective methods against adver-
sarial examples. While most existing works focus on AT with a single type of perturbation (e.g., the `
attacks), DNNs are facing threats from different types of adversarial examples. Therefore, adversarial
training for multiple perturbations (ATMP) is proposed to generalize the adversarial robustness over
different perturbation types (in `1,`2, and `norm-bounded perturbations). However, the resulting
model exhibits trade-off between different attacks. Meanwhile, there is no theoretical analysis of ATMP,
limiting its further development. In this paper, we first provide the smoothness analysis of ATMP and
show that `1,`2, and `adversaries give different contributions to the smoothness of the loss func-
tion of ATMP. Based on this, we develop the stability-based excess risk bounds and propose adaptive
smoothness-weighted adversarial training for multiple perturbations. Theoretically, our algorithm yields
better bounds. Empirically, our experiments on CIFAR10 and CIFAR100 achieve the state-of-the-art
performance against the mixture of multiple perturbations attacks.
1 Introduction
Deep neural networks (DNNs) are shown to be vulnerable to adversarial examples (Goodfellow et al.,2014;
Szegedy et al.,2013), where a small and malicious perturbation can cause incorrect predictions. Adversarial
training (AT) (Madry et al.,2017), which augments training data with `pnorm-bounded adversarial exam-
ples, is one of the most effective methods to increase the robustness of DNNs against adversarial attacks.
Currently, most existing works focus on adversarial training with a single type of attack, e.g., the `attack
(Raghunathan et al.,2019;Gowal et al.,2020). However, some recent works (Tramèr and Boneh,2019) have
experimentally demonstrated that the DNNs trained with a single type of adversarial attack cannot provide
well defense against other types of adversarial examples. Fig. 1(a) provides an example on CIFAR-10. The
plot shows that the `1adversarial training cannot defend the `2and `attacks (its robust accuracy are 0%).
The `adversarial training can provide some extent of defense against `1and `2attacks (with accuracy
being 17.19% and 53.91%). But it is not comparable to the performance of `1and `2adversarial training,
89.84%, and 61.72%, respectively.
For better robustness against different types of attacks, Tramer et al., (Tramèr and Boneh,2019) ex-
tends adversarial training against multiple perturbations (ATMP) (typically the `1,`2, and `attacks).
Specifically, they consider two types of objective functions. The first one is the average of all perturba-
tions (AVG), where the inner maximization problem of adversarial training finds adversarial examples for all
Correspondence Authors.
1
arXiv:2210.00557v1 [cs.LG] 2 Oct 2022
types of attacks. The second one is worst-case perturbation (WST), where the inner maximization problem
finds the adversarial examples with the largest loss within the union of the `pnorm balls. Following these
settings, researchers have proposed different algorithms to solve these two problems. Representative works
are multi-steepest descent (MSD) (Maini et al.,2020), and stochastic adversarial training (SAT) (Madaan
et al.,2020), which use different strategies to find adversarial examples in the `p-norm balls and obtain some
improvements comparing to the MAX and AVG.
However, there exist several crucial issues that are unsolved w.r.t. ATMP. Firstly, the optimization
process of ATMP is highly unstable compared to that of AT or standard training. Fig. 1(c) and (d) give an
example. The test robust accuracy fluctuates between different training epochs. Secondly, it is quite difficult
to achieve a satisfying trade-off between different attacks. None of them achieves the best performance
against all of the three attacks, as shown in Fig. 1(b). Different algorithms tend to find sub-optimal around
different local minima, resulting in a model perform well in one perturbation while worse defend against
others. Last but most important, there is no theoretical study of ATMP currently. The exploration of
ATMP methods is usually experimentally designed, without any theoretical guidelines.
(a) (b) (c) (d)
Figure 1: Crucial issues of adversarial training for multiple perturbations. (a) Performance of adversarial training
with a single type perturbation against other type attacks. (b) Trade-off between different types of adversaries of
four algorithms for ATMP. (c) Robust test accuracy fluctuate between different epochs using MAX. (d) Robust test
accuracy fluctuate between different epochs using MSD.
In this work, we first study the smoothness and the loss landscape of ATMP. We show that the smoothness
of `1,`2, and `adversaries give different contributions to the smoothness of ATMP. It motivates us to
study a question:
How to use the smoothness properties of different `padversaries to design algorithms for ATMP?
We study this question using the notion of uniform stability. In uniform stability analysis, excess risk,
which is the sum of optimization error and generalization error, is highly related to the smoothness of the
loss function. The formal and stability analysis and excess risk upper bound of ATMP are provided in Thm.
5.1. Inspired by the analysis, we propose adaptive smoothness-weighted adversarial training for multiple
perturbations to improve the excess risk bound. Theoretically, our algorithms yields better bound (See our
main results in Thm. 5.2 and Thm. 5.3). Experimental results on CIFAR-10 and CIFAR-100 show that
this technique mitigates the above issues and improves the performance of ATMP. Our solution achieves the
state-of-the-art performance against the mixture of multiple perturbations attacks.
Our contributions are listed as follow:
1. We provide a comprehensive smoothness analysis of adversarial training for single and multiple per-
turbations.
2. We provide a uniform stability analysis on ATMP. Based on the analysis, we propose our stability-
inspired algorithm: adaptive smoothness-weighted adversarial training for multiple perturbations.
3. Theoretically, our algorithms yields better excess risk bound. Experimentally, we obtain an improve-
ment on robust accuracy, achieving the state-of-the-art performance on CIFAR-10 and CIFAR-100.
2
2 Related Work
In this section, we first introduce the standard adversarial training with a single type of perturbation, as
well as its theoretical analysis. We then introduce the adversarial training against multiple perturbations.
Adversarial training Adversarial training (AT) has been demonstrated to be one of the most effective
ways to increase the adversarial robustness (Szegedy et al.,2013). The key idea of AT is to augment the
training set with adversarial examples during training. Currently, most AT-based methods are trained with
a single type of adversarial examples, and the `p(p=1, 2, or ) is commonly used to generate adversarial
examples during training (Madry et al.,2017). It is shown that AT overfits the adversarial examples on
the training set and generalizes badly on the test sets. Many approaches have been proposed to increase
the adversarial generalization (Raghunathan et al.,2019;Schmidt et al.,2018). Meanwhile, there have been
some attempts for the theoretical understanding of adversarial training, mainly focusing on the convergence
properties and generalization bound. For example, the work of (Gao et al.,2019) studies the convergence of
adversarial training in the neural tangent kernel (NTK) regime. Liu et al. study the smoothness of the loss
function of adversarial training (Liu et al.,2020b). In terms of generalization bound, the work of (Yin et al.,
2019;Awasthi et al.,2020) study the generalization bound in terms of Rademacher complexity. The work of
(Gao et al.,2019) considers the VC-dimension bound of adversarial training. Xing et al. (Xing et al.,2021)
study the generalization of adversarial linear regression.
Adversarial robustness against multiple perturbations models Recently, some works have demon-
strated that adversarial training with a single type of perturbation cannot provide well defense against other
types of adversarial attacks (Tramèr and Boneh,2019) and several ATMP algorithms have been proposed
accordingly (Maini et al.,2020;Madaan et al.,2020;Zhang et al.,2021;Stutz et al.,2020). The work
of (Tramèr and Boneh,2019) proposed to augment different types of adversarial examples into adversarial
training and developed two augmentation strategies, i.e., MAX and AVG. The MAX adopts the worst-case
adversarial example among different attacks, while the AVG takes all types of adversarial examples into
training. Following the above pipeline, some later works developed different aggregation strategies (e.g.,
the MSD (Maini et al.,2020), and SAT (Madaan et al.,2020)) for better robustness or training efficiency.
While these works can boost the adversarial robustness against multiple perturbations to some extent, the
training process of ATMP is highly unstable, and there is no theoretical analysis about this. The theoretical
understanding of the training difficulty of ATMP is important for the further development of adversarial
robustness for multiple perturbations. Besides, there have also been some other works for adversarial ro-
bustness against multiple perturbations, such as Ensemble models (Maini et al.,2021;Cheng et al.,2021),
Prepossessing (Nandy et al.,2020) and Neural architectures search (NAS) (Liu et al.,2020a). The weakness
of ensemble models or prepossess methods is that the performance is highly related to the classification
quality or detection of different types of adversarial examples. These methods either have lower performance
or consider different tasks from the work we considered. Therefore, we mainly compare the algorithms MAX,
AVG, MSD, and SAT in this work.
3 Preliminaries of Adversarial Training for Multiple Perturbations
Adversarial training is an approach to train a classifier that minimizes the worst-case loss within a norm-
bounded constraint. Let g(θ, z)be the loss function of the standard counterpart. Given training dataset
S={zi}i=1···n, the optimization problem of adversarial training is
min
θ
1
n
n
X
i=1
max
kziz0
ikpp
g(θ, z0
i),(3.1)
where pis the perturbation threshold, p= 1,2or for different types of attacks. Usually, gcan also be
written in the form of `(fθ(x), y), where fθis the neural network to be trained and (x, y)is the input-label
3
pair. Adversarial training aims to train a model against a single type of `pattack. As AT with a single type
of attacks may not be effecting under other types of attacks, adversarial training for multiple perturbations
are proposed (Tramèr and Boneh,2019). Following the aforementioned literature, we consider the case that
p= 1,2,. Two formulations can be use to tackle this problem.
Worst-case perturbation (WST) The optimization problem of WST is formulated as follow,
min
θ
1
n
n
X
i=1
max
p∈{1,2,∞} max
kziz0
ikpp
g(θ, z0
i).(3.2)
WST aims to find the worst adversarial examples within the union of the three norm constraints for the inner
maximization problem. The outer minimization problem updates model parameters θto fit these adversarial
examples. The MAX strategy (Tramèr and Boneh,2019) are proposed for the optimization problem in Eq.
(3.2). In each inner iteration, MAX takes the maximum loss on these three adversarial examples. Another
algorithm for the optimization problem in Eq. (3.2) is multi-steepest descent (MSD) (Maini et al.,2020). In
each PGD step in the inner iteration, MSD selects the worst among `1,`2, and `attacks.
Average of all perturbations (AVG) The optimization problem of AVG is formulated as follow
min
θ
1
n
n
X
i=1
Ep∼{1,2,∞} max
kziz0
ikpp
g(θ, z0
i),(3.3)
where p∼ {1,2,∞} uniformly at random. The goal of the minimax problem in Eq. (3.3) is to train the neural
networks using data augmented with all three types of adversarial examples. The AVG strategy (Tramèr
and Boneh,2019) and the stochastic adversarial training (SAT) (Madaan et al.,2020) are two algorithms to
solve the problem in Eq. (3.3). In each inner iteration, AVG takes the average loss on these three adversarial
examples and SAT randomly chooses one type of adversarial example among `1,`2, and `attacks.
Problem WST and AVG are similar but slightly different problems. WST aims to defend union attacks,
i.e., the optimal attack within the union of multiple perturbations. AVG aims to defend mixture attacks,
i.e., the attacker randomly pick one `pattack. In this paper, we mainly focus on problem AVG, we also
discuss some solutions for problem WST.
4 Smoothness Analysis
We first study the smoothness of the minimax problems in Eq. (3.2) and (3.3). To simplify the notation, let
hp(θ, z) = max
kzz0kpp
g(θ, z0),
havg(θ, z) = Ep∼{1,2,∞} max
kzz0kpp
g(θ, z0),
hwst(θ, z) = max
p∈{1,2,∞} max
kzz0kpp
g(θ, z0)
be the loss function of standard adversarial training, worst-case multiple perturbation adversarial training,
and average of all perturbations adversarial training, respectively. The population and empirical risks are
the expectation and average of hst(·), respectively. We use Rst
D(θ)and Rst
S(θ)to denote the population and
empirical risk for adversarial training with different strategy, i.e. st ∈ {1,2,,wst,avg}.
Case study: Linear regression We use a simple case, adversarial linear regression, to illustrate the
smoothness of the optimization problem of (3.2) and (3.3). Let fθ(x) = θTxand `(θTx, y) = |θTxy|2, we
have the following proposition.
4
Proposition 4.1. Let X= [x1,··· ,xn]T,y= [y1,··· , yn]T, and δ= [δ1,··· ,
δn]T, we have
Rp
S(θ)=[kXθyk2+npkθkp]2,
Rwst
S(θ) = max
p∈{1,2,∞}[kXθyk2+npkθkp]2,
Ravg
S(θ) = Ep∈{1,2,∞}[kXθyk2+npkθkp]2.
The proof is deferred to A. From Proposition 4.1, the loss landscape of adversarial training is non-smooth
because of the term kθkp. Specifically, the loss function of `2adversarial training is non-smooth at θ= 0.
For `1adversarial training, the loss function is non-smooth at |θi|=|θj|,i, j. For `adversarial training,
the loss function is non-smooth at θi= 0,i. For adversarial training for multiple perturbations, the loss
function is non-smooth at both θi= 0,iand |θi|=|θj|,i, j. The non-smooth region of the loss function of
ATMP is the union of that of the single perturbation cases. Different `padversaries give different contribution
to the smoothness of the loss function of ATMP.
In Fig. 2, we give a numerical simulation and demonstrate the loss landscape in a two-dimensional case.
In `2adversarial training, the loss landscape is smooth almost everywhere, except the original point. In `1
and `cases, the non-smooth region is a ‘cross’. In the cases of WST and AVG, the non-smooth region is
the union of two ‘crosses’.
"#$Worst&case All&perturbations
Figure 2: Loss landscape of adversarial linear regression for single and multiple perturbations.
General nonlinear model Now let us consider general nonlinear models. Following the work of (Sinha
et al.,2017), without loss of generality, let us assume
Assumption 4.1. The function gsatisfies the following Lipschitzian smoothness conditions:
kg(θ1, z)g(θ2, z)k ≤ Lkθ1θ2k,
k∇θg(θ1, z)− ∇θg(θ2, z)k ≤ Lθkθ1θ2k,
k∇θg(θ, z1)− ∇θg(θ, z2)k ≤ Lθzkz1z2k,
k∇zg(θ1, z)− ∇θg(θ2, z)k ≤ Lzθkθ1θ2k.
Assumption 4.1 assumes that the loss function is smooth (in zeroth-order and first-order). While ReLU
activation function is non-smooth, recent works (Allen-Zhu et al.,2019;Du et al.,2019) showed that the loss
function of overparamterized DNNs is semi-smooth. It helps justify Assumption 4.1. Under Assumption 4.1,
the following Lemma Provide the smoothness of ATMP.
Lemma 4.1. Under Assumption 4.1, assuming in addtion that g(θ, z)is locally µp-strongly concave for all
z∈ Z in `p-norm. θ1, θ2and z∈ Z, the following properties hold.
1. (Lipschitz function.) khst(θ1, z)hst(θ2, z)k ≤ Lkθ1θ2k.
2. (Gradient Lipschitz.) If we Then, k∇θhst(θ1, z)− ∇θhst(θ2, z)k ≤ βstkθ1θ2k, where
βst =
Lθz Lzθst +Lθst ∈ {1,2,∞},
Lθz Lzθ/minpµp+Lθst =wst,
Ep∼{1,2,∞}βpst =avg.
5
摘要:

AdaptiveSmoothness-weightedAdversarialTrainingforMultiplePerturbationswithItsStabilityAnalysisJiancongXiao1,ZeyuQin1,YanboFan2;*,BaoyuanWu1;3,JueWang2,Zhi-QuanLuo1;3;1TheChineseUniversityofHongKong,Shenzhen;2TencentAILab;3ShenzhenResearchInstituteofBigDatajiancongxiao@link.cuhk.edu.cn,zeyuqin@cuhk....

展开>> 收起<<
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao1 Zeyu Qin1 Yanbo Fan2.pdf

共21页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:21 页 大小:5.27MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 21
客服
关注