Stability Analysis and Generalization Bounds of Adversarial Training Jiancong Xiao1 Yanbo Fan2y Ruoyu Sun13y Jue Wang2 Zhi-Quan Luo13

2025-05-03 0 0 1.46MB 27 页 10玖币
侵权投诉
Stability Analysis and Generalization Bounds of
Adversarial Training
Jiancong Xiao1,, Yanbo Fan2,, Ruoyu Sun1,3,, Jue Wang2, Zhi-Quan Luo1,3
1The Chinese University of Hong Kong, Shenzhen;
2Tencent AI Lab; 3Shenzhen Research Institute of Big Data
jiancongxiao@link.cuhk.edu.cn, fanyanbo0124@gmail.com,
sunruoyu@cuhk.edu.cn, arphid@gmail.com, luozq@cuhk.edu.cn
Abstract
In adversarial machine learning, deep neural networks can fit the adversarial ex-
amples on the training dataset but have poor generalization ability on the test set.
This phenomenon is called robust overfitting, and it can be observed when adver-
sarially training neural nets on common datasets, including SVHN, CIFAR-10,
CIFAR-100, and ImageNet. In this paper, we study the robust overfitting issue of
adversarial training by using tools from uniform stability. One major challenge
is that the outer function (as a maximization of the inner function) is nonsmooth,
so the standard technique (e.g., (Hardt et al., 2016)) cannot be applied. Our ap-
proach is to consider η-approximate smoothness: we show that the outer function
satisfies this modified smoothness assumption with ηbeing a constant related to
the adversarial perturbation . Based on this, we derive stability-based gener-
alization bounds for stochastic gradient descent (SGD) on the general class of
η-approximate smooth functions, which covers the adversarial loss. Our results
suggest that robust test accuracy decreases in when Tis large, with a speed
between Ω(T)and O(T ). This phenomenon is also observed in practice. Ad-
ditionally, we show that a few popular techniques for adversarial training (e.g.,
early stopping, cyclic learning rate, and stochastic weight averaging) are stability-
promoting in theory.
1 Introduction
Figure 1: Experiments of ad-
versarial training on CIFAR-
10.
Deep neural networks (DNNs) (Krizhevsky et al., 2012; Hochreiter
and Schmidhuber, 1997) have become successful in many machine
learning tasks and rarely suffered overfitting issues (Zhang et al.,
2021). A neural network model can be trained to achieve zero train-
ing error and generalize well to the unseen data. While in the setting
of adversarial training, robust overfitting is a dominant issue (Rice
et al., 2020). Specifically, robust overfitting characterizes a train-
ing procedure shown in Fig. 1. After a particular epoch, the robust
test accuracy (black line) starts to decrease, but the robust train-
ing accuracy (blue line) is still increasing. This phenomenon can
be observed in the experiments on common datasets, e.g., SVHN,
CIFAR-10, CIFAR-100, and ImageNet. Recent works (Gowal et al.,
2020; Rebuffi et al., 2021) mitigated the overfitting issue using reg-
ularization techniques such as stochastic weight averaging (SWA)
This work is done when Jiancong Xiao is a research intern at Tencent AI Lab, China.
Corresponding Authors.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.00960v2 [cs.LG] 31 Oct 2022
and early stopping, but it still has a large generalization gap. Therefore, it is essential to study this
issue from a theoretical perspective. In this paper, we study the robust overfitting issue of adversarial
training by using tools from uniform stability.
Uniform stability analysis (Bousquet and Elisseeff, 2002) in learning problems has been intro-
duced to measure generalization gap instead of uniform convergence analysis such as classical VC-
dimension (Vapnik and Chervonenkis, 2015) and Rademacher complexity (Bartlett and Mendelson,
2002). Generalization gap can be bounded in terms of uniform argument stability (UAS). Formally,
UAS is the gap between the output parameters θof running an algorithm Aon two datasets Sand
S0differ in at most one sample, denoted as δ(S, S0) = kθ(S)θ(S0)k. In a standard training prob-
lem with ntraining samples, assuming that the loss function is convex, L-Lipschitz and β-gradient
Lipschitz, running stochastic gradient descent (SGD) with step size α1for Tsteps, the UAS
is bounded by O(LαT/n)(Hardt et al., 2016). Since the generalization gap is controlled by the
number of samples n, this bound (at least partially) explains the good generalization of a standard
training problem. Beyond (Hardt et al., 2016), Bassily et al. (2020) considered the case that the
loss function is non-smooth. Without the β-gradient Lipschitz assumption, they showed that the
UAS is bounded by O(T+LαT /n)and provided a lower bound to show that the additional
term O(T)is unavoidable. In adversarial settings, two works have discussed the stability of
adversarial training to our knowledge.
Firstly, Farnia and Ozdaglar (2021) considered the stability of minimax problems. Their work in-
cludes a discussion on an algorithm called GDmax (gradient descent on the maximization of the
inner problem), which can be viewed as a general form of adversarial training. It provides the stabil-
ity of GDmax when the inner problem is further assumed to be strongly concave. Under the strongly
concave assumption, the outer function is smooth. Then, the generalization bound O(T/n)can
be applied in this case. However, the inner problem is not strongly concave in practice. The bound
O(LαT/n)does not match the poor generalization of adversarial training observed in practice.
Another stability analysis of adversarial training is the work of (Xing et al., 2021a). Before they
propose their algorithm, they use the stability bound O(T+LαT/n)(Bassily et al., 2020)
to characterize the issue of adversarial generalization. However, the bound is -independent. Let
us Consider two cases, 0and is large (e.g., 16/255). In the first case, adversarial training is
very close to standard training and has good generalization ability. In the second case, adversarial
generalization is very poor. The bound is the same in these two different cases. Therefore, it cannot
provide an interpretation of adversarial generalization.
In summary, existing stability-based generalization bounds (Hardt et al., 2016; Bassily et al., 2020)
have limited interpretation on adversarial generalization. In this paper, we provide a new stability
analysis of adversarial training using the notion of η-approximate smoothness. We first show that,
under the same assumptions as (Hardt et al., 2016; Xing et al., 2021a), even though the outer func-
tion (adversarial loss) is non-smooth, it is η-approximately smooth (see Definition 4.1), where ηis
a constant linearly depending on the gradient Lipschitz of the inner function and the perturbation
intensity . Then, we derive stability-based generalization bounds (Thm. 5.1 to 5.4) for stochastic
gradient descent (SGD) on this general class of η-approximate smooth functions, which covers the
adversarial loss. Our main result can be summarized in the following equation. Running SGD on ad-
versarial loss for Tsteps with step size α1, the excess risk, which is the sum of generalization
error and optimization error, satisfies
Excess Risk ≤ Egen +Eopt
additional
z}|{
T α +
for standard training
z }| {
2L2T α
n+D2
T α +L2α
| {z }
for adversarial training
.(1.1)
The excess risk of adversarial training has an additional term T α. We also provide a lower
bound of UAS of running SGD on adversarial loss. Our results suggest that robust test accuracy
decreases in when Tis large, with a speed between Ω(T)and O(T ). This phenomenon is
also observed in practice. It provides an understanding of robust overfitting from the perspective
of uniform stability. Additionally, we show that a few popular techniques for adversarial training
(e.g., early stopping, cyclic learning rate, and stochastic weight averaging) are stability-promoting
in theory and also empirically improve adversarial training. Experiments on SVHN, CIFAR-10,
CIFAR-100, and ImageNet confirm our results. Our contributions are listed as follows:
2
Main results: we derive stability-based generalization bounds for adversarial training us-
ing the notion of η-approximate smoothness. Based on this, we provide an analysis to
understand robust overfitting.
We provide the stability analysis of a few popular techniques for adversarial training and
show that they are indeed stability-promoting.
We provide experiments on SVHN, CIFAR-10, CIFAR-100, and ImageNet. The results
verify the generalization bounds.
Technical contribution: we develop a set of properties of η-approximately smooth function,
which might be useful in other tasks.
The paper is organized as follows. After discussing the related work in Sec. 2, the rest of the paper
contains two parts. The first part is the technical part to derive stability bounds. The second part is to
analyze robust overfitting. Specifically, in the first part, Sec. 3 introduces the preliminary knowledge
about UAS. Sec. 4 provides the Lemma and properties of approximately smooth functions and Sec. 5
gives the stability bounds. In the second part, Sec. 6 analyzes the robust overfitting in the theoretical
settings and Sec. 7 presents the experiments.
2 Related Work
Adversarial Attacks and Defense. Starting from the work of (Szegedy et al., 2013), it has been
commonly realized that deep neural networks are highly susceptible to imperceptible corruptions to
the input data (Goodfellow et al., 2014; Carlini and Wagner, 2017; Madry et al., 2017). A series
of work aimed at training neural networks robust to such small perturbations (Wu et al., 2020;
Gowal et al., 2020; Zhang et al., 2020) and another line of work aimed at designing more powerful
adversarial attack algorithms (Athalye et al., 2018; Tramer et al., 2020; Fan et al., 2020; Xiao et al.,
2022c; Qin et al., 2022). A series of work considered adversarial robustness in black-box settings
(Chen et al., 2017; Qin et al., 2021). Semi-supervised learning has been used to improve adversarial
robustness (Carmon et al., 2019; Li et al., 2022). Fast adversarial training (Wong et al., 2020; Huang
et al., 2022) was introduced to save training time.
Adversarial Generalization. A series of work tried to explain adversarial generalization in the
uniform convergence framework, including VC-dimension (Attias et al., 2021; Montasser et al.,
2019) and Rademacher complexity (Khim and Loh, 2018; Yin et al., 2019; Awasthi et al., 2020; Xiao
et al., 2022a). Uniform algorithmic stability is another framework to study adversarial generalization
(Farnia and Ozdaglar, 2021; Xing et al., 2021a; Xiao et al., 2022b). The work of (Schmidt et al.,
2018; Raghunathan et al., 2019; Zhai et al., 2019) have shown that in some scenarios achieving
adversarial generalization requires more data. The work of (Xing et al., 2021b,c; Javanmard et al.,
2020) studied the generalization in the setting of adversarial linear regression. (Sinha et al., 2017)
studied the generalization of distributional robustness. The work of (Taheri et al., 2020; Javanmard
et al., 2020; Dan et al., 2020) analyzed adversarial generalization in Gaussian mixture models.
Uniform Stability. Stability is a classical approach to provide generalization bounds. It can be
traced back to the work of (Rogers and Wagner, 1978). After a few decades, it was well devel-
oped in analyzing the generalization bounds in statistical learning problems (Bousquet and Elisseeff,
2002). These bounds have been significantly improved in a recent sequence of works (Feldman and
Vondrak, 2018, 2019). The work of (Chen et al., 2018) derived minimax lower bounds for excess
risk and discussed the optimal trade-off between stability and convergence. Ozdaglar et al. (2022)
considered the generalization metric of minimax optimizer.
3 Preliminaries of Stability
Consider the following setting of statistical learning. There is an unknown distribution Dover
examples from some space Z. We receive a sample dataset S={z1, . . . , zn}of nexamples drawn
i.i.d. from D.The population risk and empirical risk are defined as:
RD(θ)def
=Ez∼D h(θ, z)and RS(θ)def
=1
n
n
X
i=1
h(θ, zi),
3
respectively, where h(·,·)is the loss function.
Risk Decomposition. Let θand ¯
θbe the optimal solution of RD(θ)and RS(θ)respectively. Then
for the algorithm output ˆ
θ=A(S), the excess risk can be decomposed as
RD(ˆ
θ)RD(θ) = RD(ˆ
θ)RS(ˆ
θ)
| {z }
Egen
+RS(ˆ
θ)RS(¯
θ)
| {z }
Eopt
+RS(¯
θ)RS(θ)
| {z }
0
+RS(θ)RD(θ)
| {z }
E=0
.
To control the excess risk, we need to control the generalization gap Egen and the optimization gap
Eopt. In the rest of the paper, we use Egen and Eopt to denote the expectation of the generalization and
optimization gap. To bound the generalization gap of a model ˆ
θ=A(S)trained by a randomized
algorithm A, we employ the following notion of uniform stability.
Definition 3.1. A randomized algorithm Ais ε-uniformly stable if for all data sets S, S0∈ Znsuch
that Sand S0differ in at most one example, we have
sup
z
EA[h(A(S); z)h(A(S0); z)] ε . (3.1)
Here, the expectation is taken only over the internal randomness of A. We recall the important
theorem that uniform stability implies generalization in expectation (Hardt et al., 2016).
Theorem 3.1 (Generalization in expectation).Let Abe ε-uniformly stable. Then, the expected
generalization gap satisfies
|Egen|=|ES,A[RD[A(S)] RS[A(S)]]| ≤ ε .
Therefore, we turn to the properties of iterative algorithms that control their uniform stability.
4 Stability of Adversarial Training
Adversarial Surrogate Loss. In adversarial training, we consider the following surrogate loss
h(θ;z) = max
kzz0kpg(θ;z0),(4.1)
where g(θ;z)is the loss function of the standard counterpart, k·kpis the `p-norm, p1. Usually, g
can also be written in the form of `(fθ(x); y), where fθis the neural network to be trained and (x, y)
is the input-label pair. We assume the loss function gsatisfies the following smoothness assumption.
Assumption 4.1. The function gsatisfies the following Lipschitzian smoothness conditions:
kg(θ1, z)g(θ2, z)k ≤ Lkθ1θ2k,
k∇θg(θ1, z)− ∇θg(θ2, z)k ≤ Lθkθ1θ2k,
k∇θg(θ, z1)− ∇θg(θ, z2)k ≤ Lzkz1z2kp,
where k·kis Euclidean norm.
Assumption 4.1 assumes that the loss function is smooth, which are also used in the stability litera-
ture (Farnia and Ozdaglar, 2021; Xing et al., 2021a), as well as the convergence analysis literature
(Wang et al., 2019; Liu et al., 2020). While ReLU activation function is non-smooth, recent works
(Allen-Zhu et al., 2019; Du et al., 2019) showed that the loss function of over-parameterized DNNs
is semi-smooth. It helps justify Assumption 4.1. Under Assumption 4.1, the loss function of adver-
sarial training satisfies the following Lemma (Liu et al., 2020).
Lemma 4.1. Let hbe the adversarial loss defined in Eq. (4.1) and gsatisfies Assumption 4.1.
θ1, θ2and z∈ Z, the following properties hold.
1. (Lipschitz function.) kh(θ1, z)h(θ2, z)k ≤ Lkθ1θ2k.
2. For all subgradient d(θ, z)θh(θ, z), we have kd(θ1, z)d(θ2, z)k ≤ Lθkθ1θ2k+ 2Lz.
If we further assume that g(θ, z)is µ-strongly concave in z, the adversarial surrogate loss h(θ, z)
is also smooth (Sinha et al., 2017). Therefore, the uniform stability of adversarial training follows
(Hardt et al., 2016; Farnia and Ozdaglar, 2021), see Appendix B.3. However, g(θ, z)is non-strongly
concave in practice. The above results provide limited explanations of the poor generalization of
adversarial training. We discuss the generalization properties of adversarial training under Lemma
4.1.2.
4
4.1 Basic Properties of Approximate Smoothness
To simplify the notation, we use h(θ)as a shorthand notation of h(θ, z). To simplify the argu-
ment, we consider differentiable function h. The results can be extended to non-differentiable cases.
Lemma 4.1 motivates us to analyze a function with the following modified smoothness assumption,
which we call approximate smoothness assumption.
Definition 4.1. Let β > 0and η > 0. We say a differentiable function h(θ)is η-approximately
β-gradient Lipschitz, if θ1and θ2, we have
k∇h(θ1)− ∇h(θ2)k ≤ βkθ1θ2k+η.
In definition 4.1, ηcontrols the smoothness of the loss function h(·). When η= 0, the function his
gradient Lipschitz. When η+,his a general non-smooth function. As our discussion before,
the adversarial surrogate loss is 2Lz-approximately smooth. As far as we know, this assumption is
rarely discussed in the optimization literature since it cannot improve the convergence rate from a
general non-smooth assumption. But it affects uniform stability, as we will discuss later. We need
to develop the basic properties of approximate smoothness first.
Lemma 4.2. Assume that the function his η-approximately β-gradient Lipschitz. θ1, θ2and z
Z, the following properties hold.
1. (η-approximate descent lemma.)
h(θ1)h(θ2)≤ ∇h(θ2)T(θ1θ2) + β
2kθ1θ2k2+ηkθ1θ2k.
2. (η-approximately co-coercive.) Assume in addition that h(θ, z)is convex in θfor all z∈ Z. Let
[·]+= max(0,·). We have
h∇h(θ1)− ∇h(θ2), θ1θ2i ≥ 1
β[k∇h(θ1)− ∇h(θ2)k − η]+2
.
We defer the proof to Appendix A. Note that the loss function is L-Lipschitz for every example z,
we have E|h(θ1, z)h(θ2, z)| ≤ LEkθ1θ2k, for all z∈ Z.To obtain the stability generalization
bounds, we need to analysis the difference kθT
1θT
2k, where θT
1and θT
2are the outputs of running
SGD on adversarial surrogate loss for Titerations on two datasets with only one different sample.
Next we provide the recursive bounds under the approximate smoothness assumption.
Algorithms. We consider the stochastic gradient descent on the adversarial surrogate loss. i.e.,
θt+1 =θtαtθh(θt, zit),(4.2)
where αtis the step size in iteration t,zitis the sample chosen in iteration t. We consider two
popular schemes for choosing the examples indices it.Sampling with replacement: One is to pick
itUniform{1,··· , n}at each step. Fixed permutation: The other is to choose a random permu-
tation over {1,··· , n}and cycle through the examples repeatedly in the order determined by the
permutation. Our results hold for both variants.
Properties of Update Rules. We define Gα,z (θ) = θαh(θ, z)be the update rule of SGD.
The following lemma holds.
Lemma 4.3. Assume that the function his η-approximately β-gradient Lipschitz. θ1, θ2and z
Z,we have
1. (αη-approximately (1 + αβ)-expansive.) kGα,z (θ1)Gα,z (θ2)k ≤ (1 + αβ)kθ1θ2k+αη.
2. (αη-approximately non-expansive.) Assume in addition that h(θ, z)is convex in θfor all z∈ Z,
for α1, we have kGα,z (θ1)Gα,z(θ2)k≤kθ1θ2k+αη.
3. (αη-approximately (1 αγ)-contractive.) Assume in addition that h(θ, z)is γ-strongly convex
in θfor all z∈ Z, for α1, we have kGα,z(θ1)Gα,z(θ2)k ≤ (1 αγ)kθ1θ2k+αη.
The proof of Lemma 4.3 is based on Lemma 4.2 and is deferred to Appendix A. Lemma 4.3 provides
the recursive distance kθ1θ2kfrom iteration tto t+ 1. Based on this, we can recursively derive
the distance kθT
1θT
2k. Then, we can obtain the stability generalization bounds.
5
摘要:

StabilityAnalysisandGeneralizationBoundsofAdversarialTrainingJiancongXiao1;,YanboFan2;y,RuoyuSun1;3;y,JueWang2,Zhi-QuanLuo1;31TheChineseUniversityofHongKong,Shenzhen;2TencentAILab;3ShenzhenResearchInstituteofBigDatajiancongxiao@link.cuhk.edu.cn,fanyanbo0124@gmail.com,sunruoyu@cuhk.edu.cn,arphid@gma...

展开>> 收起<<
Stability Analysis and Generalization Bounds of Adversarial Training Jiancong Xiao1 Yanbo Fan2y Ruoyu Sun13y Jue Wang2 Zhi-Quan Luo13.pdf

共27页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:27 页 大小:1.46MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 27
客服
关注