
and early stopping, but it still has a large generalization gap. Therefore, it is essential to study this
issue from a theoretical perspective. In this paper, we study the robust overfitting issue of adversarial
training by using tools from uniform stability.
Uniform stability analysis (Bousquet and Elisseeff, 2002) in learning problems has been intro-
duced to measure generalization gap instead of uniform convergence analysis such as classical VC-
dimension (Vapnik and Chervonenkis, 2015) and Rademacher complexity (Bartlett and Mendelson,
2002). Generalization gap can be bounded in terms of uniform argument stability (UAS). Formally,
UAS is the gap between the output parameters θof running an algorithm Aon two datasets Sand
S0differ in at most one sample, denoted as δ(S, S0) = kθ(S)−θ(S0)k. In a standard training prob-
lem with ntraining samples, assuming that the loss function is convex, L-Lipschitz and β-gradient
Lipschitz, running stochastic gradient descent (SGD) with step size α≤1/β for Tsteps, the UAS
is bounded by O(LαT/n)(Hardt et al., 2016). Since the generalization gap is controlled by the
number of samples n, this bound (at least partially) explains the good generalization of a standard
training problem. Beyond (Hardt et al., 2016), Bassily et al. (2020) considered the case that the
loss function is non-smooth. Without the β-gradient Lipschitz assumption, they showed that the
UAS is bounded by O(Lα√T+LαT /n)and provided a lower bound to show that the additional
term O(Lα√T)is unavoidable. In adversarial settings, two works have discussed the stability of
adversarial training to our knowledge.
Firstly, Farnia and Ozdaglar (2021) considered the stability of minimax problems. Their work in-
cludes a discussion on an algorithm called GDmax (gradient descent on the maximization of the
inner problem), which can be viewed as a general form of adversarial training. It provides the stabil-
ity of GDmax when the inner problem is further assumed to be strongly concave. Under the strongly
concave assumption, the outer function is smooth. Then, the generalization bound O(LαT/n)can
be applied in this case. However, the inner problem is not strongly concave in practice. The bound
O(LαT/n)does not match the poor generalization of adversarial training observed in practice.
Another stability analysis of adversarial training is the work of (Xing et al., 2021a). Before they
propose their algorithm, they use the stability bound O(Lα√T+LαT/n)(Bassily et al., 2020)
to characterize the issue of adversarial generalization. However, the bound is -independent. Let
us Consider two cases, →0and is large (e.g., 16/255). In the first case, adversarial training is
very close to standard training and has good generalization ability. In the second case, adversarial
generalization is very poor. The bound is the same in these two different cases. Therefore, it cannot
provide an interpretation of adversarial generalization.
In summary, existing stability-based generalization bounds (Hardt et al., 2016; Bassily et al., 2020)
have limited interpretation on adversarial generalization. In this paper, we provide a new stability
analysis of adversarial training using the notion of η-approximate smoothness. We first show that,
under the same assumptions as (Hardt et al., 2016; Xing et al., 2021a), even though the outer func-
tion (adversarial loss) is non-smooth, it is η-approximately smooth (see Definition 4.1), where ηis
a constant linearly depending on the gradient Lipschitz of the inner function and the perturbation
intensity . Then, we derive stability-based generalization bounds (Thm. 5.1 to 5.4) for stochastic
gradient descent (SGD) on this general class of η-approximate smooth functions, which covers the
adversarial loss. Our main result can be summarized in the following equation. Running SGD on ad-
versarial loss for Tsteps with step size α≤1/β, the excess risk, which is the sum of generalization
error and optimization error, satisfies
Excess Risk ≤ Egen +Eopt ≤
additional
z}|{
LηT α +
for standard training
z }| {
2L2T α
n+D2
T α +L2α
| {z }
for adversarial training
.(1.1)
The excess risk of adversarial training has an additional term LηT α. We also provide a lower
bound of UAS of running SGD on adversarial loss. Our results suggest that robust test accuracy
decreases in when Tis large, with a speed between Ω(√T)and O(T ). This phenomenon is
also observed in practice. It provides an understanding of robust overfitting from the perspective
of uniform stability. Additionally, we show that a few popular techniques for adversarial training
(e.g., early stopping, cyclic learning rate, and stochastic weight averaging) are stability-promoting
in theory and also empirically improve adversarial training. Experiments on SVHN, CIFAR-10,
CIFAR-100, and ImageNet confirm our results. Our contributions are listed as follows:
2