Accelerating Certified Robustness Training via Knowledge Transfer Pratik Vaishnavi

2025-04-30 0 0 467.1KB 18 页 10玖币
侵权投诉
Accelerating Certified Robustness Training via
Knowledge Transfer
Pratik Vaishnavi
Stony Brook University
pvaishnavi@cs.stonybrook.edu
Kevin Eykholt
IBM Research
kheykholt@ibm.com
Amir Rahmati
Stony Brook University
amir@cs.stonybrook.edu
Abstract
Training deep neural network classifiers that are certifiably robust against adver-
sarial attacks is critical to ensuring the security and reliability of AI-controlled
systems. Although numerous state-of-the-art certified training methods have been
developed, they are computationally expensive and scale poorly with respect to
both dataset and network complexity. Widespread usage of certified training is fur-
ther hindered by the fact that periodic retraining is necessary to incorporate new
data and network improvements. In this paper, we propose Certified Robustness
Transfer (CRT), a general-purpose framework for reducing the computational over-
head of any certifiably robust training method through knowledge transfer. Given a
robust teacher, our framework uses a novel training loss to transfer the teacher’s
robustness to the student. We provide theoretical and empirical validation of CRT.
Our experiments on CIFAR-10 show that CRT speeds up certified robustness train-
ing by
8×
on average across three different architecture generations while achiev-
ing comparable robustness to state-of-the-art methods. We also show that CRT can
scale to large-scale datasets like ImageNet.
1 Introduction
Deep Neural Networks (DNNs) are susceptible to adversarial evasion attacks [
31
,
9
], that add a small
amount of carefully crafted imperceptible noise to an input to reliably trigger misclassification. As a
defense, numerous training methods have been proposed [
25
,
40
,
35
] to grant a classifier empirical
robustness. But in the absence of any provable guarantees for this robustness, these defenses were
frequently broken [
1
,
32
]. These failures have motivated the development of training methods that
grant certifiable/provable robustness to a classifier, hence safeguarding them against all attacks
(known or unknown) within a pre-determined threat model. Such methods are broadly categorized
as either deterministic or probabilistic [
23
]. Deterministic robustness training methods [
12
,
26
,
33
,
34
,
28
,
10
,
41
,
30
] rely on computing provable bounds on the output neurons of a classifier for
a given perturbation budget in the input space. However, the deterministic robustness guarantees
provided by these methods come at a high computational cost. Probabilistic robustness training
methods address this limitation by providing highly probable (e.g., with 0.99 probability) robustness
guarantees at a greatly reduced computational cost. Within this category, randomized smoothing-
based methods [
19
,
3
,
29
,
22
,
20
,
7
,
37
,
39
,
16
,
15
] are considered the state-of-the-art for certifiable
robustness in the
`2
-space. Even so, these training methods remain an order of magnitude slower than
standard training. In commercial applications where constant model re-deployment occurs to provide
improvements (see Figure 1), re-training using computationally expensive methods is burdensome.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.14283v1 [cs.LG] 25 Oct 2022
Figure 1: Evolution of DNN architectures on the ImageNet dataset. We plot the performance (top-1
accuracy) and the number of parameters of a few popular architectures (year of release is noted in
brackets). Newer generations attempt to improve performance and/or reduce network parameters.
In this work, we reduce the training overhead of randomized smoothing-based robustness training
methods with minimal impact on the robustness achieved. We propose Certified Robustness Transfer
(CRT), a knowledge transfer framework that significantly speeds up the process of training
`2
certifiably robust image classifiers. Given a pre-trained classifier that is certifiably robust (i.e.,
teacher), CRT trains a new classifier (i.e., student) that has comparable levels of robustness in a
fraction of the time required by state-of-the-art methods. CRT brings down the cost of training
certifiably robust image classifiers to be comparable to standard training while preserving state-
of-the-art robustness. On CIFAR-10, CRT speeds up training by an average of
8×
across three
different architecture generations compared to a state-of-the-art robustness training method [
15
].
Furthermore, we show that state-of-the-art robustness training is only necessary to train the initial
classifier. Afterward, CRT can be continuously reused to transfer robustness in order to expedite
future model re-deployments and greatly reduce costs associated with computational resources. Our
contributions can be summarized as follows:
We present Certified Robustness Transfer (CRT), the first framework, to our knowledge, that
can transfer the robustness of a certifiably robust teacher classifier to a new student classifier.
CRT greatly reduces the time required to train certifiably robust image classifiers relative to
existing state-of-the-art methods while achieving comparable or better robustness.
We provide a theoretical understanding of CRT, showing how our approach of matching
outputs enables robustness transfer between the student and teacher irrespective of the
certified robustness training method used to train the teacher.
On CIFAR-10, we show that CRT trains certifiably robust classifiers on average
8×
faster
than a state-of-the-art method while having comparable or better Average Certified Radius
(by
8%
in the best case). Furthermore, CRT reduces the cumulative computational cost of
training three classifiers by 87.84%.
We also show that CRT can be reused in a recursive manner, thus supporting a continuous re-
deployment scenario (e.g., in commercial applications). Finally, we show that CRT remains
effective on a large-scale dataset, ImageNet.
2 Background
In this section, we briefly introduce certified robustness and discuss notable existing methods for
training certifiably robust image classifiers using randomized smoothing.
2.1 Preliminaries
Problem Setup.
Consider a neural network classifier
f
parameterized by
θ
(denoted
fθ
) trained
to map a given input
xRd
to a set of discrete labels
Y
using a set of i.i.d. samples
S={(x1, y1),(x1, y1),· · · ,(xn, yn)}
drawn from a data distribution
D
. The output of the classifier
can be written as
fθ(x) = arg maxc∈Y zc
θ(x)
. Here
zθ(x)
is the softmax output of the classifier and
zc
θ(x)denotes the probability that image xbelongs to class c.
2
Certified Robustness via Randomized Smoothing.
The robustness of the classifier
fθ
for a given
input pair
(x, y)
is defined using the radius of the largest
`2
ball centered at
x
within which
fθ
has a
constant output
y
. This radius is referred to as robust radius and it can mathematically be expressed as:
R(fθ;x, y) = (inf
fθ(x0)6=fθ(x)kx0xk2, when fθ(x) = y
0, when fθ(x)6=y
(1)
Within this
`2
-neighborhood of
x
,
fθ
is considered to be certifiably robust. Therefore, to improve
the robustness of a classifier, one needs to maximize this robust radius corresponding to any point
sampled from the given data distribution. Directly maximizing the robust radius of a DNN classifier
is an NP-hard problem [
17
]. Therefore, several prior works attempt to derive a lower bound for
the robust radius [
21
,
19
,
3
]. This lower bound, often termed as the certified radius, satisfies the
following condition:
0CR(fθ;x, y)R(fθ;x, y)
, for any
fθ
, (
x
,
y
). In this paper, we utilize the
certified robustness framework derived by Cohen et al. [
3
] using randomized smoothing. Given a
classifier fθ, they first define the smooth classifier gθas:
Definition 2.1.
For a given (base) classifier
fθ
and
σ > 0
, the smooth classifier
gθ
corresponding to
fθis defined as follows:
gθ(x) = arg max
c∈Y
Pη∼N (02I)(fθ(x+η) = c)(2)
Simply put,
gθ
returns the class
c
, which has the highest probability mass under the Gaussian
distribution
N(x, σ2I)
. Using Theorem 2.2, they proved that if the smooth classifier correctly
classifies a given input
x
, it is certifiably robust at
x
. They also provided an analytical form of the
`2
certified radius at x.
Theorem 2.2.
Let
fθ:Rd7→ Y
be a classifier and
gθ
be its smoothed version (as defined in
Definition 2.1). For a given input
xRd
and corresponding ground truth
y∈ Y
, if
gθ
correctly
classifies xas y,i.e.,
Pη(fθ(x+η) = y)max
y06=yPη(fθ(x+η) = y0)(3)
then gθis provably robust at xwithin the certified radius Rgiven by:
CR(gθ;x, y) = σ
21(Pη(fθ(x+η) = y)) Φ1(max
y06=yPη(fθ(x+η) = y0))] (4)
where Φis the c.d.f. of the standard Gaussian distribution.
This certified radius is a tight lower bound of the robust radius defined in Equation 1, i.e., it is
impossible to certify gθat xfor a radius larger than CR.
2.2 Training Methods for Maximizing Certified Radius
In addition to the theoretical framework discussed above, Cohen et al. [
3
] also propose a simple
yet effective method for training the base classifier in a way that maximizes the
`2
certified radius
of the smooth classifier, as expressed in Equation 4. We include an evaluation of their method in
Appendix B.6. Following their work, several other works build upon the randomized smoothing
framework and propose training methods that better maximize the
`2
certified radius of the smooth
classifier. Salman et al. [
29
] proposed combining adversarial training [
25
] with randomized smooth-
ing (called SmoothAdv). They adapted the vanilla PGD attack to target the smooth classifier
gθ
instead of the base classifier
fθ
. Zhai et al. [
39
] proposed a new robustness loss, a hinge loss that en-
forces maximization of the soft approximation of the certified radius. Their method (called MACER)
is faster than SmoothAdv as it does not use adversarial training. More recently, Jeong et al. [
15
]
proposed training with a convex combination of samples along the direction of adversarial pertur-
bation for each input to regularize over-confident predictions. Their method (called SmoothMix) is
the current state-of-the-art in the domain of
`2
certified robust image classifiers. Finally, we note the
Consistency regularization method proposed by Jeong et al. [
16
], which adds a regularization loss to
existing methods that helps better maximize the certified radius.
3
Table 1: Training on CIFAR-10 using a ResNet110 classifier on a single Nvidia V100 GPU. State-of-
the-art robustness training methods significantly slow down training compared to standard training.
METHOD TRAINING SLOWDOWN FACTOR
SMOOTHADV 46.20×
MACER 20.86×
SMOOTHMIX 4.97×
3 Maximizing Certified Radius via Knowledge Transfer
Although prior works have proposed methods for increasing the certified radius of the smooth
classifier, their training overhead is significant, making them much slower than standard training.
As we show in Table 1, training a certifiably robust ResNet110 classifier to convergence using
SmoothAdv, MACER, and SmoothMix is
46.20×
,
20.86×
, and
4.97×
slower, respectively, compared
to training a non-robust classifier with standard training.
Given constant innovations in architecture design (Figure 1) and the influx of new data, which may
result in various tweaks to deployed networks that elicit retraining, the large overhead of state-of-the-
art robustness training methods makes preserving certified robustness across model re-deployment
difficult. Therefore, we propose Certified Robustness Transfer (CRT), a training method that improves
the usability of certified robustness training methods by dramatically reducing their training overhead
while preserving the certified robustness. Given the base classifier of a pre-trained certifiably robust
smooth classifier, we leverage the knowledge transfer framework to guide the training of a new base
classifier (and its associated robust smooth classifier).
1
In this section, we describe our method and
provide theoretical justification for its effectiveness.
3.1 Transferring Certified Robustness
From Equation 4, it follows that training the base classifier to maximize
Pη(fθ(x+η) = y)
for
any given input
x
will result in the maximization of the certified radius associated with the smooth
classifier, provided Equation 3 is satisfied. Thus, for the base classifier
fθ(x)
, our goal is to maximize
the following quantity over the training set:
n
X
i=1
Eη1[fθ(xi+η) = yi]
n
X
i=1
Eη[zyi
θ(xi+η)] (5)
In the above equation, like prior works [
3
,
29
,
39
], we leverage the fact that the softmax output of
a classifier can be treated as a continuous and differentiable approximation of its
arg max
output.
Methods like SmoothAdv [
29
], MACER [
39
] and SmoothMix [
15
] that target
`2
certifiable robustness
propose training objectives that maximize this term.
Now, suppose we have a pre-trained base classifier
fφ
. It follows that
Eη[zy
φ(x+η)] 0
. Through
straightforward algebraic manipulations (see Appendix A), we derive the following lower bound:
n
X
i=1
Eη[zyi
θ(xi+η)] ≥ −
n
X
i=1
Eη[zyi
φ(xi+η)zyi
θ(xi+η)] (6)
That is to say that, for a given input
xi
, if we minimize the difference between the softmax outputs
of the teacher and the student (
fφ
and
fθ
) corresponding to the correct label
yi
, we will maximize
Equation 5 for the student. However, to ensure that the student has a non-trivial certified radius, we
must also ensure that Equation 3 is satisfied. If we assume that Equation 3 holds true for the teacher
(i.e., the base classifier of a certifiably robust smooth classifier), this condition can also be achieved
for the student by matching the overall softmax output of the student to that of the teacher.
3.2 Certified Robustness Transfer (CRT)
Based on the previous discussion, we now describe our method for training a certifiably robust
classifier through knowledge transfer. First, we obtain a pre-trained base classifier
fφ
, which has
been trained using a randomized smoothing based robustness training method as this maximizes
1
If no pre-trained classifier is available, we first train an architecture of lower complexity (i.e., fast to train)
compared to the target architecture (Section 5.1).
4
摘要:

AcceleratingCertiedRobustnessTrainingviaKnowledgeTransferPratikVaishnaviStonyBrookUniversitypvaishnavi@cs.stonybrook.eduKevinEykholtIBMResearchkheykholt@ibm.comAmirRahmatiStonyBrookUniversityamir@cs.stonybrook.eduAbstractTrainingdeepneuralnetworkclassiersthatarecertiablyrobustagainstadver-sariala...

展开>> 收起<<
Accelerating Certified Robustness Training via Knowledge Transfer Pratik Vaishnavi.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:467.1KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注