Accelerating Certiﬁed Robustness Training via Knowledge Transfer Pratik Vaishnavi

2025-04-30 0 0 467.1KB 18 页 10玖币

侵权投诉

Accelerating Certiﬁed Robustness Training via

Knowledge Transfer

Pratik Vaishnavi

Stony Brook University

pvaishnavi@cs.stonybrook.edu

Kevin Eykholt

IBM Research

kheykholt@ibm.com

Amir Rahmati

Stony Brook University

amir@cs.stonybrook.edu

Abstract

Training deep neural network classiﬁers that are certiﬁably robust against adver-

sarial attacks is critical to ensuring the security and reliability of AI-controlled

systems. Although numerous state-of-the-art certiﬁed training methods have been

developed, they are computationally expensive and scale poorly with respect to

both dataset and network complexity. Widespread usage of certiﬁed training is fur-

ther hindered by the fact that periodic retraining is necessary to incorporate new

data and network improvements. In this paper, we propose Certiﬁed Robustness

Transfer (CRT), a general-purpose framework for reducing the computational over-

head of any certiﬁably robust training method through knowledge transfer. Given a

robust teacher, our framework uses a novel training loss to transfer the teacher’s

robustness to the student. We provide theoretical and empirical validation of CRT.

Our experiments on CIFAR-10 show that CRT speeds up certiﬁed robustness train-

ing by

8×

on average across three different architecture generations while achiev-

ing comparable robustness to state-of-the-art methods. We also show that CRT can

scale to large-scale datasets like ImageNet.

1 Introduction

Deep Neural Networks (DNNs) are susceptible to adversarial evasion attacks [

], that add a small

amount of carefully crafted imperceptible noise to an input to reliably trigger misclassiﬁcation. As a

defense, numerous training methods have been proposed [

] to grant a classiﬁer empirical

robustness. But in the absence of any provable guarantees for this robustness, these defenses were

frequently broken [

]. These failures have motivated the development of training methods that

grant certiﬁable/provable robustness to a classiﬁer, hence safeguarding them against all attacks

(known or unknown) within a pre-determined threat model. Such methods are broadly categorized

as either deterministic or probabilistic [

]. Deterministic robustness training methods [

] rely on computing provable bounds on the output neurons of a classiﬁer for

a given perturbation budget in the input space. However, the deterministic robustness guarantees

provided by these methods come at a high computational cost. Probabilistic robustness training

methods address this limitation by providing highly probable (e.g., with 0.99 probability) robustness

guarantees at a greatly reduced computational cost. Within this category, randomized smoothing-

based methods [

] are considered the state-of-the-art for certiﬁable

robustness in the

-space. Even so, these training methods remain an order of magnitude slower than

standard training. In commercial applications where constant model re-deployment occurs to provide

improvements (see Figure 1), re-training using computationally expensive methods is burdensome.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.14283v1 [cs.LG] 25 Oct 2022

Figure 1: Evolution of DNN architectures on the ImageNet dataset. We plot the performance (top-1

accuracy) and the number of parameters of a few popular architectures (year of release is noted in

brackets). Newer generations attempt to improve performance and/or reduce network parameters.

In this work, we reduce the training overhead of randomized smoothing-based robustness training

methods with minimal impact on the robustness achieved. We propose Certiﬁed Robustness Transfer

(CRT), a knowledge transfer framework that signiﬁcantly speeds up the process of training

certiﬁably robust image classiﬁers. Given a pre-trained classiﬁer that is certiﬁably robust (i.e.,

teacher), CRT trains a new classiﬁer (i.e., student) that has comparable levels of robustness in a

fraction of the time required by state-of-the-art methods. CRT brings down the cost of training

certiﬁably robust image classiﬁers to be comparable to standard training while preserving state-

of-the-art robustness. On CIFAR-10, CRT speeds up training by an average of

8×

across three

different architecture generations compared to a state-of-the-art robustness training method [

Furthermore, we show that state-of-the-art robustness training is only necessary to train the initial

classiﬁer. Afterward, CRT can be continuously reused to transfer robustness in order to expedite

future model re-deployments and greatly reduce costs associated with computational resources. Our

contributions can be summarized as follows:

•

We present Certiﬁed Robustness Transfer (CRT), the ﬁrst framework, to our knowledge, that

can transfer the robustness of a certiﬁably robust teacher classiﬁer to a new student classiﬁer.

CRT greatly reduces the time required to train certiﬁably robust image classiﬁers relative to

existing state-of-the-art methods while achieving comparable or better robustness.

•

We provide a theoretical understanding of CRT, showing how our approach of matching

outputs enables robustness transfer between the student and teacher irrespective of the

certiﬁed robustness training method used to train the teacher.

•

On CIFAR-10, we show that CRT trains certiﬁably robust classiﬁers on average

8×

faster

than a state-of-the-art method while having comparable or better Average Certiﬁed Radius

(by

in the best case). Furthermore, CRT reduces the cumulative computational cost of

training three classiﬁers by 87.84%.

•

We also show that CRT can be reused in a recursive manner, thus supporting a continuous re-

deployment scenario (e.g., in commercial applications). Finally, we show that CRT remains

effective on a large-scale dataset, ImageNet.

2 Background

In this section, we brieﬂy introduce certiﬁed robustness and discuss notable existing methods for

training certiﬁably robust image classiﬁers using randomized smoothing.

2.1 Preliminaries

Problem Setup.

Consider a neural network classiﬁer

parameterized by

(denoted

fθ

) trained

to map a given input

x∈Rd

to a set of discrete labels

using a set of i.i.d. samples

S={(x1, y1),(x1, y1),· · · ,(xn, yn)}

drawn from a data distribution

. The output of the classiﬁer

can be written as

fθ(x) = arg maxc∈Y zc

θ(x)

. Here

zθ(x)

is the softmax output of the classiﬁer and

θ(x)denotes the probability that image xbelongs to class c.

Certiﬁed Robustness via Randomized Smoothing.

The robustness of the classiﬁer

fθ

for a given

input pair

(x, y)

is deﬁned using the radius of the largest

ball centered at

within which

fθ

has a

constant output

. This radius is referred to as robust radius and it can mathematically be expressed as:

R(fθ;x, y) = (inf

fθ(x0)6=fθ(x)kx0−xk2, when fθ(x) = y

0, when fθ(x)6=y

(1)

Within this

-neighborhood of

fθ

is considered to be certiﬁably robust. Therefore, to improve

the robustness of a classiﬁer, one needs to maximize this robust radius corresponding to any point

sampled from the given data distribution. Directly maximizing the robust radius of a DNN classiﬁer

is an NP-hard problem [

]. Therefore, several prior works attempt to derive a lower bound for

the robust radius [

]. This lower bound, often termed as the certiﬁed radius, satisﬁes the

following condition:

0≤CR(fθ;x, y)≤R(fθ;x, y)

, for any

fθ

, (

). In this paper, we utilize the

certiﬁed robustness framework derived by Cohen et al. [

] using randomized smoothing. Given a

classiﬁer fθ, they ﬁrst deﬁne the smooth classiﬁer gθas:

Deﬁnition 2.1.

For a given (base) classiﬁer

fθ

and

σ > 0

, the smooth classiﬁer

gθ

corresponding to

fθis deﬁned as follows:

gθ(x) = arg max

c∈Y

Pη∼N (0,σ2I)(fθ(x+η) = c)(2)

Simply put,

gθ

returns the class

, which has the highest probability mass under the Gaussian

distribution

N(x, σ2I)

. Using Theorem 2.2, they proved that if the smooth classiﬁer correctly

classiﬁes a given input

, it is certiﬁably robust at

. They also provided an analytical form of the

certiﬁed radius at x.

Theorem 2.2.

Let

fθ:Rd7→ Y

be a classiﬁer and

gθ

be its smoothed version (as deﬁned in

Deﬁnition 2.1). For a given input

x∈Rd

and corresponding ground truth

y∈ Y

, if

gθ

correctly

classiﬁes xas y,i.e.,

Pη(fθ(x+η) = y)≥max

y06=yPη(fθ(x+η) = y0)(3)

then gθis provably robust at xwithin the certiﬁed radius Rgiven by:

CR(gθ;x, y) = σ

2[Φ−1(Pη(fθ(x+η) = y)) −Φ−1(max

y06=yPη(fθ(x+η) = y0))] (4)

where Φis the c.d.f. of the standard Gaussian distribution.

This certiﬁed radius is a tight lower bound of the robust radius deﬁned in Equation 1, i.e., it is

impossible to certify gθat xfor a radius larger than CR.

2.2 Training Methods for Maximizing Certiﬁed Radius

In addition to the theoretical framework discussed above, Cohen et al. [

] also propose a simple

yet effective method for training the base classiﬁer in a way that maximizes the

certiﬁed radius

of the smooth classiﬁer, as expressed in Equation 4. We include an evaluation of their method in

Appendix B.6. Following their work, several other works build upon the randomized smoothing

framework and propose training methods that better maximize the

certiﬁed radius of the smooth

classiﬁer. Salman et al. [

] proposed combining adversarial training [

] with randomized smooth-

ing (called SmoothAdv). They adapted the vanilla PGD attack to target the smooth classiﬁer

gθ

instead of the base classiﬁer

fθ

. Zhai et al. [

] proposed a new robustness loss, a hinge loss that en-

forces maximization of the soft approximation of the certiﬁed radius. Their method (called MACER)

is faster than SmoothAdv as it does not use adversarial training. More recently, Jeong et al. [

]

proposed training with a convex combination of samples along the direction of adversarial pertur-

bation for each input to regularize over-conﬁdent predictions. Their method (called SmoothMix) is

the current state-of-the-art in the domain of

certiﬁed robust image classiﬁers. Finally, we note the

Consistency regularization method proposed by Jeong et al. [

], which adds a regularization loss to

existing methods that helps better maximize the certiﬁed radius.

Table 1: Training on CIFAR-10 using a ResNet110 classiﬁer on a single Nvidia V100 GPU. State-of-

the-art robustness training methods signiﬁcantly slow down training compared to standard training.

METHOD TRAINING SLOWDOWN FACTOR

SMOOTHADV 46.20×

MACER 20.86×

SMOOTHMIX 4.97×

3 Maximizing Certiﬁed Radius via Knowledge Transfer

Although prior works have proposed methods for increasing the certiﬁed radius of the smooth

classiﬁer, their training overhead is signiﬁcant, making them much slower than standard training.

As we show in Table 1, training a certiﬁably robust ResNet110 classiﬁer to convergence using

SmoothAdv, MACER, and SmoothMix is

46.20×

20.86×

, and

4.97×

slower, respectively, compared

to training a non-robust classiﬁer with standard training.

Given constant innovations in architecture design (Figure 1) and the inﬂux of new data, which may

result in various tweaks to deployed networks that elicit retraining, the large overhead of state-of-the-

art robustness training methods makes preserving certiﬁed robustness across model re-deployment

difﬁcult. Therefore, we propose Certiﬁed Robustness Transfer (CRT), a training method that improves

the usability of certiﬁed robustness training methods by dramatically reducing their training overhead

while preserving the certiﬁed robustness. Given the base classiﬁer of a pre-trained certiﬁably robust

smooth classiﬁer, we leverage the knowledge transfer framework to guide the training of a new base

classiﬁer (and its associated robust smooth classiﬁer).

In this section, we describe our method and

provide theoretical justiﬁcation for its effectiveness.

3.1 Transferring Certiﬁed Robustness

From Equation 4, it follows that training the base classiﬁer to maximize

Pη(fθ(x+η) = y)

for

any given input

will result in the maximization of the certiﬁed radius associated with the smooth

classiﬁer, provided Equation 3 is satisﬁed. Thus, for the base classiﬁer

fθ(x)

, our goal is to maximize

the following quantity over the training set:

i=1

Eη1[fθ(xi+η) = yi]≈

i=1

Eη[zyi

θ(xi+η)] (5)

In the above equation, like prior works [

], we leverage the fact that the softmax output of

a classiﬁer can be treated as a continuous and differentiable approximation of its

arg max

output.

Methods like SmoothAdv [

], MACER [

] and SmoothMix [

] that target

certiﬁable robustness

propose training objectives that maximize this term.

Now, suppose we have a pre-trained base classiﬁer

fφ

. It follows that

Eη[zy

φ(x+η)] ≥0

. Through

straightforward algebraic manipulations (see Appendix A), we derive the following lower bound:

i=1

Eη[zyi

θ(xi+η)] ≥ −

i=1

Eη[zyi

φ(xi+η)−zyi

θ(xi+η)] (6)

That is to say that, for a given input

, if we minimize the difference between the softmax outputs

of the teacher and the student (

fφ

and

fθ

) corresponding to the correct label

, we will maximize

Equation 5 for the student. However, to ensure that the student has a non-trivial certiﬁed radius, we

must also ensure that Equation 3 is satisﬁed. If we assume that Equation 3 holds true for the teacher

(i.e., the base classiﬁer of a certiﬁably robust smooth classiﬁer), this condition can also be achieved

for the student by matching the overall softmax output of the student to that of the teacher.

3.2 Certiﬁed Robustness Transfer (CRT)

Based on the previous discussion, we now describe our method for training a certiﬁably robust

classiﬁer through knowledge transfer. First, we obtain a pre-trained base classiﬁer

fφ

, which has

been trained using a randomized smoothing based robustness training method as this maximizes

If no pre-trained classiﬁer is available, we ﬁrst train an architecture of lower complexity (i.e., fast to train)

compared to the target architecture (Section 5.1).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AcceleratingCertiedRobustnessTrainingviaKnowledgeTransferPratikVaishnaviStonyBrookUniversitypvaishnavi@cs.stonybrook.eduKevinEykholtIBMResearchkheykholt@ibm.comAmirRahmatiStonyBrookUniversityamir@cs.stonybrook.eduAbstractTrainingdeepneuralnetworkclassiersthatarecertiablyrobustagainstadver-sariala...

展开>> 收起<<

Accelerating Certiﬁed Robustness Training via Knowledge Transfer Pratik Vaishnavi.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Accelerating Certiﬁed Robustness Training via Knowledge Transfer Pratik Vaishnavi

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: