Double Bubble Toil and Trouble Enhancing Certiﬁed Robustness through Transitivity Andrew C. Cullen1Paul Montague2Shijie Liu1

2025-04-26 0 0 720.07KB 19 页 10玖币

侵权投诉

Double Bubble, Toil and Trouble: Enhancing

Certiﬁed Robustness through Transitivity

Andrew C. Cullen1∗Paul Montague2Shijie Liu1

Sarah M. Erfani1Benjamin I.P. Rubinstein1

1School of Computing and Information Systems, University of Melbourne, Parkville, Australia

2Defence Science and Technology Group, Adelaide, Australia

andrew.cullen@unimelb.edu.au

Abstract

In response to subtle adversarial examples ﬂipping classiﬁcations of neural network

models, recent research has promoted certiﬁed robustness as a solution. There, in-

variance of predictions to all norm-bounded attacks is achieved through randomised

smoothing of network inputs. Today’s state-of-the-art certiﬁcations make optimal

use of the class output scores at the input instance under test: no better radius of

certiﬁcation (under the

norm) is possible given only these score. However, it

is an open question as to whether such lower bounds can be improved using local

information around the instance under test. In this work, we demonstrate how

today’s “optimal” certiﬁcates can be improved by exploiting both the transitivity

of certiﬁcations, and the geometry of the input space, giving rise to what we term

Geometrically-Informed Certiﬁed Robustness. By considering the smallest distance

to points on the boundary of a set of certiﬁcations this approach improves certiﬁca-

tions for more than

80%

of Tiny-Imagenet instances, yielding an on average

increase in the associated certiﬁcation. When incorporating training time processes

that enhance the certiﬁed radius, our technique shows even more promising results,

with a uniform 4percentage point increase in the achieved certiﬁed radius.

1 Introduction

Learned models, including neural networks, are well known to be susceptible having the output

changed by crafted perturbations to an input, that preserve the inputs semantic properties [

]. Neural

networks not only misclassify these perturbations—known as adversarial examples—but they also

assign high conﬁdence to these incorrect predictions. These behaviours have been observed across a

wide range of models and datasets, and appear to be a product of piecewise-linear interactions [13].

Crafting these adversarial examples typically involves gradient-based optimisation to construct small

perturbations. These attacks have been applied to both black- and white-box models [

], and can

be used to target class changes, to attack all classes [

], or even introduce backdoors into model

behaviour [

]. To mitigate the inﬂuence of these attacks, defences have typically been designed

to minimise the effect of a speciﬁc attack (or attacks). Such defences are known as best response

strategies in a Stackelberg security game where the defender leads the attacker. Best response defences

inherently favour the attacker, as deployed mitigations can be defeated by identifying undefended

attack frameworks. Moreover, the defender typically has to incorporate the defence at training time,

and as such cannot response reactively to newly developed attacks.

To circumvent these limitations, certiﬁed guarantees of adversarial robustness can be constructed to

identify class-constant regions around an input instance, that guarantee that all instances within a norm-

bounded distance (typically

) are not adversarial examples. Certiﬁcations based on randomised

smoothing of classiﬁers around an input point are in a sense optimal [

]: based only on the prediction

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.06077v1 [cs.LG] 12 Oct 2022

class scores at the input point, no better radius is in general possible. Despite this, such certiﬁcations

fail to use readily available—yet still local—information: the certiﬁability of points nearby to the

input of interest. The key insight of this work is that these neighbourhood points may generate

certiﬁed radius large enough to completely enclose that of a sample point, improving the radius of

certiﬁcation. This process can be extended to use the intersection of the regions of certiﬁcation of

multiple points, and the nature of the input domain itself to generate larger certiﬁcations. This leads

to our main contribution—Geometrically-Informed Certiﬁed Robustness—that enjoys certiﬁcations

exceeding those of the hitherto best-case guaranteed approach of Cohen et al. (2019) [8].

2 Background and literature review

Bounding mechanisms

Conservative bounds upon the impact of norm-bounded perturbations can

be constructed by way of either Interval Bound Propagation (IBP) which propagates interval bounds

through the model; or Convex Relaxation, which utilise linear relaxation to construct bounding output

polytopes over input bounded perturbations [

], in a manner that generally

provides tighter bounds than IBP [

]. In contrast to Randomised Smoothing, bounding mechanisms

employ augmented loss functions during training, which promote tight output bounds [

] at the cost

of decreased applicability. Moreover they both exhibit a time and memory complexity that makes

them infeasible for complex model architectures or high-dimensional data [40, 6, 21].

Randomised smoothing

Outside of bounding mechanisms, another common framework for de-

veloping certiﬁcations leverages randomised smoothing [

], in which noise is applied to input

instances to smooth model predictions, subject to a sampling distribution that is tied to the

-norm

of adversarial perturbations being certiﬁed against. In contrast to other robustness mechanisms, this

application of the noise is the only architectural change that is required to achieve certiﬁcation. In the

case of L2-norm bounded attacks, Gaussian sampling of the form

i=x+yiwhere yi

i.i.d.

∼ N(0, σ2)∀i∈ {1, . . . , N}(1)

is employed for all test-time instances. These

samples are then used to estimate the expected

output of the predicted class of xby way of the Monte-Carlo estimator

EY[arg max fθ(x+Y) = i]≈1

j=1

1[arg max fθ(xj) = i].(2)

While this Monte Carlo estimation of output expectations under randomised smoothing is a test-time

process, model sensitivity to random perturbations may be decreased by performing adversarial

training on such random perturbations. To mitigate the computational expense of large

sample

sizes during each training update, training typically employs single draws from the noise distribution.

Smoothing-based certiﬁcations

Based on randomised smoothing, certiﬁed robustness can guaran-

tee classiﬁcation invariance for additive perturbations up to some Lp-norm r, with recent work also

considering rotational and/or translational semantic attacks [

-norm certiﬁcations were ﬁrst

demonstrated by way of differential privacy [

], with more recent approaches employing Rényi

divergence [

], and parametrising worst-case behaviours [

]. By considering the worst-case

L2-perturbations, Cohen et al. (2019) purports that the largest achievable pointwise certiﬁcation is

r=σ

2`Φ−1pE0[x]q−Φ−1pE1[x]q˘.(3)

Hhere

{E0, E1}

are the two largest class expectations (as per Equation

(2)

is the noise, and

Φ−1

is the inverse normal CDF, or Gaussian quantile function.

3 Geometrically-informed certiﬁed robustness

While the work contained within this paper can be applied generally, for this work we will focus upon

certiﬁcations of robustness about

-norm bounded adversarial perturbations, for which we assume

that the difﬁculty of attacking a model is proportional to the size of the certiﬁcation, based upon the

need to evade both human and machine scrutiny [

]. Thus, constructing larger certiﬁcations in such

a context is inherently valuable.

(a) Transitivity

r00 x1

(b) Multiple Transitivity

Figure 1: Transitive certiﬁcation exemplars. The Green, Red, and Black circles represent hyper-

spheres of radius

(by Equation 3) about points

xi∀i∈ {1,2,3}

. The resulting certiﬁcations

r0, r00,

and

ˆr

are described within Equations 5, 12, and

. The Black line represents the domain boundary.

This speciﬁc

space is of interest due to both its viability as a defence model, and the provable

guarantee that Cohen et al. produces the largest possible certiﬁcation for any instance [

]. Over the

remainder of this section we will document how it is possible to improve upon this provably best-case

guarantee by exploiting several properties of certiﬁed robustness.

3.1 Exploiting transitivity

While it is provably true that Equation (3) is the largest achievable certiﬁcation for any point x, it is

possible to exploit the behaviour of points in the neighbourhood of xin order to enhance certiﬁable

radius. To achieve this, consider the case of a second point

, that exists within the certiﬁable radius

. As both points must correspond to the same class, it then follows that the union of their regions

of certiﬁcation can be also be considered as a region of certiﬁcation, leading to Deﬁnition 3.1.

Deﬁnition 3.1

(

Overlap Properties of Certiﬁcation

)

A radius of certiﬁcation

about

can be

calculated by evaluating Equation 3 at

. This certiﬁcation guarantees that no point

x:kx−xikP≤

can induce a chance in the predicted class. That this shape is a

-dimensional hypersphere for

input data x∈ Rdallows us to introduce the notational shorthand

BP(xi, ri) = {x∈ Rd|kx−xikP≤ri}Si={x∈ Rd|kx−xikP=ri}(4)

to represent the region covered by the hypersphere and its surface. It follows from this deﬁnition that

that if

BP(x1, r1)∩BP(x2, r2)6=∅

, which ensures that the class predictions at

and

match,

then the region of certiﬁcation about x1can be expressed as BP(x1, r1)∪BP(x2, r2).

However typically we are concerned not with the size of the region of classiﬁcation invariance, but

rather the distance to the nearest adversarial example. If it is possible to ﬁnd some

such that its

region of certiﬁcation completely encircles that of the certiﬁcation at

, the following deﬁnition

demonstrates that the certiﬁcation radius about xcan be increased.

Lemma 3.2

(

Set Unions Certiﬁed Radius

)

and

have the same class associated with

them and

BP(x1, r1)⊂BP(x2, r2)

, then the nearest possible adversarial example—and thus, the

certiﬁable radius—exists at a distance r0≥rfrom x1, where

r0=r2− kx2−x1kP,(5)

Proof.

The closest point on the surface of

BP(x2, r2)

must exist on the vector between

and

x2. Thus r0= min pr2± kx2−x1kPq,which takes the form of Equation (5).

As such, we can recast the task of constructing a certiﬁcation from being a strictly analytic function

to the nonlinear optimisation problem in terms of a second ball with certiﬁed radius

centred at

r0= max

x2∈[0,1]dr2− kx2−x1kP(6)

with Figure 1a providing a two-dimensional exemplar. Crucially, the above formalism does not

require obtaining a global optima, as any r0> r1yields an improved certiﬁcation at x1.

3.2 Multiple transitivity

To further enhance our ability to certify, let us consider the set of points and their associated certiﬁca-

tions

{x1, r1,x2, r2, . . . xn, rn}

. If the union of

BP=∪i∈{1,...,n}BP(xi, ri)

is simply-connected,

then the certiﬁcation across this set can be expressed as

r(n)0= minx∈∂ˆ

BPkx−x1kP

, where

∂ˆ

is the boundary of

. This can be further simpliﬁed by imposing that

xj∈BP(x2, r2)∀j > 2

and

that BP(xj, rj)6⊂ BP(xk, rk)∀(j > 2, k > 2) to ensure that hyperspheres exist near the boundary

of S2and yielding a certiﬁcation of

r(n)0= min

x∈Snkx−x1kwhere Sn=S2∩`∪n+1

i=3 Si˘for n≥2.(7)

Here Snis a (d−1)-dimensional manifold embedded in Rd.

Lemma 3.3

(

Optimal positioning of x3in the case of n= 3

)

Consider the addition of a new

hypersphere at some point

with associated radius

, which has an associated boundary

. If it

is true that

BP(x3, r3)6⊂ BP(x2, r2)and BP(x3, r3)∩BP(x2, r2)6=∅(8)

BP(x3, r3)6⊂ BP(˜

x,˜r)∀{˜

x∈[0,1]d|˜

x6=x3}with ˜r=σ

2`Φ−1(E0[˜

x]) −Φ−1(E1[˜

x])˘,(9)

then the largest possible certiﬁcation r00 by Equation 7 is achieved at

x3(s) = x1+sr0x1−x2

kx1−x2k2

for some s∈[0,1] .(10)

Proof. The closest point to x1upon S2is located at

x=x1+r0x1−x2

kx1−x2k2

,(11)

where

is deﬁned by Equation 6. Thus any improved radius of certiﬁcation

r00 > r0

is only achievable

BP(x3, r3)

satisﬁes

x∈BP(x3, r3)

and Equation 8. Then by symmetry,

r00

is the maximally

achievable radius of certiﬁcation if Equation 9 hold and if x3is deﬁned by Equation 10.

While ﬁnding some

satisfying Equations 8 and 10 is trivial, proving Equation 9 would require

an exhaustive search of the input space

[0,1]d

. However, even in the absence of such a search,

Equation 10 still provides the framework for a simple search for x3, which follows Figure 1b.

Lemma 3.4

(

Certiﬁcation from two eccentric hyperspheres

)

is deﬁned by Equation 10 in a

fashion that satisﬁes Equation 8 then an updated certiﬁcation can be achieved in terms of some

x3(s)

deﬁned by Equation 10 by way of

r00 = max

s∈[0,1] dd2(r2

3−d2

3) + d3(r2

2−d2

d2+d3

where

d2=kx2−x1kd3=kx3(s)−x1kr3=σ

2`Φ−1pE0[x3(s)]q−Φ−1pE1[x3(s)]q˘.

(12)

If Equation 9 holds, then this is the largest achievable certiﬁcation for n= 3.

Proof.

By symmetry we can deﬁne the arbitrary rotational mapping

f:Rd→ Rd

from

xi7→ yi

way of yi=f(xi−x1), subject to the condition

yi,j 6= 0 for i∈ {1,2}and ∀j6=k, for some ksuch that j, k ∈ {1,2, . . . , d}(13)

then the intersection of the hyperspheres centred about x2and x3occurs at

3−r2

2=ky−y3k2− ky−y2k2

= 2y2,k(d2+d3) + d2

3−d2

2y2,k =r2

3−r2

2+d2

2−d2

d2+d3

(14)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DoubleBubble,ToilandTrouble:EnhancingCertiedRobustnessthroughTransitivityAndrewC.Cullen1PaulMontague2ShijieLiu1SarahM.Erfani1BenjaminI.P.Rubinstein11SchoolofComputingandInformationSystems,UniversityofMelbourne,Parkville,Australia2DefenceScienceandTechnologyGroup,Adelaide,Australiaandrew.cullen@uni...

展开>> 收起<<

Double Bubble Toil and Trouble Enhancing Certiﬁed Robustness through Transitivity Andrew C. Cullen1Paul Montague2Shijie Liu1.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Double Bubble Toil and Trouble Enhancing Certiﬁed Robustness through Transitivity Andrew C. Cullen1Paul Montague2Shijie Liu1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: