Double Bubble Toil and Trouble Enhancing Certified Robustness through Transitivity Andrew C. Cullen1Paul Montague2Shijie Liu1

2025-04-26 0 0 720.07KB 19 页 10玖币
侵权投诉
Double Bubble, Toil and Trouble: Enhancing
Certified Robustness through Transitivity
Andrew C. Cullen1Paul Montague2Shijie Liu1
Sarah M. Erfani1Benjamin I.P. Rubinstein1
1School of Computing and Information Systems, University of Melbourne, Parkville, Australia
2Defence Science and Technology Group, Adelaide, Australia
andrew.cullen@unimelb.edu.au
Abstract
In response to subtle adversarial examples flipping classifications of neural network
models, recent research has promoted certified robustness as a solution. There, in-
variance of predictions to all norm-bounded attacks is achieved through randomised
smoothing of network inputs. Today’s state-of-the-art certifications make optimal
use of the class output scores at the input instance under test: no better radius of
certification (under the
L2
norm) is possible given only these score. However, it
is an open question as to whether such lower bounds can be improved using local
information around the instance under test. In this work, we demonstrate how
today’s “optimal” certificates can be improved by exploiting both the transitivity
of certifications, and the geometry of the input space, giving rise to what we term
Geometrically-Informed Certified Robustness. By considering the smallest distance
to points on the boundary of a set of certifications this approach improves certifica-
tions for more than
80%
of Tiny-Imagenet instances, yielding an on average
5%
increase in the associated certification. When incorporating training time processes
that enhance the certified radius, our technique shows even more promising results,
with a uniform 4percentage point increase in the achieved certified radius.
1 Introduction
Learned models, including neural networks, are well known to be susceptible having the output
changed by crafted perturbations to an input, that preserve the inputs semantic properties [
2
]. Neural
networks not only misclassify these perturbations—known as adversarial examples—but they also
assign high confidence to these incorrect predictions. These behaviours have been observed across a
wide range of models and datasets, and appear to be a product of piecewise-linear interactions [13].
Crafting these adversarial examples typically involves gradient-based optimisation to construct small
perturbations. These attacks have been applied to both black- and white-box models [
31
], and can
be used to target class changes, to attack all classes [
10
], or even introduce backdoors into model
behaviour [
5
]. To mitigate the influence of these attacks, defences have typically been designed
to minimise the effect of a specific attack (or attacks). Such defences are known as best response
strategies in a Stackelberg security game where the defender leads the attacker. Best response defences
inherently favour the attacker, as deployed mitigations can be defeated by identifying undefended
attack frameworks. Moreover, the defender typically has to incorporate the defence at training time,
and as such cannot response reactively to newly developed attacks.
To circumvent these limitations, certified guarantees of adversarial robustness can be constructed to
identify class-constant regions around an input instance, that guarantee that all instances within a norm-
bounded distance (typically
L2
) are not adversarial examples. Certifications based on randomised
smoothing of classifiers around an input point are in a sense optimal [
8
]: based only on the prediction
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.06077v1 [cs.LG] 12 Oct 2022
class scores at the input point, no better radius is in general possible. Despite this, such certifications
fail to use readily available—yet still local—information: the certifiability of points nearby to the
input of interest. The key insight of this work is that these neighbourhood points may generate
certified radius large enough to completely enclose that of a sample point, improving the radius of
certification. This process can be extended to use the intersection of the regions of certification of
multiple points, and the nature of the input domain itself to generate larger certifications. This leads
to our main contribution—Geometrically-Informed Certified Robustness—that enjoys certifications
exceeding those of the hitherto best-case guaranteed approach of Cohen et al. (2019) [8].
2 Background and literature review
Bounding mechanisms
Conservative bounds upon the impact of norm-bounded perturbations can
be constructed by way of either Interval Bound Propagation (IBP) which propagates interval bounds
through the model; or Convex Relaxation, which utilise linear relaxation to construct bounding output
polytopes over input bounded perturbations [
34
,
26
,
41
,
45
,
46
,
37
,
28
], in a manner that generally
provides tighter bounds than IBP [
25
]. In contrast to Randomised Smoothing, bounding mechanisms
employ augmented loss functions during training, which promote tight output bounds [
42
] at the cost
of decreased applicability. Moreover they both exhibit a time and memory complexity that makes
them infeasible for complex model architectures or high-dimensional data [40, 6, 21].
Randomised smoothing
Outside of bounding mechanisms, another common framework for de-
veloping certifications leverages randomised smoothing [
20
], in which noise is applied to input
instances to smooth model predictions, subject to a sampling distribution that is tied to the
LP
-norm
of adversarial perturbations being certified against. In contrast to other robustness mechanisms, this
application of the noise is the only architectural change that is required to achieve certification. In the
case of L2-norm bounded attacks, Gaussian sampling of the form
x0
i=x+yiwhere yi
i.i.d.
∼ N(0, σ2)i∈ {1, . . . , N}(1)
is employed for all test-time instances. These
N
samples are then used to estimate the expected
output of the predicted class of xby way of the Monte-Carlo estimator
EY[arg max fθ(x+Y) = i]1
N
N
X
j=1
1[arg max fθ(xj) = i].(2)
While this Monte Carlo estimation of output expectations under randomised smoothing is a test-time
process, model sensitivity to random perturbations may be decreased by performing adversarial
training on such random perturbations. To mitigate the computational expense of large
N
sample
sizes during each training update, training typically employs single draws from the noise distribution.
Smoothing-based certifications
Based on randomised smoothing, certified robustness can guaran-
tee classification invariance for additive perturbations up to some Lp-norm r, with recent work also
considering rotational and/or translational semantic attacks [
23
,
7
].
Lp
-norm certifications were first
demonstrated by way of differential privacy [
20
,
11
], with more recent approaches employing Rényi
divergence [
22
], and parametrising worst-case behaviours [
8
,
33
]. By considering the worst-case
L2-perturbations, Cohen et al. (2019) purports that the largest achievable pointwise certification is
r=σ
2`Φ1pE0[x]qΦ1pE1[x]q˘.(3)
Hhere
{E0, E1}
are the two largest class expectations (as per Equation
(2)
),
σ
is the noise, and
Φ1
is the inverse normal CDF, or Gaussian quantile function.
3 Geometrically-informed certified robustness
While the work contained within this paper can be applied generally, for this work we will focus upon
certifications of robustness about
L2
-norm bounded adversarial perturbations, for which we assume
that the difficulty of attacking a model is proportional to the size of the certification, based upon the
need to evade both human and machine scrutiny [
12
]. Thus, constructing larger certifications in such
a context is inherently valuable.
2
r0
x1
x2
(a) Transitivity
r00 x1
x2
x3
(b) Multiple Transitivity
ˆ
r
x1
x2
(c) Boundary Treatment
Figure 1: Transitive certification exemplars. The Green, Red, and Black circles represent hyper-
spheres of radius
ri
(by Equation 3) about points
xii∈ {1,2,3}
. The resulting certifications
r0, r00,
and
ˆr
are described within Equations 5, 12, and
16
. The Black line represents the domain boundary.
This specific
LP
space is of interest due to both its viability as a defence model, and the provable
guarantee that Cohen et al. produces the largest possible certification for any instance [
8
]. Over the
remainder of this section we will document how it is possible to improve upon this provably best-case
guarantee by exploiting several properties of certified robustness.
3.1 Exploiting transitivity
While it is provably true that Equation (3) is the largest achievable certification for any point x, it is
possible to exploit the behaviour of points in the neighbourhood of xin order to enhance certifiable
radius. To achieve this, consider the case of a second point
x0
, that exists within the certifiable radius
of
x
. As both points must correspond to the same class, it then follows that the union of their regions
of certification can be also be considered as a region of certification, leading to Definition 3.1.
Definition 3.1
(
Overlap Properties of Certification
)
.
A radius of certification
ri
about
xi
can be
calculated by evaluating Equation 3 at
xi
. This certification guarantees that no point
x:kxxikP
ri
can induce a chance in the predicted class. That this shape is a
d
-dimensional hypersphere for
input data x∈ Rdallows us to introduce the notational shorthand
BP(xi, ri) = {x∈ Rd|kxxikPri}Si={x∈ Rd|kxxikP=ri}(4)
to represent the region covered by the hypersphere and its surface. It follows from this definition that
that if
BP(x1, r1)BP(x2, r2)6=
, which ensures that the class predictions at
x1
and
x2
match,
then the region of certification about x1can be expressed as BP(x1, r1)BP(x2, r2).
However typically we are concerned not with the size of the region of classification invariance, but
rather the distance to the nearest adversarial example. If it is possible to find some
x0
such that its
region of certification completely encircles that of the certification at
x
, the following definition
demonstrates that the certification radius about xcan be increased.
Lemma 3.2
(
Set Unions Certified Radius
)
.
If
x1
and
x2
have the same class associated with
them and
BP(x1, r1)BP(x2, r2)
, then the nearest possible adversarial example—and thus, the
certifiable radius—exists at a distance r0rfrom x1, where
r0=r2− kx2x1kP,(5)
Proof.
The closest point on the surface of
BP(x2, r2)
to
x1
must exist on the vector between
x1
and
x2. Thus r0= min pr2± kx2x1kPq,which takes the form of Equation (5).
As such, we can recast the task of constructing a certification from being a strictly analytic function
to the nonlinear optimisation problem in terms of a second ball with certified radius
r2
centred at
x2
r0= max
x2[0,1]dr2− kx2x1kP(6)
with Figure 1a providing a two-dimensional exemplar. Crucially, the above formalism does not
require obtaining a global optima, as any r0> r1yields an improved certification at x1.
3
3.2 Multiple transitivity
To further enhance our ability to certify, let us consider the set of points and their associated certifica-
tions
{x1, r1,x2, r2, . . . xn, rn}
. If the union of
ˆ
BP=i∈{1,...,n}BP(xi, ri)
is simply-connected,
then the certification across this set can be expressed as
r(n)0= minxˆ
BPkxx1kP
, where
ˆ
BP
is the boundary of
ˆ
BP
. This can be further simplified by imposing that
xjBP(x2, r2)j > 2
and
that BP(xj, rj)6⊂ BP(xk, rk)(j > 2, k > 2) to ensure that hyperspheres exist near the boundary
of S2and yielding a certification of
r(n)0= min
xSnkxx1kwhere Sn=S2`n+1
i=3 Si˘for n2.(7)
Here Snis a (d1)-dimensional manifold embedded in Rd.
Lemma 3.3
(
Optimal positioning of x3in the case of n= 3
)
.
Consider the addition of a new
hypersphere at some point
x3
with associated radius
r3
, which has an associated boundary
S3
. If it
is true that
BP(x3, r3)6⊂ BP(x2, r2)and BP(x3, r3)BP(x2, r2)6=(8)
BP(x3, r3)6⊂ BP(˜
x,˜r)∀{˜
x[0,1]d|˜
x6=x3}with ˜r=σ
2`Φ1(E0[˜
x]) Φ1(E1[˜
x])˘,(9)
then the largest possible certification r00 by Equation 7 is achieved at
x3(s) = x1+sr0x1x2
kx1x2k2
for some s[0,1] .(10)
Proof. The closest point to x1upon S2is located at
˘
x=x1+r0x1x2
kx1x2k2
,(11)
where
r0
is defined by Equation 6. Thus any improved radius of certification
r00 > r0
is only achievable
if
BP(x3, r3)
satisfies
˘
xBP(x3, r3)
and Equation 8. Then by symmetry,
r00
is the maximally
achievable radius of certification if Equation 9 hold and if x3is defined by Equation 10.
While finding some
x3
satisfying Equations 8 and 10 is trivial, proving Equation 9 would require
an exhaustive search of the input space
[0,1]d
. However, even in the absence of such a search,
Equation 10 still provides the framework for a simple search for x3, which follows Figure 1b.
Lemma 3.4
(
Certification from two eccentric hyperspheres
)
.
If
x3
is defined by Equation 10 in a
fashion that satisfies Equation 8 then an updated certification can be achieved in terms of some
x3(s)
defined by Equation 10 by way of
r00 = max
s[0,1] dd2(r2
3d2
3) + d3(r2
2d2
2)
d2+d3
where
d2=kx2x1kd3=kx3(s)x1kr3=σ
2`Φ1pE0[x3(s)]qΦ1pE1[x3(s)]q˘.
(12)
If Equation 9 holds, then this is the largest achievable certification for n= 3.
Proof.
By symmetry we can define the arbitrary rotational mapping
f:Rd→ Rd
from
xi7→ yi
by
way of yi=f(xix1), subject to the condition
yi,j 6= 0 for i∈ {1,2}and j6=k, for some ksuch that j, k ∈ {1,2, . . . , d}(13)
then the intersection of the hyperspheres centred about x2and x3occurs at
r2
3r2
2=kyy3k2− kyy2k2
= 2y2,k(d2+d3) + d2
3d2
2
2y2,k =r2
3r2
2+d2
2d2
3
d2+d3
.
(14)
4
摘要:

DoubleBubble,ToilandTrouble:EnhancingCertiedRobustnessthroughTransitivityAndrewC.Cullen1PaulMontague2ShijieLiu1SarahM.Erfani1BenjaminI.P.Rubinstein11SchoolofComputingandInformationSystems,UniversityofMelbourne,Parkville,Australia2DefenceScienceandTechnologyGroup,Adelaide,Australiaandrew.cullen@uni...

展开>> 收起<<
Double Bubble Toil and Trouble Enhancing Certified Robustness through Transitivity Andrew C. Cullen1Paul Montague2Shijie Liu1.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:720.07KB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注