
class scores at the input point, no better radius is in general possible. Despite this, such certifications
fail to use readily available—yet still local—information: the certifiability of points nearby to the
input of interest. The key insight of this work is that these neighbourhood points may generate
certified radius large enough to completely enclose that of a sample point, improving the radius of
certification. This process can be extended to use the intersection of the regions of certification of
multiple points, and the nature of the input domain itself to generate larger certifications. This leads
to our main contribution—Geometrically-Informed Certified Robustness—that enjoys certifications
exceeding those of the hitherto best-case guaranteed approach of Cohen et al. (2019) [8].
2 Background and literature review
Bounding mechanisms
Conservative bounds upon the impact of norm-bounded perturbations can
be constructed by way of either Interval Bound Propagation (IBP) which propagates interval bounds
through the model; or Convex Relaxation, which utilise linear relaxation to construct bounding output
polytopes over input bounded perturbations [
34
,
26
,
41
,
45
,
46
,
37
,
28
], in a manner that generally
provides tighter bounds than IBP [
25
]. In contrast to Randomised Smoothing, bounding mechanisms
employ augmented loss functions during training, which promote tight output bounds [
42
] at the cost
of decreased applicability. Moreover they both exhibit a time and memory complexity that makes
them infeasible for complex model architectures or high-dimensional data [40, 6, 21].
Randomised smoothing
Outside of bounding mechanisms, another common framework for de-
veloping certifications leverages randomised smoothing [
20
], in which noise is applied to input
instances to smooth model predictions, subject to a sampling distribution that is tied to the
LP
-norm
of adversarial perturbations being certified against. In contrast to other robustness mechanisms, this
application of the noise is the only architectural change that is required to achieve certification. In the
case of L2-norm bounded attacks, Gaussian sampling of the form
x0
i=x+yiwhere yi
i.i.d.
∼ N(0, σ2)∀i∈ {1, . . . , N}(1)
is employed for all test-time instances. These
N
samples are then used to estimate the expected
output of the predicted class of xby way of the Monte-Carlo estimator
EY[arg max fθ(x+Y) = i]≈1
N
N
X
j=1
1[arg max fθ(xj) = i].(2)
While this Monte Carlo estimation of output expectations under randomised smoothing is a test-time
process, model sensitivity to random perturbations may be decreased by performing adversarial
training on such random perturbations. To mitigate the computational expense of large
N
sample
sizes during each training update, training typically employs single draws from the noise distribution.
Smoothing-based certifications
Based on randomised smoothing, certified robustness can guaran-
tee classification invariance for additive perturbations up to some Lp-norm r, with recent work also
considering rotational and/or translational semantic attacks [
23
,
7
].
Lp
-norm certifications were first
demonstrated by way of differential privacy [
20
,
11
], with more recent approaches employing Rényi
divergence [
22
], and parametrising worst-case behaviours [
8
,
33
]. By considering the worst-case
L2-perturbations, Cohen et al. (2019) purports that the largest achievable pointwise certification is
r=σ
2`Φ−1pE0[x]q−Φ−1pE1[x]q˘.(3)
Hhere
{E0, E1}
are the two largest class expectations (as per Equation
(2)
),
σ
is the noise, and
Φ−1
is the inverse normal CDF, or Gaussian quantile function.
3 Geometrically-informed certified robustness
While the work contained within this paper can be applied generally, for this work we will focus upon
certifications of robustness about
L2
-norm bounded adversarial perturbations, for which we assume
that the difficulty of attacking a model is proportional to the size of the certification, based upon the
need to evade both human and machine scrutiny [
12
]. Thus, constructing larger certifications in such
a context is inherently valuable.
2