Towards Global Neural Network Abstractions with Locally-Exact Reconstruction

2025-05-06 0 0 790.79KB 38 页 10玖币

侵权投诉

Towards Global Neural Network Abstractions with

Locally-Exact Reconstruction

Edoardo Maninoa,∗, Iury Bessab, Lucas Cordeiroa

aUniversity of Manchester, Department of Computer Science,

Oxford Road, Manchester M13 9PL, United Kingdom

bFederal University of Amazonas, Department of Electricity,

Avenida General Rodrigo Octavio Jord˜ao Ramos, 1200 - Coroado I,

Manaus - AM, 69067-005, Brazil

Abstract

Neural networks are a powerful class of non-linear functions. However, their

black-box nature makes it diﬃcult to explain their behaviour and certify

their safety. Abstraction techniques address this challenge by transforming

the neural network into a simpler, over-approximated function. Unfortu-

nately, existing abstraction techniques are slack, which limits their applica-

bility to small local regions of the input domain. In this paper, we propose

Global Interval Neural Network Abstractions with Center-Exact Reconstruc-

tion (GINNACER). Our novel abstraction technique produces sound over-

approximation bounds over the whole input domain while guaranteeing exact

reconstructions for any given local input. Our experiments show that GIN-

NACER is several orders of magnitude tighter than state-of-the-art global

abstraction techniques, while being competitive with local ones.

Keywords: Neural Networks, Abstract Interpretation, Global Abstraction.

1. Introduction

Neural networks have been applied to a large number of predictive tasks.

The main reason for their versatility is their ability to approximate a large

∗Corresponding author

Email addresses: edoardo.manino@manchester.ac.uk (Edoardo Manino),

iurybessa@ufam.edu.br (Iury Bessa), lucas.cordeiro@manchester.ac.uk (Lucas

Cordeiro)

Preprint submitted to Neural Networks April 3, 2023

arXiv:2210.12054v2 [cs.LG] 31 Mar 2023

class of nonlinear functions [1, 2]. However, at the same time, trained neural

networks tend to have a black-box nature, which limits our ability to explain

their behaviour [3]. More worryingly, even when we have a priori formal

expectations on their behaviour, e.g., robustness to input noise, the act of

certifying whether a neural network satisﬁes our requirements comes at a

very high computational cost [4, 5].

A common strategy in dealing with the complexity of analysing neural

networks is transforming them into equivalent models described by better-

behaved classes of functions. Examples from theoretical works include max-

aﬃne splines [6], residual networks with one neuron per layer [7], prototypical

“sawtooth” rectiﬁed linear unit (ReLU) networks [8, 9] and Radon or ridgelet

integral representations [10, 11]. However, the purpose of these transforma-

tions is to prove properties over whole classes of network architectures, e.g.,

the existence of approximating networks with given error bounds, rather than

providing a constructive algorithm that outputs such a network.

In stark contrast with these works, there exist a variety of approximate

techniques that can compress the size of a given neural network. Pruning

techniques can remove network parameters whose contribution is negligi-

ble [12, 13]. Distillation techniques train a simpler network to approximate

the output of the hidden layers of the original network [14]. Quantisation

techniques reduce the accuracy of arithmetic operations to take advantage

of hardware acceleration [15]. Frequently, these techniques are combined to

minimise the computational cost of executing a neural network in terms of

power, latency or memory [16, 17]. However, they oﬀer no formal guarantees

on the approximation errors introduced during the compression process.

Ideally, we would like to compress any given neural network while keeping

formal guarantees on the original model. A recent development in such di-

rection is lossless compression [18, 19, 20]. The algorithms in this family can

identify which ReLU activation functions in the network are always active or

inactive and remove them accordingly. Unfortunately, lossless algorithms can

only produce equivalent networks; thus, their compression scope is somewhat

limited [21].

More promising techniques can be grouped under the umbrella term of

abstractions [22, 23]. An abstraction is a set-valued mapping, i.e., a function

whose output is a set instead of a point. In particular, this output set is

guaranteed to be an over-approximation of the original and more complex

function. More formally, for a given abstraction’s domain of validity, the im-

age of the original function is always within the abstraction’s output set [24].

Depending on that domain of validity, we discriminate between local and

global abstractions.

On the one hand, local abstractions are valid for a small region of the

input domain only [25, 26, 27, 28]. As such, they are suitable for verifying

the robustness of neural networks to local perturbations. Typically, their

design is neuron-centric: each activation function is over-approximated by

bounding its output between a set of linear [29] or quadratic constraints [26].

Still, these abstractions’ main goal is not to reduce the network complexity

but to encode it into a set of constraints that a solver can handle.

On the other hand, global abstractions are valid for any input [30, 31].

In this regard, Prabhakar and Afzal [30] propose to transform the original

neural network so that it can operate in interval arithmetic [32] and main-

tain an upper and lower bound on the value of each neuron. Furthermore, by

over-approximating the individual intervals, the authors can merge similar

neurons and reduce the overall complexity of the global abstraction func-

tion. Similarly, Elboher et al. [31] introduce a global abstraction algorithm

that creates multiple copies of each neuron, and then merge them together.

The advantage of their algorithm is that the abstraction function is itself an-

other neural network, which grants a high degree of integration with existing

veriﬁcation tools [33].

A signiﬁcant drawback of local and global abstraction techniques is their

slackness [34, 31]. In general, there is a trade-oﬀ between the complexity of

the abstracted neural network and the size of the over-approximation margin

it produces. In practice, even abstracting a few neurons may generate large

output sets that become meaningless. This phenomenon is not surprising:

established techniques from control theory, like Taylor approximations [35]

and Bernstein polynomials [36], are known to suﬀer from the same over-

approximation explosion when applied to highly non-linear functions.

This paper proposes an algorithm to compute Global Interval Neural

Network Abstractions with Center-Exact Reconstruction (GINNACER). Our

approach to the complexity-approximation trade-oﬀ is novel: we focus on

guaranteeing exact reconstruction for a given input centroid while keeping

sound over-approximation bounds elsewhere. This way, we obtain tighter

global abstractions in a speciﬁc local region of the input. Empirically, our

GINNACER abstractions are orders of magnitude tighter than state-of-the-

art global abstractions while on par with local abstractions.

More speciﬁcally, our contributions are the following:

•We introduce the Inactive Canonical Form (ICF), an exact transforma-

tion of ReLU networks where all activation functions are inactive for a

speciﬁc input centroid.

•We introduce a layer-wise abstraction that can handle interval inputs

and outputs and uses fewer ReLU activations than the original layer.

•We introduce the GINNACER algorithm, which computes an individ-

ual abstraction for each layer of the network. The abstracted layers are

aligned so that their output intervals have size zero for a speciﬁc input

centroid.

•We compare GINNACER with state-of-the-art global and local ab-

stractions. Our comparison is not dependent on a downstream task

(e.g. robustness veriﬁcation) like in previous studies but measures the

quality of the abstraction in isolation.

The rest of this paper is organized as follows. In Section 2 we clarify the

diﬀerences between GINNACER and the existing literature. In Section 3 we

introduce a motivating example and formally state the GINNACER require-

ments. In Section 4, we present our GINNACER algorithm and provide proof

of its theoretical properties. In Section 5, we compare GINNACER against

the state-of-the-art and study its empirical properties. Finally, in Section 6,

we discuss the potential uses of GINNACER, the connections with related

research ﬁelds and a roadmap for future work. The data, neural networks

and code of our experiments is freely available at [37].

2. Related Work

Due to the functional complexity of modern neural networks, there have

been many eﬀorts to cast them to a simpler class of functions. The pur-

pose and application of these neural transformation methods are very dif-

ferent in nature. Here, we propose a high-level taxonomy that focuses on

the mathematical relationship between the original neural network fand its

transformed version g.

First, we have neural network approximations, where f(x)≈g(x). In this

category, neural network compression methods are the most prominent [16,

17]. Their goal is to reduce the computational requirements of the neural

network at inference time. These methods include pruning individual weights

or neurons [12, 13], executing the neural network in integer arithmetic or

any similar quantized representation [15], or training a smaller network to

mimic the hidden representations of the original one [14]. A more recent

development in the approximation category comes from privacy-preserving

machine learning, where it has been proposed to approximate the non-linear

activation functions of a neural network with high-degree polynomials [38,

39]. This approximation allows inference on private inputs via homomorphic

encryption. In general, neural network approximations give no guarantees

on the approximation error and are usually evaluated in terms of loss of

predictive accuracy concerning the original network.

Second, we have equivalent transformations, where f(x) = g(x),∀x. On

the one hand, some of these eﬀorts are theoretical in nature, and their pur-

pose is proving some general mathematical properties of neural networks [6].

On the other hand, there have been eﬀorts towards constructing lossless com-

pression algorithms for neural networks [18, 19, 20]. These algorithms can

identify the neurons whose activation function state is either active or in-

active for all possible inputs x. Then, these neurons can be safely replaced

with a linear function or removed altogether. However, only a minority of

the neurons in the original network can be processed in this way.

Third, we have global functional abstractions, where g(x)≤f(x)≤

g(x),∀x. The main challenge is maintaining a valid upper and lower bound

for any xin the input domain. Prabhakar et al. [30] achieve this by al-

lowing the entire neural network to operate in interval arithmetic. Instead,

Elboher et al. [31] propose to keep explicit copies of each neuron that carry

the upper and lower bound information. Both methods can merge neurons

with similar weights, thus yielding a pair of functions g, g that contains fewer

non-linearities. However, the functions g, g produced by the aforementioned

methods tend to be quite slack, as we show in our experiments (see Sec-

tions 3.1 and 5.3). In contrast, GINNACER yields tighter abstractions while

maintaining global soundness.

Fourth, we have local functional abstractions, where g(x)≤f(x)≤g(x)

for a small sub-domain x∈ D. These have the advantage that they only need

to represent a small portion of the original function f. Traditionally, non-

linear systems have been analysed with Taylor models [40], which give a local

Taylor polynomial approximation and a corresponding worst error interval.

This corpus of techniques has been applied to neural networks with a fair de-

gree of success [35, 41]. In contrast, more recent advances in local functional

abstractions for neural networks have been driven by linear techniques. For

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TowardsGlobalNeuralNetworkAbstractionswithLocally-ExactReconstructionEdoardoManinoa,,IuryBessab,LucasCordeiroaaUniversityofManchester,DepartmentofComputerScience,OxfordRoad,ManchesterM139PL,UnitedKingdombFederalUniversityofAmazonas,DepartmentofElectricity,AvenidaGeneralRodrigoOctavioJord~aoRamos,12...

展开>> 收起<<

Towards Global Neural Network Abstractions with Locally-Exact Reconstruction.pdf

共38页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Towards Global Neural Network Abstractions with Locally-Exact Reconstruction

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: