Partially Oblivious Neural Network Inference Panagiotis Rizomiliotis Christos Diou Aikaterini Triakosia Ilias Kyrannas and Konstantinos Tserpes

2025-05-02 1 0 550.24KB 12 页 10玖币

侵权投诉

Partially Oblivious Neural Network Inference

Panagiotis Rizomiliotis, Christos Diou, Aikaterini Triakosia,

Ilias Kyrannas and Konstantinos Tserpes

Department of Informatics and Telematics,

Harokopio University of Athens

Omirou 9, 17778, Athens, Greece

{prizomil,cdiou,ktriakos,kyrannas,tserpes}@hua.gr

Abstract

Oblivious inference is the task of outsourcing a ML

model, like neural-networks, without disclosing critical

and sensitive information, like the model’s parameters.

One of the most prominent solutions for secure obliv-

ious inference is based on a powerful cryptographic

tools, like Homomorphic Encryption (HE) and/or

multi-party computation (MPC). Even though the im-

plementation of oblivious inference systems schemes

has impressively improved the last decade, there are

still signiﬁcant limitations on the ML models that they

can practically implement. Especially when both the

ML model and the input data’s conﬁdentiality must

be protected. In this paper, we introduce the notion of

partially oblivious inference. We empirically show that

for neural network models, like CNNs, some informa-

tion leakage can be acceptable. We therefore propose a

novel trade-oﬀ between security and eﬃciency. In our

research, we investigate the impact on security and in-

ference runtime performance from the CNN model’s

weights partial leakage. We experimentally demon-

strate that in a CIFAR-10 network we can leak up to

80% of the model’s weights with practically no secu-

rity impact, while the necessary HE-mutliplications are

performed four times faster.

1 Introduction

Artiﬁcial intelligence (AI), and in particular, machine

learning (ML) technology is transforming almost every

business in the world. ML provides the ability to ob-

tain deep insights from data sets, and to create models

that outperform any human or expert system in crit-

ical tasks, like face recognition, medical diagnosis and

ﬁnancial predictions. Many companies oﬀer such ML-

based operations as a service (Machine learning as a

service, MLaaS). MLaaS facilitates clients to beneﬁt

from ML models without the cost of establishing and

maintaining an inhouse ML system. There are three

parties involved in the transaction; the data owner,

the model owner and the infrastructure provider.

However, the use of ML models raises crucial secu-

rity and privacy concerns. The data set used for the

ML model training and/or the MLaaS client’s input

in the inference phase can leak sensitive personal or

business information. To complete the scenery of se-

curity threats, in several applications, like medical or

ﬁnancial, the ML models are considered as the MLaaS

provider’s intellectual property, and they must be pro-

tected.

Oblivious inference is the task of running a ML

model without disclosing the client’s input, the model’s

prediction and/or by protecting the ownership of the

trained model. This ﬁeld of research is also referred to

as Privacy-preserved machine learning (PPML).

Several solutions for oblivious inference have been

proposed that utilize powerful cryptographic tools, like

Multi-party Computation (MPC) primitives and the

Homomorphic Encryption (HE) schemes. MPC based

protocols facilitate the computation of an arbitrary

function on private input from multiple parties. These

protocols have signiﬁcant communication overhead as

they are very interactive. On the other hand, using HE

cryptography we are able to perform computations on

encrypted data, but with signiﬁcant computation and

storage overhead.

Several PPML schemes have been proposed that are

either based solely on one of these technologies or that

they leverage a combination of them (hybrid schemes).

So far, literature has focused on two attack models. It

is assumed that either the model owner is also the in-

frastructure provider or that the ML model that it is

used, it is publicly known. This is a reasonable choice,

as in both cases, the ML model’s weights can be used

in plaintext form. That is that, the schemes designers

avoid expensive computations between ciphertexts and

thus, they introduce inference systems that are practi-

cal.

In this paper, we consider the use cases in which the

ML model’s conﬁdentiality must be protected. The

service provider wants to outsource the ML prediction

computation (for instance to a cloud provider or to an

edge device). However, the ML model constitutes in-

tellectual property and it’s privacy must be preserved.

Protecting both the client’s input data and the

model’s privacy can increase prohibitively the compu-

tational complexity as all the computations must be

performed between encrypted data. Just as a rough

estimation, the runtime of a single HE multiplication

increases ten times when it is performed between two

ciphertexts compared to HE multiplication between a

arXiv:2210.15189v1 [cs.CR] 27 Oct 2022

plaintext and a ciphertext. At the same time, HE mul-

tiplications between encrypted data (ciphertexts) in-

crease signiﬁcantly the accumulative level of noise and

they limit the applicability of the HE schemes. Thus,

they must be avoided when possible.

Building on this observation, we introduce the no-

tion of partially oblivious (PO) inference. In a PO in-

ference system, the ML model owner decides to leak

some of the model’s weights in order to improve the

eﬃciency of the inference process. PO inference can

be seen as a generalization of oblivious inference that

oﬀers a trade-oﬀ between security and eﬃciency. The

PO inference systems lie between the two extreme use

cases, the most secure but the less eﬃcient in which

all the ML model weights are encrypted and the less

secure and the most eﬃcient in which all the weights

are revealed. The optimal point of equilibrium between

eﬃciency and security depends on the use case.

Our work is summarized as follows:

1. We introduce the notion of Partially Oblivious in-

ference for ML models.

2. We provide a security deﬁnition for the evaluation

of the information leakage impact. In our analysis,

the attacker is passive (”honest-but-curious”) and

she aims to compute a model that simulates the

protected one as accurately as possible. We use

accuracy improvement as our security metric.

3. As a proof-of-concept use case, we apply the no-

tion of PO inference to protect Convolutional Neu-

ral Networks (CNN) inference.

4. We experimentally measure the security and per-

formance trade-oﬀ. We use two models trained

with the MINST [15] and CIFAR-10 datasets [14],

respectively. For the PO inference implementa-

tion, Gazelle-like [12] approach is used. Impres-

sively, it is shown that in some scenarios, leakage

of more than 80% of the model weight’s can be

acceptable.

The paper is organized as follows. In Section 2, the

necessary background is provided. In Section 3, we an-

alyze our motivation, we introduce the security attack

model and the security deﬁnition for PO inference and

we demonstrate the application of the PO inference to

CNN models. Finally, in Section 4, we implement and

evaluate the two CNN models and in Section 5, we

conclude the paper.

1.1 Related work

There are several lines of work for PPML systems that

leverage advanced cryptographic tools, like MPC and

HE. The most promising solutions are hybrid, and they

are using HE to protect the linear and MPC to protect

the non-linear layers.

CryptoNet ([10]) is the ﬁrst scheme that deploys the

HE primitive for PPML on the MNIST benchmark. In

the same research line, CHET [9], SEALion [22] and

Faster Cryptonets [7] use HE and retrained networks.

There are HE based schemes that use pre-trained net-

works, like Chimera [4] and Pegasus [16]. In the pre-

trained PPML category, we can ﬁnd several propos-

als that use only MPC schemes, like ABY3 [18], and

XONN [20].

The most promising type of PPML systems are hy-

brid, i.e. the proposals that use both MPC and the HE

schemes. Hybrid HE-MPC schemes provide an elegant

solution for pre-trained networks. The MPC is respon-

sible for the non-linear part (activation function) and

HE for the linear transformations (FC and convolu-

tional layers). Gazelle [12] is a state-of-the-art hybrid

scheme for CNN prediction and several works have fol-

lowed, like Delphi [17], nGraph-HE [3], nGraph-HE2

[2], and PlaindML-HE [5]. All these schemes assume

that either the model owner runs the models locally or

that the ML model is publicly known.

There are several open source HE libraries that im-

plement the operations of a HE schemes and oﬀer

higher-level API [23] and there is an ongoing eﬀort to

standardize APIs for HE schemes [1]. However, deal-

ing directly with HE-libraries and operations is still

a very challenging task for the developers. In order

to facilitate developers work, HE compilers have been

proposed to oﬀer a high-level abstraction. There is a

nice overview of existing HE-compilers in [23].

2 Background

2.1 Homomorphic Encryption

In the last decade, the performance of HE schemes has

impressively improved up to several orders of magni-

tude thanks to advances in the theory and to more

eﬃcient implementations. However, it is still signiﬁ-

cantly slower than plaintext computations, while real-

izing HE-based computations is complex for the non-

expert.

Modern HE schemes belong into one of two main cat-

egories. The schemes that compute logical gates and

thus, they are most eﬃcient for generic applications,

and the schemes that operate nicely on arithmetic or

p-ary circuits and thus, they are used for the evalu-

ation of polynomial functions. The CKKS [6] scheme

belongs in the second category. As it operates to arith-

metic circuits on complex and real approximate num-

bers, CKKS is suitable for machine learning applica-

tions. We are going to use it in our experiments.

Following the last version of the HE Standard [1], all

the schemes must support the following types of oper-

ations: key and parameters management, encryption

and decryption operation, HE evaluation of additions

and multiplications, and noise management.

2.2 HE evaluation operations cost

Practically, all the modern HE schemes are based on

the hardness of the Learning With Errors (LWE) prob-

lem [19] and its polynomial ring variant. Depending on

the scheme the plaintext, the keys, and the ciphertexts

are elements of Zn

qor Zq[X]/(Xn+ 1), i.e. they are

either vectors of integers or polynomials with integer

coeﬃcients.

In order to protect a message ma randomly selected

vector (or polynomial) eis selected from a distribution

and it is added to produce a noisy version of m. The

level Bof this added noise must always be between

two bounds Bmin and Bmax. When B < Bmin, the

ciphertext cannot protect the message, while when B >

Bmax, the noise cannot be removed and the correct

message cannot be retrieved anymore.

Thus, it is crucial to manage the level of noise in-

duced by the HE operations. It has been demonstrated

that the best noise management approach is to treat

the ciphertext’s noise level Bas an invariant. That is

that, after each HE operation, the level of noise must

remain close to B.

In the CKKS [6] scheme, the ciphertext is a pair

of polynomials c= (c0, c1) over a ring of polynomials

Zq[X]/(XN+ 1), for appropriately selected integers q

and N. The four main evaluation operations of CKKS

scheme are summarized as:

1. Plaintext-Ciphertext Addition

Let mand m0be two plaintexts and c0= (c0

0, c0

be the encrypted value of m0. The output of the

addition is coutput = (m+c0

0, c0

1) and decrypts to

m+m0and the noise level is B.

2. Ciphertext Addition

Let c= (c0, c1) and c0= (c0

0, c0

1) be the encrypted

values of plaintexts mand m0. The output of the

addition is coutput = (c0+c0

0, c1+c0

1) and it is the

ciphertext of m+m0(approximately with good

accuracy). The level of noise is upper bounded by

2B.

3. Plaintext-Ciphertext multiplication

Let mand m0be two plaintexts and c0= (c0

0, c0

1) be

the encrypted value of m0. The output coutput =

(m·c0

0, m ·c0

1) decrypts to m·m0and the level of

noise is mB.

4. Ciphertext multiplication

Let c= (c0, c1) and c0= (c0

0, c0

1) be the encrypted

values of plaintexts mand m0. The output of the

multiplication is three polynomials, coutput = (c0·

0, c0·c0

1+c0

0·c1, c1·c0

1) and the noise level is B2.

It is clear that the ciphertext multiplication is the

problematic one. The number of ciphertext polynomi-

als increases linearly (one more polynomial after each

multiplication) and the noise level increases exponen-

tially (it becomes B2L, after Lconsecutive multipli-

cations). To manage this size and noise increase, two

refresh type operations are applied. In order to bring

the dimension of the output ciphertext back to two,

the relinearlization algorithm is used. The resulting

ciphertext c00

output is an encryption of the m·m0and

the level of noise is B2. For the noise management, an

algorithm called rescale (or modulo switching in other

HE schemes) is used. However, it can be applied only

a limited and predetermined number of times, usually

equal to the multiplicative depth Lof the arithmetic

circuit.

Both algorithms, rescaling and relinearization, are

costly in terms of computational complexity and both

of them are applied after each multiplication between

two ciphertexts. Relinearization has approximately the

same computational cost with ciphertext multiplica-

tion and an evaluation key is required. The evalua-

tions keys are created by the encryptor and passed to

the evaluator.

To summarize, the HE multiplication between

cipehrtexts is a very costly operation in terms of com-

putational overhead and noise management. Com-

pared to ciphertext multiplication, the other three HE

evaluation operations are practically for free.

2.3 Plaintext Packing

One of the main features of some HE schemes that ex-

tremely improve performance is plaintext packing (also

referred to as batching). It allows several scalar values

to be encoded in the same plaintext. Thus, for schemes

with cyclotomic polynomial of degree N, it is possible

to store up to N/2 values in the same plaintext (we

refer to them as slots). Thus, homomorphic operations

can be performed component-wise in Single Instruc-

tion Multiple Data (SIMD) manner. This encoding

has several limitations, since there is not random ac-

cess operation and only cyclic rotations of the slots is

allowed.

There are various choices for plaintext packing in

ML, i.e. how the input data and the model weights are

organized in plaintexts (or ciphertexts). Depending on

the workload two are the main packing approaches,

batch-axis-packing and inter-axis packing.

The batch-axis-packing is used by CryptoNets,

nGraph-HE and nGraph-HE2. It is used to a 4D tensor

of shape (B, C, H, W), where Bis the batch size, Cis

number of channels and H,Wthe height and width of

input, along the batch axis. That is that, each plain-

text (or ciphertext) packing has Bslots and C·H·W

are needed. This approach assumes that Binputs are

available for each inference operation.

On the other hand, inter-axis packing is used when

each input is processed separately, i.e. it is not neces-

sary to collect Binputs before performing a prediction

(this is common in medical diagnosis). There are sev-

eral packing choices, all of them encode scalars from

the same input. This approach is used by Gazelle in

which diﬀerent packing is used for each type of linear

transformation. We will use inter-axis packing in our

analysis. In Section 3.4, we provide more details on the

diﬀerent inter-axis packing choices.

2.4 CNN models

The neural-network inference has been identiﬁed as the

main application area for privacy preserving technol-

ogy, and especially for HE and MPC schemes, as we

have seen in Section 1.1. However, there are practical

limits to the complexity of the use cases that can be

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PartiallyObliviousNeuralNetworkInferencePanagiotisRizomiliotis,ChristosDiou,AikateriniTriakosia,IliasKyrannasandKonstantinosTserpesDepartmentofInformaticsandTelematics,HarokopioUniversityofAthensOmirou9,17778,Athens,Greecefprizomil,cdiou,ktriakos,kyrannas,tserpesg@hua.grAbstractObliviousinferenceist...

展开>> 收起<<

Partially Oblivious Neural Network Inference Panagiotis Rizomiliotis Christos Diou Aikaterini Triakosia Ilias Kyrannas and Konstantinos Tserpes.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Partially Oblivious Neural Network Inference Panagiotis Rizomiliotis Christos Diou Aikaterini Triakosia Ilias Kyrannas and Konstantinos Tserpes

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: