Classically Approximating Variational Quantum Machine Learning with Random Fourier Features Jonas Landman1 2Slimane Thabet3 4Constantin Dalyac3 4Hela Mhiri3 5and Elham Kashe1 3

2025-05-01 0 0 1.82MB 17 页 10玖币

侵权投诉

Classically Approximating Variational Quantum Machine Learning with Random

Fourier Features

Jonas Landman,1, 2 Slimane Thabet,3, 4 Constantin Dalyac,3, 4 Hela Mhiri,3, 5 and Elham Kasheﬁ1, 3

1School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, United Kingdom

2QC Ware, Palo Alto, USA and Paris, France

3Laboratoire d’Informatique de Paris 6, CNRS, Sorbonne Universit´e, 4 Place Jussieu, 75005 Paris, France

4PASQAL SAS, 7 avenue L´eonard de Vinci, 91300 Massy, France

5ENSTA Paris, Institut polytechnique de Paris

(Dated: October 25, 2022)

Many applications of quantum computing in the near term rely on variational quantum circuits

(VQCs). They have been showcased as a promising model for reaching a quantum advantage

in machine learning with current noisy intermediate scale quantum computers (NISQ). It is often

believed that the power of VQCs relies on their exponentially large feature space, and extensive works

have explored the expressiveness and trainability of VQCs in that regard. In our work, we propose

a classical sampling method that may closely approximate a VQC with Hamiltonian encoding,

given only the description of its architecture. It uses the seminal proposal of Random Fourier

Features (RFF) and the fact that VQCs can be seen as large Fourier series. We provide general

theoretical bounds for classically approximating models built from exponentially large quantum

feature space by sampling a few frequencies to build an equivalent low dimensional kernel, and we

show experimentally that this approximation is eﬃcient for several encoding strategies. Precisely, we

show that the number of required samples grows favorably with the size of the quantum spectrum.

This tool therefore questions the hope for quantum advantage from VQCs in many cases, but

conversely helps to narrow the conditions for their potential success. We expect VQCs with various

and complex encoding Hamiltonians, or with large input dimension, to become more robust to

classical approximations.

INTRODUCTION

Many applications of quantum computing in the near

term rely on variational quantum circuits (VQCs) [1,2],

and in particular for solving machine learning (ML)

tasks. VQCs are trained using classical optimization of

their gates’ parameters, a method borrowed from clas-

sical neural networks. Many early works have shown

promising results, both empirically and in theory [3–5].

However, whether these variational methods can provide

a quantum advantage in the general case with a scaling

number of qubits has not been proven.

What would make VQCs advantageous compared to

classical algorithms for machine learning on classical

data? The most common and intuitive answer is the

formation of a large feature space, due to the projection

of data points in an exponentially large Hilbert space.

Understanding what is happening in this Hilbert space

[6], and most importantly, knowing how we can exploit its

size is an important question for this ﬁeld. When it comes

to trainability, it has been shown that these exponential

spaces are in fact drawbacks [7]. Regarding expressivity,

it is crucial to understand what kind of functions VQCs

can learn better than a classical algorithm.

One of the biggest challenges we face while trying to

answer these questions is to ﬁnd the fair comparison be-

tween VQCs and classical ML algorithms. Recent works

[6,8,9] have shown equivalence between VQCs and both

Fourier series and kernel methods. In fact, VQCs are ex-

tremely powerful Fourier series: the functions they can

learn are predetermined by a set of frequencies which can

become numerous with the dimension of the input, and

the number and complexity of the quantum gates used.

In this work, we adapt Random Fourier Features

(RFF) [10], a successful classical sampling algorithm

aimed at eﬃciently approximating some large classical

kernel methods. We designed three diﬀerent RFF strate-

gies to approximate VQCs. For each one, we analyze

its eﬃciency in reproducing results obtained by a VQC.

To do so, we have studied in details the expressivity of

VQCs, understood where their power could come from,

and compared it each time with RFF.

Our method consists in analyzing the encoding gates

of the VQC to extract the ﬁnal frequencies of its model

and sample from them. Notably, if the VQCs possesses

simple encoding gates such as Pauli gates, we show that

the large quantum feature space is not fully exploited,

making RFF even more eﬃcient. If the number of fre-

quencies in the VQC grows exponentially, the number of

samples required by RFF grows only linearly. Finally, we

have empirically compared VQCs and RFF on real and

artiﬁcial use cases. On these, RFF were able to match

the VQC’s answer, and sometimes outperform it. Some

hope resides for VQCs with non-diagonalizable encoding

Hamiltonians, preventing the use of usual RFF. However,

even in this case, we provide an alternative RFF strategy,

along with theoretical approximation bounds.

Our conclusion is that, despite the potentially expo-

nential number of frequencies in the functions that a

VQC can create, Random Fourier Features can be used

to approximate the same resulting function on a given

dataset, in many cases. This restrains the regime in

arXiv:2210.13200v1 [quant-ph] 24 Oct 2022

FIG. 1: Random Fourier Features as a classical approximator of quantum models. Instead of training a

Variational Quantum Circuit by using a quantum computer, we propose to train a classical kernel built by sampling

a few frequencies of the quantum model. These frequencies can be derived from the quantum circuit architecture, in

particular from the encoding gates. Using Random Fourier Features, one can build a classical model which performs

as good as the quantum model with a bounded error and a number of samples that grows nicely.

which VQCs can oﬀer a true advantage over classical in

learning tasks over classical data, or invites us to change

the way VQCs are deﬁned.

The manuscript is structured as follows. In Section I,

we recall the properties of variational quantum circuits,

and the fact that they can be expressed as Fourier series

by detailing the structure of their spectrum. In Section

II, we introduce the classical method of Random Fourier

Features. In Section III, we ﬁnally present three ways

of applying RFF to approximate VQC’s models. These

theoretical proposals are tested over real and artiﬁcial

datasets in Section IV.

I. SPECTRAL PROPERTIES OF VARIATIONAL

QUANTUM CIRCUITS

A. Deﬁnitions

We consider a standard ML task where a function f,

named model, must be optimized to map data points

to their target values. The data used for the train-

ing is made of Mpoints x= (x1, . . . , xd) in X=Rd

along with their target values yin Y=R. We deﬁne a

quantum model as the family of parametrized functions

f: (X,Θ) −→ Y, such that

f(x;θ) = h0|U(x;θ)†OU(x;θ)|0i(1)

where U(x;θ) is a unitary that represents the

parametrized quantum circuits, θrepresents the train-

able parameters from a space Θ, and Ois an observable.

We can always describe the parametrized quantum cir-

cuit as a series of two types of gates. The ﬁrst are called

encoding gates as they only depend on input data val-

ues, whereas the trainable gates depend on internal pa-

rameters that are optimized during training. A typical

instance of a Variational Quantum Circuit (VQC) is illus-

trated in Fig.3. In the recent literature, these gates are

often grouped as layers, which is not mandatory, since

any circuit can be sliced into alternating sequences of en-

coding and training blocks (even if containing a single

gate).

Any quantum unitary implements the evolution of a

quantum system under a Hamiltonian. Thus, we choose

to write the `th encoding gates as exp(−ixiH`), where xi

is one of the dcomponents of x, and H`is a Hamiltonian

matrix of size 2pif pis the number of qubits this gate

acts on. We will note Lthe number of encoding gates

for each dimension of x(the same for each dimension, for

notation simplicity).

In this framework, the aim is to ﬁnd the optimal map-

ping between data points and their target values. This

is done by optimizing the parameters θto ﬁnd the best

guess f∗such that

f∗= argmin

i=1

l(f(xi;θ), yi) (2)

where lis a cost function adapted to the task. For a

standard regression tasks, we can choose l(z, y) = |z−y|2

B. Quantum Models are Large Fourier Series

We know since [6] that the family of quantum models

deﬁned in Eq.(1) can be rewritten as a Fourier series:

f(x;θ) = X

ω∈Ω

cωeiωx (3)

where the spectrum Ω of frequencies is determined by

the ensemble of eigenvalues of the encoding Hamiltoni-

ans and the coeﬃcients cωdepend on the parametrized

ansatz, as pictured in Fig.3.

In order to familiarize the reader with the structure of

the spectrum, we explicitly build Ω in the case of a one

dimensional data input (X=R) and with a variational

circuit containing only Lencoding gates. The accessible

frequency spectrum Ω is the ensemble of all the diﬀer-

ences between all possible sums of the eigenvalues of the

encoding gates as shown in Fig.2.

We note λk

`the kth eigenvalue of the `th encoding

Hamiltonian H`having d`eigenvalues. We use the multi-

index i= (i1, . . . , iL) indicating which eigenvalue is taken

from each encoding Hamiltonian. We deﬁne Λias

Λi=λi1

1+· · · +λiL

L(4)

Finally, we can express Ω, the set of frequencies as in

Eq.5:

Ω = Λi−Λj,i,j∈

`=1

[|1, d`|],(5)

The simplest case is called Pauli encoding, when all

encoding Hamiltonians are in fact Pauli matrices (e.g.

encoding gates RZ(x) = e−ix

2σZ) as in [6,11]. In this

case, all the eigenvalues are λ=±1/2, and therefore, the

Λiare all the integers (or half-integers, if Lis odd) in

[−L/2, L/2]. It follows that the set of distinct values in

Ω is simply the set of integers in J−L, LK. In this case,

there are many redundant frequencies, due to the fact

that all Pauli eigenvalues are the same. Namely, only

2L+1 distinct frequencies among the 22Lpossible values

of Λi−Λj. As shown in Fig.2, more various eigenvalues

would create more distinct frequencies in the end. In

the rest of the paper, Ω will denote the set of distinct

frequencies, without redundancy.

In our experiments (Section IV), we observe an im-

portant phenomenon: redundant frequencies are likely

to have high coeﬃcients. Unique frequencies might of-

ten have in contrast small coeﬃcients, reducing the po-

tential expressivity of the VQC. We see that the redun-

dancy might therefore play an important role in the ex-

pressivity of VQCs, and leave theoretical proof for future

work. In fact, the impact of the frequencies redundancy

on the quantum model has been recently highlighted in

[12] where an analytical correspondence between a fre-

quency redundancy and its Fourier coeﬃcient has been

proven for a simpliﬁed class of parameterized quantum

models.

These arguments give some intuition on why one

should use encoding gates from Hamiltonians with rich

and various eigenvalues, by taking complex interactions

over many qubits. A global Hamiltonian over nqubits,

hard to implement, could potentially have 2ndistinct

eigenvalues, thus enlarging Ω and avoiding redundancy.

Another approach from [13] consists in adding scaling

factors in the Pauli encoding gate to modify their eigen-

values and avoid redundancy. It results in an exponential

number of integer frequencies, with respect to L, with

many very high frequencies.

FIG. 2: From encoding Hamiltonians to

Frequencies. The frequencies composing the VQC

model (on one dimensional input) come from all the

combinations of eigenvalues from each encoding

Hamiltonians. This can be seen as a tree, with L= 3

Hamiltonians in this ﬁgure. We also see potential

redundancy in the leaves.

We can now generalize, if we now suppose that X=

Rd, such that we encode a vector x= (x1, . . . , xd)

in our quantum model, then Ω becomes the following

d−dimensional Cartesian product

Ω=Ω1×Ω2× · · · × Ωd(6)

where each Ωκis deﬁned as in Eq.5on its own set of

Hamiltonians.

In this context, note that the frequencies ωare now

vectors in Rdand there are ddiﬀerent trees to build Ω (see

Fig. 2). Note that for notation simplicity, we assumed

that Lgates were applied on each input’s component,

but it can be generalized to any number of gates per

dimension.

We therefore see that the size of the spectrum |Ω|can

potentially grow exponentially with the number of en-

coding gates and the dimension of the input data. For

instance, if we consider a d-dimensional vector xand L

Pauli-encoding gates for each dimension in such a way

that there are Ld encoding gates in the VQC. According

to equation (6), the size of the spectrum Ω would scale as

O(Ld), which becomes quickly intractable as dincreases.

As an example, the spectrum associated to a VQC with

L= 20 encoding gates and d= 16 would require more

than one hundred times the world’s storage data capacity

available in 2007 to be stored [14].

However, can such large Fourier series be approximated

with classical methods? It would consist in building a

classical approximator ˜

fas

f(x) = X

ω∈˜

Ω

˜cωeiωx (7)

such that ˜

Ω is of tractable size and the two solution are

close, written as 



f(x)−˜

f(x)



≤ε, using a given error

measure (inﬁnite or `2norms, for instance).

In the cases where the construction of such a model

is possible, it would imply that although classically sim-

ulating the encoding VQC might not be possible, the

quantum model that emerges from it can be eﬃciently

approximated in a classical way.

C. Quantum models are shift-invariant kernel

methods

As the quantum model is a real-valued function, it fol-

lows that ω∈Ω implies −ω∈Ω and cω=c∗

−ω. We

express the Fourier series of the quantum model as a sum

of trigonometric functions by deﬁning for every ω∈Ω:

aω:= cω+c−ω∈R(8)

bω:= 1

i(cω−c−ω)∈R(9)

such that

f(x;θ) = X

ω∈Ω+

cωeiωx +c−ωe−iωx

ω∈Ω+

aωcos(ωx) + bωsin(ωx)

(10)

where Ω+contains only half of the frequencies from

Ω. Considering only Pauli gates, if d= 1, we simply

have Ω = J−L, LKand Ω+=J0, LK. In dimension d, we

have Ω = J−L, LKdand Ω+is built by keeping half of the

frequencies (after removing those of opposite sign), plus

the null vector. In the end, we have

|Ω+|=(2L+ 1)d−1

2+ 1 (11)

With a more general encoding scheme, if there is a diﬀer-

ent number of distinct positive frequencies per dimension,

the formula is diﬀerent but is built similarly.

In the following parts, we will focus solely on Ω+and

conveniently drop the + subscript.

FIG. 3: Variational Quantum Circuits give rise to

Fourier Series. In a quantum machine learning task,

classical data is encoded in a subset of variational gates

of a quantum circuit (green), while blue gates are

trainable.

We can also deﬁne the feature map of the quantum

model [8] as

f(x;θ) = hψ(x;θ)|O|ψ(x;θ)i=w(θ)Tφ(x) (12)

where φ(x) is the feature vector, the mapping of the

initial input into a larger feature space, where the new

distribution of the data is supposed to make the classiﬁ-

cation (or regression) solvable with only a linear model.

This linear model is in fact the inner product between

φ(x) and a trainable weight vector w. In the case of

VQCs, we can explicitly express them as:

φ(x) = 1

p|Ω|





cos(ωTx)

sin(ωTx)





ω∈Ω

,w(θ) = 





aω

bω





ω∈Ω

(13)

If the spectrum Ω is known and accessible, one can ﬁt

the quantum model by retrieving the coeﬃcients aω, bω

associated to each frequency ω. This can be done by

using general linear ridge regression techniques (see Ap-

pendix A). Interestingly, there exists a dual formulation

of the linear ridge regression that depends entirely on the

kernel function associated to the model [15]. In our case,

the related kernel function is deﬁned by:

k(x, x0) = hφ(x), φ(x0)i

|Ω|X

ω∈Ω

cos(ωx) cos(ωx0) + sin(ωx) sin(ωx0)

|Ω|X

ω∈Ω

cos(ω(x−x0))

(14)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ClassicallyApproximatingVariationalQuantumMachineLearningwithRandomFourierFeaturesJonasLandman,1,2SlimaneThabet,3,4ConstantinDalyac,3,4HelaMhiri,3,5andElhamKashe1,31SchoolofInformatics,UniversityofEdinburgh,10CrichtonStreet,Edinburgh,UnitedKingdom2QCWare,PaloAlto,USAandParis,France3Laboratoired'Inf...

展开>> 收起<<

Classically Approximating Variational Quantum Machine Learning with Random Fourier Features Jonas Landman1 2Slimane Thabet3 4Constantin Dalyac3 4Hela Mhiri3 5and Elham Kashe1 3.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Classically Approximating Variational Quantum Machine Learning with Random Fourier Features Jonas Landman1 2Slimane Thabet3 4Constantin Dalyac3 4Hela Mhiri3 5and Elham Kashe1 3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: