Protocols for classically training quantum generative models on probability distributions Sachin Kasture1Oleksandr Kyriienko2and Vincent E. Elfving1

2025-05-06 0 0 2.43MB 12 页 10玖币
侵权投诉
Protocols for classically training quantum generative models on probability
distributions
Sachin Kasture,1Oleksandr Kyriienko,2and Vincent E. Elfving1
1PASQAL, 7 rue L´eonard de Vinci, 91300 Massy, France
2Department of Physics and Astronomy, University of Exeter,
Stocker Road, Exeter EX4 4QL, United Kingdom
(Dated: October 9, 2023)
Quantum Generative Modelling (QGM) relies on preparing quantum states and generating sam-
ples from these states as hidden — or known — probability distributions. As distributions from
some classes of quantum states (circuits) are inherently hard to sample classically, QGM represents
an excellent testbed for quantum supremacy experiments. Furthermore, generative tasks are increas-
ingly relevant for industrial machine learning applications, and thus QGM is a strong candidate for
demonstrating a practical quantum advantage. However, this requires that quantum circuits are
trained to represent industrially relevant distributions, and the corresponding training stage has an
extensive training cost for current quantum hardware in practice. In this work, we propose protocols
for classical training of QGMs based on circuits of the specific type that admit an efficient gradient
computation, while remaining hard to sample. In particular, we consider Instantaneous Quantum
Polynomial (IQP) circuits and their extensions. Showing their classical simulability in terms of the
time complexity, sparsity and anti-concentration properties, we develop a classically tractable way of
simulating their output probability distributions, allowing classical training to a target probability
distribution. The corresponding quantum sampling from IQPs can be performed efficiently, unlike
when using classical sampling. We numerically demonstrate the end-to-end training of IQP circuits
using probability distributions for up to 30 qubits on a regular desktop computer. When applied
to industrially relevant distributions this combination of classical training with quantum sampling
represents an avenue for reaching advantage in the noisy intermediate-scale quantum (NISQ) era.
I. INTRODUCTION
Recent breakthrough works in quantum computing
demonstrated an improved scaling for sampling prob-
lems, leading to so-called the quantum supremacy [1].
While an exact boundary of classical simulation is still to
be established [2], for carefully selected task of random
circuits sampling and boson sampling [3,4] one achieves
an exponential separation for the task of generating sam-
ples (bit strings from measurement read-out). Generally,
the task of generating samples from underlying probabil-
ity distributions is the basis of generative modelling, and
represents a highly important part of classical machine
learning. Its quantum version — Quantum Generative
Modelling (QGM) — relies on training quantum circuits
as adjustable probability distributions that can model
sampling from some particular desired distribution [5
10]. QGM exploits the inherent superposition properties
of a quantum state generated from a parameterized uni-
tary, along with the probabilistic nature of quantum mea-
surements, to efficiently sample from a trainable model.
QGM have potential applications in generating samples
from solutions of stochastic differential equations (SDEs)
for simulating financial and diffusive physical processes
[11,12], scrambling data using a quantum embedding
for anonymization [1315], generating solutions to graph-
based problems like maximum independent set or maxi-
mum clique [16,17], among many others. Given the suc-
cesses of quantum sampling this makes QGM a promising
contender for achieving a quantum advantage. However,
to date demonstrating QGM of practical significance has
input data
{Xj}
samples PDF est. p(x)
~PDF p(x)
circuit training
diff. constraints
(SDE, Fokker-
Planck eqs.)
x∂p(x)/∂x =
= αp + ∂2p/∂x2
QCBM (implicit)
{Xj}Uθ
QCBM (explicit)
p(x)
~
DQGM (explicit)
p(x)
Uφ(x)
UθUθ
QCBM
{Xj}
Uθ
DQGM
{Xj}
Uθ
QFT
quantum classical (CPU/GPU) or quantum
sample-based
loss (MMD etc)
distribution-based loss and gradients (MSE)
implicit explicit
sampling
FIG. 1. Schematic of different steps in training a quantum
generative model using QCBM and DQGM architectures. In
a conventional QCBM setup, the loss and gradient estimation
is done using input data samples directly, while in our work we
focus on an explicit version of QCBM and DQGM, allowing
for classical training.
eluded the field as training specific generative models is
arXiv:2210.13442v2 [quant-ph] 6 Oct 2023
2
a complex time-intensive task.
The key asset of quantum generative modelling is that
quantum measurements (collapse to one of eigenstates of
a measurement operator) provide a new sample with each
shot. Depending on the hardware platform, generating
one sample can take on the order of a few hundreds of
microseconds to several milliseconds [1824]. For large
probability distributions, represented by entangled 50+
qubit registers, performing the classical inversion is in-
deed much more costly. However, often the challenge
comes from the inference side, when quantum circuits
are required to match specific distributions. Typically,
training of quantum generative models utilizes gradient-
based parametric learning, similarly to training of deep
neural networks [25]. Parameterized quantum circuits
(also referred as quantum neural networks — QNNs) are
trained by estimating the gradient of gate parameters
θ. Moreover, the gradient of a full QGM loss has to
be estimated with respect to θ. For quantum devices
this can be done by the parameter-shift rule and its gen-
eralization [2628], where number of circuit estimation
increases linearly with the number of parameters. The
overall training cost corresponds to the measurement of
the loss function at each iteration step. In the case of
quantum circuit Born machine the loss may correspond
to Kullback-Leibler(KL) divergence [9], Sinkhorn diver-
gence [29] or maximum mean discrepancy (MMD) [30],
and may require extensive sampling for resolving the loss
as an average. For quantum generative adversarial net-
works (QGAN) the loss minimization is substituted by
the minimax game [31,32] requiring multi-circuit esti-
mation. In all cases the convergence is not guaranteed
due to exponential reduction of gradients [33].
In this work we investigate the possibility of training
the parameters of quantum generative models classically,
while still retaining the quantum advantage in sampling
[34]. For instance, the ability of classical training for a
different paradigm was shown for Gaussian Boson Sam-
pling (GBS) devices [35], but under certain conditions
of fixing an initial set of samples and non-universal op-
eration. For the digital quantum computing operation,
results from previous works [36,37] motivate the possi-
bility that estimating probability density classically can
be feasible without losing the sampling complexity. We
explore this possibility in more detail and use this fur-
ther to develop methods to train circuits classically to
output a desired distribution using a gradient-based ap-
proach. We show that our method is feasible using nu-
merics for up to 30 qubits on a regular desktop computer.
We explore different families of quantum circuits in detail
and perform numerical studies to study their sampling
complexity and expressivity. For expressivity studies in
particular, we look at training a Differentiable Quantum
Generative Model (DQGM) [3840] architecture which
allows training in the latent (or ‘frequency’) space, and
sampling in the bit-basis. This presents a good test-
ing ground for applying the proposed method to explicit
quantum generative models. We also show that QCBMs
can be trained classically for certain distributions, while
still hard to sample. Our protocols contribute towards
tools for achieving the practical advantage in sampling
once the target distributions are chosen carefully. We
highlight the differences between well known strategies
for QGM and the method discussed in the paper in Fig. 1.
II. METHODS
A. Preliminaries: QCBM and DQGM as implicit
vs explicit generative models
Generally, there are two types of generative models.
Explicit generative models assume a direct access (or
‘inference’) to probability density functions (PDF). At
the same time, implicit generative models are described
by hidden parametric distributions, where samples are
produced by transforming randomness via inversion pro-
cedure. These two types of models have crucial differ-
ences. For example, training explicit models involves
loss functions measuring distances between a given PDF,
ptarget(x) and a model PDF, pmodel(x), for example with
a mean square error (MSE) loss which is defined as
LMSE =X
x|pmodel(x)ptarget(x)|2(1)
where explicit knowledge of the ptarget(x) is used. On
the other hand, training implicit models involves com-
paring the samples generated by the model with given
data samples (e.g. with a MMD loss [30]). The MMD
loss is defined as
LMMD =E
xpmodel ,ypmodel
K(x, y)
2E
xpmodel ,yptarget
K(x, y)
+E
xptarget,yptarget
K(x, y)
(2)
where K(x, y) is an appropriate kernel function. The
MMD loss measures the distance between two probabil-
ity distributions using samples drawn from the respec-
tive distributions as shown in the above equation. In the
context of QGM, QCBM is an excellent example of im-
plicit training where typically a MMD like loss-function is
used. On the other hand, recent work showcases how ex-
plicit quantum models such as DQGM [38] and Quantum
Quantile Mechanics [11] benefit from a functional access
to the model probability distributions, allowing input-
differentiable quantum models [39,40] to solve stochas-
tic differential equations or to model distributions with
differential constraints.
Let us consider a quantum state |Ψcreated by ap-
plying a quantum circuit ˆ
U(which can be parameter-
ized) to a zero basis state. For a general ˆ
U, simulat-
ing the output PDF values that follow the Born rule
pmodel(x) = |⟨x|Ψ⟩|2and producing samples from |Ψare
both classically hard. But what if estimating the PDF for
3
DQGM/QCBM
setting up...
Use optimized parameters in the
DQGM sampling circuit or QCBM
circuit and generate samples
If hard to sample continue...
Measure anti-concentration
and t-sparseness of probability
distributions
Measure cross-entropy
difference to Porter-Thomas
distribution using sampling
Number of qubits <20?
Measure sparsity of distribution to verify quantum
advantage for sampling
Check time complexity and compare with other IQP-like circuits
Start with a family known to admit additive
polynomial estimation of probabilities
Identify feature map and variational
ansatz which allow additive polyno-
mial estimation of probabilities and
are classically hard to sample from
Classically simulate circuits to
estimate p(x) and vary circuit
parameters to minimize loss for
DQGM training circuit or QCBM circuit
FIG. 2. Workflow used for classical training of quantum samplers, both in the differentiable quantum generative modelling
(DQGM) and the quantum circuit Born machine (QCBM) setting. First, we identify circuits ˆ
Usuitable for classically tractable
probability estimation and hard sampling (see the right chart). For this we: 1) check that a circuit admits additive polynomial
estimation of probability; 2) compare the time-complexity for exact probability calculation with known circuit families; 3)
verify the sparseness of the probability distributions, depending on the number of qubits n; 4) measure anti-concentration/t-
sparseness or cross entropy difference using sampling. Analysing these properties we confirm if ˆ
Uadmits classical training and
computationally hard sampling. After the classical training is performed variationally by minimizing DQGM/QCBM loss, we
use optimized parameters for quantum sampling circuits.
certain ˆ
Uto sufficient accuracy is classically tractable? In
this case one can use an explicit training, and at the infer-
ence stage have access not only to probabilities but also
the capacity to sample efficiently via quantum measure-
ments. This scenario describes a potential for classical
training of quantum generative models.
To enable the classical training, we propose a strategy
described in a schematic shown in Fig. 2. Here, our goal
is finding circuits satisfying ’classical training + quantum
sampling’ conditions.
B. π-simulable circuits that are hard to sample
We note that certain families of quantum circuits, in-
cluding Clifford [41,42] and match-gate sequences [43
45], admit a classical tractable estimation of probabili-
ties pmodel(x). At the same time, for these circuits the
generation of samples is also classically ‘easy’, thus lim-
iting the potential for achieving a quantum advantage.
However, there exist families of quantum circuits that
allow for complexity separation between the two tasks.
For instance, in Ref. [36] the authors show that one can
estimate probabilities for IQP (Instantaneous Quantum
Polynomial) circuits [46,47] up to an additive polynomial
error, while retaining a classical hardness for sampling.
For a typical IQP circuit with input |0n, the amplitude
to obtain a certain bit-string xat output is given by
Ψ(x) = x|ˆ
Hˆ
Uˆ
H|0n(3)
where ˆ
Uconsists of single and 2-qubit Z-basis rotations
and ˆ
Hrepresents the Hadamard gate applied to all the
qubits. Moreover, it is known to be classically hard to
even approximately sample from IQP circuits in an aver-
age case [48]. Therefore, such circuits offer an opportu-
nity for explicit training of a quantum generative model,
and a potential for quantum advantage in sampling. IQP
circuits by its structure are also strongly related to the
so-called forrelation problem [4951], they only differ by
an additional ˆ
Hlayer in the middle. This corresponds to
calculating an overlap between states defined as
Φ = 0n|ˆ
Hˆ
U2ˆ
Hˆ
U1ˆ
H|0n(4)
after the action of quantum circuit ˆ
UF:= ˆ
Hˆ
U2ˆ
Hˆ
U1ˆ
H,
where ˆ
U1,ˆ
U2are circuits that consist of Z-basis rota-
tions ˆ
Rzand Ising-type propagators ˆ
Rzz. Hereafter, ˆ
H
is a layer of Hadamard gates applied to all nqubits. It
was shown that Φ can be calculated efficiently classically.
Therefore, from the squared forrelation |Φ|2one can esti-
mate the probability to obtain |0nat the output after
the action of ˆ
UF, provided the input is |0n. At the
same time, in the following we show that the variant of
forrelation can be used for achieving the sampling advan-
tage.
Classical estimation of probability using the forrelation
has been shown to be possible [52] under the condition
that the circuit’s entangling properties are constrained.
To understand this, we use the concept of connectivity in
graph theory. Consider a graph Gwith nnodes, where
each node represents a qubit in a quantum circuit. Two
nodes are connected by an edge if there is a 2-qubit entan-
gling gate between the corresponding qubits in the cir-
cuit. Typically IQP circuits have all-to-all connectivity,
usually by using a two-qubit entangling gate. However, if
摘要:

ProtocolsforclassicallytrainingquantumgenerativemodelsonprobabilitydistributionsSachinKasture,1OleksandrKyriienko,2andVincentE.Elfving11PASQAL,7rueL´eonarddeVinci,91300Massy,France2DepartmentofPhysicsandAstronomy,UniversityofExeter,StockerRoad,ExeterEX44QL,UnitedKingdom(Dated:October9,2023)QuantumGe...

展开>> 收起<<
Protocols for classically training quantum generative models on probability distributions Sachin Kasture1Oleksandr Kyriienko2and Vincent E. Elfving1.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:12 页 大小:2.43MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注