A super-polynomial quantum-classical separation for density modelling Niklas Pirnay1Ryan Sweke2Jens Eisert2 3and Jean-Pierre Seifert1 4 1Electrical Engineering and Computer Science Technische Universität Berlin Berlin 10587 Germany

2025-04-30 1 0 1.35MB 15 页 10玖币
侵权投诉
A super-polynomial quantum-classical separation for density modelling
Niklas Pirnay,1Ryan Sweke,2, Jens Eisert,2, 3 and Jean-Pierre Seifert1, 4
1Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, 10587, Germany
2Dahlem Center for Complex Quantum Systems,
Freie Universität Berlin, Berlin, 14195, Germany
3Fraunhofer Heinrich Hertz Institute, 10587 Berlin, Germany
4Fraunhofer SIT, D-64295 Darmstadt, Germany
(Dated: October 28, 2022)
Density modelling is the task of learning an unknown probability density function from samples,
and is one of the central problems of unsupervised machine learning. In this work, we show that there
exists a density modelling problem for which fault-tolerant quantum computers can offer a super-
polynomial advantage over classical learning algorithms, given standard cryptographic assumptions.
Along the way, we provide a variety of additional results and insights, of potential interest for
proving future distribution learning separations between quantum and classical learning algorithms.
Specifically, we (a) provide an overview of the relationships between hardness results in supervised
learning and distribution learning, and (b) show that any weak pseudo-random function can be used
to construct a classically hard density modelling problem. The latter result opens up the possibility
of proving quantum-classical separations for density modelling based on weaker assumptions than
those necessary for pseudo-random functions.
I. INTRODUCTION
The task of learning a representation of a probability distribution from samples is of importance in a wide variety
of contexts, from the natural sciences to industry. As such, a central focus of modern machine learning is to develop
algorithms and models for this task. Of particular importance is the distinction between density modelling and
generative modelling. In density modelling the task is to learn an evaluator for a distribution – i.e., a function which
on input of a sample returns the probability weight assigned to that sample by the underlying distribution. As such,
density modelling is sometimes referred to, as we do here, as evaluator learning. In generative modelling, the task is
to learn a generator for a distribution – i.e., a function which given a (uniformly) random input seed outputs a sample
with probabilities according to the target distribution. As a result, generative modelling is sometimes referred to as
generator learning. As an example, in a generative modelling problem the goal might be to generate images of cats
or dogs, while the associated density modelling problem would be to evaluate the probability that an image depicts a
cat or a dog. It is important to stress that these two learning tasks are indeed fundamentally different. That is to say,
efficiently learning a generator for a distribution does not imply efficiently learning an evaluator and vice versa [1].
Given the incredible success of modern machine learning, and the rapidly increasing availability of quantum compu-
tational devices, a natural question is whether or not quantum devices can provide any advantage in this domain [2–5].
Most current research in this direction is of a heuristic nature [6, 7]. However, there also exists an emerging body
of results which provide examples of machine learning tasks for which one can prove rigorously a meaningful separa-
tion between the power of classical and quantum computational devices [8–11], even though results on such rigorous
separations are still rather scarce [12]. One of these, the work of Ref. [11] has shown rigorously that one can obtain
a quantum advantage in generative modelling by constructing a distribution class which is (a) provably hard to gen-
erator learn classically, but (b) efficiently generator learnable using a fault-tolerant quantum computer. Importantly
however, Ref. [11] has not addressed the related task of density modelling.
In this work, we close this gap, by showing that the class of probability distributions constructed in Ref. [11] in fact
also allows one to demonstrate a super-polynomial quantum-classical separation for density modelling. Additionally,
along the way we provide a variety of insights and additional results, which may be of independent interest for
constructing future quantum-classical separations in distribution learning. More specifically, in this work we do the
following:
1. Any quantum-classical separation requires a proof of classical hardness. The generative modelling separation
of Ref. [11] relies crucially on a result from Ref. [1] which shows that from any pseudo-random function (PRF)
one can construct a distribution class which is provably hard to generator learn classically. We strengthen this
Currently at IBM Quantum, Almaden Research Center, San Jose, CA 95120, USA.
arXiv:2210.14936v1 [quant-ph] 26 Oct 2022
2
fundamental tool, by showing that for the case of density modelling, any weak PRF can be used to construct
a distribution class which is provably hard to evaluator learn classically. This opens up the door for proving
classical hardness results for density modelling, based on weaker assumptions than those necessary for candidate
PRF constructions. In particular, the hope is that one may be able to prove classical hardness using assumptions
which do not immediately also rule out the possibility of efficient learning algorithms running on near-term
quantum devices.
2. We prove a super-polynomial quantum-classical separation for density modelling using the distribution class
from Ref. [11]. As the distribution class from Ref. [11] is constructed from a PRF, the classical hardness
follows immediately given the above mentioned insight that even weak PRFs are sufficient for density modelling
hardness. As such, what remains is to provide an efficient quantum evaluator-learner for this distribution class,
and we show that a simple modification of the quantum generator-learner from Ref. [11] is sufficient to achieve
this.
3. The majority of work in computational learning theory has been focused on the task of supervised learning
Boolean functions. As such, it is natural to ask to which extent hardness results and quantum-classical sep-
arations in supervised learning can be leveraged to obtain separations for distribution learning. We provide
an overview of the extent to which this is or is not possible, for both generative and density modelling. Once
again, the hope is that this provides a toolbox for proving future quantum-classical separations in distribution
learning.
This work is structured as follows: We start below in Section II by providing some essential definitions and back-
ground from both computational learning theory and cryptography. We note that as this work to a large extent
generalizes and extends Ref. [11], we do not provide all necessary background here, and we refer often to Ref. [11]
for a variety of definitions and constructions. Given the necessary background we proceed in Section III to present
a variety of techniques – both known and novel – for proving hardness results in distribution learning from hardness
results in supervised learning. Of particular interest is Section III B, in which we show that one can use any weak
PRF to construct a distribution class which is classically hard to evaluator learn. Using these tools, we then show in
Section IV a super-polynomial quantum-classical separation for density modelling, using the distribution class from
Ref. [11]. Finally, we conclude in Section V with a discussion and outlook.
II. BACKGROUND
To show a quantum-classical learning separation, on the highest level, one needs to prove two things: one has to prove
classical learning hardness and show efficiency of quantum learning. To introduce the necessary formalism, we will start
in this section by providing an overview of the PAC framework for learning both functions and distributions. Given
this, we will then present a construction from Ref. [1] which allows one to define distribution classes from function
classes in a way which facilitates the conversion of function learning hardness to distribution learning hardness. Finally,
we introduce weak-secure pseudo-random functions, which will later be used to construct distribution classes, via the
aforementioned construction from Ref. [1], for which the density modelling problem is provably hard for classical
learning algorithms. In what follows, we denote:
(x, y): the index function evaluating to 1if and only if x=y,
xU(X): sample xfrom the uniform distribution over a set X(sometimes Xis omitted if Xis clear from the
context),
poly(a, b): any polynomial in aand bor sometimes also meaning the set of all polynomials in a, b,
{0,1}n: the set of bit strings of length n,
akb: the concatenation of bit strings a, b,
1n: the bit string consisting of n1’s,
D(x): the probability mass assigned to x, if Dis a probability distribution,
dT V (P, Q): the total variation distance between distributions Pand Q,
Zq: the residue class ring Z/qZ.
3
Before introducing the formalism for analyzing distribution learning, we introduce Valiant’s PAC learning framework
for function learning [13], which since its proposal is the standard framework for rigorously analyzing the complexity
of supervised learning problems [14]. In it we are concerned with learning some class of functions that map nbits to
mbits. At a high level, for any such target function in the class, when given some sort of oracle access to the unknown
target function, a learning algorithm should with high probability, output a hypothesis function that is close to the
target function.
For this function learning task, we distinguish between two different types of oracle access to the function fthat
is to be learned1. Firstly, the membership query oracle to f,MQ(f), which when queried with xyields the tuple
(x, f(x)). We denote this via
query[MQ(f)](x)=(x, f(x)).(1)
The membership query access corresponds to the ability to evaluate fon chosen points. We sometimes refer to the
membership query oracle in general without any fixed function simply as MQ. Secondly, the η-noisy random example
oracle REX(f, P, η)to fis defined via
query[REX(f, P, η)] = ((x, f (x)) with probability P(x)(1 η)
(x, ¬f(x)) with probability P(x)(η), (2)
where Pis a probability distribution over inputs, η[0,1] is a noise rate and ¬f(x)is any element in the image space
of fexcept f(x). If η= 0, we also write REX(f, P ). At a high level, this oracle generates random (possibly noisy)
input output tuples from f. If we refer to the random example oracle in general, without any fixed function, we write
REX(P, η). Algorithmically, we consider a query to MQ or REX to take unit time.
We can now give the definition of a PAC learning algorithm for function classes. Note that here we use the notation
O(f, D)to denote some oracle, which might be either MQ(f)or REX(f, D).
Definition 1 ((, δ, O)-PAC function learner for F).Let Fbe a class of functions, with f:{0,1}n→ {0,1}mfor all
f∈ F . Given some fixed , δ (0,1), an algorithm Ais an (, δ, O, D)-PAC function learner F, if for all f∈ F,
when given oracle access O(f, D), with probability at least 1δ,Aoutputs a hypothesis hsatisfying
PrxD[f(x)6=h(x)] .(3)
The algorithm Ais an (, δ, O)-PAC function learner for Fif it is a (, δ, O, D)-PAC function learner for Ffor all
distributions D. We call Aan efficient (, δ, O)-PAC function learner for Fif the time complexity of Ais O(poly(n)).
We call F(, δ, O)-PAC-hard, if there exists no efficient (, δ, O)-PAC function learner for it.
The above definition refers to fixed accuracy and probability parameters (, δ), but we note that if these parameters
are considered as variables, then an efficient learner is taken as one with time complexity O(poly(n, 1/, 1)). In this
case, the algorithm will be efficient with respect to the definition above for any , δ = Ω(1/poly(n)). Additionally,
we stress that the learning algorithm could be either classical or quantum. Indeed, as we will see in this work, it is
possible that there exists an efficient quantum learning algorithm for a given class, but no efficient classical learning
algorithm.
We would now like to generalize the PAC framework for learning functions to the natural and important problem
of learning distributions. To formulate this problem rigorously, it is necessary to first introduce the different possible
representations of a distribution that one might want to learn, namely generators and evaluators:
Definition 2 (Generator and evaluator for D).Let Dbe a discrete probability distribution over {0,1}n. A generator
for Dis any function GEND:{0,1}m→ {0,1}nthat on uniformly random inputs outputs samples according to D,
i.e.,
PrxU({0,1}m)[GEND(x) = y] = D(y).(4)
An evaluator for Dis any function EVALD:{0,1}n[0,1] that evaluates the probability mass assigned to a event
with respect to D, i.e.,
EVALD(x) = D(x).(5)
1We note that one can consider many other types of oracle access as well, such as for example, statistical query access [15].
摘要:

Asuper-polynomialquantum-classicalseparationfordensitymodellingNiklasPirnay,1RyanSweke,2,JensEisert,2,3andJean-PierreSeifert1,41ElectricalEngineeringandComputerScience,TechnischeUniversitätBerlin,Berlin,10587,Germany2DahlemCenterforComplexQuantumSystems,FreieUniversitätBerlin,Berlin,14195,Germany3F...

展开>> 收起<<
A super-polynomial quantum-classical separation for density modelling Niklas Pirnay1Ryan Sweke2Jens Eisert2 3and Jean-Pierre Seifert1 4 1Electrical Engineering and Computer Science Technische Universität Berlin Berlin 10587 Germany.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.35MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注