Deep Neural Networks as the Semi-classical Limit of Topological Quantum Neural Networks The problem of generalisation

2025-05-06 0 0 3.67MB 22 页 10玖币

侵权投诉

Deep Neural Networks as the Semi-classical Limit of Topological Quantum Neural

Networks:

The problem of generalisation

Antonino Marcianò,1Emanuele Zappala,2Tommaso Torda,3Matteo

Lulli,4Stefano Giagu,3Chris Fields,5Deen Chen,6and Filippo Fabrocini7

1Center for Field Theory and Particle Physics & Department of Physics,

Fudan University, Jingwan campus, Jingsan Rd, 200433 Shanghai,

China and Laboratori Nazionali di Frascati INFN,

Via Enrico Fermi, 54, 00044 Frascati (Rome), Italy,

EU and INFN sezione Roma “Tor Vergata”, 00133 Rome, Italy, EU∗

2Department of Mathematics and Statistics, Idaho State University

Physical Science Complex | 921 S. 8th Ave., Stop 8085 | Pocatello, ID 83209†

3Department of Physics, University of Rome “La Sapienza”, Piazzale Aldo Moro,

5, 00185 Rome, Italy, and the Laboratori Nazionali di Frascati INFN,

Via Enrico Fermi, 54, 00044 Frascati (Rome), Italy‡

4Department of Mechanics and Aerospace Engineering,

Southern University of Science and Technology, 1088 Xueyuan Avenue, 518055 Shenzhen, China§

5Allen Discovery Center, Tufts University, 200 College Avenue, 02155 Medford, MA, US¶

6College of Design and Innovation, Tongji University, 281 Fuxin Rd, 200092 Shanghai, China∗∗

7College of Design and Innovation, Tongji University, 281 Fuxin Rd,

200092 Shanghai, China and Institute for Computing Applications "Mario Picone",

Italy National Research Council, Via dei Taurini, 19, 00185 Rome, Italy, EU††

Deep Neural Networks miss a principled model of their operation. A novel framework for super-

vised learning based on Topological Quantum Field Theory that looks particularly well suited for

implementation on quantum processors has been recently explored. We propose using this frame-

work to understand the problem of generalisation in Deep Neural Networks. More speciﬁcally, in

this approach, Deep Neural Networks are viewed as the semi-classical limit of Topological Quantum

Neural Networks. A framework of this kind explains the overﬁtting behavior of Deep Neural Net-

works during the training step and the corresponding generalisation capabilities. We explore the

paradigmatic case of the perceptron, which we implement as the semiclassical limit of Topological

Quantum Neural Networks. We apply a novel algorithm we developed, showing that it obtains

similar results to standard neural networks, but without the need for training (optimisation).

I. INTRODUCTION

The problem of generalisation for deep neural networks

(DNNs), i.e. neural networks with several hidden layers,

is the problem of understanding how DNNs can success-

fully generalise. Test errors that are very close to their

training errors can be achieved, even when they have suf-

ﬁciently many model parameters to demonstrably “mem-

orise” the training data [1–3]. In other words, the prob-

lem of generalisation is the problem of understanding why

(at least some) overparametrised DNNs do not fail to

generalise when they display an overﬁtting regime. Gen-

eralisation failure would be expected from the standard

bias-variance trade-oﬀ; however, many DNNs evince a

“double-descent” behavior, in which the generalisation

∗Electronic address: marciano@fudan.edu.cn

†Electronic address: emanuelezappala@isu.edu

‡Electronic address: tommaso.torda@uniroma1.it

§Electronic address: lulli@sustech.edu.cn

¶Electronic address: ﬁeldsres@gmail.com

∗∗Electronic address: deenchen@tongji.edu.cn

††Electronic address: fabrocini@tongji.edu.cn

performance increases, instead of decreasing, as the num-

ber of model parameters grows [4]. As shown by Zhang

et al. [1], the methods of conventional generalisation the-

ory, i.e. statistical learning theory, demonstrably fail to

explain this behavior. The problem of generalisation has

generated a rich literature [5–17], but remains unsolved.

As Kevin Hartnett [18] has written in Quanta Magazine:

When we design a skyscraper, we expect

it will perform to speciﬁcation: that the

tower will support so much weight and be

able to withstand an earthquake of a certain

strength. But with one of the most important

technologies of the modern world, we are ef-

fectively building blind. We play with diﬀer-

ent designs, tinker with diﬀerent setups, but

until we take it out for a test run, we do not

really know what it can do or where it will

fail.

We do not really know, in fact, after one or even sev-

eral test runs: it always remains the case that the “next”

generalisation problem will reveal failure. A principled

approach for studying generalisation in DNNs that goes

arXiv:2210.13741v2 [quant-ph] 11 Oct 2024

beyond conventional generalisation theory is, therefore,

needed.

Here we bring to completion our previous conjecture

[19] that DNN architectures can be considered the semi-

classical limit of a generalised quantum neural-network

(QNN) architecture, the topological quantum neural net-

work (TQNN). To this purpose, we elaborate on the

theoretical framework we proposed in [19], and provide

empirical evidence to support our thesis, ultimately de-

riving a novel answer to the problem of generalisation.

TQNNs diﬀer from conventional QNNs [20,21] by allow-

ing the number of “layers” and their connection-topology

to vary arbitrarily, provided only that the input and out-

put boundary conditions are preserved. The full gen-

erality of the TQNN framework as a representation of

computational processes acting on classical data to yield

classical outputs has been recently proven [22]. We pro-

pose, in particular, that DNNs generalise because they

are classical limits of TQNNs. DNNs are, from this per-

spective, implementations of distributed, high-entropy,

error-correcting codes that can be seen as classical lim-

its of the quantum error-correcting codes (QECCs) [23]

implemented by TQNNs [24]. Increasing the model ca-

pacity of a DNN increases its generalisation ability be-

cause it increases the redundancy available in the code

space. We demonstrate, using the perceptron architec-

ture as a simpliﬁed example, that a TQNN can be used

to directly compute the parameters needed to generalise

from a training set, in the absence of any actual train-

ing of the network. This raises the possibility of replacing

data-intensive training of DNNs with quantum computa-

tions, with a signiﬁcant increase in eﬃciency and decrease

in operational costs.

The plan of the paper is the following. In §II, we

provide a formal deﬁnition of generalisation (§II A), and

brieﬂy review some expectations of conventional general-

isation theory and how they have been observed to fail

in DNNs (§II B). We then show, via a simple thought

experiment, how generalisation can be expected to im-

prove as parameters are added when a QECC is employed

(§II C). We provide a brief review of topological quantum

ﬁeld theory (TQFT), the theoretical framework under-

lying TQNNs, in §III, and summarise the TQNN con-

struction in §IV. In §Vwe discuss the theoretical foun-

dations of the correspondence between generalisation and

the semi-classical limit of TQNNs. Speciﬁcally, in §V A

we introduce the concept of generalisation as the semi-

classical limit of TQNNs. In §V B we discuss the paradig-

matic case of the sum over quantum histories of a one-

particle state, considered in analogy with a perceptron.

We show that generalisation arises naturally from the

path-integral formalism. In §V C we extend this con-

struction to any TQNN, by rephrasing the path-integral

realising the sum over quantum histories for the percep-

tron, with generic spin-network states. In §VI we compu-

tationally showcase the theory of perceptron as a semi-

classical limit of TQNN. In §VII we discuss possible per-

spectives to develop the current theoretical framework.

Finally, in §VIII we spell out our conclusions.

II. BACKGROUND AND MOTIVATION

A. The generalisation bound problem

Informally, generalisation is the ability to identify

members of some set (or class) Dafter being exposed to

the members of some proper subset S⊂D. The generali-

sation ability of machine learning (ML) systems is tested

by exposing them to new members of Dafter training

via backpropagation (or similar algorithm) on the mem-

bers of S. Whether they have identiﬁed new members

of Dcorrectly is determined by human knowledge of D.

This knowledge is, however, not explicitly characterised:

if we had an algorithmic speciﬁcation of the members of

D, we could encode that speciﬁcation directly and ML

would not be necessary. We can make this dependence

on non-algorithmic human knowledge explicit by deﬁning

generalisation as follows.

We can, without loss of generality, treat Das a set

of N-bit strings, i.e. D={0,1}N. Let S⊂Dbe the

training set. Let Y={0,1}M, with M≤N, be the set

of possible “classes” or “answers” to be obtained by acting

on D. In most cases of interest, M≪N, e.g. classifying

handwritten characters, distinguishing cats from dogs, or

identifying individual people from photographs.

Now consider two agents Aand B, who implement

functions fA, fB:D→Yand gA, gB:S→Yrespec-

tively, where we assume gA=fA|Sand gB=fB|S. This

latter assumption is true even if Aand/or Bis trained

on a set S′⊃Sthat also contains noise or distractors, as

in [2]. We can call Athe “ground truth” observer (e.g. a

human) and Bthe “test” observer (e.g. a ML system).

The “training error” is d(gA, gB), where dis some met-

ric on the relevant function space, and the “test error”

or “generalisation error” is d(fA, fB). We can measure

d(gA, gB)because we can explicitly list every instance of

gAand gBacting on S. Both fAand fBare unknown,

so we cannot compute d(fA, fB).

We can now state the generalisation bound problem

(GBP) as: given d(gA, gB)≤ϵ, determine an upper

bound β≥ϵsuch that d(fA, fB)≤β. There is an a

priori upper bound on β:2N, corresponding to Aand B

disagreeing on every instance in D. In this case, ϵ= 2M.

Our goal is to make β→ϵas ϵ→0, which corresponds

to both optimal learning and Sbeing optimally represen-

tative of D.

It is worth noting that the GBP cannot be solved in

the general case. The set Dmay not be computable; we

know, for example, that the generalisation from some set

Sof programs that halt to the set Dof all programs that

halt is not computable [25]. There is, moreover, every

reason to expect that human judges may disagree about

edge cases for some classes of objects, and hence about

membership in some candidate sets D, rendering test er-

ror in such cases observer-dependent. It is also worth

noting that humans are every bit as much black boxes

as any ML system — the “explanation problem” [26,27]

for humans is even harder than it is for ML systems,

because experimenting on humans is more diﬃcult than

experimenting on machines.

B. How conventional generalisation theory fails to

explain DNN generalisation

The goal of generalisation theory is to explain and

justify why and how minimising the empirically mea-

surable training error d(g) = d(gA, gB)is a reasonable

approach to minimising the test error d(f) = d(fA, fB)

by analysing the generalisation gap, i.e. d(f)−d(g). In

practice, the generalisation gap is typically taken to be

fB−gB, i.e. the “gap” is taken to characterise the per-

formance of the ML system being tested. This implicitly

assumes that gA=fA, i.e. that the “ground truth” ob-

server Acan identify all elements of D; we have seen

above that this assumption can be incorrect. With this

assumption, the gap results entirely from the dependence

of the trained predictor gS(here we drop the “agent” sub-

script for convenience) on the training set S. However,

as noted above, Smight not be suﬃciently representative

of Dto direct the learner towards a good classiﬁer for all

of D. In a ML context one is, moreover, more interested

in the generalisation error than in the empirical train-

ing error; the main goal is evaluating the performance

of the model on the unseen cases, i.e. on D\S. If we

look at the problem from this point of view, then the

challenge comes from the mismatch between the optimi-

sation task of minimising the empirical or training risk

and the machine learning task of minimising the true or

test risk. Treating Das a probability distribution over

possible objects x, we can represent this mismatch as:

LD,f (gS) = Px∼D[gS(x)̸=f(x)] = D[{x:gS(x)̸=f(x)}],

where the error due to using gSis the probability of ran-

domly drawing an example x, according to the distribu-

tion D, for which gS(x)̸=f(x), measured with respect

to the probability distribution Dand the correct label-

ing function f. Here Pis the probability of a random

variable, and x∼Drepresents sampling xaccording to

The dependence of the predictor gSon the training

data set Shas led to the deﬁnition of several examples

of complexity measures providing bounds on the test er-

ror or on the sample-complexity. An instance of these

measures are the VC-dimension or the Rademacher com-

plexity. In particular, if the VC-dimension provides an

upper bound on the test error, the Rademacher complex-

ity measures the richness of a class of predictors with re-

spect to a probability distribution. These results provide

bounds on the test error that depend on the complexity

or capacity of the class of functions Gfrom which gSis

drawn. In this sense, generalisation theory and capacity

control are strictly related. Capacity control consists in

using models that are rich enough to get good ﬁts with-

out using those which are so rich that they overﬁt.

The empirical success of DNNs, which are typically

overparametrised models [28], challenges the traditional

complexity measures. Indeed, according for instance to

the VC-dimension, the discrepancy between training er-

ror and generalisation error is bounded from above by

a quantity that grows at least linearly as the number of

adjustable parameters grows, but shrinks as the num-

ber of training examples increases. As a consequence,

the traditional complexity measures tend to control the

capacity by minimising the number of parameters [29–

31]. Moreover, experimental results prove that DNNs

that have been trained to interpolate the training data

achieve a near-optimal test result even when the train-

ing data have been corrupted by a massive amount of

noise [2]. As Poggio et al. [3] write, the main puzzle of

DNN revolves around the absence of overﬁtting despite

large overparametrisation and despite the large capacity

demonstrated by zero training error on randomly labeled

data. It is, in other words, the fact that DNNs can im-

plement an operation L: (r, gS′)7→ f, where ris some

initial, e.g. random, network state, that (at least ap-

proximately) correctly generalises to the desired ffrom

a function gthat is not the restriction of fto the to-

tal training set S′, but rather a noise function, namely a

function of randomised data, on S′.

Historically, this puzzle was raised by a seminal paper

of Zhang et al. [1] mentioned earlier; see also [32]. This

paper showed that successful deep model classes have suf-

ﬁcient capacity to memorise randomised data sets while

having the ability to produce zero training error for par-

ticular natural datasets, e.g. CIFAR-10. The authors

also empirically observed that explicit regularisation on

the norm of weights seemed to be unnecessary to obtain

small test errors, in contradiction to the “traditional wis-

dom” of generalisation theory. In the case of DNNs, the

large capacity of the model looks suﬃcient to memorise

the training data by brute force. This behavior conﬂicts

with conventional generalisation theory, since learning by

explicit memorisation of training examples should not

imply generalisation capabilities. Generalising is tradi-

tionally understood as learning some underlying rule as-

sociated with the data generation process, i.e. learning

some compact representation of the training function gS,

and therefore being able to extrapolate that rule from

the training data to new unseen data. Moreover, as we

have already mentioned, a result of this kind is a chal-

lenge to traditional complexity measures and, in general,

to computational learning theory, since none of the exist-

ing bounds produces non-trivial results for interpolating

solutions of the sort generated by such DNN models.

Several eﬀorts have recently focused on achieving non-

vacuous generalisation bounds for DNN models [11,33,

34]. In the light of this situation, Belkin et al. [4] pro-

pose to subsume the traditional U-shaped risk curve de-

scribing the trade-oﬀ between underﬁtting and overﬁt-

FIG. 1: Traditional U-shaped risk curve, describing the trade-oﬀ

between underﬁtting and overﬁtting, and double-descent risk curve,

from [4, Figure 1].

ting with a double-descent risk curve Figure 1. From the

standard viewpoint, when the number of parameters N

is much smaller than the sample size n, i.e. N << n, the

traditional complexity measures assume that the train-

ing risk is close to the test risk. However, according to

the double descent risk curve that the authors propose,

by increasing progressively Nand thus by increasing the

class capacity, the model will also increase the function

classes until they are rich enough to achieve zero train-

ing error. As a consequence, near-perfect ﬁt functions

can be progressively constructed. These typically show

a smaller norm and, thus, are simpler in the sense of the

Occam’s razor. A view of this kind contradicts the tra-

ditional framework according to which, by increasing the

class capacity, the predictors will perfectly interpolate the

data. In this case, when an overﬁtting regime is achieved,

a very high generalisation risk is achieved as well. Yet,

as the authors state, increasing further the function class

capacity beyond this point, leads to decreasing test risk,

typically below the risk achieved when balancing under-

ﬁtting and overﬁtting under the traditional approach.

This view matches recent results from Poggio et al. [3]

and Bartlett et al. [35] according to which overparametri-

sation leads to “benign overﬁtting” in which an accurate

level of generalisation is achieved despite a near-perfect

ﬁt to training data. In particular, according to these au-

thors, overparametrisation enables gradient techniques to

impose regularisation implicitly while leading to accurate

predictions despite their overﬁtting behaviour. Such be-

havior will prefer matrices showing a smaller norm, and

hence a more distributed representation. This is a result

that matches the thesis that the generalisation perfor-

mance also depends on the size of the weights [12]. Both

of these assumptions guide our approach as well, even

though they will be embedded into a theoretical frame-

work totally diﬀerent from classical complexity theory.

C. A quantum approach to generalisation

Quantum computing introduces a new and fundamen-

tally nonclassical resource for the representation of data:

quantum coherence, typically in the form of quantum

entanglement [36]. Just as classical computers employ

additional bits — from single-bit parity checks to full

backup copies — to detect and correct errors and hence

enable robustness in the face of noise or other perturba-

tions, quantum computers employ additional quantum

bits (qbits) to implement QECCs. The extreme sensi-

tivity of quantum states to environmental perturbations,

which induce decoherence [37] and hence “erase” data,

requires that eﬀective QECCs have large dimensional-

ity, typically at least two qbits for every qbit to be pro-

tected [38]. Adding dimensionality, i.e. adding qbits to

the code-space of the implemented QECC, to a quantum

computer increases the range of perturbations against

which the encoded data will be protected [23].

The relevance of a high-dimensionality QECC to the

generalisation ability of a QNN can be made obvious

with a simple thought experiment. Consider a two-player

game in which Aand Bshare a set Yof classical to-

kens, and share a quantum channel X.Aselects a token

y∈Yand encodes it into Xusing some quantum op-

erator QA, which is nondeterministic in the sense that

QA(y)depends on the immediately-previous state of X.

Bthen uses an operator QBto read a token y′from X.

Bsends y′to A, and Asends back either ‘yes’ or ‘no’.

B’s objective is to vary QBso as to receive ‘yes’ answers

from A. Hence asymptotically, B’s objective is to make

QB→QA.

This game is an example of local operations, classical

communication (LOCC) protocol [39]. The local opera-

tions are the encoding and measurement steps that em-

ploy QAand QB, respectively; the classical communica-

tion is the exchange of classical data (y′) and ‘yes’ or ‘no’

answers. The quantum channel subserving any successful

LOCC protocol must implement a QECC to protect the

quantum encoding of classical data [24]. In this game,

Ximplements a QECC, with QAand QBthe encoding

and decoding operators. The dimension dim(X) must,

therefore, be much larger than dim(Y); how much larger

depends on what range of perturbations the code protects

against.

Suppose Aand Bplay this game for nrounds, at the

end of which Bis consistently getting ‘yes’ from A. What

can be said about further rounds of the game? This is,

clearly, the question of generalisation: the question of

how similar QBis to QAafter nrounds of “training”. If

the QECC implemented by Xprotects against some set

Bαof perturbations, and any encoding x=QA(y)is such

that ∃Bαsuch that x=Bα(xi), xi∈x1. . . xn, then B

will receive a ‘yes’ after decoding x. This corresponds to

the x1. . . xnbeing representative of QA(y)for arbitrary

yand arbitrary previous |X⟩, given the set Bα. If we

know the Bα, we can ﬁnd a lower bound on dim(X). In-

creasing dim(X) increases the set of perturbations that

Xcan correct against. Here it is clear that the gener-

alisation problem is not solvable in the general case: a

general solution of the generalisation problem would be

a QECC that protects against a provably-complete set of

perturbations mapping Sto Dfor arbitrary Sand D. If

Dis unknown, no set of perturbations can be proved to

be complete.

In this thought experiment, the training and test sets

Sand Dare, eﬀectively, measurements of the state in the

“middle” of X, reﬂecting the fact that from a quantum-

computing perspective, the “world” outside the commu-

nicating agents is a quantum information channel. We

can think of Aand B— a human and a ML system un-

dergoing training — as converging on a shared language

— a shared set of concepts. The goal is for B’s con-

cepts — elements of Y— to refer to the same things as

A’s concepts; they approach agreement asymptotically

by playing the above game. They each employ a quan-

tum channel, the physical world, to generate a classical

encoding, D. They take turns showing each other an

instance of Dand naming a concept. This is how, for

example, teaching a child to speak a language works.

In this quantum setting, there is no overﬁtting prob-

lem, because Xdoes not separate into components that

individually encode particular inferences. Instead, Xem-

ploys a high-dimensional entangled state to maximally

distribute the representation of both the data and the

implemented computation (in the above case, an Iden-

tity map). A QECC is, in other words, as far from a

classical look-up table as one can get; it eﬀectively takes

such a table and fully entangles it. Hence for any channel

that implements QECC, the test risk goes to zero as the

dimensionality, and hence the range of perturbations the

QECC protects against, increases.

As mentioned earlier, DNNs are the classical limits

of TQNNs, a generalisation of conventional quantum-

gate based, layered QNNs [19]. As quantum comput-

ers, TQNNs employ QECCs to achieve robustness [24].

Hence TQNNs, and DNNs as their classical limits, gen-

eralise better as their dimensionality increases. Indeed

minimising the norm of a DNN’s matrix — maximising

the classical entropy of the representation — is the clas-

sical limit of the use of entanglement to achieve a maxi-

mally distributed quantum encoding.

III. BASICS ON TOPOLOGICAL QUANTUM

FIELD THEORY

In this section we recollect some basic facts on TQFTs

and give a general overview of the mathematical formal-

ism used in this article. While TQFTs use the language

of category theory, we will not delve into categorical ap-

proaches, but simply explain the meaning of certain cat-

egorical notions used in TQFT. We will also provide a

physical perspective of TQFTs, which is the driving mo-

tivation of this work, as application of this framework to

machine learning (in the semi-classical limit).

A. Mathematical foundation

In general mathematical terms, using the Atiyah-Segal

axioms [40,41], a TQFT is a functor from the category of

cobordisms to the category of vector spaces. This deﬁni-

tion, while very concise, is not particularly illuminating.

We will therefore explain the most important points of

this deﬁnition in a simpliﬁed manner.

First, a cobordism is a manifold of dimension n+ 1

whose boundary is a union of manifolds of dimension n.

As a simple example one can think of a cylinder. In fact,

a cylinder is a manifold of dimension 2, while a circle is a

manifold of dimension 1, and the boundary of a cylinder

is a union of two circles. By category, in this context,

it is meant that it is possible to perform operations be-

tween manifolds, such as glueing manifolds along their

boundaries. One can therefore imagine, for example, to

glue a cylinder to another cylinder along circles that con-

stitute part of their boundaries. Therefore, we can com-

pose manifolds by glueing along homeomorphic bound-

aries. This procedure is understood to be orientation-

preserving. In addition, we have another type of opera-

tion, the “tensor product”, which consists of taking the

disjoint union of manifolds.

The category of vector spaces in the context of TQFT

simply refers to the class of vector spaces over a given

ﬁeld, e.g. C, and the linear maps between them. Com-

position is the usual composition of maps. The tensor

product, which is the second operation deﬁned for cobor-

disms, is just the regular tensor product of vector spaces

and linear maps.

With these deﬁnitions in place, we are in the posi-

tion to state what a functor between cobordisms and

vector spaces is, and therefore to give a more descrip-

tive deﬁnition of TQFT. Such a functor is an assignment

of n-dimensional manifolds to vector spaces, and cobor-

disms between manifolds to linear maps between the cor-

responding vector spaces. One can think of a functor as

a translation of the terminology from a category (the one

of cobordisms in our case) to another category (the one

of vector spaces in our case). The correspondence should

satisfy certain coherence axioms, in the sense that com-

positions are sent to compositions, and tensor products

are sent to tensor products.

If we denote by Cob the category of cobordisms, and

by V eckthe category of vector spaces over a ﬁeld k, then

a TQFT is a functor F:Cob −→ V eck, meaning that

F(M)is a vector space for each n-dimensional manifold

M, and any cobordism Σwith boundary M1⊔M2corre-

sponds to a linear map F(Σ) : F(M1)−→ F(M2). The

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DeepNeuralNetworksastheSemi-classicalLimitofTopologicalQuantumNeuralNetworks:TheproblemofgeneralisationAntoninoMarcianò,1EmanueleZappala,2TommasoTorda,3MatteoLulli,4StefanoGiagu,3ChrisFields,5DeenChen,6andFilippoFabrocini71CenterforFieldTheoryandParticlePhysics&DepartmentofPhysics,FudanUniversity,Ji...

展开>> 收起<<

Deep Neural Networks as the Semi-classical Limit of Topological Quantum Neural Networks The problem of generalisation.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Deep Neural Networks as the Semi-classical Limit of Topological Quantum Neural Networks The problem of generalisation

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: