Hierarchical quantum circuit representations for neural architecture search Matt Lourens1Ilya Sinayskiy2 3Daniel K. Park4Carsten Blank5and Francesco Petruccione3 6 1 1Physics Department Stellenbosch University Stellenbosch South Africa

2025-05-06 0 0 4.54MB 22 页 10玖币

侵权投诉

Hierarchical quantum circuit representations for neural architecture search

Matt Lourens,1, ∗Ilya Sinayskiy,2, 3 Daniel K. Park,4Carsten Blank,5and Francesco Petruccione3, 6, 1

1Physics Department, Stellenbosch University, Stellenbosch, South Africa

2School of Chemistry and Physics, University of KwaZulu-Natal, Durban, South Africa

3National Institute for Theoretical and Computational Sciences (NITheCS), South Africa

4Department of Statistics and Data Science, Yonsei University, Seoul, Korea

5Data Cybernetics, Landsberg, Germany

6School of Data Science and Computational Thinking,

Stellenbosch University, Stellenbosch, South Africa

Machine learning with hierarchical quantum circuits, usually referred to as Quantum Convolu-

tional Neural Networks (QCNNs), is a promising prospect for near-term quantum computing. The

QCNN is a circuit model inspired by the architecture of Convolutional Neural Networks (CNNs).

CNNs are successful because they do not need manual feature design and can learn high-level fea-

tures from raw data. Neural Architecture Search (NAS) builds on this success by learning network

architecture and achieves state-of-the-art performance. However, applying NAS to QCNNs presents

unique challenges due to the lack of a well-deﬁned search space. In this work, we propose a novel

framework for representing QCNN architectures using techniques from NAS, which enables search

space design and architecture search. Using this framework, we generate a family of popular QC-

NNs, those resembling reverse binary trees. We then evaluate this family of models on a music genre

classiﬁcation dataset, GTZAN, to justify the importance of circuit architecture. Furthermore, we

employ a genetic algorithm to perform Quantum Phase Recognition (QPR) as an example of archi-

tecture search with our representation. This work provides a way to improve model performance

without increasing complexity and to jump around the cost landscape to avoid barren plateaus.

Finally, we implement the framework as an open-source Python package to enable dynamic QCNN

creation and facilitate QCNN search space design for NAS.

INTRODUCTION

Machine learning using trainable quantum circuits

provides promising applications for quantum computing

[1–4]. Among various parameterized quantum circuit

(PQC) models, the Quantum Convolutional Neural

Network (QCNN) introduced in Ref [5] stands out for

its shallow circuit depth, absence of barren plateaus

[6], and good generalisation capabilities [7]. It has

been implemented experimentally [8] and combines

techniques from Quantum Error Correction (QEC),

Tensor Networks (TNs) and deep learning. Research at

this intersection has been fruitful, yielding deep learning

solutions for quantum many-body problems [9–12],

quantum-inspired insights for deep learning [13–15] and

equivalences between them [16–18]. Deep learning has

been widely successful in recent years with applications

spanning from content ﬁltering and product recom-

mendations to aided medical diagnosis and scientiﬁc

research. Its main characteristic, learning features from

raw data, eliminates the need for manual feature design

by experts [19]. AlexNet [20] demonstrated this and

marked the shift in focus from feature design to archi-

tecture design [21]. Naturally, the next step is learning

network architecture, which Neural Architecture Search

(NAS) aims to achieve [22]. NAS has already produced

state-of-the-art deep learning models with automatically

designed architectures [21, 23–25]. NAS consist of

∗lourensmattj@gmail.com

three main categories: search space, search strategy

and performance estimation strategy [22]. The search

space deﬁnes the set of possible architectures that a

search algorithm can consider, and carefully designed

search spaces help improve search eﬃciency and reduce

computational complexity [26]. Search space design

often involves encoding architectures using a cell-based

representation. Usually, a set of primitive operations,

such as convolutions or pooling, are combined into a

cell to capture some design motif (compute graph).

Diﬀerent cells are then stacked to form a complete

architecture. Cell-based representations are popular

because they can capture repeated motifs and modular

design patterns, which are often seen in successful

hand-crafted architectures. Similar patterns also appear

in quantum circuit designs [5, 27–31]. For example,

Grant et al. [27] use hierarchical architectures based

on tensor networks to classify classical and quantum

data. Similarly, Cong et al. [5] use the multiscale

entanglement renormalisation ansatz (MERA) as an

instance of their proposed QCNN and discuss generalisa-

tions for quantum analogues of convolution and pooling

operations. In this work, we formalise these design

patterns by providing a hierarchical representation for

QCNNs, thereby capturing their architecture in such a

way to facilitate search space design for NAS with PQCs.

The QCNN belongs to the class of hybrid quantum-

classical algorithms, in which a quantum computer

executes the circuit, and a classical computer optimises

its parameters. Two key factors must be considered

when using PQCs for machine learning: the method

arXiv:2210.15073v3 [quant-ph] 7 May 2023

FIG. 1: The machine learning pipeline we implemented for music genre classiﬁcation. Given an audio signal of a song

(a), we generate two forms of data: tabular (b) and image (c). Each form has data preprocessing applied before being

encoded into a quantum state (d). The QCNN circuit shown in (d) favours Principal Component Analysis (PCA)

because qubits are pooled from bottom to top, and principal components are encoded from top to bottom. This

architecture is an instance of the reverse binary tree family that we generated with our framework.

of data encoding (feature map) [32, 33] and the choice

of a quantum circuit [34–36]. Both the challenge and

objective are to ﬁnd a suitable quantum circuit for a

given feature map that is expressive and trainable [33].

The typical approach to ﬁnding a circuit is to keep the

architecture (gates layout) ﬁxed and to optimise con-

tinuous parameters such as rotation angles. Optimising

architecture is referred to as variable structure ansatz in

literature and is generally not the focus because of its

computational complexity [2]. However, the architecture

of a circuit can improve its expressive power and the

eﬀectiveness of initialisation techniques [28]. Also, the

QCNN’s deﬁning characteristic is its architecture, which

we found to impact model performance signiﬁcantly.

Therefore, we look towards NAS to optimise archi-

tecture in a quantum circuit setting. This approach,

sometimes referred to as quantum architecture search

(QAS) [37, 38], has shown promising results for the

variational quantum eigensolver (VQE) [39–42], the

quantum approximate optimisation algorithm (QAOA)

[43, 44] and general architecture search [37, 38, 45, 46].

However, these approaches are often task-speciﬁc or

impose additional constraints, such as circuit topology

or allowable gates, to make them computationally

feasible. To the best of the author’s knowledge, there is

currently no framework that can generate hierarchical

architectures such as the QCNN without imposing such

constraints.

One problem with the cell-based representation for

NAS is that the macro architecture, the sequence of

cells, is ﬁxed and must be chosen [22]. Recently, Liu

et al. [26] proposed a hierarchical representation as

a solution, where a cell sequence acts as the third

level of a multi-level hierarchy. In this representation,

lower-level motifs act as building blocks for higher-level

ones, allowing both macro and micro architecture to be

learned. In this work, we follow a similar approach and

represent a QCNN architecture as a hierarchy of directed

graphs. On the lowest level are primitive operations such

as convolutions and pooling. The second level consists

of sequences of these primitives, such as convolution-

pooling or convolution-convolution units. Higher-level

motifs then contain sequences of these lower-level motifs.

For example, the third level could contain a sequence

of three convolution-pooling units, as seen in Figure 1d.

For the primitives, we deﬁne hyperparameters such as

strides and pooling ﬁlters that control their architectural

eﬀect. This way, the representation can capture design

motifs on multiple levels, from the distribution of gates

in a single layer to overall hierarchical patterns such as

tensor tree networks. We demonstrate this by generating

FIG. 2: An overview of our architectural representation for QCNNs. From a given set of gates, we build two-qubit

unitary ansatzes. The representation then captures design motifs Ml

kon diﬀerent levels lof the hierarchy. On the

lowest level l= 1, we deﬁne primitives which act as building blocks for the architecture. For example, a convolution

operation with stride one is encoded as the directed graph M1

1. The directed graph M1

3is a pooling operation that

measures the bottom half of the circuit. Combined, they form the level two motif (e): a convolution-pooling unit M2

Higher-level motifs consist of combinations of lower-level motifs up until the ﬁnal level l=L, which contains only one

motif ML

1, the complete QCNN architecture. ML

1is a hierarchy of directed graphs fully specifying how to spread the

unitary ansatzes across the circuit. The two lines of code (e)and (f)show the power of this representation as it is

all that is required to create the entire QCNN circuit from Figure 1 (d). The code comes from the Python package

we implemented based on the work of this paper. It facilitates dynamic QCNN creation and search space design.

a family of QCNN architectures based on popular motifs

in literature. We then benchmark this family of models

and show that alternating architecture has a greater

impact on model performance than other modelling

components. By alternating architecture we mean the

following: given a quantum circuit that consist of n

unitary gates, an altered architecture consists of the

same ngates rearranged in a diﬀerent way on the circuit.

The types of rearrangements may be changing which

qubits the gates act upon, altering the order of gate

occurrences, or adjusting larger architectural motifs,

such as pooling speciﬁc qubits (stop using them) while

leaving others available for subsequent gates and so on.

We create architectural families to show the impact

of alternating architecture, any two instances of the

family will have the exact same unitaries, just applied

in a diﬀerent order on diﬀerent qubits. Consider the

machine learning pipeline for classifying musical genres

from audio signals, seen in Figure 1. We start with a

30-second recording of a song (Figure 1a) and transform

it in two ways. The ﬁrst is tabular form (Figure

1b), derived from standard digital signal processing

statistics of the audio signal. The second is image

form (Figure 1c), constructed using a Mel frequency

spectrogram. Both datasets are benchmarked separately,

with their own data preprocessing and encoding tech-

niques applied. For the tabular data, we test Principal

Component Analysis (PCA) and tree-based feature

selection before encoding it in a quantum state using

either qubit, IQP, or amplitude encoding. Once encoded,

we choose two-qubit unitary ansatzes Umand Vmfor

the convolution and pooling primitives m= 1,2,...,6,

as shown in Figure 1d. We show example ansatzes in

Appendix A and test them across diﬀerent instances of

an architecture family. Of all the components in this

pipeline, alternating architecture, that is changing how

each Umand each Vmare spread across the circuit, had

the greatest impact on model performance. In addition

to our theoretical framework, we implement it as an

open-source Python package to enable dynamic QCNN

creation and facilitate search space design for NAS.

It allows users to experimentally determine suitable

architectures for speciﬁc modelling setups, such as

ﬁnding circuits that perform well under a speciﬁc noise

or hardware conﬁguration, which is particularly relevant

in the Noisy Intermediate-Scale Quantum (NISQ) [47]

era. Additionally, as more qubits become available, the

hierarchical nature of our framework provides a natural

way to scale up the same model. In summary, our

contributions are the architectural representation for

QCNNs, a Python package for dynamic QCNN creation,

and experimental results on the potential advantage of

architecture search in a quantum setting.

The remainder of this paper is structured as fol-

lows: we begin with our main results by summarising

the architectural representation for QCNNs and then

show the eﬀect of alternating architecture, justifying its

importance. We then provide an example of architecture

search with our representation by employing an evolu-

tionary algorithm to perform QPR. Following this, we

give details of our framework by providing a mathemat-

ical formalism for the representation and describing its

use. Next, with the formalism at hand, we show how it

facilitates search space design by describing the space we

created for the benchmark experiments. We then discuss

generalisations of the formalism and the applicability of

our representation with search algorithms. After this

we elaborate on our experimental setup in the Methods

Section. Finally, we discuss applications and future

steps.

RESULTS

Architectural Representation

Figure 2 shows our architectural representation for

QCNNs. We deﬁne two-qubit unitary ansatzes from a

given set of gates, and capture design motifs Ml

kon dif-

ferent levels lof the hierarchy. On the lowest level l= 1,

we deﬁne primitives which act as building blocks for the

architecture. For example, a convolution operation with

stride one is encoded as the directed graph M1

1, and with

stride three as M1

2. The directed graph M1

3is a pooling

operation that measures the bottom half of the circuit,

and M1

4measures from the inside outwards. Combined,

they can form higher-level motifs such as convolution-

pooling units M2

1(e), convolution-convolution units M2

or convolution-pooling-convolution units M2

3. The high-

est level l=Lcontains only one motif ML

1, the complete

QCNN architecture. ML

1is a hierarchy of directed graphs

fully specifying how to spread the unitary ansatzes across

the circuit. This hierarchical representation is based on

the one from Liu et al. [26] for deep neural networks

(DNNs), and allows for the capture of modularised de-

sign patterns and repeated motifs. The two lines of code

(e)and (f)show the power of this representation as it

is all that is required to create the entire QCNN circuit

from Figure 1 (d). The code comes from the Python

package we implemented based on the work of this pa-

per. It facilitates dynamic QCNN creation and search

space design.

Architectural impact

The details regarding speciﬁc notation and represen-

tation of the framework is given after this section, ﬁrst

we justify it with the following experimental results.

In Appendix C we also give background on QCNNs

and quantum machine learning for more context. To

illustrate the impact of architecture on model perfor-

mance, we compare the ﬁxed architecture from the

experiments of Hur et al. [29] to other architectures

in the same family while keeping all other components

the same. The only diﬀerence in each comparison

is architecture (how the unitaries are spread across

the circuit). The architecture in [29] is represented

within our framework as: (sc, F ∗, sp) = (1,even,0) 7→

Qfree(8) + (Qconv(1) + Qpool(0, F even)) ×3, see al-

gorithm 1. To evaluate their performance, we use the

country vs rock genre pair, which proved to be one of the

most diﬃcult classiﬁcation tasks from the 45 possible

combinations. We compare eight unitary ansatzes with

diﬀerent levels of complexity, as shown in Figure A.1.

Table I shows the results of the comparisons, the

reference architecture is as described above and the

discovered alteration found via random search. We note

the ﬁrst important result, we improved the performance

of every ansatz, in one case, by 18.05%, through random

search of the architecture space. Ansatz refers to the

two-qubit unitary used for the convolution operation

of a model. For example, the model in ﬁgure 1 (d) is

described by (1,right,0) and ansatz A.1a corresponds

to U1, U2and U3being circuit A.1a from Appendix A.

Each value represents the average model accuracy and

standard deviation from 30 separate trained instances

on the same held-out test set.

The second important result is that alternating architec-

ture can improve model performance without increasing

complexity. For instance, the best-performing model for

the reference architecture is with ansatz A.1g, which has

an average accuracy of 73.24%. However, this ansatz

causes the model to have 10 ×3 = 30 parameters.

In contrast, by alternating the architecture with the

Architecture vs Ansatz

Ansatz, Architecture Alteration

# Params Reference New alteration ∆(sc, F ∗, sp)

A.1a, 6 65.37 ±2.875.14 ±1.7 +9.77 (6,left,2)

A.1b, 6 56.34 ±3.270.46 ±1.0 +14.12 (1,odd,3)

A.1c, 12 52.69 ±3.870.74 ±1.3 +18.05 (1,odd,0)

A.1d, 18 67.13 ±1.577.87 ±2.4 +9.87 (1,outside,2)

A.1e, 18 67.87 ±2.573.61 ±1.8 +5.74 (6,left,0)

A.1f, 18 69.21 ±2.674.80 ±2.8 +5.59 (1,left,3)

A.1g, 30 73.24 ±2.979.47 ±2.2 +6.23 (2,left,1)

A.1h, 30 69.35 ±4.171.71 ±3.7 +2.36 (2,left,1)

TABLE I: The average accuracy and standard deviation

of the country vs rock genre pair on a held-out test set af-

ter 30 separate trained instances. All architectures come

from the family of reverse binary trees, generated with

algorithm 1. The "reference" architecture is the one used

in the experiments of Ref [29] and the "alteration" was

found through random search within the same family.

The unitary ansatzes also come from Ref [29], which

is based on previous studies that benchmarked PQCs

[27, 48, 49].

simplest ansatz A.1a, the model outperformed the best

reference model with an average accuracy of 75.14%

while only having 3×2=6parameters. The parameter

counts come from each model having N= 8 qubits and

the same number of unitaries, 3N−2→3(8) −2 = 22,

of which 13 are for convolutions. See the search space

design section and Algorithm 1 for more details. A

model has three convolutions, and each convolution

shares weights between its two-qubit unitaries. This

means that the two-qubit unitary ansatz primarily

determines the number of parameters to optimise for

a model. For example, a model with ansatz A.1a have

2×3 = 6 parameters to optimise because ansatz A.1a

has two parameters.

Another interesting result is for ansatz A.1c, the refer-

ence architecture could only obtain an average accuracy

of 52.69% indicating its inability to ﬁnd any kind of

local minimum during training, leading one to think it

might be a barren plateu. But, the altered architecture

was able to ﬁnd a local minima and improve the average

accuracy by 18.05%.

We would like to note that our primary objective

in these experiments is to demonstrate the potential

for performance improvement. As such, we only con-

ducted random search for approximately 2 hours on an

i7-1165G7 processor for each ansatz. Consequently, for

higher parameter ansatzes, which correspond to longer

training times, the search space was less explored. This

is likely the reason behind the observed decrease in

performance improvement for larger parameter ansatzes.

Therefore the observed improvements are all lower

Performance across architecture search space

Convolution stride, sc

F∗, sp1 2 3 4 5 6 7 Avg

even 67.01 63.63 60.76 64.93 59.98 63.1 59.49 62.81

065.97 58.68 56.25 66.67 62.85 59.72 63.43 61.88

166.32 66.32 63.54 60.07 61.46 71.88 54.17 63.73

266.67 60.76 60.07 68.06 54.17 58.8 63.89 61.81

369.1 68.75 63.19 64.93 61.46 60.19 56.48 63.84

inside 66.41 71.96 58.25 54.25 69.27 68.15 60.53 64.18

065.28 72.22 60.07 49.65 70.49 68.4 60.65 63.94

167.01 71.18 58.68 55.9 66.32 68.4 60.19 64.09

268.4 71.53 58.33 51.74 71.88 68.98 58.8 64.26

364.93 72.92 55.9 59.72 68.4 66.67 62.5 64.42

left 62.85 61.63 59.38 59.03 51.56 72.52 72.45 62.22

066.67 67.01 56.94 61.46 52.08 71.18 73.61 63.79

159.03 62.15 52.78 57.99 52.08 71.18 73.61 60.8

263.19 63.19 63.19 60.76 51.74 75.93 71.76 63.51

362.5 54.17 64.58 55.9 50.35 72.69 70.83 60.79

odd 61.11 68.75 63.37 62.76 64.67 60.52 57.99 62.96

060.76 71.88 63.19 58.33 63.54 59.38 57.87 62.29

163.54 67.36 64.58 63.54 64.24 62.5 59.26 63.73

260.42 70.14 64.58 65.97 69.1 58.8 56.94 64.16

359.72 65.62 61.11 63.19 61.81 61.11 57.87 61.65

outside 60.68 65.8 65.54 57.12 62.15 59.83 67.13 62.51

067.36 59.72 71.88 54.17 67.01 60.07 70.37 64.15

153.47 69.79 62.15 56.25 61.11 58.33 70.83 61.49

257.99 70.83 60.07 61.11 59.03 59.26 66.67 62.07

363.89 62.85 68.06 56.94 61.46 61.57 60.65 62.29

right 70.05 65.63 64.41 53.65 68.66 63.69 60.65 63.94

070.14 63.54 64.58 50 68.4 61.11 62.96 62.96

169.79 67.71 64.58 69.1 68.06 67.01 57.87 66.62

270.14 62.15 63.89 43.75 68.75 62.04 61.57 61.75

370.14 69.1 64.58 51.74 69.44 64.35 60.19 64.37

Avg 64.68 66.23 61.95 58.62 62.72 64.69 63.04 63.11

TABLE II: Country vs Rock average accuracy within the

reverse binary tree search space, all with A.1a as ansatz.

The convolution stride scis shown on the horizontal axis

and the combinations of pooling ﬁlter F∗and stride sp

on the vertical. The best pooling ﬁlter and convolution

stride combinations are presented in bold along with the

overall best architecture (sc, F ∗, sp) = (6,left,2).

bounds for the potential performance increase from

alternating architecture. We anticipate that signiﬁcantly

better architectures may still exist within the space.

Table II presents the performance of the family of

reverse binary trees (as described in Algorithm 1) for

ansatz A.1a. Due to its quick training time, ansatz A.1a

was the only case for which we managed to exhaust the

search space (168 architectures). In the search space

design section, we discuss how the size of the family can

be easily increased or decreased. Each value represents

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

HierarchicalquantumcircuitrepresentationsforneuralarchitecturesearchMattLourens,1,IlyaSinayskiy,2,3DanielK.Park,4CarstenBlank,5andFrancescoPetruccione3,6,11PhysicsDepartment,StellenboschUniversity,Stellenbosch,SouthAfrica2SchoolofChemistryandPhysics,UniversityofKwaZulu-Natal,Durban,SouthAfrica3Nati...

展开>> 收起<<

Hierarchical quantum circuit representations for neural architecture search Matt Lourens1Ilya Sinayskiy2 3Daniel K. Park4Carsten Blank5and Francesco Petruccione3 6 1 1Physics Department Stellenbosch University Stellenbosch South Africa.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Hierarchical quantum circuit representations for neural architecture search Matt Lourens1Ilya Sinayskiy2 3Daniel K. Park4Carsten Blank5and Francesco Petruccione3 6 1 1Physics Department Stellenbosch University Stellenbosch South Africa

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: