Hierarchical quantum circuit representations for neural architecture search Matt Lourens1Ilya Sinayskiy2 3Daniel K. Park4Carsten Blank5and Francesco Petruccione3 6 1 1Physics Department Stellenbosch University Stellenbosch South Africa

2025-05-06 0 0 4.54MB 22 页 10玖币
侵权投诉
Hierarchical quantum circuit representations for neural architecture search
Matt Lourens,1, Ilya Sinayskiy,2, 3 Daniel K. Park,4Carsten Blank,5and Francesco Petruccione3, 6, 1
1Physics Department, Stellenbosch University, Stellenbosch, South Africa
2School of Chemistry and Physics, University of KwaZulu-Natal, Durban, South Africa
3National Institute for Theoretical and Computational Sciences (NITheCS), South Africa
4Department of Statistics and Data Science, Yonsei University, Seoul, Korea
5Data Cybernetics, Landsberg, Germany
6School of Data Science and Computational Thinking,
Stellenbosch University, Stellenbosch, South Africa
Machine learning with hierarchical quantum circuits, usually referred to as Quantum Convolu-
tional Neural Networks (QCNNs), is a promising prospect for near-term quantum computing. The
QCNN is a circuit model inspired by the architecture of Convolutional Neural Networks (CNNs).
CNNs are successful because they do not need manual feature design and can learn high-level fea-
tures from raw data. Neural Architecture Search (NAS) builds on this success by learning network
architecture and achieves state-of-the-art performance. However, applying NAS to QCNNs presents
unique challenges due to the lack of a well-defined search space. In this work, we propose a novel
framework for representing QCNN architectures using techniques from NAS, which enables search
space design and architecture search. Using this framework, we generate a family of popular QC-
NNs, those resembling reverse binary trees. We then evaluate this family of models on a music genre
classification dataset, GTZAN, to justify the importance of circuit architecture. Furthermore, we
employ a genetic algorithm to perform Quantum Phase Recognition (QPR) as an example of archi-
tecture search with our representation. This work provides a way to improve model performance
without increasing complexity and to jump around the cost landscape to avoid barren plateaus.
Finally, we implement the framework as an open-source Python package to enable dynamic QCNN
creation and facilitate QCNN search space design for NAS.
INTRODUCTION
Machine learning using trainable quantum circuits
provides promising applications for quantum computing
[1–4]. Among various parameterized quantum circuit
(PQC) models, the Quantum Convolutional Neural
Network (QCNN) introduced in Ref [5] stands out for
its shallow circuit depth, absence of barren plateaus
[6], and good generalisation capabilities [7]. It has
been implemented experimentally [8] and combines
techniques from Quantum Error Correction (QEC),
Tensor Networks (TNs) and deep learning. Research at
this intersection has been fruitful, yielding deep learning
solutions for quantum many-body problems [9–12],
quantum-inspired insights for deep learning [13–15] and
equivalences between them [16–18]. Deep learning has
been widely successful in recent years with applications
spanning from content filtering and product recom-
mendations to aided medical diagnosis and scientific
research. Its main characteristic, learning features from
raw data, eliminates the need for manual feature design
by experts [19]. AlexNet [20] demonstrated this and
marked the shift in focus from feature design to archi-
tecture design [21]. Naturally, the next step is learning
network architecture, which Neural Architecture Search
(NAS) aims to achieve [22]. NAS has already produced
state-of-the-art deep learning models with automatically
designed architectures [21, 23–25]. NAS consist of
lourensmattj@gmail.com
three main categories: search space, search strategy
and performance estimation strategy [22]. The search
space defines the set of possible architectures that a
search algorithm can consider, and carefully designed
search spaces help improve search efficiency and reduce
computational complexity [26]. Search space design
often involves encoding architectures using a cell-based
representation. Usually, a set of primitive operations,
such as convolutions or pooling, are combined into a
cell to capture some design motif (compute graph).
Different cells are then stacked to form a complete
architecture. Cell-based representations are popular
because they can capture repeated motifs and modular
design patterns, which are often seen in successful
hand-crafted architectures. Similar patterns also appear
in quantum circuit designs [5, 27–31]. For example,
Grant et al. [27] use hierarchical architectures based
on tensor networks to classify classical and quantum
data. Similarly, Cong et al. [5] use the multiscale
entanglement renormalisation ansatz (MERA) as an
instance of their proposed QCNN and discuss generalisa-
tions for quantum analogues of convolution and pooling
operations. In this work, we formalise these design
patterns by providing a hierarchical representation for
QCNNs, thereby capturing their architecture in such a
way to facilitate search space design for NAS with PQCs.
The QCNN belongs to the class of hybrid quantum-
classical algorithms, in which a quantum computer
executes the circuit, and a classical computer optimises
its parameters. Two key factors must be considered
when using PQCs for machine learning: the method
arXiv:2210.15073v3 [quant-ph] 7 May 2023
2
FIG. 1: The machine learning pipeline we implemented for music genre classification. Given an audio signal of a song
(a), we generate two forms of data: tabular (b) and image (c). Each form has data preprocessing applied before being
encoded into a quantum state (d). The QCNN circuit shown in (d) favours Principal Component Analysis (PCA)
because qubits are pooled from bottom to top, and principal components are encoded from top to bottom. This
architecture is an instance of the reverse binary tree family that we generated with our framework.
of data encoding (feature map) [32, 33] and the choice
of a quantum circuit [34–36]. Both the challenge and
objective are to find a suitable quantum circuit for a
given feature map that is expressive and trainable [33].
The typical approach to finding a circuit is to keep the
architecture (gates layout) fixed and to optimise con-
tinuous parameters such as rotation angles. Optimising
architecture is referred to as variable structure ansatz in
literature and is generally not the focus because of its
computational complexity [2]. However, the architecture
of a circuit can improve its expressive power and the
effectiveness of initialisation techniques [28]. Also, the
QCNN’s defining characteristic is its architecture, which
we found to impact model performance significantly.
Therefore, we look towards NAS to optimise archi-
tecture in a quantum circuit setting. This approach,
sometimes referred to as quantum architecture search
(QAS) [37, 38], has shown promising results for the
variational quantum eigensolver (VQE) [39–42], the
quantum approximate optimisation algorithm (QAOA)
[43, 44] and general architecture search [37, 38, 45, 46].
However, these approaches are often task-specific or
impose additional constraints, such as circuit topology
or allowable gates, to make them computationally
feasible. To the best of the author’s knowledge, there is
currently no framework that can generate hierarchical
architectures such as the QCNN without imposing such
constraints.
One problem with the cell-based representation for
NAS is that the macro architecture, the sequence of
cells, is fixed and must be chosen [22]. Recently, Liu
et al. [26] proposed a hierarchical representation as
a solution, where a cell sequence acts as the third
level of a multi-level hierarchy. In this representation,
lower-level motifs act as building blocks for higher-level
ones, allowing both macro and micro architecture to be
learned. In this work, we follow a similar approach and
represent a QCNN architecture as a hierarchy of directed
graphs. On the lowest level are primitive operations such
as convolutions and pooling. The second level consists
of sequences of these primitives, such as convolution-
pooling or convolution-convolution units. Higher-level
motifs then contain sequences of these lower-level motifs.
For example, the third level could contain a sequence
of three convolution-pooling units, as seen in Figure 1d.
For the primitives, we define hyperparameters such as
strides and pooling filters that control their architectural
effect. This way, the representation can capture design
motifs on multiple levels, from the distribution of gates
in a single layer to overall hierarchical patterns such as
tensor tree networks. We demonstrate this by generating
3
FIG. 2: An overview of our architectural representation for QCNNs. From a given set of gates, we build two-qubit
unitary ansatzes. The representation then captures design motifs Ml
kon different levels lof the hierarchy. On the
lowest level l= 1, we define primitives which act as building blocks for the architecture. For example, a convolution
operation with stride one is encoded as the directed graph M1
1. The directed graph M1
3is a pooling operation that
measures the bottom half of the circuit. Combined, they form the level two motif (e): a convolution-pooling unit M2
1.
Higher-level motifs consist of combinations of lower-level motifs up until the final level l=L, which contains only one
motif ML
1, the complete QCNN architecture. ML
1is a hierarchy of directed graphs fully specifying how to spread the
unitary ansatzes across the circuit. The two lines of code (e)and (f)show the power of this representation as it is
all that is required to create the entire QCNN circuit from Figure 1 (d). The code comes from the Python package
we implemented based on the work of this paper. It facilitates dynamic QCNN creation and search space design.
a family of QCNN architectures based on popular motifs
in literature. We then benchmark this family of models
and show that alternating architecture has a greater
impact on model performance than other modelling
components. By alternating architecture we mean the
following: given a quantum circuit that consist of n
unitary gates, an altered architecture consists of the
same ngates rearranged in a different way on the circuit.
The types of rearrangements may be changing which
qubits the gates act upon, altering the order of gate
occurrences, or adjusting larger architectural motifs,
such as pooling specific qubits (stop using them) while
leaving others available for subsequent gates and so on.
We create architectural families to show the impact
of alternating architecture, any two instances of the
family will have the exact same unitaries, just applied
in a different order on different qubits. Consider the
machine learning pipeline for classifying musical genres
from audio signals, seen in Figure 1. We start with a
30-second recording of a song (Figure 1a) and transform
it in two ways. The first is tabular form (Figure
1b), derived from standard digital signal processing
statistics of the audio signal. The second is image
form (Figure 1c), constructed using a Mel frequency
spectrogram. Both datasets are benchmarked separately,
with their own data preprocessing and encoding tech-
niques applied. For the tabular data, we test Principal
Component Analysis (PCA) and tree-based feature
selection before encoding it in a quantum state using
either qubit, IQP, or amplitude encoding. Once encoded,
4
we choose two-qubit unitary ansatzes Umand Vmfor
the convolution and pooling primitives m= 1,2,...,6,
as shown in Figure 1d. We show example ansatzes in
Appendix A and test them across different instances of
an architecture family. Of all the components in this
pipeline, alternating architecture, that is changing how
each Umand each Vmare spread across the circuit, had
the greatest impact on model performance. In addition
to our theoretical framework, we implement it as an
open-source Python package to enable dynamic QCNN
creation and facilitate search space design for NAS.
It allows users to experimentally determine suitable
architectures for specific modelling setups, such as
finding circuits that perform well under a specific noise
or hardware configuration, which is particularly relevant
in the Noisy Intermediate-Scale Quantum (NISQ) [47]
era. Additionally, as more qubits become available, the
hierarchical nature of our framework provides a natural
way to scale up the same model. In summary, our
contributions are the architectural representation for
QCNNs, a Python package for dynamic QCNN creation,
and experimental results on the potential advantage of
architecture search in a quantum setting.
The remainder of this paper is structured as fol-
lows: we begin with our main results by summarising
the architectural representation for QCNNs and then
show the effect of alternating architecture, justifying its
importance. We then provide an example of architecture
search with our representation by employing an evolu-
tionary algorithm to perform QPR. Following this, we
give details of our framework by providing a mathemat-
ical formalism for the representation and describing its
use. Next, with the formalism at hand, we show how it
facilitates search space design by describing the space we
created for the benchmark experiments. We then discuss
generalisations of the formalism and the applicability of
our representation with search algorithms. After this
we elaborate on our experimental setup in the Methods
Section. Finally, we discuss applications and future
steps.
RESULTS
Architectural Representation
Figure 2 shows our architectural representation for
QCNNs. We define two-qubit unitary ansatzes from a
given set of gates, and capture design motifs Ml
kon dif-
ferent levels lof the hierarchy. On the lowest level l= 1,
we define primitives which act as building blocks for the
architecture. For example, a convolution operation with
stride one is encoded as the directed graph M1
1, and with
stride three as M1
2. The directed graph M1
3is a pooling
operation that measures the bottom half of the circuit,
and M1
4measures from the inside outwards. Combined,
they can form higher-level motifs such as convolution-
pooling units M2
1(e), convolution-convolution units M2
2,
or convolution-pooling-convolution units M2
3. The high-
est level l=Lcontains only one motif ML
1, the complete
QCNN architecture. ML
1is a hierarchy of directed graphs
fully specifying how to spread the unitary ansatzes across
the circuit. This hierarchical representation is based on
the one from Liu et al. [26] for deep neural networks
(DNNs), and allows for the capture of modularised de-
sign patterns and repeated motifs. The two lines of code
(e)and (f)show the power of this representation as it
is all that is required to create the entire QCNN circuit
from Figure 1 (d). The code comes from the Python
package we implemented based on the work of this pa-
per. It facilitates dynamic QCNN creation and search
space design.
Architectural impact
The details regarding specific notation and represen-
tation of the framework is given after this section, first
we justify it with the following experimental results.
In Appendix C we also give background on QCNNs
and quantum machine learning for more context. To
illustrate the impact of architecture on model perfor-
mance, we compare the fixed architecture from the
experiments of Hur et al. [29] to other architectures
in the same family while keeping all other components
the same. The only difference in each comparison
is architecture (how the unitaries are spread across
the circuit). The architecture in [29] is represented
within our framework as: (sc, F , sp) = (1,even,0) 7→
Qfree(8) + (Qconv(1) + Qpool(0, F even)) ×3, see al-
gorithm 1. To evaluate their performance, we use the
country vs rock genre pair, which proved to be one of the
most difficult classification tasks from the 45 possible
combinations. We compare eight unitary ansatzes with
different levels of complexity, as shown in Figure A.1.
Table I shows the results of the comparisons, the
reference architecture is as described above and the
discovered alteration found via random search. We note
the first important result, we improved the performance
of every ansatz, in one case, by 18.05%, through random
search of the architecture space. Ansatz refers to the
two-qubit unitary used for the convolution operation
of a model. For example, the model in figure 1 (d) is
described by (1,right,0) and ansatz A.1a corresponds
to U1, U2and U3being circuit A.1a from Appendix A.
Each value represents the average model accuracy and
standard deviation from 30 separate trained instances
on the same held-out test set.
The second important result is that alternating architec-
ture can improve model performance without increasing
complexity. For instance, the best-performing model for
the reference architecture is with ansatz A.1g, which has
an average accuracy of 73.24%. However, this ansatz
causes the model to have 10 ×3 = 30 parameters.
In contrast, by alternating the architecture with the
5
Architecture vs Ansatz
Ansatz, Architecture Alteration
# Params Reference New alteration (sc, F , sp)
A.1a, 6 65.37 ±2.875.14 ±1.7 +9.77 (6,left,2)
A.1b, 6 56.34 ±3.270.46 ±1.0 +14.12 (1,odd,3)
A.1c, 12 52.69 ±3.870.74 ±1.3 +18.05 (1,odd,0)
A.1d, 18 67.13 ±1.577.87 ±2.4 +9.87 (1,outside,2)
A.1e, 18 67.87 ±2.573.61 ±1.8 +5.74 (6,left,0)
A.1f, 18 69.21 ±2.674.80 ±2.8 +5.59 (1,left,3)
A.1g, 30 73.24 ±2.979.47 ±2.2 +6.23 (2,left,1)
A.1h, 30 69.35 ±4.171.71 ±3.7 +2.36 (2,left,1)
TABLE I: The average accuracy and standard deviation
of the country vs rock genre pair on a held-out test set af-
ter 30 separate trained instances. All architectures come
from the family of reverse binary trees, generated with
algorithm 1. The "reference" architecture is the one used
in the experiments of Ref [29] and the "alteration" was
found through random search within the same family.
The unitary ansatzes also come from Ref [29], which
is based on previous studies that benchmarked PQCs
[27, 48, 49].
simplest ansatz A.1a, the model outperformed the best
reference model with an average accuracy of 75.14%
while only having 3×2=6parameters. The parameter
counts come from each model having N= 8 qubits and
the same number of unitaries, 3N23(8) 2 = 22,
of which 13 are for convolutions. See the search space
design section and Algorithm 1 for more details. A
model has three convolutions, and each convolution
shares weights between its two-qubit unitaries. This
means that the two-qubit unitary ansatz primarily
determines the number of parameters to optimise for
a model. For example, a model with ansatz A.1a have
2×3 = 6 parameters to optimise because ansatz A.1a
has two parameters.
Another interesting result is for ansatz A.1c, the refer-
ence architecture could only obtain an average accuracy
of 52.69% indicating its inability to find any kind of
local minimum during training, leading one to think it
might be a barren plateu. But, the altered architecture
was able to find a local minima and improve the average
accuracy by 18.05%.
We would like to note that our primary objective
in these experiments is to demonstrate the potential
for performance improvement. As such, we only con-
ducted random search for approximately 2 hours on an
i7-1165G7 processor for each ansatz. Consequently, for
higher parameter ansatzes, which correspond to longer
training times, the search space was less explored. This
is likely the reason behind the observed decrease in
performance improvement for larger parameter ansatzes.
Therefore the observed improvements are all lower
Performance across architecture search space
Convolution stride, sc
F, sp1 2 3 4 5 6 7 Avg
even 67.01 63.63 60.76 64.93 59.98 63.1 59.49 62.81
065.97 58.68 56.25 66.67 62.85 59.72 63.43 61.88
166.32 66.32 63.54 60.07 61.46 71.88 54.17 63.73
266.67 60.76 60.07 68.06 54.17 58.8 63.89 61.81
369.1 68.75 63.19 64.93 61.46 60.19 56.48 63.84
inside 66.41 71.96 58.25 54.25 69.27 68.15 60.53 64.18
065.28 72.22 60.07 49.65 70.49 68.4 60.65 63.94
167.01 71.18 58.68 55.9 66.32 68.4 60.19 64.09
268.4 71.53 58.33 51.74 71.88 68.98 58.8 64.26
364.93 72.92 55.9 59.72 68.4 66.67 62.5 64.42
left 62.85 61.63 59.38 59.03 51.56 72.52 72.45 62.22
066.67 67.01 56.94 61.46 52.08 71.18 73.61 63.79
159.03 62.15 52.78 57.99 52.08 71.18 73.61 60.8
263.19 63.19 63.19 60.76 51.74 75.93 71.76 63.51
362.5 54.17 64.58 55.9 50.35 72.69 70.83 60.79
odd 61.11 68.75 63.37 62.76 64.67 60.52 57.99 62.96
060.76 71.88 63.19 58.33 63.54 59.38 57.87 62.29
163.54 67.36 64.58 63.54 64.24 62.5 59.26 63.73
260.42 70.14 64.58 65.97 69.1 58.8 56.94 64.16
359.72 65.62 61.11 63.19 61.81 61.11 57.87 61.65
outside 60.68 65.8 65.54 57.12 62.15 59.83 67.13 62.51
067.36 59.72 71.88 54.17 67.01 60.07 70.37 64.15
153.47 69.79 62.15 56.25 61.11 58.33 70.83 61.49
257.99 70.83 60.07 61.11 59.03 59.26 66.67 62.07
363.89 62.85 68.06 56.94 61.46 61.57 60.65 62.29
right 70.05 65.63 64.41 53.65 68.66 63.69 60.65 63.94
070.14 63.54 64.58 50 68.4 61.11 62.96 62.96
169.79 67.71 64.58 69.1 68.06 67.01 57.87 66.62
270.14 62.15 63.89 43.75 68.75 62.04 61.57 61.75
370.14 69.1 64.58 51.74 69.44 64.35 60.19 64.37
Avg 64.68 66.23 61.95 58.62 62.72 64.69 63.04 63.11
TABLE II: Country vs Rock average accuracy within the
reverse binary tree search space, all with A.1a as ansatz.
The convolution stride scis shown on the horizontal axis
and the combinations of pooling filter Fand stride sp
on the vertical. The best pooling filter and convolution
stride combinations are presented in bold along with the
overall best architecture (sc, F , sp) = (6,left,2).
bounds for the potential performance increase from
alternating architecture. We anticipate that significantly
better architectures may still exist within the space.
Table II presents the performance of the family of
reverse binary trees (as described in Algorithm 1) for
ansatz A.1a. Due to its quick training time, ansatz A.1a
was the only case for which we managed to exhaust the
search space (168 architectures). In the search space
design section, we discuss how the size of the family can
be easily increased or decreased. Each value represents
摘要:

HierarchicalquantumcircuitrepresentationsforneuralarchitecturesearchMattLourens,1,IlyaSinayskiy,2,3DanielK.Park,4CarstenBlank,5andFrancescoPetruccione3,6,11PhysicsDepartment,StellenboschUniversity,Stellenbosch,SouthAfrica2SchoolofChemistryandPhysics,UniversityofKwaZulu-Natal,Durban,SouthAfrica3Nati...

展开>> 收起<<
Hierarchical quantum circuit representations for neural architecture search Matt Lourens1Ilya Sinayskiy2 3Daniel K. Park4Carsten Blank5and Francesco Petruccione3 6 1 1Physics Department Stellenbosch University Stellenbosch South Africa.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:4.54MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注