
Reusability Report: Comparing gradient descent and monte carlo tree search
optimization of quantum annealing schedules
Matteo M. Wauters1and Evert van Nieuwenburg1, 2
1Niels Bohr Institute, University of Copenhagen, Copenhagen 2100, Denmark
2Lorentz Institute and Leiden Institute of Advanced Computer Science,
Leiden University, P.O. Box 9506, 2300 RA Leiden, The Netherlands
(Dated: October 10, 2022)
ARISING FROM Yu-Qin Chen et al. Na-
ture Machine Intelligence https://doi.org/10.1038/
s42256-022-00446-y (2022).
The AI system AlphaZero [1,2] famously learned to
play complex and strategic games such as Go and Chess
and achieved super-human performance in these tasks.
AlphaZero uses a combination of searching through the
possible states of a game (Monte Carlo Tree Search) and
a neural network to guide that search. The result is an
algorithm that learns to choose sequences of moves that
lead to a victory. In Ref. [3], authors Chen et al., set
out to employ this same set of techniques to solve a spe-
cific class of combinatorial optimisation problems, the
3-satisfiability (3-SAT) problems.
3-SAT is an NP-hard problem, where the task is to de-
cide whether there exists a choice of nbinary variables
bi=1...n that simultaneously satisfy a set of mclauses of
3 variables each. A clause for a 3-SAT problem is of the
form C= (b2OR not b4OR b6). The authors focus on
the case of m/n = 3, for which there are always three
times as many clauses as there are variables. This ra-
tio is smaller than the critical value m/n ≈4.2, above
which the ratio of satisfiable expressions drops to zero [4],
however, it corresponds to a set of hard instances char-
acterised by a unique solution. The authors provide a
dataset that has one or several of such instances for dif-
ferent sizes of 3-SAT.
A 3-SAT problem of nvariables can be encoded into
a Hamiltonian of nqubits, spanning a Hilbert space of
dimension N= 2n, whose groundstate provides the so-
lution. Solving the 3-SAT problem is then equivalent
to finding that groundstate, for which several algorithms
exist. The algorithm of choice here is the technique of
quantum annealing (QA) [5,6].
In quantum annealing, finding the groundstate is done
by starting in a known (and easily prepared) groundstate
of an initial Hamiltonian H0, and then slowly (adiabati-
cally) interpolating to the desired final Hamiltonian Hf.
That is, we perform
H(t) = (1 −s(t))H0+s(t)Hf,
for tgoing from 0 to T, and s(t) satisfying s(0) = 0 and
s(T) = 1. Finding the optimal annealing path – the ac-
tual form for s(t) – is the central task. Starting from
the groundstate of H0, and changing s(t) adiabatically
will keep the system in the instantaneous groundstate
50 100 150 200 250 300
T
0.2
0.4
0.6
0.8
1.0
1.2
fidelity
n= 11, M = 5
BFGS
linear QA
MCTS
Figure 1. Fidelity at the end of the quantum annealing pro-
cess vs annealing time Tfor 3-SAT instances with n= 11
variables and M= 5 frequency component in the annealing
schedule s(t). We compare a simple linear schedule (orange
triangles), gradient descent with the BFGS algorithm (blue
circles), and MCTS (red squares). Data are averaged over
the 18 instances provided in the GitHub repository; the er-
rorbars are due to the large difference in performance between
individual instances.
of the full H(t). At t=Tthen, we will have obtained
the groundstate of Hfand hence the solution to our 3-
SAT problem. At the same time, going slowly means the
annealing takes more time, and we thus have an inher-
ent trade-off between speed and accuracy: the perfect
place for an optimisation algorithm to help out. Fol-
lowing the original paper [3], we choose the fidelity as
the figure of merit for the accuracy of the algorithm,
which simply measures the overlap of the solution ob-
tained with the true solution (which for benchmarking
purposes is known). A fidelity of 1 means that the algo-
rithm achieved the perfect solution, whereas a fidelity of
0 is completely off. For more general purposes, when the
ground state is not known and it might be degenerate or
quasi-degenerate, the energy is a better figure of merit.
For the specific problem at hand, however, we checked
that the fidelity and the energy provide equivalent esti-
mates of the algorithm’s accuracy.
The way Chen et al. approach the problem of optimis-
ing s(t), is by expanding it as a Fourier series with M
frequencies ωk=πk/T , and then optimise the choice of
arXiv:2210.03411v1 [quant-ph] 7 Oct 2022