2
efficient and systematic approaches.
In this work we present a machine learning based
method that automates the process of comparing numer-
ical simulation and experimental data, so the physical
parameters of the band structure that gave rise to a par-
ticular experimental dataset can be determined with min-
imal human effort. Recently machine learning and arti-
ficial neural network techniques have seen various appli-
cations in the realm of physical sciences [13]. In con-
densed matter physics, artificial neural networks have
been used to represent quantum states [14, 15] and learn
these states from available data [16, 17]. In a different
direction, recently machine learning models were use for
photonic crystals band diagram prediction and gap opti-
misation [18–20]. Despite a large number of more the-
oretical applications, machine learning approaches are
only starting to be employed in analysis of experimental
data. Recent examples include identification of quantum
phase transitions [21] and hidden orders from experimen-
tal images [22]. These few examples highlight the strong
potential of machine learning based approaches on ex-
perimental data, that we further exploit in the present
work.
A conceptual overview of our approach is shown in
Fig. 1. To extract the band structure parameters from
experimental data, we first train a deep neural net-
work (DNN) [23, 24] that solves the forward problem by
replicating the numerical calculation of the DOS (Sec-
tion II A). To this end we use the simulation of the ex-
perimental data shown in Fig. 1(a). In the particular
example of penetration field capacitance data considered
here, the simulator uses the band structure parameters,
the asymmetry potential between two edges of the system
(physically equivalent to transverse electric field) and the
chemical potential as input parameters. As an output we
get charge density and from that determine the DOS by
differentiating density with respect to chemical poten-
tial. A set of simulated data is used to train the DNN in
Fig. 1(b). Constructed in a way to efficiently replace the
data simulator, the DNN acts as a function that takes the
band structure parameters, the asymmetry potential and
directly charge density as input, and outputs the corre-
sponding DOS. It is constructed by learning from a large
dataset of simulation results, optimising its output to al-
ways match that of the simulator. The resulting DNN
represents a fast and differentiable replacement for the
physical simulation. It can therefore be used to efficiently
solve the inverse problem (Section II B). In particular, the
values of parameters that gave rise to a given dataset are
extracted using gradient-based optimisation in Fig. 1(c),
where we iteratively modify the band structure parame-
ters until the DNN’s output matches the provided DOS
values.
The task of mapping a vector of inputs (e.g. band
structure parameters) to a continuous output (e.g. DOS)
is known as regression in machine learning [24]. A DNN
implements such a mapping as a series of chained ma-
trix multiplications (‘layers’) interleaved with element-
wise non-linear functions (‘activations’). Each layer mul-
tiplies the vector of outputs from the previous by some
weight matrix, to give an updated vector [23]; the final
layer typical yields a single value. The weight matri-
ces are optimized (‘trained’) using first-order optimiza-
tion (e.g. gradient descent), such that the overall map-
ping from inputs to output approximates some func-
tion defined by a training dataset of inputs and desired
outputs. The celebrated universal approximation the-
orem [25] proves that a neural network with just two
layers (but unbounded width) can represent any smooth
function. More recently, it has been shown that this is
also true for a neural network of bounded width (but un-
bounded depth) [26]. In practice, even finite-sized DNNs
have proven very successful in approximating complex
functions in many domains of science. One of our contri-
butions is to show a DNN can also provide an accurate
estimate of DOS given band-structure parameters, field
strength, and chemical potential.
Since our final goal is to determine the band-structure
parameters from experimental measurements of penetra-
tion capacitance, it might seem natural to train the DNN
for exactly this task (the inverse problem), instead of
the forward problem. However, this is not possible in
practice. We have only a single experimental dataset,
for which the parameters are unknown, whereas machine
learning techniques require a large dataset of training ex-
amples (with the true output known) to learn the desired
mapping from. If instead we trained on easy-to-obtain
simulated data, the resulting model would not work on
experimental data since the latter is significantly different
from the former both in terms of the relative magnitude
of features in the data, and the locations of features such
as edges. These differences may arise since the simula-
tor uses a simplified effective model of the material and
does not account for screening at the microscopic level,
disorder, strain, experimental uncertainties, and possibly
other ingredients. In contrast, statistical learning theory
requires that the training and test data be drawn from
the same distributions if a trained model is to work on
the latter [27].
As a specific application, we demonstrate the frame-
work outlined above on Bernal stacked (ABA) tri-
layer graphene. For this material both band structure
parametrization [28, 29] and high quality experimental
measurements are readily available [11], calling for an
accurate extraction of the band structure parameters.
The determination of the band structure was performed
by tour de force manual fitting in an earlier work [11],
thus allowing us to benchmark our approach. First, we
train the DNN and show it gives an efficient and accu-
rate surrogate for numerical calculation of the DOS for
this system, for a wide range of band structure param-
eters (Section III C). Next, we use the DNN for auto-
matically solving the inverse problem of determining the
physical parameters giving rise to certain values of the
penetration capacitance (Section III D). Finally, we ap-
ply this to experimental data from Ref. [11] by exploit-