
2
ies in chemistry and materials science so far. Lifting the
curse of cubic scaling has been a long-standing challenge.
Prior works have attempted to overcome this challenge
in terms of either an orbital-free formulation of DFT
[20] or algorithmic development known as linear-scaling
DFT [21, 22]. Neither of these paths has led to a gen-
eral solution to this problem. More recently, other works
have explored leveraging ML techniques to circumvent
the inherent bottleneck of the DFT algorithm. These
have used kernel-ridge regression [23] or neural networks
[24, 25], but remained on the conceptual level and are
applicable to only model systems, small molecules, and
low-dimensional solids.
Despite all these efforts, computing the electronic
structure of matter at large scales while maintaining first-
principles accuracy has remained an elusive goal so far.
We provide a solution to this long-standing challenge in
the form of a linear-scaling ML surrogate for DFT. Our
algorithm enables accurate predictions of the electronic
structure of materials at any length scale.
RESULTS
Ultra-large scale electronic structure predictions
with neural networks
In this work, we circumvent the computational bottle-
neck of DFT calculations by utilizing neural networks in
local atomic environments to predict the local electronic
structure. Thereby, we achieve the ability to compute the
electronic structure of matter at any length scale with
minimal computational effort and at the first-principles
accuracy of DFT.
To this end, we train a feed-forward neural network M
that performs a simple mapping
˜
d(, r) = M(B(J, r)) ,(1)
where the bispectrum coefficients Bof order Jserve as
descriptors that encode the positions of atoms relative
to every point in real space r, while ˜
dapproximates the
local density of states (LDOS) dat energy . The LDOS
encodes the local electronic structure at each point in
real space and energy. More specifically, the LDOS can
be used to calculate the electronic density nand density
of states D, two important quantities which enable access
to a range of observables such as the total free energy A
itself [26], i.e.,
A[n, D] = Ahn[d], D[d]i=A[d].(2)
The key point is that the neural network is trained
locally on a given point in real space and therefore has
no awareness of the system size. Our underlying working
assumption relies on the nearsightedness of the electronic
structure [27]. It sets a characteristic length scale beyond
which effects on the electronic structure decay rapidly
with distance. Since the mapping defined in Eq. (1) is
purely local, i.e., performed individually for each point
in real space, the resulting workflow is scalable across
the real-space grid, highly parallel, and transferable to
different system sizes. Non-locality is factored into the
model via the bispectrum descriptors, which are calcu-
lated by including information from adjacent points in
space up to a specified cutoff radius consistent with the
aforementioned length scale.
The individual steps of our computational workflow
are visualized in Fig. 1. They include combining the cal-
culation of bispectrum descriptors to encode the atomic
density, training and evaluation of neural networks to
predict the LDOS, and finally, the post-processing of
the LDOS to physical observables. The entire workflow
is implemented end-to-end as a software package called
Materials Learning Algorithms (MALA) [28], where we
employ interfaces to popular open-source software pack-
ages, namely LAMMPS [29] (descriptor calculation), Py-
Torch [30] (neural network training and inference), and
Quantum ESPRESSO [31] (post-processing of the elec-
tronic structure data to observables).
We illustrate our workflow by computing the elec-
tronic structure of sample material that contains more
than 100,000 atoms. The employed ML model is a feed-
forward neural network that is trained on simulation cells
containing 256 Beryllium atoms. In Fig. 2, we showcase
how our framework predicts multiple observables at pre-
viously unattainable scales. Here, we show an atomic
snapshot containing 131,072 Beryllium atoms at room
temperature into which a stacking fault has been intro-
duced, i.e., three atomic layers have been shifted laterally,
changing the local crystal structure from hcp to fcc. Our
ML model is then used to predict both the electronic
densities and energies of this simulation cell with and
without the stacking fault. As expected, our ML predic-
tions reflect the changes in the electronic density due to
the changes in the atomic geometry. The energetic differ-
ences associated with such a stacking fault are expected
to follow a behavior ∼N−1
3, where Nis the number of
atoms. By calculating the energy of progressively larger
systems with and without a stacking fault, we find that
this expected behavior is indeed obeyed quite closely by
our model (Fig. 2b).
Our findings open up the possibility to train models
for specific applications on scales previously unattainable
with traditional simulation methods. Our ML predic-
tions on the 131,072 atom system take 48 minutes on
150 standard CPUs; the resulting computational cost
of roughly 121 CPU hours (CPUh) is comparable to a
conventional DFT calculation for a few hundred atoms.
The computational cost of our ML workflow is orders of
magnitude below currently existing linear-scaling DFT
codes, i.e., codes scaling with ∼N[32], which employ
approximations in terms of the density matrix. Their
computational cost lies two orders of magnitude above
our approach. Standard DFT codes scale even more un-
favorably as ∼N3, which renders simulations like the