Predicting electronic structures at any length scale with machine learning Lenz Fiedler1Normand Modine2Steve Schmerler3Dayton J. Vogel2Gabriel A. Popoola4Aidan Thompson5Sivasankaran Rajamanickam5and Attila Cangi1

2025-05-02 0 0 6.86MB 13 页 10玖币
侵权投诉
Predicting electronic structures at any length scale with machine learning
Lenz Fiedler,1Normand Modine,2Steve Schmerler,3Dayton J. Vogel,2Gabriel A.
Popoola,4Aidan Thompson,5Sivasankaran Rajamanickam,5and Attila Cangi1
1Center for Advanced Systems Understanding, Helmholtz-Zentrum Dresden-Rossendorf,
Untermarkt 20, G¨orlitz, 02826, Saxony, Germany
2Computational Materials and Data Science, Sandia National Laboratories,
1515 Eubank Blvd, Albuquerque, 87123, NM, USA
3Information Services and Computing, Helmholtz-Zentrum Dresden-Rossendorf,
Bautzner Landstraße 400, Dresden, 01328, Saxony, Germany
4Elder Research, Inc., 300 West Main Street, Charlottesville, 22903, VA, USA
5Center for Computing Research, Sandia National Laboratories 1515 Eubank Blvd, Albuquerque, 87123, NM USA
(Dated: December 12, 2022)
ABSTRACT
The properties of electrons in matter are of fundamen-
tal importance. They give rise to virtually all material
properties and determine the physics at play in objects
ranging from semiconductor devices to the interior of gi-
ant gas planets. Modeling and simulation of such diverse
applications rely primarily on density functional theory
(DFT), which has become the principal method for pre-
dicting the electronic structure of matter. While DFT
calculations have proven to be very useful, their compu-
tational scaling limits them to small systems. We have
developed a machine learning framework for predicting
the electronic structure on any length scale. It shows up
to three orders of magnitude speedup on systems where
DFT is tractable and, more importantly, enables pre-
dictions on scales where DFT calculations are infeasible.
Our work demonstrates how machine learning circum-
vents a long-standing computational bottleneck and ad-
vances materials science to frontiers intractable with any
current solutions.
INTRODUCTION
Electrons are elementary particles of fundamental im-
portance. Their quantum mechanical interactions with
each other and with atomic nuclei give rise to the plethora
of phenomena we observe in chemistry and materials sci-
ence. Knowing the probability distribution of electrons
in molecules and materials their electronic structure
provides insights into the reactivity of molecules, the
structure and the energy transport inside planets, and
how materials break. Hence, both an understanding and
the ability to manipulate the electronic structure in a ma-
terial propels novel technologies impacting both industry
and society. In light of the global challenges related to
climate change, green energy, and energy efficiency, the
most notable applications that require an explicit insight
into the electronic structure of matter include the search
for better batteries[1, 2] and the identification of more
efficient catalysts[3, 4]. The electronic structure is fur-
thermore of great interest to fundamental physics as it
determines the Hamiltonian of an interacting many-body
quantum system[5] and is observable using experimental
techniques[6].
The quest for predicting the electronic structure of
matter dates back to Thomas[7], Fermi[8], and Dirac[9]
who formulated the very first theory in terms of elec-
tron density distributions. While computationally cheap,
their theory was not useful for chemistry or materials
science due to its lack of accuracy, as pointed out by
Teller[10]. Subsequently, based on a mathematical exis-
tence proof[5], the seminal work of Kohn and Sham[11]
provided a smart reformulation of the electronic struc-
ture problem in terms of modern density functional the-
ory (DFT) that has led to a paradigm shift. Due to
the balance of accuracy and computational cost it of-
fers, DFT has revolutionized chemistry with the No-
bel Prize in 1998 to Kohn[12] and Pople[13] marking its
breakthrough. It is the reason DFT remains by far the
most widely used method for computing the electronic
structure of matter. With the advent of exascale high-
performance computing systems, DFT continues reshap-
ing computational materials science at an even bigger
scale[14, 15]. However, even with an exascale system, the
scale one could achieve with DFT is limited due its cubic
scaling on system size. We address this limitation and
demonstrate that an approach based on machine learn-
ing can predict electronic structures at any length scale
for the first time.
In principle, DFT is an exact method, even though in
practice the exchange-correlation functional needs to be
approximated[16]. Sufficiently accurate approximations
do exist for useful applications, and the search for ever
more accurate functionals that extend the scope of DFT
is an active area of research[17] where methods of artifi-
cial intelligence and machine learning (ML) have led to
great advances in accuracy[18, 19] without addressing the
scaling limitation.
Despite these initial successes, DFT calculations are
hampered inherently due to their computational cost.
The standard algorithm scales as the cube of system size,
limiting routine calculations to problems comprised of
only a few hundred atoms. This is a fundamental limi-
tation that has impeded large-scale computational stud-
arXiv:2210.11343v3 [cond-mat.mtrl-sci] 8 Dec 2022
2
ies in chemistry and materials science so far. Lifting the
curse of cubic scaling has been a long-standing challenge.
Prior works have attempted to overcome this challenge
in terms of either an orbital-free formulation of DFT
[20] or algorithmic development known as linear-scaling
DFT [21, 22]. Neither of these paths has led to a gen-
eral solution to this problem. More recently, other works
have explored leveraging ML techniques to circumvent
the inherent bottleneck of the DFT algorithm. These
have used kernel-ridge regression [23] or neural networks
[24, 25], but remained on the conceptual level and are
applicable to only model systems, small molecules, and
low-dimensional solids.
Despite all these efforts, computing the electronic
structure of matter at large scales while maintaining first-
principles accuracy has remained an elusive goal so far.
We provide a solution to this long-standing challenge in
the form of a linear-scaling ML surrogate for DFT. Our
algorithm enables accurate predictions of the electronic
structure of materials at any length scale.
RESULTS
Ultra-large scale electronic structure predictions
with neural networks
In this work, we circumvent the computational bottle-
neck of DFT calculations by utilizing neural networks in
local atomic environments to predict the local electronic
structure. Thereby, we achieve the ability to compute the
electronic structure of matter at any length scale with
minimal computational effort and at the first-principles
accuracy of DFT.
To this end, we train a feed-forward neural network M
that performs a simple mapping
˜
d(, r) = M(B(J, r)) ,(1)
where the bispectrum coefficients Bof order Jserve as
descriptors that encode the positions of atoms relative
to every point in real space r, while ˜
dapproximates the
local density of states (LDOS) dat energy . The LDOS
encodes the local electronic structure at each point in
real space and energy. More specifically, the LDOS can
be used to calculate the electronic density nand density
of states D, two important quantities which enable access
to a range of observables such as the total free energy A
itself [26], i.e.,
A[n, D] = Ahn[d], D[d]i=A[d].(2)
The key point is that the neural network is trained
locally on a given point in real space and therefore has
no awareness of the system size. Our underlying working
assumption relies on the nearsightedness of the electronic
structure [27]. It sets a characteristic length scale beyond
which effects on the electronic structure decay rapidly
with distance. Since the mapping defined in Eq. (1) is
purely local, i.e., performed individually for each point
in real space, the resulting workflow is scalable across
the real-space grid, highly parallel, and transferable to
different system sizes. Non-locality is factored into the
model via the bispectrum descriptors, which are calcu-
lated by including information from adjacent points in
space up to a specified cutoff radius consistent with the
aforementioned length scale.
The individual steps of our computational workflow
are visualized in Fig. 1. They include combining the cal-
culation of bispectrum descriptors to encode the atomic
density, training and evaluation of neural networks to
predict the LDOS, and finally, the post-processing of
the LDOS to physical observables. The entire workflow
is implemented end-to-end as a software package called
Materials Learning Algorithms (MALA) [28], where we
employ interfaces to popular open-source software pack-
ages, namely LAMMPS [29] (descriptor calculation), Py-
Torch [30] (neural network training and inference), and
Quantum ESPRESSO [31] (post-processing of the elec-
tronic structure data to observables).
We illustrate our workflow by computing the elec-
tronic structure of sample material that contains more
than 100,000 atoms. The employed ML model is a feed-
forward neural network that is trained on simulation cells
containing 256 Beryllium atoms. In Fig. 2, we showcase
how our framework predicts multiple observables at pre-
viously unattainable scales. Here, we show an atomic
snapshot containing 131,072 Beryllium atoms at room
temperature into which a stacking fault has been intro-
duced, i.e., three atomic layers have been shifted laterally,
changing the local crystal structure from hcp to fcc. Our
ML model is then used to predict both the electronic
densities and energies of this simulation cell with and
without the stacking fault. As expected, our ML predic-
tions reflect the changes in the electronic density due to
the changes in the atomic geometry. The energetic differ-
ences associated with such a stacking fault are expected
to follow a behavior N1
3, where Nis the number of
atoms. By calculating the energy of progressively larger
systems with and without a stacking fault, we find that
this expected behavior is indeed obeyed quite closely by
our model (Fig. 2b).
Our findings open up the possibility to train models
for specific applications on scales previously unattainable
with traditional simulation methods. Our ML predic-
tions on the 131,072 atom system take 48 minutes on
150 standard CPUs; the resulting computational cost
of roughly 121 CPU hours (CPUh) is comparable to a
conventional DFT calculation for a few hundred atoms.
The computational cost of our ML workflow is orders of
magnitude below currently existing linear-scaling DFT
codes, i.e., codes scaling with N[32], which employ
approximations in terms of the density matrix. Their
computational cost lies two orders of magnitude above
our approach. Standard DFT codes scale even more un-
favorably as N3, which renders simulations like the
3
FIG. 1. Overview of the MALA framework. ML models created via this workflow can be trained on data from popular
first-principles simulation codes such as Quantum ESPRESSO [31]. The pictograms below the individual workflow steps show,
from left to right, the calculation of local descriptors at an arbitrary grid point (green) based on information at adjunct grid
points (grey) within a certain cutoff radius (orange), with an atom shown in red; a neural network; the electronic structure,
exemplified here as a contour plot of the electronic density for a cell containing Aluminum atoms (red).
one presented here completely infeasible.
Common research directions for utilizing ML in the
realm of electronic structure theory either focus on pre-
dicting energies and forces of extended systems (ML in-
teratomic potentials[33]) or directly predicting observ-
ables of interest such as polarizabilities[34]. MALA mod-
els are not limited to singular observables and even give
insight into the electronic structure itself, from which a
range of relevant observables including the total free en-
ergy, the density of states, the electronic density, and
atomic forces follow.
The utility of our ML framework for chemistry and ma-
terials science relies on two key aspects. It needs to scale
well with system size up to the 100,000 atom scale and
beyond. Furthermore, it also needs to maintain accuracy
as we run inferences on increasingly large systems. Both
issues are addressed in the following.
Computational scaling
The computational cost of conventional DFT calcu-
lations scales as N3. Improved algorithms can enable
an effective N2scaling in certain cases over certain size
ranges[37]. In either case, one is faced with an increas-
ingly insurmountable computational cost for systems in-
volving more than a few thousand atoms. As illustrated
in Fig. 3a, conventional DFT calculations (here using
the Quantum ESPRESSO[31] software package) are sub-
ject to this scaling behavior. Contrarily, the computa-
tional cost of using MALA models for size extrapolation
(as shown in Fig. 3b) grows linearly with the number
of atoms and has a significantly smaller computational
overhead. We observe speed-ups of up to three orders of
magnitude for atom counts up to which DFT calculations
are computationally tractable.
MALA model inference consists of three steps. First,
the descriptor vectors are calculated on a real-space grid,
then the LDOS is computed using a pre-trained neu-
ral network for given input descriptors, and finally, the
LDOS is post-processed to compute electronic densities,
total energies, and other observables. The first two parts
of this workflow trivially scale with N, since they strictly
perform operations per grid point, and the real space
simulation grid grows linearly with N.
Obtaining linear scaling for the last part of the work-
flow, which includes processing the electronic density
to the total free energy, is less trivial since it requires
both the evaluation of the ion-ion energy as well as the
exchange-correlation energy, which for the pseudopoten-
tials we employ includes the calculation of non-linear core
corrections. While both of these terms can be shown to
scale linearly with system size in principle, in practice
this requires the addition of a few custom routines, as is
further outlined in the methods section.
摘要:

PredictingelectronicstructuresatanylengthscalewithmachinelearningLenzFiedler,1NormandModine,2SteveSchmerler,3DaytonJ.Vogel,2GabrielA.Popoola,4AidanThompson,5SivasankaranRajamanickam,5andAttilaCangi11CenterforAdvancedSystemsUnderstanding,Helmholtz-ZentrumDresden-Rossendorf,Untermarkt20,Gorlitz,02826...

展开>> 收起<<
Predicting electronic structures at any length scale with machine learning Lenz Fiedler1Normand Modine2Steve Schmerler3Dayton J. Vogel2Gabriel A. Popoola4Aidan Thompson5Sivasankaran Rajamanickam5and Attila Cangi1.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:6.86MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注