Predicting electronic structures at any length scale with machine learning Lenz Fiedler1Normand Modine2Steve Schmerler3Dayton J. Vogel2Gabriel A. Popoola4Aidan Thompson5Sivasankaran Rajamanickam5and Attila Cangi1

2025-05-02 2 0 6.86MB 13 页 10玖币

侵权投诉

Predicting electronic structures at any length scale with machine learning

Lenz Fiedler,1Normand Modine,2Steve Schmerler,3Dayton J. Vogel,2Gabriel A.

Popoola,4Aidan Thompson,5Sivasankaran Rajamanickam,5and Attila Cangi1

1Center for Advanced Systems Understanding, Helmholtz-Zentrum Dresden-Rossendorf,

Untermarkt 20, G¨orlitz, 02826, Saxony, Germany

2Computational Materials and Data Science, Sandia National Laboratories,

1515 Eubank Blvd, Albuquerque, 87123, NM, USA

3Information Services and Computing, Helmholtz-Zentrum Dresden-Rossendorf,

Bautzner Landstraße 400, Dresden, 01328, Saxony, Germany

4Elder Research, Inc., 300 West Main Street, Charlottesville, 22903, VA, USA

5Center for Computing Research, Sandia National Laboratories 1515 Eubank Blvd, Albuquerque, 87123, NM USA

(Dated: December 12, 2022)

ABSTRACT

The properties of electrons in matter are of fundamen-

tal importance. They give rise to virtually all material

properties and determine the physics at play in objects

ranging from semiconductor devices to the interior of gi-

ant gas planets. Modeling and simulation of such diverse

applications rely primarily on density functional theory

(DFT), which has become the principal method for pre-

dicting the electronic structure of matter. While DFT

calculations have proven to be very useful, their compu-

tational scaling limits them to small systems. We have

developed a machine learning framework for predicting

the electronic structure on any length scale. It shows up

to three orders of magnitude speedup on systems where

DFT is tractable and, more importantly, enables pre-

dictions on scales where DFT calculations are infeasible.

Our work demonstrates how machine learning circum-

vents a long-standing computational bottleneck and ad-

vances materials science to frontiers intractable with any

current solutions.

INTRODUCTION

Electrons are elementary particles of fundamental im-

portance. Their quantum mechanical interactions with

each other and with atomic nuclei give rise to the plethora

of phenomena we observe in chemistry and materials sci-

ence. Knowing the probability distribution of electrons

in molecules and materials −their electronic structure

−provides insights into the reactivity of molecules, the

structure and the energy transport inside planets, and

how materials break. Hence, both an understanding and

the ability to manipulate the electronic structure in a ma-

terial propels novel technologies impacting both industry

and society. In light of the global challenges related to

climate change, green energy, and energy eﬃciency, the

most notable applications that require an explicit insight

into the electronic structure of matter include the search

for better batteries[1, 2] and the identiﬁcation of more

eﬃcient catalysts[3, 4]. The electronic structure is fur-

thermore of great interest to fundamental physics as it

determines the Hamiltonian of an interacting many-body

quantum system[5] and is observable using experimental

techniques[6].

The quest for predicting the electronic structure of

matter dates back to Thomas[7], Fermi[8], and Dirac[9]

who formulated the very ﬁrst theory in terms of elec-

tron density distributions. While computationally cheap,

their theory was not useful for chemistry or materials

science due to its lack of accuracy, as pointed out by

Teller[10]. Subsequently, based on a mathematical exis-

tence proof[5], the seminal work of Kohn and Sham[11]

provided a smart reformulation of the electronic struc-

ture problem in terms of modern density functional the-

ory (DFT) that has led to a paradigm shift. Due to

the balance of accuracy and computational cost it of-

fers, DFT has revolutionized chemistry −with the No-

bel Prize in 1998 to Kohn[12] and Pople[13] marking its

breakthrough. It is the reason DFT remains by far the

most widely used method for computing the electronic

structure of matter. With the advent of exascale high-

performance computing systems, DFT continues reshap-

ing computational materials science at an even bigger

scale[14, 15]. However, even with an exascale system, the

scale one could achieve with DFT is limited due its cubic

scaling on system size. We address this limitation and

demonstrate that an approach based on machine learn-

ing can predict electronic structures at any length scale

for the ﬁrst time.

In principle, DFT is an exact method, even though in

practice the exchange-correlation functional needs to be

approximated[16]. Suﬃciently accurate approximations

do exist for useful applications, and the search for ever

more accurate functionals that extend the scope of DFT

is an active area of research[17] where methods of artiﬁ-

cial intelligence and machine learning (ML) have led to

great advances in accuracy[18, 19] without addressing the

scaling limitation.

Despite these initial successes, DFT calculations are

hampered inherently due to their computational cost.

The standard algorithm scales as the cube of system size,

limiting routine calculations to problems comprised of

only a few hundred atoms. This is a fundamental limi-

tation that has impeded large-scale computational stud-

arXiv:2210.11343v3 [cond-mat.mtrl-sci] 8 Dec 2022

ies in chemistry and materials science so far. Lifting the

curse of cubic scaling has been a long-standing challenge.

Prior works have attempted to overcome this challenge

in terms of either an orbital-free formulation of DFT

[20] or algorithmic development known as linear-scaling

DFT [21, 22]. Neither of these paths has led to a gen-

eral solution to this problem. More recently, other works

have explored leveraging ML techniques to circumvent

the inherent bottleneck of the DFT algorithm. These

have used kernel-ridge regression [23] or neural networks

[24, 25], but remained on the conceptual level and are

applicable to only model systems, small molecules, and

low-dimensional solids.

Despite all these eﬀorts, computing the electronic

structure of matter at large scales while maintaining ﬁrst-

principles accuracy has remained an elusive goal so far.

We provide a solution to this long-standing challenge in

the form of a linear-scaling ML surrogate for DFT. Our

algorithm enables accurate predictions of the electronic

structure of materials at any length scale.

RESULTS

Ultra-large scale electronic structure predictions

with neural networks

In this work, we circumvent the computational bottle-

neck of DFT calculations by utilizing neural networks in

local atomic environments to predict the local electronic

structure. Thereby, we achieve the ability to compute the

electronic structure of matter at any length scale with

minimal computational eﬀort and at the ﬁrst-principles

accuracy of DFT.

To this end, we train a feed-forward neural network M

that performs a simple mapping

d(, r) = M(B(J, r)) ,(1)

where the bispectrum coeﬃcients Bof order Jserve as

descriptors that encode the positions of atoms relative

to every point in real space r, while ˜

dapproximates the

local density of states (LDOS) dat energy . The LDOS

encodes the local electronic structure at each point in

real space and energy. More speciﬁcally, the LDOS can

be used to calculate the electronic density nand density

of states D, two important quantities which enable access

to a range of observables such as the total free energy A

itself [26], i.e.,

A[n, D] = Ahn[d], D[d]i=A[d].(2)

The key point is that the neural network is trained

locally on a given point in real space and therefore has

no awareness of the system size. Our underlying working

assumption relies on the nearsightedness of the electronic

structure [27]. It sets a characteristic length scale beyond

which eﬀects on the electronic structure decay rapidly

with distance. Since the mapping deﬁned in Eq. (1) is

purely local, i.e., performed individually for each point

in real space, the resulting workﬂow is scalable across

the real-space grid, highly parallel, and transferable to

diﬀerent system sizes. Non-locality is factored into the

model via the bispectrum descriptors, which are calcu-

lated by including information from adjacent points in

space up to a speciﬁed cutoﬀ radius consistent with the

aforementioned length scale.

The individual steps of our computational workﬂow

are visualized in Fig. 1. They include combining the cal-

culation of bispectrum descriptors to encode the atomic

density, training and evaluation of neural networks to

predict the LDOS, and ﬁnally, the post-processing of

the LDOS to physical observables. The entire workﬂow

is implemented end-to-end as a software package called

Materials Learning Algorithms (MALA) [28], where we

employ interfaces to popular open-source software pack-

ages, namely LAMMPS [29] (descriptor calculation), Py-

Torch [30] (neural network training and inference), and

Quantum ESPRESSO [31] (post-processing of the elec-

tronic structure data to observables).

We illustrate our workﬂow by computing the elec-

tronic structure of sample material that contains more

than 100,000 atoms. The employed ML model is a feed-

forward neural network that is trained on simulation cells

containing 256 Beryllium atoms. In Fig. 2, we showcase

how our framework predicts multiple observables at pre-

viously unattainable scales. Here, we show an atomic

snapshot containing 131,072 Beryllium atoms at room

temperature into which a stacking fault has been intro-

duced, i.e., three atomic layers have been shifted laterally,

changing the local crystal structure from hcp to fcc. Our

ML model is then used to predict both the electronic

densities and energies of this simulation cell with and

without the stacking fault. As expected, our ML predic-

tions reﬂect the changes in the electronic density due to

the changes in the atomic geometry. The energetic diﬀer-

ences associated with such a stacking fault are expected

to follow a behavior ∼N−1

3, where Nis the number of

atoms. By calculating the energy of progressively larger

systems with and without a stacking fault, we ﬁnd that

this expected behavior is indeed obeyed quite closely by

our model (Fig. 2b).

Our ﬁndings open up the possibility to train models

for speciﬁc applications on scales previously unattainable

with traditional simulation methods. Our ML predic-

tions on the 131,072 atom system take 48 minutes on

150 standard CPUs; the resulting computational cost

of roughly 121 CPU hours (CPUh) is comparable to a

conventional DFT calculation for a few hundred atoms.

The computational cost of our ML workﬂow is orders of

magnitude below currently existing linear-scaling DFT

codes, i.e., codes scaling with ∼N[32], which employ

approximations in terms of the density matrix. Their

computational cost lies two orders of magnitude above

our approach. Standard DFT codes scale even more un-

favorably as ∼N3, which renders simulations like the

FIG. 1. Overview of the MALA framework. ML models created via this workﬂow can be trained on data from popular

ﬁrst-principles simulation codes such as Quantum ESPRESSO [31]. The pictograms below the individual workﬂow steps show,

from left to right, the calculation of local descriptors at an arbitrary grid point (green) based on information at adjunct grid

points (grey) within a certain cutoﬀ radius (orange), with an atom shown in red; a neural network; the electronic structure,

exempliﬁed here as a contour plot of the electronic density for a cell containing Aluminum atoms (red).

one presented here completely infeasible.

Common research directions for utilizing ML in the

realm of electronic structure theory either focus on pre-

dicting energies and forces of extended systems (ML in-

teratomic potentials[33]) or directly predicting observ-

ables of interest such as polarizabilities[34]. MALA mod-

els are not limited to singular observables and even give

insight into the electronic structure itself, from which a

range of relevant observables including the total free en-

ergy, the density of states, the electronic density, and

atomic forces follow.

The utility of our ML framework for chemistry and ma-

terials science relies on two key aspects. It needs to scale

well with system size up to the 100,000 atom scale and

beyond. Furthermore, it also needs to maintain accuracy

as we run inferences on increasingly large systems. Both

issues are addressed in the following.

Computational scaling

The computational cost of conventional DFT calcu-

lations scales as N3. Improved algorithms can enable

an eﬀective N2scaling in certain cases over certain size

ranges[37]. In either case, one is faced with an increas-

ingly insurmountable computational cost for systems in-

volving more than a few thousand atoms. As illustrated

in Fig. 3a, conventional DFT calculations (here using

the Quantum ESPRESSO[31] software package) are sub-

ject to this scaling behavior. Contrarily, the computa-

tional cost of using MALA models for size extrapolation

(as shown in Fig. 3b) grows linearly with the number

of atoms and has a signiﬁcantly smaller computational

overhead. We observe speed-ups of up to three orders of

magnitude for atom counts up to which DFT calculations

are computationally tractable.

MALA model inference consists of three steps. First,

the descriptor vectors are calculated on a real-space grid,

then the LDOS is computed using a pre-trained neu-

ral network for given input descriptors, and ﬁnally, the

LDOS is post-processed to compute electronic densities,

total energies, and other observables. The ﬁrst two parts

of this workﬂow trivially scale with N, since they strictly

perform operations per grid point, and the real space

simulation grid grows linearly with N.

Obtaining linear scaling for the last part of the work-

ﬂow, which includes processing the electronic density

to the total free energy, is less trivial since it requires

both the evaluation of the ion-ion energy as well as the

exchange-correlation energy, which for the pseudopoten-

tials we employ includes the calculation of non-linear core

corrections. While both of these terms can be shown to

scale linearly with system size in principle, in practice

this requires the addition of a few custom routines, as is

further outlined in the methods section.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PredictingelectronicstructuresatanylengthscalewithmachinelearningLenzFiedler,1NormandModine,2SteveSchmerler,3DaytonJ.Vogel,2GabrielA.Popoola,4AidanThompson,5SivasankaranRajamanickam,5andAttilaCangi11CenterforAdvancedSystemsUnderstanding,Helmholtz-ZentrumDresden-Rossendorf,Untermarkt20,Gorlitz,02826...

展开>> 收起<<

Predicting electronic structures at any length scale with machine learning Lenz Fiedler1Normand Modine2Steve Schmerler3Dayton J. Vogel2Gabriel A. Popoola4Aidan Thompson5Sivasankaran Rajamanickam5and Attila Cangi1.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Predicting electronic structures at any length scale with machine learning Lenz Fiedler1Normand Modine2Steve Schmerler3Dayton J. Vogel2Gabriel A. Popoola4Aidan Thompson5Sivasankaran Rajamanickam5and Attila Cangi1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: