Accurate neural-network-based tting of full-dimensional two-body potential energy surfaces

2025-04-30 0 0 2MB 38 页 10玖币
侵权投诉
Accurate neural-network-based fitting of
full-dimensional two-body potential energy
surfaces
Artem A. Finenko,,,
Department of Chemistry, Lomonosov Moscow State University, GSP-1, 1-3 Leninskiye
Gory, Moscow 119991, Russia
Institute of Quantum Physics, Irkutsk National Research Technical University, 83
Lermontov str., Irkutsk 664074, Russia
Institute of Applied Physics of the Russian Academy of Sciences, 46 Ulyanov str., Nyzhny
Novgorod 603950, Russia
E-mail: artfin@mail.ru
Abstract
We describe the development of machine-learned potentials of atmospheric gases
with flexible monomers for molecular simulations. A recently suggested permutation-
ally invariant polynomial neural network (PIP-NN) approach is utilized to represent
the full-dimensional two-body component of the dimer energy. To ensure the asymp-
totic zero-interaction limit, a tailored subset of the full invariant polynomial basis set
is utilized and their variables are modified to achieve a better fit of the correct asymp-
totic behavior at a long range. The new technique is used to build full-dimensional
potentials for the two-body N2Ar and N2CH4interactions by fitting databases of
ab initio energies calculated at the coupled-cluster level of theory. The second virial
coefficients with full account of molecular flexibility effects are then calculated within
1
arXiv:2210.09970v1 [physics.chem-ph] 18 Oct 2022
the classical framework using the PIP-NN potential surfaces. To showcase the advan-
tages of the PIP-NN method, we compare its accuracy and computational efficiency to
several kernel-based and neural-network-based approaches using the MD17 database of
energies and forces for ethanol. For large training set sizes, the PIP-NN models attain
the best accuracy among examined models, and the computation time is shown to be
comparable to that of the PIP regression model and several orders of magnitude faster
than the quickest alternatives.
Introduction
A potential energy surface (PES) is the key concept in computational chemistry that enables
understanding the processes on an atomistic scale. The continuous changes within atomistic
systems are often thought of as being driven by a multidimensional energy landscape, which
is determined by the spatial positions of the atoms. Offering unprecedented insights into
complex physical and chemical phenomena, molecular dynamics simulation is indispensable
in the computational chemistry toolbox. One of the widely recognized issues in MD sim-
ulations is the lack of accuracy of underlying interatomic potentials, which hinders truly
predictive modeling of molecular systems’ properties. Although the potential energy values
could, in principle, be obtained on-the-fly using pertinent electronic structure calculations,
more efficient approaches are often required to characterize relevant observables, for example,
(ro-)vibrational spectra, rate constants, etc. Thanks to modern computational resources, a
large number of highly precise electronic structure calculations can be carried out at various
points across the configuration space.1Despite significant advances, representing a global
PES from these ab initio points poses a difficult challenge. Thus, one of the key focuses
of computational chemistry has been on the development of efficient methods to generate
high-quality representations of global PESs for molecules. Various non-ML approaches, such
as modified Sheppard interpolation,2interpolating moving least-squares (IMLS),3reproduc-
ing kernel Hilbert space interpolations4and PIP regression,5,6 to name a few, have been
2
successful at constructing global PESs. Machine learning, particularly neural networks, has
recently shown great promise as tools for building both flexible and computationally efficient
models of PESs.7–9 Combining techniques coming from ML and non-ML perspectives has
been demonstrated to be effective; PIP-IMLS10 and PIP-NN11,12 are excellent examples of
this synergy.
In the current study, we undertake the development of ML PES for molecular simulations
of atmospheric gases with flexible monomers. Collision processes in atmospheric gases mod-
ify the radiative properties of moieties by shifting and broadening spectral lines as well as
leading to the emergence of collision-induced bands. The advancement of the observational
and retrieval capabilities of terrestrial and planetary remote-sensing missions poses new chal-
lenges for theoretical research to improve the representation of these phenomena. Starting
from potential energy and dipole moment surfaces, contemporary state-of-the-art theories
permit first-principle simulations of spectral features that emerge in real gases (see review
by Hartmann et al. 13 and references therein). One of the hurdles affecting remote-sensing
retrievals is the lack of or insufficient accuracy of models describing continuum absorption.
In Refs.,14,15 we developed a first-principle trajectory-based approach to simulate collision-
induced absorption. In light of the approach’s effectiveness in simulating far-infrared absorp-
tion in the cases of N2N2,14 CO2Ar15 and N2CH4,16 we seek to simulate the forbidden
rotovibrational fundamental and overtone bands of molecule pairs consisting of dipoleless
moieties. Following many-body expansion,17 we specifically focus on the full-dimensional
representation of the two-body term, V2b, which corresponds to the interaction energy be-
tween two molecules. The two-body component is defined as the difference between the total
energy of the dimer Vdimer and the one-body energies of isolated monomers Vmon1and Vmon2
V2b =Vdimer Vmon1Vmon2.(1)
Note that in this context dimer refers to a molecular pair that is not necessarily in a bound
3
state. Keep in mind that the two-body component of the PES can be paired with various
monomer potentials depending on the level of fidelity required for a particular application,
making the global PES model configurable and adaptable.
The machine-learning potential is typically built from two components: the molecular
descriptor and the ML model. The descriptor provides a mapping that associates an atomic
configuration, which is identified by the positions and chemical identities of atoms, with a
point in a feature space.18 Although the Cartesian coordinates of the atoms encode all the
information about the structure of a molecular system, it is evident that they cannot be used
directly as input to the ML model. In principle, the potential energy surface is invariant
to overall translation and rotation as well as to permutations of identical atoms. Generally
speaking, ML models that account for symmetry either using descriptors or through clever
architecture tend to be more data-efficient and accurate. In this paper, we use PIPs as a type
of descriptor that will allow us to account for translational, rotational, and permutational
symmetry.
In recent years, numerous ML methods have been applied to fit electronic energies (see
review19 and references therein). These ML methods produce mappings that can be charac-
terized as parametric or nonparametric.20 The assumption made by parametric algorithms
is that the mapping function has a predefined functional form with a fixed number of pa-
rameters. The typical example is a neural network. Nonparametric algorithms, on the other
hand, do not make this assumption. The more training data is given, the more complex
the algorithm becomes and the more parameters it contains. Such is the case with kernel
methods like Gaussian process regression and kernel ridge regression. Assuming the NN’s
structure is fixed, the prediction cost for NN-based methods becomes independent of the
size of the training set, in contrast to kernel-based approaches where the prediction cost
scales linearly with the size of the training set. Given our goal of performing large-scale
collision simulations with ML potentials, which necessitates robust force models, we chose
an NN-based method.
4
To assess the computational efficiency of the PIP-NN method, we conducted a showcase
study on the ethanol molecule using the MD17 database of energies and forces. A range
of ML approaches have been compared by Pinheiro et al.21 based on the MD17-ethanol
database. Using the suggested protocol based on the analysis of learning curves, we compare
our implementation of the PIP-NN method to other approaches, in particular, symmetrized
gradient-domain machine learning (sGDML),22,23 deep neural network with PhysNet archi-
tecture,9Gaussian approximation potential,24 and PIP regression6on the basis of precision
and prediction times.
The rest of the paper is organized as follows. The structure of the PIP-NN model and a
few modifications that we explore in this paper are described in the next section, along with
the generation of interaction energy data sets and an explanation of the training procedure.
The Results section discusses how PIP-NN models performed on these data sets as well as
on the MD17-ethanol data set. The Summary and Conclusions section covers concluding
remarks.
Methods
In this work, we adapt the PIP-NN approach put forward by Guo and co-workers11,25 to
represent multidimensional PESs, characterized by two-body, non-covalent interactions. We
begin by outlining the PIP-NN model’s structure and how the PIP basis could be altered
to be able to reproduce essential properties pertinent to the two-body component of the
PES. Next, we explain how the variables of invariant polynomials could be changed to more
accurately map the long-range region. Finally, we discuss the training procedure and the
development of interaction energy data sets.
5
摘要:

Accurateneural-network-based ttingoffull-dimensionaltwo-bodypotentialenergysurfacesArtemA.Finenko,y,z,{yDepartmentofChemistry,LomonosovMoscowStateUniversity,GSP-1,1-3LeninskiyeGory,Moscow119991,RussiazInstituteofQuantumPhysics,IrkutskNationalResearchTechnicalUniversity,83Lermontovstr.,Irkutsk664074...

展开>> 收起<<
Accurate neural-network-based tting of full-dimensional two-body potential energy surfaces.pdf

共38页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:38 页 大小:2MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 38
客服
关注