The MRChem multiresolution analysis code for molecular electronic structure calculations performance and scaling properties

2025-05-06 0 0 1.34MB 31 页 10玖币
侵权投诉
The MRChem multiresolution analysis code for
molecular electronic structure calculations:
performance and scaling properties
Peter Wind,,Magnar Bjørgve,Anders Brakestad,Gabriel A. Gerez S.,Stig
Rune Jensen,,Roberto Di Remigio Eik˚as,and Luca Frediani,
Department of Chemistry, UiT - The Arctic University of Norway, N-9037 Tromsø,
Norway
Algorithmiq Ltd, Kanavakatu 3C, FI-00160 Helsinki, Finland
E-mail: peter.wind@uit.no; stig.r.jensen@uit.no; luca.frediani@uit.no
Abstract
MRChem is a code for molecular electronic structure calculations, based on a multi-
wavelet adaptive basis representation. We provide a description of our implementation
strategy and several benchmark calculations. Systems comprising more than a thou-
sand orbitals are investigated at the Hartree–Fock level of theory, with an emphasis
on scaling properties. With our design, terms which formally scale quadratically with
the system size, in effect have a better scaling because of the implicit screening intro-
duced by the inherent adaptivity of the method: all operations are performed to the
requested precision, which serves the dual purpose of minimizing the computational
cost and controlling the final error precisely. Comparisons with traditional Gaussian-
type orbitals based software, show that MRChem can be competitive with respect to
performance.
1
arXiv:2210.01011v2 [physics.chem-ph] 9 Nov 2022
1 Introduction
Gaussian-type orbitals (GTOs) and more generally Linear Combination of Atomic Or-
bitals (LCAOs)1are well established as a standard for ab initio molecular electronic structure
calculations. As their shape is closely related to the electronic structure of atoms, even very
small basis sets with only a few functions per Molecular Orbital (MO) can give reasonable
results for describing molecular properties. However, for extended systems the description of
each orbital still requires the contributions from the entire basis in order to ensure orthog-
onality. Without further precautions, even when using localized orbitals, a large proportion
of the coefficients will be very small for those systems.
In a Multiresolution Analysis (MRA) framework like multiwavelets (MWs), the basis can
adapt according to the function described (for an in-depth review of the MRA method in
the field of Quantum Chemistry, see Ref. 2). The available basis is in principle infinite and
complete, and, in practice, it is dynamically adapted to the local shape of the function and the
required precision. This can require the real-space basis to comprise millions of elementary
functions for each orbital. In this sense, the method starts with a big handicap compared
to LCAO basis sets. On the other hand, the exponential growth of available computational
resources has in recent years enabled MRA calculations on systems comprising thousands of
atoms.3,4
Two main challenges need to be addressed in order to achieve adequate performance:
the large number of operations to perform and the large memory footprint. The former
is addressed by limiting the numerical operations to those that are strictly necessary to
achieve the requested precision: rigorous error control at each elementary operation is the
main strength of a MW approach, enabling effective screening. The latter is achieved by
algorithmic design: beyond a certain system size, not all data can be stored in local (i.e.
fast access) memory. On modern computers, data access is generally more time consum-
ing than the mathematical operations, especially if the data is not available locally.aThe
aThe online interactive chart at https://colin-scott.github.io/personal_website/research/
2
computer implementation must be able to copy large amounts of data efficiently between
compute nodes and the algorithm must be devised to reuse the data already available on
the compute node or in cache memory when possible. In this article, we will present the
main implementation features of our MRA code, MRChem,5to tackle those challenges, thus
enabling calculations on large systems at the Hartree–Fock (HF) level.
Beyond the effective thresholding that screens numerically negligible operations, the large
computational cost is addressed by parallelization either at the orbital level or through real
space partitioning. This dual strategy allows to minimize the relative cost of communication
and the overall computational costs. Further, the most intensive mathematical operations
are expressed as matrix multiplications, which allows for optimal efficiency. Using localized
orbitals, the adaptivity of the MW description will significantly reduce the computational
effort required to compute the terms involving remote orbitals. Within a MRA framework,
the operators will exhibit an intrinsic sparsity, even if the orbitals have contributions from
remote regions. This is achieved because the different length scales are naturally separated
through the adaptive grid representation. This opens the way for a method that scales
linearly with system size (N), where the linearity arises naturally from the methodology,
rather than being obtained with ad hoc techniques, such as fast-multipole methods6for
Fock matrix construction, or purification of the density matrix.7
The large memory requirement is addressed by storing the data in a distributed “memory
bank”, where orbitals can be sent and received independently by any CPU. The total size
of the memory bank is then only limited by the overall memory available on the entire
computer cluster. Benchmark calculations at the HF level show that MRChem is able
to describe systems with thousands of electrons. The implementation exhibits near-linear
scaling properties and it can also be competitive with state-of-the-art electronic structure
software based on LCAO methods.
interactive_latency.html (Accessed 2022-10-29) is useful to understand the relative performance of on-
chip vs. off-chip memory accesses. In particular, it shows how the evolution of memory chips has been
lagging behind that of CPUs.
3
2 Solving the Hartree–Fock equations with multiwavelets
We consider the Self-Consistent Field (SCF) equations of the HF method:
(T+V)ϕi=
Nocc
X
j
Fij ϕj,(1)
where the Fock matrix elements Fij are obtained as
Fij =hϕi|T+V|ϕji.(2)
In the above equations, T=1
22is the kinetic energy operator, and the potential Vcon-
tains the nuclear-electron attraction Vnuc, the Coulomb repulsion J, and the exact exchange
K. To solve the SCF equations within a MW framework, we rewrite the differential equation
(1) as an integral equation:
ϕi=2Hµi?"V ϕi
Nocc
X
j6=i
Fij ϕj#,(3)
where ?stands for the convolution product. This form is obtained by recalling that the
shifted kinetic energy operator Thas a closed-form Green’s function:8,9
(T)1= 2Hµ, Hµ=eµr
r, µ =2, (4)
By setting i=Fii and µi=2Fii, the diagonal term is excluded from the summation in
Eq. (3). It can be shown that iterating on the integral equation corresponds to a precon-
ditioned steepest descent. Practical applications show that this approach is comparable in
efficiency (as measured in the rate of convergence of the SCF iterations) to more traditional
methods.10
A MW representation does not have a fixed basis. It is therefore not possible to express
functions and operators as vectors and matrices of a predefined size, and the virtual space is
4
to be regarded as infinite. Still, each function has a finite, truncated representation defined
by the chosen precision threshold, but this representation will in general be different for
different functions. It is therefore necessary to develop working equations which allow the
direct optimization of orbitals. On the other hand, only occupied orbitals are constructed
and the coupling through the Fock matrix in Eqs. (1) and (3) is therefore limited in size.
Another advantage is that, to within the requested precision, the result can be formally
considered exact, and exploiting formal results valid in a complete basis – most notably the
Hellmann–Feynman theorem – becomes straightforward.11
Differential operators such as the Laplacian pose a fundamental problem for function
representations which are not continuous, and a naive implementation leads to numerical
instabilities. Robust approximations are nowadays available,12 but for repeated iterations
avoiding the use of the Laplacian operator is still an advantage. This is the main reason for
using the integral form in Eq. (4) rather than the differential form.
3 Implementation details and parallelization strategy
MW calculations can be computationally demanding. Both the total amount of elemen-
tary operations, the large amount of memory required and the capacity of the network are
important bottlenecks.
In practice, a supercomputer is required for calculations on large systems. For example
the representation of one orbital can demand between 10 and 500MB of data, depending
on the kind of orbital and the precision requested (see Section 4.2.1 for details). Moreover,
several sets of functions are necessary in a single SCF iteration (orbitals at the previous
iterations, right-hand side of equations etc.). Large systems will eventually require more
memory than is locally available on a single compute node in a cluster, and distributed
data storage becomes mandatory. At the same time, the SCF equations will require pairs of
orbitals, i.e. all the data must at some point be accessible locally.
5
摘要:

TheMRChemmultiresolutionanalysiscodeformolecularelectronicstructurecalculations:performanceandscalingpropertiesPeterWind,,yMagnarBjrgve,yAndersBrakestad,yGabrielA.GerezS.,yStigRuneJensen,,yRobertoDiRemigioEikas,zandLucaFrediani,yyDepartmentofChemistry,UiT-TheArcticUniversityofNorway,N-9037Troms...

展开>> 收起<<
The MRChem multiresolution analysis code for molecular electronic structure calculations performance and scaling properties.pdf

共31页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:31 页 大小:1.34MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 31
客服
关注