dynamical core Jack D. Betteridge1 Colin J. Cotter1 Thomas H. Gibson26 Matthew J. Griﬃth34

2025-08-18 1 0 665.49KB 24 页 10玖币

侵权投诉

arXiv:2210.11797v2 [physics.comp-ph] 19 Jun 2023

Hybridised multigrid preconditioners for a compatible ﬁnite element

dynamical core

Jack D. Betteridge1, Colin J. Cotter1, Thomas H. Gibson2,6, Matthew J. Griﬃth3,4,

Thomas Melvin5, and Eike H. M¨uller3,*

1Imperial College, London, United Kingdom

2National Center for Supercomputing Applications, Urbana, IL, USA

3University of Bath, UK

4European Centre for Medium Range Weather Forecasts, Reading, UK

5Met Oﬃce, Exeter, United Kingdom

6Advanced Micro Devices, Inc., Austin, TX, United States

*Email: e.mueller@bath.ac.uk

June 21, 2023

Abstract

Compatible ﬁnite element discretisations for the atmospheric equations of motion have recently attracted consid-

erable interest. Semi-implicit timestepping methods require the repeated solution of a large saddle-point system of

linear equations. Preconditioning this system is challenging since the velocity mass matrix is non-diagonal, leading

to a dense Schur complement. Hybridisable discretisations overcome this issue: weakly enforcing continuity of

the velocity ﬁeld with Lagrange multipliers leads to a sparse system of equations, which has a similar structure

to the pressure Schur complement in traditional approaches. We describe how the hybridised sparse system can

be preconditioned with a non-nested two-level preconditioner. To solve the coarse system, we use the multigrid

pressure solver that is employed in the approximate Schur complement method previously proposed by the some

of the authors. Our approach signiﬁcantly reduces the number of solver iterations. The method shows excellent

performance and scales to large numbers of cores in the Met Oﬃce next-generation climate- and weather prediction

model LFRic.

keywords: Numerical methods and NWP, Dynamics, Atmosphere, Multigrid, Parallel Scalability, Linear Solvers,

Finite Element Discretisation, Hybridisation

1 Introduction

Over the past decade, ﬁnite element discretisations of the governing equations for atmospheric motion have become

increasingly popular. A majority of the focus has been centred on high-order continuous Galerkin (CG), spectral

element, and discontinuous Galerkin (DG) methods (for an overview, see [1,2,3,4,5,6,7]). More recently, a theory

of mixed-ﬁnite element discretisations, known as mimetic or compatible ﬁnite element methods [8,9,10,11,12,13], has

been developed within the context of simulating geophysical ﬂuid dynamics. Compatible ﬁnite element elements are

constructed from discrete spaces that satisfy an L2de-Rham co-homology in which diﬀerential operators surjectively

map from one space to another [14,15,16]. Discretisations arising from this framework have a long history in

both numerical analysis and applications ranging from structural mechanics to porous media ﬂows, see [17,18] for

a summary of contributions. The embedded property of these discrete spaces deﬁnes a ﬁnite element extension of

the Arakawa C-grid staggered ﬁnite diﬀerence approaches [19], while avoiding the need for discretising on structured,

orthogonal meshes (such as traditional latitude-longitude meshes). The realization made by Cotter and Shipton in

[8] that compatible ﬁnite element methods respect essential geostrophic balance relations for steady-state solutions of

the rotating shallow water equations, combined with other properties making them analogous to the Arakawa C-grid,

has galvanized an eﬀort to develop similar techniques for numerical weather prediction. This framework of compatible

ﬁnite elements is a core area of research in the UK Met Oﬃce’s eﬀort to redesign an operational dynamical core,

known as the Gung-Ho project. A key result of this project has been the development of the LFRic model [20,21],

which solves the compressible Euler equations for a perfect gas in a rotating frame. The model uses a compatible ﬁnite

element discretisation on a hexahedral grid based on tensor product spaces derived from the classical Raviart-Thomas

(RT) method [22].

1.1 Semi-implicit time discretisation of the compressible Euler equations

Atmospheric ﬂows are characterised by a separation of scales: although fast acoustic and inertia-gravity waves carry

very little energy, they need to be represented in the model since they drive slower large-scale balanced ﬂow through

non-linear interactions. As a result, the timestep size in a model based on fully explicit timestepping would be

prohibitively small due to the CFL condition imposed by the fast acoustic waves. To overcome this issue, the LFRic

model employs a semi-implicit time discretisation which treats the fast waves implicitly in all spatial directions. This

allows simulations with a signiﬁcantly larger timestep size, which can potentially reduce the overall model runtime.

The drawback of semi-implicit methods is that they require the solution of a non-linear system in every model

timestep. Since the implicit system is only mildly non-linear, it can be solved eﬃciently with a quasi-Newton-type

iteration, which in turn requires the repeated solution of a linear system. This linear solve is one of the computational

bottlenecks of the LFRic model. While compatible ﬁnite element discretisations have several useful properties for

geophysical ﬂow applications, the eﬃcient implementation of solver algorithms for the innermost linear system poses

a signiﬁcant challenge. This is primarily due to the fact that, unlike for standard ﬁnite diﬀerence discretisations, the

linear system takes the form of an indeﬁnite mixed problem with saddle-point structure. To deliver forecasts under

strict operational time constraints, solvers must eﬃciently scale to many cores on distributed-memory clusters while

being algorithmically robust.

1.2 Hybridisation and related work on preconditioners

There are a number of possible approaches for handling the linear saddle-point system arising from compatible ﬁ-

nite element discretisations. This includes H(div)-multigrid [23], requiring complex overlapping Schwarz smoothers;

auxiliary space multigrid [24]; and global Schur-complement preconditioners, requiring inverting both a velocity mass

matrix and an elliptic Schur-complement operator. So far, a pragmatic solution has been employed in LFRic: the

saddle-point system is preconditioned with an approximate Schur-complement method which is obtained by lumping

the velocity mass matrix [25,26]. The resulting elliptic Schur-complement system in the piecewise constant pressure

space is solved with a bespoke multigrid algorithm that exploits the tensor-product structure of the function spaces,

following [27]. The drawback of this approach is that it requires the repeated, expensive application of the saddle-point

operator for all four prognostic variables in a preconditioned Krylov-iteration.

To address this issue, we focus on a technique known as hybridisation [28,29] in this paper. Hybridisation utilises

element-wise static condensation to obtain a reduced problem on the mesh interfaces. Once solved, recovery of the

prognostic variables is achieved by solving purely local linear systems. Although the reduced, sparse system is smaller

than the original saddle-point system for the prognostic variables, it has to be solved eﬃciently. Non-nested algorithms

[30] have been proposed for the solution of linear systems that arise from the hybridisation of mimetic ﬁnite element

discretisations. The authors of [31] describe a non-nested multigrid method for the solution of the hybridisable mixed

formulation of a second order elliptic boundary value problem, while [32] use a similar approach for the DG discetisation

of the same equation. Both papers employ a conforming coarse level space based on piecewise linear elements; this

space allows the construction of a standard multigrid method with h-coarsening. The hybridised method has also

been employed to solve the linear system that arises in an IMEX-DG discretisation of the shallow water equations in

[33]. While the authors of [33] use a direct solver for the resulting elliptic system on the mesh interfaces, a non-nested

multigrid algorithm similar to [32] is used for the IMEX-DG discretisation of the shallow water equations in [34].

1.3 Main contributions

In this paper, we describe how a hybridisable discretisation of the compressible Euler equations in LFRic allows the

eﬃcient solution of the saddle-point system for the prognostic variables if it is combined with a non-nested multigrid

preconditioner. Our work is motivated by two scientiﬁc questions:

1. Can hybridisation be used to solve the saddle-point system in LFRic more eﬃciently?

2. What performance gains can be achieved by using bespoke non-nested multigrid preconditioners?

The new contributions of our work are:

i. We decribe how the exact elimination of density and potential temperature allows the construction of a re-

duced saddle-point system for Exner pressure and momentum; solving this system with the approximate Schur-

complement approach with a pressure multigrid preconditioner improves on the results in [25] and forms the

basis for our hybridisable discretisation.

ii. We derive a hybridisable discretisation of the compressible Euler equations and use static condensation to

construct an elliptic system for the unknowns on the mesh skeleton.

iii. We introduce a non-nested multigrid algorithm for the solution of this elliptic system; since the unknowns on

the mesh skeleton approximate pressure, we use the pressure multigrid algorithm from [25] to solve the piecewise

constant pressure equation on the coarse level of the multigrid hierarchy.

iv. We compare the performance of diﬀerent solver setups to demonstrate the impact of hybridisation and the use

of a multigrid preconditioner, thus providing empirical evidence to answer our two central scientiﬁc questions

above.

v. We demonstrate the excellent parallel scalability of our implementation on up to 13824 cores.

The choice of a piecewise constant coarse level space in the non-nested multigrid preconditioner is mainly motivated

by practical considerations, since it allows the re-use of the eﬃcient pressure multigrid algorithm in [25]. Although a

rigorous theoretical justiﬁcation of this approach is beyond the scope of this paper, our numerical results show that it

works very well in practice. While for the present, lowest order discretisation employed in LFRic, our hybridised solver

is not able to beat the optimised approximate Schur-complement approach in [25], we will argue in Section 4that this

picture is likely to change for higher-order discretisations. This is due to the dramatic increase in condition number

and problem size for the resulting mixed operators. In situations like this, it has been show that static condensation

approachs can provide signiﬁcant beneﬁt [35]. We therefore believe that the hybridised solver introduced in this paper

has signiﬁcant potential to accelerate semi-implicit compatible ﬁnite element models for atmospheric ﬂuid dynamics.

Structure. The rest of this paper is organised as follows: in Section 2we describe the mixed ﬁnite element discreti-

sation of the compressible Euler equations that is used in LFRic and show how potential temperature and density can

be eliminated to obtain a smaller linear system for Exner pressure and velocity only. We also explain how hybridisation

allows the construction of a sparse Schur complement system. After reviewing the approximate Schur-complement

solver and pressure multigrid algorithm from [25], a new non-nested multigrid algorithm for solving this system is

introduced in this section. Following a description of our implementation and the model setup, we compare the perfor-

mance of diﬀerent solver algorithms for a baroclinic test case in Section 3, which also contains results from massively

parallel scaling runs. We conclude and outline avenues for further work in Section 4. Further technical details on the

LFRic implementation and parameter tuning can be found in the appendices.

2 Methods

2.1 Model equations

To obtain an update for the non-linear outer iteration in a semi-implicit timestepping scheme we need to compute

increments in Exner pressure Π′, potential temperature θ′, density ρ′and velocity u′. This can be achieved by

linearising the compressible Euler equations around a suitable reference state and solving the following four equations:

[u′+τu∆t(µb

z(b

z·u′) + 2Ω×u′)] + τu∆tcp(θ′∇Π∗+θ∗∇Π′) = ru,(1a)

ρ′+τρ∆t∇ · (ρ∗u′) = rρ,(1b)

θ′+τθ∆tu′· ∇θ∗=rθ,(1c)

1−κ

Π′

Π∗−ρ′

ρ∗−θ′

θ∗=rΠ.(1d)

Here ∆tis the timestep size, τu, τρ, τθare relaxation parameters, the unit normal in the vertical direction is denoted

by b

z, and we have µ= 1 + τuτθ∆t2N2with the Brunt-V¨ais¨al¨a frequency N2. The right-hand sides ru,rρ,rθ, and

rΠare residuals obtained from the linearisation process. In LFRic the reference states θ∗,ρ∗and Π∗are the ﬁelds

at the previous timestep. The expressions on the right-hand side depend on the current non-linear iterate, but their

exact form is not relevant for the following derivation. The equations have to be solved in a thin spherical shell Ω,

with suitable boundary conditions at the top and bottom of the atmosphere. For a further discussion of the equations

we refer the reader to [36] and [25, Section 3.3]. It should be noted that in [25], Eqs. (1a) - (1d) are discretised to

obtain a 4 ×4 block matrix system, which is solved with a suitable preconditioned Krylov subspace method. Since

each application of the 4 ×4 block matrix is expensive, we pursue a diﬀerent approach here: by eliminating θ′and ρ′

ﬁrst, we obtain a smaller 2 ×2 block system for velocity u′and Exner pressure Π′only. This system is then solved

with a preconditioned Krylov subspace method or the hybridised approach described in Section 2.4, followed by the

reconstruction of θ′and ρ′from u′and Π′. Since the elimination and recovery of θ′and ρ′is only necessary once per

linear solve, this increases eﬃciency.

Analytic elimination of potential temperature. Eq. (1c) can be used to eliminate θ′from Eqs. (1a), (1b) and

(1d), leading to the following system of three equations:

u′+τu∆t(µb

z(b

z·u′) + 2Ω×u′)−τuτθ∆t2cp(u′· ∇θ∗)∇Π∗+τu∆tcpθ∗∇Π′=ru≡ru−τu∆tcp∇Π∗rθ,(2a)

ρ′+τρ∆t∇ · (ρ∗u′) = rρ,(2b)

1−κ

Π′

Π∗−ρ′

ρ∗+τθ∆tu′· ∇θ∗

θ∗=rΠ≡rΠ+rθ

θ∗.(2c)

2.2 Mixed ﬁnite element discretisation

To discretise Eqs. (2a) - (2c), we proceed as in [25, Section 3.2] and introduce ﬁnite element spaces on a hexahedral

grid Ωhcovering the domain Ω: W2is the Raviart-Thomas [37] space RTpand W3=QDG

pis the scalar discontinuous

Galerkin space, where in both cases pis the order of the discretisation. The velocity space W2=Wh

2⊕Wz

2can be

written as the direct sum of a space Wh

2, which only contains vectors pointing in the horizontal, and a space Wz

2with

purely vertical ﬁelds (see [25, Figure 1]). Recall further that the space Wz

2is continuous in the vertical direction,

but discontinuous in the horizontal direction, while Wh

2is horizontally continuous and vertically discontinuous. To

represent the reference proﬁle for potential temperature, a third space Wθis introduced. Wθis the scalar-valued

equivalent of Wz

2and has the same continuity, which is analogous to using a Charney-Phillips staggered vertical

discretisation. A further discussion of the function spaces can be found in [20].

We now assume that u′∈W2, Π′, ρ′,Π∗, ρ∗∈W3and θ∗∈Wθ. Multiplying Eqs. (2a) - (2c) by suitable test

functions w∈W2and φ, ψ ∈W3and integrating over Ωhleads to the following weak form of the problem:

˘m2(w,u′)−q22(w,u′) + g(w,Π′) = ru(w),

d(φ, u′) + m3(φ, ρ′) = rρ(φ),for all w∈W2and all φ, ψ ∈W3.

q32(ψ, u′)−mρ

3(ψ, ρ′) + mΠ

3(ψ, Π′) = rΠ(ψ),

(3)

Writing h·iΩhfor integrals over the mesh Ωhand h·iEhfor integrals over the mesh skeleton Eh, i.e.

hfiΩh=X

K∈ΩhZK

f dx, hgiEh=X

∂K∈EhZ∂K

g dS,

the eight bilinear forms on the left-hand side of Eq. (3) are

˘m2(w,u′) = hw·(u′+τu∆t(µb

z(b

z·u′) + 2Ω×u′))iΩh,

q22(w,u′) = τuτθ∆t2Dcp(b

z·w)∂zb

Π∗(b

z·u′)∂zθ∗EΩh

g(w,Π′) = τu∆tcph[[θ∗w]]{{Π′}}iEh− h∇ · (θ∗w)Π′iΩh,

d(φ, u′) = τρ∆thφ∇ · (ρ∗u′)iΩh,

m3(φ, ρ′) = hφρ′iΩh,

q32(ψ, u′) = τθ∆tψ∂zθ∗

θ∗(b

z·u′)Ωh

mρ

3(ψ, ρ′) = ψρ′

ρ∗Ωh

mΠ

3(ψ, Π′) = 1−κ

κψΠ′

Π∗Ωh

(4)

Since it is important for the discussion of hybridisation in Section 2.4, we explain the derivation of the weak form

g(·,·) in more detail. For this, consider the term θ∗∇Π′in Eq. (1a). The function space W3is discontinuous and the

derivative of Π′can not be evaluated directly. Instead, we integrate the term by parts after it has been multiplied by

a test function w∈W2. Considering a single cell Kof the mesh, this leads to two integrals:

θ∗w· ∇Π′dx =−ZK

∇ · (θ∗w) Π′dx +Z∂K

(θ∗w·n) Π′|∂K dS, (5)

where nis the unit outward normal vector at each point of the cell surface ∂K. In the surface integral, we need to

decide how to evaluate the ﬁeld Π′|∂K , which is not well-deﬁned since W3is discontinuous. The natural choice is to

take the average {{Π′}} := 1

2Π′

++ Π′

−of the ﬁelds in the two cells K+,K−that are adjacent to a particular facet.

Summing the right-hand side of Eq. (5) over all cells of the mesh then leads to the expression in g(·,·), namely

− h∇ · (θ∗w) Π′iΩh+h[[θ∗w]]{{Π′}}iEh.(6)

In this expression [[θ∗w]] ≡θ∗

+(w+·n+) + θ∗

−(w−·n−) and the subscripts “+” and “−” are again used to identify

quantities deﬁned on the two cells that are adjacent to a particular facet. Note that although here w+·n+=−w−·n−

as n+=−n−and the normal components of the velocity ﬁeld are continuous across facets, this is no longer true in

our hybridisable formulation (see Section 2.4).

To obtain the bilinear forms q22(·,·) and q32(·,·), the following approximations have been made: when evaluating

∇Π∗in Eq. (2a), the reference proﬁle is projected into the function space Wθto obtain b

Π∗∈Wθ. Since Wθis only

continuous in the vertical direction, the horizontal components of the derivatives ∇θ∗and ∇Π∗in Eqs. (2a) and

(2c) are dropped to obtain b

z∂zθ∗and b

z∂zb

Π∗. This is a sensible approximation because the reference proﬁles vary

predominantly in the vertical direction.

The linear forms on the right-hand side of Eq. (3) are given by ru(w) = hw·ruiΩh,rρ(φ) = hφrρiΩhand

rΠ(ψ) = hψrΠiΩh.

In general, the integrals in Eq. (4) are approximated by n-point Gaussian quadrature rules (with n=p+ 3 for

an element of order p), in each coordinate direction for the volume integrals h·iΩhand in two dimensions restricted to

each facet for the facet integrals h·iEh. This value of nis chosen as it is generally suﬃcient to integrate the terms in

Eq. (4) exactly on an aﬃne mesh. However, since it desirable for the equation of state rΠin Eq. (2c) to hold exactly,

the bilinear forms q32,mρ

3and mΠ

3in Eq. (4) are instead evaluated at the nodal points of the W3space, such that the

diagnostic relationship rΠbetween ρ, θ, Π holds exactly at these nodal points. For p= 0 elements this can be simply

enforced by using a one-point quadrature rule with the quadrature point located in the cell centre. Since for p= 0 the

test functions ψ∈W3are constant, this is equivalent to enforcing Eq. (2c) in strong form at the quadrature points.

Matrix formulation. Choosing suitable basis functions vj(χ)∈W2,σj(χ)∈W3that depend on the spatial

coordinate χ∈Ωh, the ﬁelds u′,ρ′and Π′can be expanded as

u′(χ) = X

jeu′

jvj(χ)∈W2, ρ′(χ) = X

jeρ′

jσj(χ)∈W3,Π′(χ) = X

Π′

jσj(χ)∈W3,(7)

where for each quantity a′the corresponding dof (degrees of freedom)-vector is given by ea′= [a1, a2,...]. This allows

expressing the weak form in Eq. (3) as a 3 ×3 matrix equation for the dof-vectors eu′,eρ′and e

Π′:





M2−Q22 0G

D M30

Q32 −Mρ

3MΠ





eu′

eρ′

Π′

=

Ru

Rρ

RΠ

.(8)

The entries of the matrices ˘

M2,Q22,G,D,M3,Q32,Mρ

3and MΠ

3are obtained by evaluating the bilinear forms in

Eq. (4) for the basis functions, for example ( ˘

M2)ij = ˘m2(vi,vj) and Gij =g(vi, σj). Similarly, the components of the

residual (dual) vectors Ru,Rρand RΠon the right-hand side of Eq. (8) are obtained by evaluating the linear forms

ru,rρand rΠfor the basis functions, for example (Ru)i=ru(vi).

Elimination of density. Finally, the density eρ′is eliminated algebraically from Eq. (8) by using e

ρ′=M−1

3(Rρ−

u′) to obtain a 2 ×2 matrix system for velocity eu′and Exner pressure e

Π′only:

˘

M2−Q22 G

Q32 +Mρ

3M−1

3D MΠ

3eu′

Π′=Ru

RΠ≡Ru

RΠ+Mρ

3M−1

3Rρ(9)

In the next section, we review a suitable preconditioned Krylov subspace method for solving Eq. (9) (namely the

approximate Schur complement pressure multigrid method in [25]) and then introduce an alternative solution method

based on hybridisation in Section 2.4.

2.3 Pressure multigrid for approximate Schur complement solver

The traditional approach for solving Eq. (9), which is pursued in the ENDGame dynamical core [36], is to eliminate

the velocity eu′and then solve the resulting elliptic Schur complement equation for the Exner pressure increment e

Π′.

For the ﬁnite element discretisation considered here, however, the matrix ˘

M2−Q22 is not diagonal. This makes this

approach unfeasible since the resulting Schur complement would be dense because it contains the inverse of ˘

M2−Q22.

As in [25], we address this issue by constructing an approximate Schur complement which is sparse, and then using this

to precondition a Krylov-subspace iteration over the mixed system in Eq. (9). To construct the approximate Schur

complement, we replace ˘

M2−Q22 by a suitable lumped, diagonal approximation ˚

M2, which results in the following

2×2 system: ˚

M2G

Q32 +Mρ

3M−1

3D MΠ

3eu′

Π′=Ru

RΠ.(10)

Eq. (10) has a sparse Schur complement and can be solved as follows:

Step 1: Compute the modiﬁed right-hand side

B=RΠ−Q32 +Mρ

3M−1

3D˚

M−1

2Ru(11)

Step 2: Solve the sparse elliptic system

Π′=Bwith H=−Q32 +Mρ

3M−1

3D˚

M−1

2G+MΠ

3(12)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

arXiv:2210.11797v2[physics.comp-ph]19Jun2023HybridisedmultigridpreconditionersforacompatibleﬁniteelementdynamicalcoreJackD.Betteridge1,ColinJ.Cotter1,ThomasH.Gibson2,6,MatthewJ.Griﬃth3,4,ThomasMelvin5,andEikeH.M¨uller3,*1ImperialCollege,London,UnitedKingdom2NationalCenterforSupercomputingApplication...

展开>> 收起<<

dynamical core Jack D. Betteridge1 Colin J. Cotter1 Thomas H. Gibson26 Matthew J. Griﬃth34.pdf

共24页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

dynamical core Jack D. Betteridge1 Colin J. Cotter1 Thomas H. Gibson26 Matthew J. Griﬃth34

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: