Optimal Chebyshev Smoothers and One-sided V-cycles

2025-04-29 0 0 3.81MB 28 页 10玖币

侵权投诉

OPTIMAL CHEBYSHEV SMOOTHERS AND ONE-SIDED V-CYCLES∗

MALACHI PHILLIPS†AND PAUL FISCHER†‡

Abstract. The solution to the Poisson equation arising from the spectral element discretization

of the incompressible Navier-Stokes equation requires robust preconditioning strategies. One such

strategy is multigrid. To realize the potential of multigrid methods, eﬀective smoothing strategies

are needed. Chebyshev polynomial smoothing proves to be an eﬀective smoother. However, there

are several improvements to be made, especially at the cost of symmetry. For the same cost per

iteration, a symmetric V-cycle with korder Chebyshev polynomial smoothing may be substituted

with a one-sided V-cycle with order 2kChebyshev polynomial smoothing, wherein the smoother is

omitted on the up-leg of the V-cycle. The choice of omitting the post-smoother in favor of higher order

Chebyshev pre-smoothing is shown to be advantageous in cases where the multigrid approximation

property constant, C, is large. Results utilizing Lottes’s fourth-kind Chebyshev polynomial smoother

are shown. These methods demonstrate substantial improvement over the standard Chebyshev

polynomial smoother. The authors demonstrate the eﬀectiveness of this scheme in p-geometric

multigrid, as well as a 2D model problem with ﬁnite diﬀerences.

Key words. multigrid, smoothers, parallel computing

MSC codes. 76D05, 65N55, 65F08

1. Introduction. Chebyshev smoothing was introduced in the context of paral-

lel multigrid (MG) methods in [1], where it was established that Chebyshev smoothing

was competitive with Gauss-Seidel smoothing even in serial computing applications.

Here, we explore several variations on Chebyshev smoothing for the Poisson problem

in general domains. Our primary aim is to develop fast highly-scalable solvers for the

pressure sub-step in time advancement of the Navier-Stokes (NS) equations, partic-

ularly for discretizations based on the spectral element method (SEM). Many of the

ﬁnding, however, would apply in more general settings.

Our target problem is to solve a sequence of Poisson problems,

−∇2˜u=˜

ffor ˜u, ˜

f∈Ω⊂lRd7→ lR.(1.1)

The weak formulation is written as: ﬁnd um(x)∈XN

0⊂ H1

0such that

ZΩ

∇v· ∇u dV =ZΩ

v fmdV ∀v∈XN

0,(1.2)

where fm(x) is the data and um(x) is the corresponding solution ﬁeld at some time

instant tm,m= 1,2, . . . Here, Ω ⊂lRdis the computational domain in d(=1, 2, or

3) space dimensions; H1

0(Ω) is the standard Sobolev space comprising functions that

vanish on a subset of the boundary, ∂ΩD⊂∂Ω, are square-integrable on Ω, and whose

gradient is also square-integrable; and XN

0= span{φj(x)}is the ﬁnite-dimensional

trial/test space associated with a Galerkin formulation of the Poisson problem. The

discrete problem statement is expressed as Aum=bm, where umis the vector of basis

coeﬃcients at tmand Ais the symmetric-positive deﬁnite (SPD) matrix with

aij =ZΩ

∇φi· ∇φjdV.(1.3)

∗This work was funded by the Exascale Computing Project under contract no. 17-SC-20-SC

†Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana IL 61801

(malachi2@illinois.edu).

‡Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign,

Urbana IL 61801 (ﬁscherp@illinois.edu).

arXiv:2210.03179v2 [math.NA] 23 Feb 2023

2M. PHILLIPS AND P. FISCHER

We point out that the need to solve a sequence of problems diﬀers from solving

a single problem in several signiﬁcant ways. First, solver set-up costs are typically

amortized over thousands of right-hand sides and are therefore largely irrelevant to our

cost concerns. Second, the solution is typically devoid of signiﬁcant low wave-number

content because we solve only for a perturbed solution, δum:= um−¯u, where ¯uis an

initial guess. If we take ¯u=um−1then the initial residual r0=b−Aum−1=O(∆t).

This result is improved to O(∆tl) by projecting umonto the space of prior solutions,

{um−1. . . um−l}[8,16]. Finally, with an initially small residual, GMRES is likely to

converge in just a few iterations, which obviates the need for restarts and mitigates

the O(k2) complexity terms in a k-iteration GMRES solve. This latter observation

puts less pressure on requiring a symmetric preconditioner since one can retain the

full beneﬁts of using Krylov subspace projection (KSP) without resorting to conjugate

gradient iteration. With these circumstances in mind, we will drop the superscript m

in the sequel.

We note that Chebyshev smoothers have gained a lot of attention recently. Kro-

nbichler and co-workers [20,12,11] have employed Chebyshev smoothing for discon-

tinuous Galerkin discretizations of the NS equations. Rudi and coworkers employ

algebraic multigrid (AMG) with Chebyshev smoothing [39]. Similarly, Chebyshev

smoothing is considered by Sundar and coworkers as a multigrid smoother for high-

order continuous ﬁnite element discretizations [43].

A major diﬀerence here is that we consider Chebyshev in conjunction with addi-

tive Schwarz methods (ASM) [45,13,24,35] and restrictive additive Schwarz (RAS)

[5] in place of point-Jacobi smoothing. The principal idea is to use ASM or RAS to

eliminate high wave number content. In the case of the spectral element method, local

Schwarz solves can be eﬀected at a cost that is comparable to forward operator eval-

uation through the use of fast diagonalization [25,15,24]. Another critical aspect of

the current context is that many of our applications are targeting exascale platforms

and beyond, where compute is performed on tens of thousands of GPUs for which the

relative cost of global communication and hence, coarse-grid solves, is high [31]. In

such cases, it often pays to have high-quality and broad bandwidth smoothing, such

as provided by Chebyshev, in order to reduce the number of visits to the bottom of

the V-cycle where the expensive coarse-grid solve is invoked.

Here, we explore a seemingly simple question: Given 2ksmoothing iterations,

what is the optimal choice of mpre-smoothing and npost-smoothing applications,

where m+n= 2k?More speciﬁcally, in the Chebyshev context, the question is

what order mpre-smoothing and order npost-smoothing should be used at the same

cost per iteration. An additional and important point to this question is, What kind

of Chebyshev smoothing should be used? One could use standard 1st-kind Chebyshev

polynomials with tuned parameters. (Recall, we can aﬀord signiﬁcant tuning over-

head.) Or, one could use standard or optimized 4th-kind Chebyshev polynomials that

were proposed in recent work by Lottes. (See [23] and references therein.) We explore

these questions under several diﬀerent conditions: using ﬁnite diﬀerences and spectral

elements discretizations and using Jacobi, ASM, or RAS as the basic smoother.

The structure of this paper is as follows. Section 2 outlines the multigrid V-cycle

and Chebyshev smoothers. 2D ﬁnite diﬀerence Poisson results on varying aspect ra-

tio grids, along with a comparison between theoretical and observed multigrid error

contraction rates, are presented in section 3. Spectral element (SE)-based pressure

Poisson preconditioning schemes implemented in the scalable open-source CFD code,

nekRS [14], are presented in section 4. nekRS started as a fork of libParanumal [7]

and uses highly optimized kernels based on the Open Concurrent Compute Abstrac-

OPT. CHEBY. SM. AND V-CYCLE 3

tion (OCCA) [29]. Special focus is given to performance on large-scale GPU-based

platforms such as OLCF’s Summit. Cases for the stationary Poisson and pressure

Poisson problem arising from the NS equations are shown in section 5. Results for

the cases in section 5 are shown in section 6.Section 7 concludes the paper.

2. Multigrid and Chebyshev Smoothers. Let us consider the V-cycle algo-

rithm to solve the SPD matrix A. Suppose that levels j= 0, . . . , ` are used in the

V-cycle, with A=A0. Let us denote the interpolation operator mapping entries from

grid j+ 1 to jby Pj

j+1 for j= 0, . . . , ` −1. The sequence of matrices corresponding

to each level are typically constructed in a Galerkin fashion, with

(2.1) Aj+1 =Pj

j+1TAjPj

j+1, j = 0, . . . , ` −1.

The multiplicative error propagator for a single V-cycle is given recursively by

Ej=I−M−1

jAj

=G0

jI−Pj

j+1M−1

j+1 Pj

j+1TAjGj, j = 0, . . . , ` −1,(2.2)

with M−1

`:=A−1

`.Gjand G0

jare the smoother iteration matrices for the pre- and

post-smoothing iteration matrices, respectively. For example, Gj= (I−ωSjAj)k

corresponds to ksteps of the simple smoothing iteration

(2.3) xi+1j= (xi)j+ωSj(bj−Aj(xi)j),

where Sjis the smoother for the jth level, such as Jacobi with Sj= diag(Aj)−1Aj.G0

and Gjare not necessarily the same – in fact, we will consider the choice of G0

j=Gj

(symmetric post-smoothing) as well as G0

j=I(omitting post-smoothing), among

others.

For later use, we will need to consider the one-sided V-cycle, which is suﬃcient

for the analysis of general V-cycles [28,23]. Similar as before, the “ﬁne-to-coarse”

(E&)j=I−(M&)−1

jAj

=I−Pj

j+1 (M&)−1

j+1 Pj

j+1TAjGj, j = 0, . . . , ` −1,(2.4)

and “coarse-to-ﬁne”

(E%)j=I−(M%)−1

jAj

=G0

jI−Pj

j+1 (M%)−1

j+1 Pj

j+1TAj, j = 0, . . . , ` −1,(2.5)

error propagators are deﬁned, with (M&)−1

`= (M%)−1

`=A−1

`. Let E&= (E&)0and

E%= (E%)0. The general V-cycle, therefore, is the product of the “ﬁne-to-coarse”

and “coarse-to-ﬁne” error propagators,

(2.6) EV=E%E&.

Given kjiterations of the smoothing iteration in (2.3) for level j, is it possible

to construct a better order kjpolynomial than pkj(SjAj) = Gkj= (I−ωSjAj)kj?

Following [40,1], we wish to solve:

(2.7) min

pk∈Pk,pk(0)=1 max

λ∈[λmin,λmax ]|p(t)|.

4M. PHILLIPS AND P. FISCHER

where Pkis the space of all polynomials with degree less than or equal to k. Taking

Eto be the interval [λmin, λmax], the solution to the minimax problem in (2.7) are

the shifted and scaled Chebyshev polynomials of the 1st-kind

(2.8) ˆ

Tk(λ) = 1

σk

Tkθ−λ

δwith σk:=Tkθ

δ.

Tk(·) is the Chebyshev polynomial of the 1st-kind of order k;θis the midpoint of the

interval [λmin, λmax],

θ=λmin +λmax

and δis the mid-width of the interval,

δ=λmax −λmin

The Chebyshev polynomials of the 1st-kind enjoy a three-term recurrence relation

that is used to derive Algorithm 2.1 [40]. While λmax is taken to be the largest

eigenvalue of SjAj, how should λmin be chosen? λmin is chosen as a small factor of

λmax. Previous works considered factors such as 1/30 [1], 3/10 [2], 1/4 [43], and 1/6

[47]. Two approaches utilizing the 1st-kind Chebyshev smoother are considered in

this study. The ﬁrst uses λmin = 0.1λmax and is denoted as 1st-Cheb. The second

optimizes the for λmin and is denoted as 1st-Cheb, λopt

min.

Algorithm 2.1 Chebyshev smoother, 1st-kind

θ=1

2(λmax +λmin), δ=1

2(λmax −λmin), σ=θ

δ,ρ0=1

x0=x, r0=S(b−Ax0), d0=1

θr0

for i= 1,...,k−1do

xi=xi−1+di−1

ri=ri−1

−SAdi−1,ρi=1

2σ−ρi−1

di=ρiρi−1di−1+2ρi

δri

end for

xk=xk−1+dk−1

return xk

Is it possible to further improve the polynomial smoother and remove the ad-hoc

λmin parameter? The polynomial smoother can be chosen in such a way to minimize

an error bound [23], such as the two-level bound proposed by Hackbusch [17] in (2.9).

Without loss of generality, let ρ(SA) = 1. Let G=Gk(SA) be a k-order polynomial

in SA. The two-level bound from Hackbusch [17] is re-written as [23]:

kE&kA=

(I−P A−1

cPTA)Gk

A

≤C1/2

0sup

0<λ≤1

λ1/2|pk(λ)|,(2.9)

where C0is the multigrid approximation property constant, which for a given level j

OPT. CHEBY. SM. AND V-CYCLE 5

Cj:=



A−1

j−Pj

j+1A−1

j+1 Pj

j+1T



Aj,Sj

:= sup

kfkSj

≤1



A−1

j−Pj

j+1A−1

j+1 Pj

j+1Tf



.(2.10)

Solving the weighted minimax problem in (2.9) yields the solution [23]:

(2.11) pk(λ) = 1

2k+ 1Wk(1 −2λ),

where Wkis the Chebyshev polynomial of the 4th-kind of order k. Chebyshev poly-

nomials of the 4th-kind satisfy the same recurrence as that of the ﬁrst-kind, with

diﬀerent initial conditions [23,27]:

(2.12) Wn(x) = 2xWn−1(x)−Wn−2(x) with W0(x)=1, W1(x) = 2x+ 1.

Following this recurrence and relaxing the ρ(SA) = 1 assumption, a similar iterative

algorithm to Algorithm 2.1 can be derived as Algorithm 2.2. The inclusion of the βi

parameters in Algorithm 2.2 comes as a result of optimizing (2.14) in the context of the

multi-level error bound from Lemma 2.1 [23]1. For the standard 4th-kind Chebyshev

polynomial, βi:= 1 for all i. The simple smoothing iteration with ω= 3/2, 1st-kind

Chebyshev smoothing with λmin = 0.3λmax, and the 4th-kind Chebyshev smoothing

polynomials, with and without optimal βi, are shown in Figure 1. Minimizing (2.14)

with respect to λmin in the 1st-kind Chebyshev smoothing iteration is also shown.

Algorithm 2.2 Chebyshev smoother, (Opt.) 4th-kind

x0=x,r0=b−Ax0

d0=4

λmax

Sr0

for i= 1,...,k−1do

xi=xi−1+βidi−1, ri=ri−1

−Adi−1

di=2i−1

2i+ 3 di−1+8i+ 4

2i+ 3

λmax

Sri

end for

xk=xk−1+βkdk−1

return xk

Given 2ksmoothing steps, what is the optimal way to distribute the number of

smoothing steps for the pre-smoother, m, and the post-smoother, n, with m+n= 2k?

The multi-level error bound in Lemma 2.1, developed by Lottes in [23], provides a

theoretical framework to answer the question.

Lemma 2.1. Let the smoother iteration (on each level j) be given by

Gj=pkj(SjAj)

where Sjis SPD and scaled such that ρ(SjAj) = 1, and pkj(x)is a kj-order polynomial

satisfying pkj(0) = 1 and |pkj(x)|<1for 0< x ≤1, possibly diﬀerent on each level.

1Tabulated βicoeﬃcients for the 4th-kind Chebyshev polynomials are available in Table 5.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OPTIMALCHEBYSHEVSMOOTHERSANDONE-SIDEDV-CYCLESMALACHIPHILLIPSyANDPAULFISCHERyzAbstract.ThesolutiontothePoissonequationarisingfromthespectralelementdiscretizationoftheincompressibleNavier-Stokesequationrequiresrobustpreconditioningstrategies.Onesuchstrategyismultigrid.Torealizethepotentialofmultigrid...

展开>> 收起<<

Optimal Chebyshev Smoothers and One-sided V-cycles.pdf

共28页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimal Chebyshev Smoothers and One-sided V-cycles

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: