Optimal Chebyshev Smoothers and One-sided V-cycles

2025-04-29 0 0 3.81MB 28 页 10玖币
侵权投诉
OPTIMAL CHEBYSHEV SMOOTHERS AND ONE-SIDED V-CYCLES
MALACHI PHILLIPSAND PAUL FISCHER
Abstract. The solution to the Poisson equation arising from the spectral element discretization
of the incompressible Navier-Stokes equation requires robust preconditioning strategies. One such
strategy is multigrid. To realize the potential of multigrid methods, effective smoothing strategies
are needed. Chebyshev polynomial smoothing proves to be an effective smoother. However, there
are several improvements to be made, especially at the cost of symmetry. For the same cost per
iteration, a symmetric V-cycle with korder Chebyshev polynomial smoothing may be substituted
with a one-sided V-cycle with order 2kChebyshev polynomial smoothing, wherein the smoother is
omitted on the up-leg of the V-cycle. The choice of omitting the post-smoother in favor of higher order
Chebyshev pre-smoothing is shown to be advantageous in cases where the multigrid approximation
property constant, C, is large. Results utilizing Lottes’s fourth-kind Chebyshev polynomial smoother
are shown. These methods demonstrate substantial improvement over the standard Chebyshev
polynomial smoother. The authors demonstrate the effectiveness of this scheme in p-geometric
multigrid, as well as a 2D model problem with finite differences.
Key words. multigrid, smoothers, parallel computing
MSC codes. 76D05, 65N55, 65F08
1. Introduction. Chebyshev smoothing was introduced in the context of paral-
lel multigrid (MG) methods in [1], where it was established that Chebyshev smoothing
was competitive with Gauss-Seidel smoothing even in serial computing applications.
Here, we explore several variations on Chebyshev smoothing for the Poisson problem
in general domains. Our primary aim is to develop fast highly-scalable solvers for the
pressure sub-step in time advancement of the Navier-Stokes (NS) equations, partic-
ularly for discretizations based on the spectral element method (SEM). Many of the
finding, however, would apply in more general settings.
Our target problem is to solve a sequence of Poisson problems,
−∇2˜u=˜
ffor ˜u, ˜
flRd7→ lR.(1.1)
The weak formulation is written as: find um(x)XN
0⊂ H1
0such that
Z
v· ∇u dV =Z
v fmdV vXN
0,(1.2)
where fm(x) is the data and um(x) is the corresponding solution field at some time
instant tm,m= 1,2, . . . Here, lRdis the computational domain in d(=1, 2, or
3) space dimensions; H1
0(Ω) is the standard Sobolev space comprising functions that
vanish on a subset of the boundary, DΩ, are square-integrable on Ω, and whose
gradient is also square-integrable; and XN
0= span{φj(x)}is the finite-dimensional
trial/test space associated with a Galerkin formulation of the Poisson problem. The
discrete problem statement is expressed as Aum=bm, where umis the vector of basis
coefficients at tmand Ais the symmetric-positive definite (SPD) matrix with
aij =Z
φi· ∇φjdV.(1.3)
This work was funded by the Exascale Computing Project under contract no. 17-SC-20-SC
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana IL 61801
(malachi2@illinois.edu).
Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign,
Urbana IL 61801 (fischerp@illinois.edu).
1
arXiv:2210.03179v2 [math.NA] 23 Feb 2023
2M. PHILLIPS AND P. FISCHER
We point out that the need to solve a sequence of problems differs from solving
a single problem in several significant ways. First, solver set-up costs are typically
amortized over thousands of right-hand sides and are therefore largely irrelevant to our
cost concerns. Second, the solution is typically devoid of significant low wave-number
content because we solve only for a perturbed solution, δum:= um¯u, where ¯uis an
initial guess. If we take ¯u=um1then the initial residual r0=bAum1=O(∆t).
This result is improved to O(∆tl) by projecting umonto the space of prior solutions,
{um1. . . uml}[8,16]. Finally, with an initially small residual, GMRES is likely to
converge in just a few iterations, which obviates the need for restarts and mitigates
the O(k2) complexity terms in a k-iteration GMRES solve. This latter observation
puts less pressure on requiring a symmetric preconditioner since one can retain the
full benefits of using Krylov subspace projection (KSP) without resorting to conjugate
gradient iteration. With these circumstances in mind, we will drop the superscript m
in the sequel.
We note that Chebyshev smoothers have gained a lot of attention recently. Kro-
nbichler and co-workers [20,12,11] have employed Chebyshev smoothing for discon-
tinuous Galerkin discretizations of the NS equations. Rudi and coworkers employ
algebraic multigrid (AMG) with Chebyshev smoothing [39]. Similarly, Chebyshev
smoothing is considered by Sundar and coworkers as a multigrid smoother for high-
order continuous finite element discretizations [43].
A major difference here is that we consider Chebyshev in conjunction with addi-
tive Schwarz methods (ASM) [45,13,24,35] and restrictive additive Schwarz (RAS)
[5] in place of point-Jacobi smoothing. The principal idea is to use ASM or RAS to
eliminate high wave number content. In the case of the spectral element method, local
Schwarz solves can be effected at a cost that is comparable to forward operator eval-
uation through the use of fast diagonalization [25,15,24]. Another critical aspect of
the current context is that many of our applications are targeting exascale platforms
and beyond, where compute is performed on tens of thousands of GPUs for which the
relative cost of global communication and hence, coarse-grid solves, is high [31]. In
such cases, it often pays to have high-quality and broad bandwidth smoothing, such
as provided by Chebyshev, in order to reduce the number of visits to the bottom of
the V-cycle where the expensive coarse-grid solve is invoked.
Here, we explore a seemingly simple question: Given 2ksmoothing iterations,
what is the optimal choice of mpre-smoothing and npost-smoothing applications,
where m+n= 2k?More specifically, in the Chebyshev context, the question is
what order mpre-smoothing and order npost-smoothing should be used at the same
cost per iteration. An additional and important point to this question is, What kind
of Chebyshev smoothing should be used? One could use standard 1st-kind Chebyshev
polynomials with tuned parameters. (Recall, we can afford significant tuning over-
head.) Or, one could use standard or optimized 4th-kind Chebyshev polynomials that
were proposed in recent work by Lottes. (See [23] and references therein.) We explore
these questions under several different conditions: using finite differences and spectral
elements discretizations and using Jacobi, ASM, or RAS as the basic smoother.
The structure of this paper is as follows. Section 2 outlines the multigrid V-cycle
and Chebyshev smoothers. 2D finite difference Poisson results on varying aspect ra-
tio grids, along with a comparison between theoretical and observed multigrid error
contraction rates, are presented in section 3. Spectral element (SE)-based pressure
Poisson preconditioning schemes implemented in the scalable open-source CFD code,
nekRS [14], are presented in section 4. nekRS started as a fork of libParanumal [7]
and uses highly optimized kernels based on the Open Concurrent Compute Abstrac-
OPT. CHEBY. SM. AND V-CYCLE 3
tion (OCCA) [29]. Special focus is given to performance on large-scale GPU-based
platforms such as OLCF’s Summit. Cases for the stationary Poisson and pressure
Poisson problem arising from the NS equations are shown in section 5. Results for
the cases in section 5 are shown in section 6.Section 7 concludes the paper.
2. Multigrid and Chebyshev Smoothers. Let us consider the V-cycle algo-
rithm to solve the SPD matrix A. Suppose that levels j= 0, . . . , ` are used in the
V-cycle, with A=A0. Let us denote the interpolation operator mapping entries from
grid j+ 1 to jby Pj
j+1 for j= 0, . . . , ` 1. The sequence of matrices corresponding
to each level are typically constructed in a Galerkin fashion, with
(2.1) Aj+1 =Pj
j+1TAjPj
j+1, j = 0, . . . , ` 1.
The multiplicative error propagator for a single V-cycle is given recursively by
Ej=IM1
jAj
=G0
jIPj
j+1M1
j+1 Pj
j+1TAjGj, j = 0, . . . , ` 1,(2.2)
with M1
`:=A1
`.Gjand G0
jare the smoother iteration matrices for the pre- and
post-smoothing iteration matrices, respectively. For example, Gj= (IωSjAj)k
corresponds to ksteps of the simple smoothing iteration
(2.3) xi+1j= (xi)j+ωSj(bjAj(xi)j),
where Sjis the smoother for the jth level, such as Jacobi with Sj= diag(Aj)1Aj.G0
j
and Gjare not necessarily the same – in fact, we will consider the choice of G0
j=Gj
(symmetric post-smoothing) as well as G0
j=I(omitting post-smoothing), among
others.
For later use, we will need to consider the one-sided V-cycle, which is sufficient
for the analysis of general V-cycles [28,23]. Similar as before, the “fine-to-coarse”
(E&)j=I(M&)1
jAj
=IPj
j+1 (M&)1
j+1 Pj
j+1TAjGj, j = 0, . . . , ` 1,(2.4)
and “coarse-to-fine”
(E%)j=I(M%)1
jAj
=G0
jIPj
j+1 (M%)1
j+1 Pj
j+1TAj, j = 0, . . . , ` 1,(2.5)
error propagators are defined, with (M&)1
`= (M%)1
`=A1
`. Let E&= (E&)0and
E%= (E%)0. The general V-cycle, therefore, is the product of the “fine-to-coarse”
and “coarse-to-fine” error propagators,
(2.6) EV=E%E&.
Given kjiterations of the smoothing iteration in (2.3) for level j, is it possible
to construct a better order kjpolynomial than pkj(SjAj) = Gkj= (IωSjAj)kj?
Following [40,1], we wish to solve:
(2.7) min
pkPk,pk(0)=1 max
λ[λminmax ]|p(t)|.
4M. PHILLIPS AND P. FISCHER
where Pkis the space of all polynomials with degree less than or equal to k. Taking
Eto be the interval [λmin, λmax], the solution to the minimax problem in (2.7) are
the shifted and scaled Chebyshev polynomials of the 1st-kind
(2.8) ˆ
Tk(λ) = 1
σk
Tkθλ
δwith σk:=Tkθ
δ.
Tk(·) is the Chebyshev polynomial of the 1st-kind of order k;θis the midpoint of the
interval [λmin, λmax],
θ=λmin +λmax
2;
and δis the mid-width of the interval,
δ=λmax λmin
2.
The Chebyshev polynomials of the 1st-kind enjoy a three-term recurrence relation
that is used to derive Algorithm 2.1 [40]. While λmax is taken to be the largest
eigenvalue of SjAj, how should λmin be chosen? λmin is chosen as a small factor of
λmax. Previous works considered factors such as 1/30 [1], 3/10 [2], 1/4 [43], and 1/6
[47]. Two approaches utilizing the 1st-kind Chebyshev smoother are considered in
this study. The first uses λmin = 0.1λmax and is denoted as 1st-Cheb. The second
optimizes the for λmin and is denoted as 1st-Cheb, λopt
min.
Algorithm 2.1 Chebyshev smoother, 1st-kind
θ=1
2(λmax +λmin), δ=1
2(λmax λmin), σ=θ
δ,ρ0=1
σ
x0=x, r0=S(bAx0), d0=1
θr0
for i= 1,...,k1do
xi=xi1+di1
ri=ri1
SAdi1,ρi=1
2σρi1
di=ρiρi1di1+2ρi
δri
end for
xk=xk1+dk1
return xk
Is it possible to further improve the polynomial smoother and remove the ad-hoc
λmin parameter? The polynomial smoother can be chosen in such a way to minimize
an error bound [23], such as the two-level bound proposed by Hackbusch [17] in (2.9).
Without loss of generality, let ρ(SA) = 1. Let G=Gk(SA) be a k-order polynomial
in SA. The two-level bound from Hackbusch [17] is re-written as [23]:
kE&kA=
(IP A1
cPTA)Gk
A
C1/2
0sup
01
λ1/2|pk(λ)|,(2.9)
where C0is the multigrid approximation property constant, which for a given level j
OPT. CHEBY. SM. AND V-CYCLE 5
is
Cj:=
A1
jPj
j+1A1
j+1 Pj
j+1T
2
Aj,Sj
:= sup
kfkSj
1
A1
jPj
j+1A1
j+1 Pj
j+1Tf
2
Aj
.(2.10)
Solving the weighted minimax problem in (2.9) yields the solution [23]:
(2.11) pk(λ) = 1
2k+ 1Wk(1 2λ),
where Wkis the Chebyshev polynomial of the 4th-kind of order k. Chebyshev poly-
nomials of the 4th-kind satisfy the same recurrence as that of the first-kind, with
different initial conditions [23,27]:
(2.12) Wn(x) = 2xWn1(x)Wn2(x) with W0(x)=1, W1(x) = 2x+ 1.
Following this recurrence and relaxing the ρ(SA) = 1 assumption, a similar iterative
algorithm to Algorithm 2.1 can be derived as Algorithm 2.2. The inclusion of the βi
parameters in Algorithm 2.2 comes as a result of optimizing (2.14) in the context of the
multi-level error bound from Lemma 2.1 [23]1. For the standard 4th-kind Chebyshev
polynomial, βi:= 1 for all i. The simple smoothing iteration with ω= 3/2, 1st-kind
Chebyshev smoothing with λmin = 0.3λmax, and the 4th-kind Chebyshev smoothing
polynomials, with and without optimal βi, are shown in Figure 1. Minimizing (2.14)
with respect to λmin in the 1st-kind Chebyshev smoothing iteration is also shown.
Algorithm 2.2 Chebyshev smoother, (Opt.) 4th-kind
x0=x,r0=bAx0
d0=4
3
1
λmax
Sr0
for i= 1,...,k1do
xi=xi1+βidi1, ri=ri1
Adi1
di=2i1
2i+ 3 di1+8i+ 4
2i+ 3
1
λmax
Sri
end for
xk=xk1+βkdk1
return xk
Given 2ksmoothing steps, what is the optimal way to distribute the number of
smoothing steps for the pre-smoother, m, and the post-smoother, n, with m+n= 2k?
The multi-level error bound in Lemma 2.1, developed by Lottes in [23], provides a
theoretical framework to answer the question.
Lemma 2.1. Let the smoother iteration (on each level j) be given by
Gj=pkj(SjAj)
where Sjis SPD and scaled such that ρ(SjAj) = 1, and pkj(x)is a kj-order polynomial
satisfying pkj(0) = 1 and |pkj(x)|<1for 0< x 1, possibly different on each level.
1Tabulated βicoefficients for the 4th-kind Chebyshev polynomials are available in Table 5.
摘要:

OPTIMALCHEBYSHEVSMOOTHERSANDONE-SIDEDV-CYCLESMALACHIPHILLIPSyANDPAULFISCHERyzAbstract.ThesolutiontothePoissonequationarisingfromthespectralelementdiscretizationoftheincompressibleNavier-Stokesequationrequiresrobustpreconditioningstrategies.Onesuchstrategyismultigrid.Torealizethepotentialofmultigrid...

展开>> 收起<<
Optimal Chebyshev Smoothers and One-sided V-cycles.pdf

共28页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:28 页 大小:3.81MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 28
客服
关注