Parallel-in-Time with Fully Finite Element Multigrid for 2-D Space-fractional Diffusion Equations

Parallel-in-Time with Fully Finite Element Multigrid for 2-D Space-fractional Diffusion Equations

[
Abstract

The paper investigates a non-intrusive parallel time integration with multigrid for space-fractional diffusion equations in two spatial dimensions. We firstly obtain a fully discrete scheme via using the linear finite element method to discretize spatial and temporal derivatives to propagate solutions. Next, we present a non-intrusive time-parallelization and its two-level convergence analysis, where we algorithmically and theoretically generalize the MGRIT to time-dependent fine time-grid propagators. Finally, numerical illustrations show that the obtained numerical scheme possesses the saturation error order, theoretical results of the two-level variant deliver good predictions, and significant speedups can be achieved when compared to parareal and the sequential time-stepping approach.

Yue X Q et. al.]X. Q. Yue, S. Shu, X. W. Xu, W. P. Bu and K. J. Pan111Corresponding author.

  • Parallel-in-Time with Fully Finite Element Multigrid for 2-D Space-fractional Diffusion Equations

  • [

  •  School of Mathematics and Computational Science, Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Hunan 411105, China
     Institute of Applied Physics and Computational Mathematics, Beijing 100088, China
     School of Mathematics and Statistics, Central South University, Hunan 410083, China

  • AMS subject classifications: 35R11, 65F10, 65F15, 65N55

  • Key words: Space-fractional diffusion equations, space-time finite element, multigrid-in-time, parallel computing.

1 Introduction

In recent years, mathematical models with fractional derivatives and integrals attract a wide interest of scientists in a variety of fields (including physics, biology and chemistry, etc.), owing to their potences in descriptions of memory and heredity [1]. Particularly, fractional diffusion equations are shown to afford investigations on subdiffusive phenomena and Lévy fights [2]. In this article, we are interested in a class of two-dimensional space-fractional diffusion equations (SFDEs)

(1.1)
(1.2)
(1.3)

with orders , , constants , , solution domain , and Riesz fractional derivatives

where

Since closed-form analytical solutions of fractional models are rarely accessible in practice, the numerical solutions become very prevalent to empower their successful applications. In literatures, numerical methods of SFDEs proposed to achieve high accuracy and efficiency include finite difference [3, 4, 5, 6, 7, 8, 9, 10], finite element (FE) [11, 12, 13, 14, 15, 16], finite volume [17, 18, 19] and spectral (element) [20, 21, 22] methods. It must be emphasized that no matter which discretization is applied, there usually persists intensive computational task in nonlocality caused by fractional differential operators [23]. Numerous scholars are working to identify fast algorithms most appropriate to tackle this challenge, see [24, 25, 26, 27, 28, 29] and related references therein. Except for these fast solutions, parallel computing can be viewed as another potential technique or even a basic strategy. Gong et al. presented MPI-based and GPU-based parallel algorithms for one-dimensional Riesz SFDEs [30, 31], whose speedups are both achieved by spatial parallelism with sequential time-stepping approach, using some time propagator to integrate from one time to the next. However, future computing speed must rely on the increased concurrency provided by more, instead of faster, processors. An immediate consequence of this is that solution algorithms, limited to spatial parallelism, for problems with evolutionary behavior entail long overall computation time, often exceeding computing resources available to resolve multidimensional SFDEs. Thus, algorithms achieving parallelism in time have been of especially high demand over the past decade. Currently, parareal in time [32] and multigrid-reduction-in-time (MGRIT) [33] are two active choices. Wu et al. analyzed convergence properties of the parareal algorithm for SFDEs via certain underlying ODEs in constant time-steps [34, 35], but they uninvolved large-scale testings. Observe that parareal can be interpreted as a two-level multigrid (reduction) method [33, 36], its concurrency is still limited because of the large sequential coarse-grid solve. MGRIT, a non-intrusively and truly multilevel algorithm implemented in the open-source XBraid [37] with optimal parallel communication, counteracts this and enables us to approximate simultaneously the evolution over all time points. The non-intrusive nature of MGRIT relies upon modalities of fine and coarse time-grid propagators rather than their internals. It has been proven to be effective and analyzed sharply in the two-level setting for integer order basic parabolic and hyperbolic problems [38], however, with limitations on analysis that they only consider linear problems and the temporal grid is uniform, i.e., fine time-grid propagators are all the same. Furthermore, from the survey of references, there are no calculations of the MGRIT algorithm to SFDEs, nor to FE discretizations of parabolic and hyperbolic problems in time.

The main aim of the paper is to propose and analyze a non-intrusive optimal-scaling MGRIT algorithm for space-time FE discretizations of problem (1.1)-(1.3) in uniform and nonuniform temporal partitions, where we shall extend the scope of the MGRIT method and develop a library of MGRIT modifications to time-dependent propagators. The outline of our presentation proceeds as follows. In Section 2, we derive a fully discrete FE scheme. Section 3 introduces the MGRIT solver followed by its two-level convergence analysis to the obtained scheme. In Section 4, we report some numerical results to illustrate optimal convergence rates in both space and time, present theoretical confirmations and analyze weak and strong scaling studies to show benefits. Finally, relevant results are summarized and follow-up work are drawn in Section 5.

2 Space-time FE discretization for SFDEs

This section deals with the construction of our space-time FE scheme for SFDEs. Here we denote by and the inner product and its associated norm on . First, we introduce some fractional derivative spaces.

Definition 2.1.

(Left and right fractional derivative spaces) For a given constant , define norms

Let and be closures of in regard to and , respectively.

Definition 2.2.

(Fractional Sobolev space) For a given constant , define the norm

where is the Fourier transform of . Let be the closure of in norm sense.

Remark 2.1.

The mathematical equivalence among , and with related norms can be proved to the case where , analogously to the work [39].

Utilizing Lemma 5 in [11], the weak formulation of problem (1.1)-(1.3) is derived in reference to : given , and , hunting for subject to and

(2.1)

where

(2.2)

For the depiction of our space-time FE discretization, we define a (possibly nonuniform) temporal partition and a uniform spatial triangulation with constant spacings and , let be the set of interior mesh points in , denote the -th subintervals and for , and . We choose the usual spaces in tensor products

where , and with as the space of all polynomials of degree .

Now a fully discrete FE approximation for (2.1) is singled out immediately: for , to seek the solution satisfying and

(2.3)

where the function approximates to the initial , and

Note that and can be respectively written as and , with being the Lagrange linear shape function at point , as well as

By simple algebraic calculations, we deduce

Hence, to compute satisfying Eq. (2.3) we solve

(2.4)

where initial guess with , vector with , mass matrix takes the block-tridiagonal form

with

stiffness matrices

and

(2.5)

with the permutation matrix produced in terms of the identity matrix by

Toeplitz matrices () whose entries are in forms

and

Remark 2.2.

It is worthwhile to point out that is introduced for a more economical memory requirement, since is naturally of the full block form, while (2.5) computes without explicit generation of or .

3 Time-parallelization and its two-level convergence analysis

This section is devoted to the time-parallelization for the forward time-marching loop (2.4) as well as the two-level convergence analysis.

3.1 The MGRIT algorithm

Consider the forward problem (2.4), a one-step method of Eq. (1.1), which is equivalent to the block unit lower bidiagonal system

(3.1)

where the -th time-grid propagator

(3.2)

and

Obviously, the inverse in Eq. (3.2) corresponds to a spatial solve.

Figure 1: Schematic of the update sequence during FCF-relaxation with the coarsening factor .

Regarding the MGRIT algorithm to solve the global space-time problem (3.1), various components have to be chosen. Let be the coarsening factor in time and , we define a coarse mesh , . In this setting, all are C-points and the others are F-points. FCF-relaxation depicted in Fig. 1, an initial F-relaxation followed by a C-relaxation and then a second F-relaxation, is often the most reliable choice to produce optimal and scalable multilevel iterations [33]. Define the injection at C-points as our restriction operator, and the injection followed by an F-relaxation over the fine-grid operator as our interpolation operator. The multilevel hierarchy can be constructed by applying the above processes recursively. Both sequential time-stepping and MGRIT are in terms of spatial solve, but MGRIT is highly concurrent. About more processors are actually required in temporal concurrency to outweigh the extra work of MGRIT, where is the number of MGRIT iterations [33]. Below is the MGRIT V-cycle algorithm, where , , , , and () correspond to the -th coarse-scale time re-discretization, restriction and interpolation, respectively.

Algorithm 1.

MGRIT algorithm with V-cycle: .

Step 1

Apply FCF-relaxation to .

Step 2

Restrict the residual .

Step 3

If , then solve ;
Else perform .

Step 4

Do the coarse-grid correction .

Remark 3.1.

To save computational work, Step 4 of Algorithm 1 is done by just correcting C-point values, and updating F-point values only when the Euclidean norm of the residual is small enough. The reason is that the correction at F-points is equivalent to an F-relaxation, which will be performed in Step 1 of the subsequent iteration.

3.2 Implementation details

For numerical experiments, the general-purpose parallel-in-time library XBraid [37] was employed. Eq. (3.1) by uniform temporal partitions can be solved by XBraid in a straightforward way, whereas modifications on XBraid was done to time-dependent propagators as part of this study. Wrapper routines were written in C and integrals were calculated by a quadrature formula.

One of the most noteworthy is that we split processors and communicators into spatial and temporal groups for purpose of running parallelized modules in space, time or both. In view of the linearity of Eq. (1.1) and Toeplitz structures of , and , all submatrices and (; ) are only set up once to reduce computational work. The space-time approximation is obtained until the space-time residual norm in the discrete sense is less than the absolute halting tolerance , where all spatial solves are accomplished by the Conjugate Gradient algorithm provided in the HYPRE library [40] with as the relative tolerance for stopping and the Euclidean norm used to measure solution progress. Since the procedure involves only matrix-by-vector multiplications, matrices , and are kept in unassembled form to save on memory. We skip any work on the first MGRIT down cycle. In addition, we choose random initial guesses for the entire temporal grid hierarchy, except that the initial condition (1.3) is used at on the finest grid.

3.3 Two-level convergence analysis

Setting , Algorithm 1 in this case reduces to a two-level scheme, whose error propagator is characterized in the following lemma.

Lemma 3.1.

Let be the error of (3.1) and the coarsening factor. Define with . Then, after an iteration of the two-level version of Algorithm 1, the new error at C-points satisfies

where is the coarse time-grid propagator at the coarse time-scale point .

Proof.

Let be the approximation, we have the update sequence during FCF-relaxation

Notice that the exact solution can be written in the form

which follows from the recursion (2.4). Hence, the residual at the C-points becomes

Then we can get the coarse-grid solution

which gives

(3.3)

It follows by the subsequent correction at C-points that

The desired result follows immediately by plugging (3.3) into the above equation. ∎

It is important to note that is introduced to approximate the ideal coarse time-stepper . An obvious and effective choice of is to re-discretize problem (1.1)-(1.3) on the coarse time-grid, i.e.,

(3.4)

At this point, a coarse time-step is roughly as expensive to solve as a fine time-step.

Next we wish to establish the error reduction factor of the two-level version of Algorithm 1, similar to the one in [38], based on the fact that defined by (3.2) and defined by (3.4) can be simultaneously diagonalized by a unitary matrix , which satisfies

(3.5)

This will yield the following two lemmas.

Lemma 3.2.

The eigenvalue in Eq. (3.5) is real and positive for .

Proof.

Note that the matrix

is symmetric and similar to , which, together with the positive definiteness of defined by (2) (see reference [11] for a proof), imply that the result is true. ∎

Lemma 3.3.

All time-grid propagators (3.2) and (3.4) are stable.

Proof.

For and , let

(3.6)

Then it is easy to verify that the following relations hold

(3.7)

By Lemma 3.2, we conclude that