Parallel-in-Time with Fully Finite Element Multigrid for 2-D Space-fractional Diffusion Equations
The paper investigates a non-intrusive parallel time integration with multigrid for space-fractional diffusion equations in two spatial dimensions. We firstly obtain a fully discrete scheme via using the linear finite element method to discretize spatial and temporal derivatives to propagate solutions. Next, we present a non-intrusive time-parallelization and its two-level convergence analysis, where we algorithmically and theoretically generalize the MGRIT to time-dependent fine time-grid propagators. Finally, numerical illustrations show that the obtained numerical scheme possesses the saturation error order, theoretical results of the two-level variant deliver good predictions, and significant speedups can be achieved when compared to parareal and the sequential time-stepping approach.
Yue X Q et. al.]X. Q. Yue, S. Shu, X. W. Xu, W. P. Bu and K. J. Pan111Corresponding author.
Parallel-in-Time with Fully Finite Element Multigrid for 2-D Space-fractional Diffusion Equations
School of Mathematics and Computational Science, Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Hunan 411105, China
Institute of Applied Physics and Computational Mathematics, Beijing 100088, China
School of Mathematics and Statistics, Central South University, Hunan 410083, China
AMS subject classifications: 35R11, 65F10, 65F15, 65N55
Key words: Space-fractional diffusion equations, space-time finite element, multigrid-in-time, parallel computing.
In recent years, mathematical models with fractional derivatives and integrals attract a wide interest of scientists in a variety of fields (including physics, biology and chemistry, etc.), owing to their potences in descriptions of memory and heredity . Particularly, fractional diffusion equations are shown to afford investigations on subdiffusive phenomena and Lévy fights . In this article, we are interested in a class of two-dimensional space-fractional diffusion equations (SFDEs)
with orders , , constants , , solution domain , and Riesz fractional derivatives
Since closed-form analytical solutions of fractional models are rarely accessible in practice, the numerical solutions become very prevalent to empower their successful applications. In literatures, numerical methods of SFDEs proposed to achieve high accuracy and efficiency include finite difference [3, 4, 5, 6, 7, 8, 9, 10], finite element (FE) [11, 12, 13, 14, 15, 16], finite volume [17, 18, 19] and spectral (element) [20, 21, 22] methods. It must be emphasized that no matter which discretization is applied, there usually persists intensive computational task in nonlocality caused by fractional differential operators . Numerous scholars are working to identify fast algorithms most appropriate to tackle this challenge, see [24, 25, 26, 27, 28, 29] and related references therein. Except for these fast solutions, parallel computing can be viewed as another potential technique or even a basic strategy. Gong et al. presented MPI-based and GPU-based parallel algorithms for one-dimensional Riesz SFDEs [30, 31], whose speedups are both achieved by spatial parallelism with sequential time-stepping approach, using some time propagator to integrate from one time to the next. However, future computing speed must rely on the increased concurrency provided by more, instead of faster, processors. An immediate consequence of this is that solution algorithms, limited to spatial parallelism, for problems with evolutionary behavior entail long overall computation time, often exceeding computing resources available to resolve multidimensional SFDEs. Thus, algorithms achieving parallelism in time have been of especially high demand over the past decade. Currently, parareal in time  and multigrid-reduction-in-time (MGRIT)  are two active choices. Wu et al. analyzed convergence properties of the parareal algorithm for SFDEs via certain underlying ODEs in constant time-steps [34, 35], but they uninvolved large-scale testings. Observe that parareal can be interpreted as a two-level multigrid (reduction) method [33, 36], its concurrency is still limited because of the large sequential coarse-grid solve. MGRIT, a non-intrusively and truly multilevel algorithm implemented in the open-source XBraid  with optimal parallel communication, counteracts this and enables us to approximate simultaneously the evolution over all time points. The non-intrusive nature of MGRIT relies upon modalities of fine and coarse time-grid propagators rather than their internals. It has been proven to be effective and analyzed sharply in the two-level setting for integer order basic parabolic and hyperbolic problems , however, with limitations on analysis that they only consider linear problems and the temporal grid is uniform, i.e., fine time-grid propagators are all the same. Furthermore, from the survey of references, there are no calculations of the MGRIT algorithm to SFDEs, nor to FE discretizations of parabolic and hyperbolic problems in time.
The main aim of the paper is to propose and analyze a non-intrusive optimal-scaling MGRIT algorithm for space-time FE discretizations of problem (1.1)-(1.3) in uniform and nonuniform temporal partitions, where we shall extend the scope of the MGRIT method and develop a library of MGRIT modifications to time-dependent propagators. The outline of our presentation proceeds as follows. In Section 2, we derive a fully discrete FE scheme. Section 3 introduces the MGRIT solver followed by its two-level convergence analysis to the obtained scheme. In Section 4, we report some numerical results to illustrate optimal convergence rates in both space and time, present theoretical confirmations and analyze weak and strong scaling studies to show benefits. Finally, relevant results are summarized and follow-up work are drawn in Section 5.
2 Space-time FE discretization for SFDEs
This section deals with the construction of our space-time FE scheme for SFDEs. Here we denote by and the inner product and its associated norm on . First, we introduce some fractional derivative spaces.
(Left and right fractional derivative spaces) For a given constant , define norms
Let and be closures of in regard to and , respectively.
(Fractional Sobolev space) For a given constant , define the norm
where is the Fourier transform of . Let be the closure of in norm sense.
The mathematical equivalence among , and with related norms can be proved to the case where , analogously to the work .
For the depiction of our space-time FE discretization, we define a (possibly nonuniform) temporal partition and a uniform spatial triangulation with constant spacings and , let be the set of interior mesh points in , denote the -th subintervals and for , and . We choose the usual spaces in tensor products
where , and with as the space of all polynomials of degree .
Now a fully discrete FE approximation for (2.1) is singled out immediately: for , to seek the solution satisfying and
where the function approximates to the initial , and
Note that and can be respectively written as and , with being the Lagrange linear shape function at point , as well as
By simple algebraic calculations, we deduce
Hence, to compute satisfying Eq. (2.3) we solve
where initial guess with , vector with , mass matrix takes the block-tridiagonal form
with the permutation matrix produced in terms of the identity matrix by
Toeplitz matrices () whose entries are in forms
It is worthwhile to point out that is introduced for a more economical memory requirement, since is naturally of the full block form, while (2.5) computes without explicit generation of or .
3 Time-parallelization and its two-level convergence analysis
This section is devoted to the time-parallelization for the forward time-marching loop (2.4) as well as the two-level convergence analysis.
3.1 The MGRIT algorithm
where the -th time-grid propagator
Obviously, the inverse in Eq. (3.2) corresponds to a spatial solve.
Regarding the MGRIT algorithm to solve the global space-time problem (3.1), various components have to be chosen. Let be the coarsening factor in time and , we define a coarse mesh , . In this setting, all are C-points and the others are F-points. FCF-relaxation depicted in Fig. 1, an initial F-relaxation followed by a C-relaxation and then a second F-relaxation, is often the most reliable choice to produce optimal and scalable multilevel iterations . Define the injection at C-points as our restriction operator, and the injection followed by an F-relaxation over the fine-grid operator as our interpolation operator. The multilevel hierarchy can be constructed by applying the above processes recursively. Both sequential time-stepping and MGRIT are in terms of spatial solve, but MGRIT is highly concurrent. About more processors are actually required in temporal concurrency to outweigh the extra work of MGRIT, where is the number of MGRIT iterations . Below is the MGRIT V-cycle algorithm, where , , , , and () correspond to the -th coarse-scale time re-discretization, restriction and interpolation, respectively.
MGRIT algorithm with V-cycle: .
- Step 1
Apply FCF-relaxation to .
- Step 2
Restrict the residual .
- Step 3
If , then solve ;
Else perform .
- Step 4
Do the coarse-grid correction .
To save computational work, Step 4 of Algorithm 1 is done by just correcting C-point values, and updating F-point values only when the Euclidean norm of the residual is small enough. The reason is that the correction at F-points is equivalent to an F-relaxation, which will be performed in Step 1 of the subsequent iteration.
3.2 Implementation details
For numerical experiments, the general-purpose parallel-in-time library XBraid  was employed. Eq. (3.1) by uniform temporal partitions can be solved by XBraid in a straightforward way, whereas modifications on XBraid was done to time-dependent propagators as part of this study. Wrapper routines were written in C and integrals were calculated by a quadrature formula.
One of the most noteworthy is that we split processors and communicators into spatial and temporal groups for purpose of running parallelized modules in space, time or both. In view of the linearity of Eq. (1.1) and Toeplitz structures of , and , all submatrices and (; ) are only set up once to reduce computational work. The space-time approximation is obtained until the space-time residual norm in the discrete sense is less than the absolute halting tolerance , where all spatial solves are accomplished by the Conjugate Gradient algorithm provided in the HYPRE library  with as the relative tolerance for stopping and the Euclidean norm used to measure solution progress. Since the procedure involves only matrix-by-vector multiplications, matrices , and are kept in unassembled form to save on memory. We skip any work on the first MGRIT down cycle. In addition, we choose random initial guesses for the entire temporal grid hierarchy, except that the initial condition (1.3) is used at on the finest grid.
3.3 Two-level convergence analysis
Setting , Algorithm 1 in this case reduces to a two-level scheme, whose error propagator is characterized in the following lemma.
Let be the approximation, we have the update sequence during FCF-relaxation
Notice that the exact solution can be written in the form
which follows from the recursion (2.4). Hence, the residual at the C-points becomes
Then we can get the coarse-grid solution
It follows by the subsequent correction at C-points that
The desired result follows immediately by plugging (3.3) into the above equation. ∎
At this point, a coarse time-step is roughly as expensive to solve as a fine time-step.
Next we wish to establish the error reduction factor of the two-level version of Algorithm 1, similar to the one in , based on the fact that defined by (3.2) and defined by (3.4) can be simultaneously diagonalized by a unitary matrix , which satisfies
This will yield the following two lemmas.
The eigenvalue in Eq. (3.5) is real and positive for .
For and , let
Then it is easy to verify that the following relations hold
By Lemma 3.2, we conclude that