A preconditioning technique for all-at-once system from the nonlinear tempered fractional diffusion equation

# A preconditioning technique for all-at-once system from the nonlinear tempered fractional diffusion equation

Yong-Liang Zhao Pei-Yong Zhu Xian-Ming Gu Xi-Le Zhao Huan-Yan Jian School of Mathematical Sciences,
University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, P.R. China
School of Economic Mathematics/Institute of Mathematics,
Southwestern University of Finance and Economics, Chengdu, Sichuan 611130, P.R. China
###### Abstract

An all-at-once linear system arising from the nonlinear tempered fractional diffusion equation with variable coefficients is studied. Firstly, the nonlinear and linearized implicit schemes are proposed to approximate such the nonlinear equation with continuous/discontinuous coefficients. The stabilities and convergences of the two schemes are proved under several suitable assumptions, and numerical examples show that the convergence orders of these two schemes are in both time and space. Secondly, a nonlinear all-at-once system is derived based on the nonlinear implicit scheme, which may suitable for parallel computations. Newton’s method, whose initial value is obtained by interpolating the solution of the linearized implicit scheme on the coarse space, is chosen to solve such the nonlinear all-at-once system. To accelerate the speed of solving the Jacobian equations appeared in Newton’s method, a robust preconditioner is developed and analyzed. Numerical examples are reported to demonstrate the effectiveness of our proposed preconditioner. Meanwhile, they also imply that such the initial guess for Newton’s method is more suitable.

###### keywords:
Nonlinear tempered fractional diffusion equation, All-at-once system; Newton’s method, Krylov subspace method, Toeplitz matrix, banded Toeplitz preconditioner
###### Msc:
[2010] 65L05, 65N22, 65F10

## 1 Introduction

In this work, we mainly focus on solving the all-at-once system arising from the nonlinear tempered fractional diffusion equation (NL-TFDE):

where , , and is a source term which satisfies the Lipschitz condition:

 ∣f(r1,x,t)−f(r2,x,t)∣≤L∣r1−r2∣, for~{}% all r1,r2 over [a,b]×[0,T].

The variants of the left and right Riemann-Liouville tempered fractional derivatives are respectively defined as cartea2007fluid (); baeumer2010tempered (); meerschaert2012stochastic ()

and

 DxDα,λbu(x,t)=DxDα,λbu(x,t)+αλα−1∂u(x,t)∂x−λαu(x,t), (1.3)

where and are the left and right Riemann-Liouville tempered fractional derivatives defined respectively as cartea2007fluid (); baeumer2010tempered ()

 DxDα,λbu(x,t)=eλxΓ(2−α)∂2∂x2∫bxe−λξu(ξ,t)(ξ−x)1−αdξ.

If , they reduce to the Riemann-Liouville fractional derivatives podlubny1998fractional ().

Tempered fractional diffusion equations (TFDEs) are exponentially tempered extension of fractional diffusion equations. In recent several decades, the TFDEs are widely used across various fields, such as statistical physics cartea2007fluid (); chakrabarty2011tempered (); zheng2015numerical (), finance carr2002fine (); carr2003stochastic (); wang2015circulant (); zhang2016numerical () and geophysics baeumer2010tempered (); meerschaert2008tempered (); metzler2004restaurant (); zhang2011gaussian (); zhang2012linking (). Unfortunately, it is difficult to obtain the analytical solutions of TFDEs, or the obtained analytical solutions are less practical. Hence, numerical methods such as finite difference method zhao2018fast (); gu2017fast () and finite element method li2018fast () become important approaches to solve TFDEs. There are limited works addressing the finite difference schemes for the TFDEs. Baeumera and Meerschaert baeumer2010tempered () provided finite difference and particle tracking methods for solving the TFDEs on a bounded interval. The stability and second-order accuracy of the resulted schemes are discussed. Cartea and del-Castillo-Negrete cartea2007fractional () proposed a general finite difference scheme to numerically solve a Black-Merton-Scholes model with tempered fractional derivatives. Marom and Momoniat marom2009comparison () compared the numerical solutions of three fractional partial differential equations (FDEs) with tempered fractional derivatives that occur in finance. However, the stabilities of their proposed schemes are not proved. Recently, Li and Deng li2016high () derived a series of high order difference approximations (called tempered-WSGD operators) for the tempered fractional calculus. They also used such operators to numerically solve the TFDE, and the stability and convergence of the obtained numerical schemes are proved.

Similar to the fractional derivatives, the tempered fractional derivatives are nonlocal. Thus the discretized systems for TFDEs usually accompany a full (or dense) coefficient matrix. Traditional methods (e.g., Gaussian elimination) to solve such systems need computational cost is of and storage requirement is of , where is the number of space grid points. Fortunately, the coefficient matrix always holds a Toeplitz-like structure. It is well known that Toeplitz matrices possess great structures and properties, and their matrix-vector multiplications can be computed in operations via fast Fourier transform (FFT) ng2004iterative (); chan2007introduction (). With this truth, the memory requirement and computational cost of Krylov subspace methods are and , respectively. However, the convergence rate of the Krylov subspace methods will be slow, if the coefficient matrix is ill-conditioned. To address this problem, Wang et al. wang2015circulant () proposed a circulant preconditioned generalized minimal residual method (PGMRES) to solve the discretized linear system, whose computational cost is of . Lei et al. lei2018fast () proposed fast solution algorithms for solving TFDEs in one-dimensional (1D) and two-dimensional (2D). In their article, for 1D case, a circulant preconditioned iterative method and a fast-direct method are developed, and the computational complexity of both methods are in each time step. For 2D case, such two methods were extended to fast solve their alternating direction implicit (ADI) scheme, and the complexity of both methods are in each time step. For many other studies about Toeplitz-like systems, see qu2016cscs (); gu2015k (); gu2015strang (); zhao2012dct () and the references therein.

Actually, all the aforementioned fast implementations for TFDEs are developed based on the time-stepping schemes, which are not suitable for parallel computations. If all the time steps are stacked in a vector, the all-at-once system is obtained and it is suitable for parallel computations, see Martin2017time (); wu2018parareal (). To the best of our knowledge, such the system arising from the FDEs or the partial differential equations have been studied by many researchers Martin2015time (); Banjai2012multistep (); McDonald2018pde (); ke2015fast (); lu2015fast (); huang2017fast (); lu2018approximate (); zhao2018limited (). However, the all-at-once system arising from the TFDEs is less studied. In this work, a preconditioning technique is designed for such the system arising from the NL-TFDE (1.1). The rest of this paper is organized as follows: in Section 2, the nonlinear and linearized implicit schemes are derived by utilizing the finite difference method. Then, the nonlinear all-at-once system is obtained from the nonlinear implicit scheme. The stabilities and convergences of such two schemes are analyzed in Section 3. A preconditioning technique is designed in Section 4 to accelerate solving such the all-at-once system. In Section 5, numerical examples are provided to illustrate the first-order convergences of the two implicit schemes and show the performance of our preconditioning strategy for solving such the system. Concluding remarks are given in Section 6.

## 2 Two implicit schemes and all-at-once system

In this section, the nonlinear and linearized implicit schemes are proposed to approach Eq. (1.1). Then, the all-at-once system is obtained from the nonlinear one.

### 2.1 Two implicit schemes

In order to derive the proposed schemes, we first introduce the mesh , where and . Let represents the numerical approximation of . Then the variants of the Riemann-Liouville tempered fractional derivatives defined in Eqs. (1.2)-(1.3) for can be approximated respectively as sabzikar2015tempered (); li2016high ():

 DxDα,λbu(x,t)|(x,t)=(xi,tj)=1hαN−i+1∑k=0g(α)kuji+k−1+αλα−1δxuji+O(h), (2.2)

where

 δxuji=uji−uji−1handg(α)k={~g(α)1−ehλ(1−e−hλ)α,k=1,~g(α)ke−(k−1)hλ,k≠1

with .

As for the time discretization, the backward Euler method is used. Combining Eqs. (2.1) and (2.2), the following first-order nonlinear implicit Euler scheme (NL-IES) is obtained:

 uji−w1(d+,ii+1∑k=0g(α)kuji−k+1+d−,iN−i+1∑k=0g(α)kuji+k−1)+w2(d+,i−d−,i)(uji−uji−1)=uj−1i+τfju,i, (2.3)

in which , , and . Applying the formula to Eq. (2.3) and omitting the small term, it gets the first-order linearized implicit Euler scheme (L-IES):

 uji−w1(d+,ii+1∑k=0g(α)kuji−k+1+d−,iN−i+1∑k=0g(α)kuji+k−1)+w2(d+,i−d−,i)(uji−uji−1)=uj−1i+τfj−1u,i. (2.4)

The stabilities and first-order convergences of schemes (2.3)-(2.4) will be discussed in Section 3.

### 2.2 The all-at-once system

Several auxiliary notations are introduced before deriving the all-at-once system: and represent the identity and zero matrices of suitable orders, respectively.

 uj=[uj1,uj2,⋯,ujN−1]T,fju=[fju,1,fju,2,⋯,fju,N−1]T,
 D±=diag(d±,1,d±,2,⋯,d±,N−1),B=%tridiag(−1,1,0),
 u=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣u1u2⋮uM⎤⎥ ⎥ ⎥ ⎥ ⎥⎦, f(u)=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣f1uf2u⋮fMu⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦, v=⎡⎢ ⎢ ⎢ ⎢⎣u00⋮0⎤⎥ ⎥ ⎥ ⎥⎦, G=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣g(α)1g(α)00⋯0g(α)2g(α)1g(α)0⋱0⋮⋱⋱⋱⋮g(α)N−2⋯⋱⋱g(α)0g(α)N−1g(α)N−2⋯g(α)2g(α)1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦.

In this work, the all-at-once system is derived based on Eq. (2.3), which can be expressed as:

 Au=τf(u)+v, (2.5)

in which is a bi-diagonal block matrix with . Obviously, is a Toeplitz-like matrix and its storage requirement is of .

For this nonlinear all-at-once system, we prefer to utilize Newton’s method kelley2003solving (). Such the method requires to solve the equation with Jacobian matrix at each iterative step, and the computation of solving these equations consumes most of the method. Before applying Newton’s method to solve the system (2.5), two essential problems need to be addressed:

• How to find a good enough initial value?

• How to solve the Jacobian equations efficiently?

Here, a strategy is provided to address these two problems. For the first problem, the initial value of Newton’s method is constructed by interpolating the solution of L-IES (2.4) on the coarse mesh. Numerical experiences in Section 5 show that it is a good enough initial value. For the second problem, the Jacobian matrix of (2.5) is a bi-diagonal block matrix. More precisely, such the matrix is the sum of a diagonal block matrix and a bi-diagonal block matrix, whose blocks are Toeplitz-like matrices. Based on this special structure, the preconditioned Krylov subspace methods, such as preconditioned biconjugate gradient stabilized (PBiCGSTAB) method van1992bicgstab (), are employed to solve the Jacobian equations appeared in Newton’s method. The details will be discussed in Section 4.

## 3 Stabilities and convergences of (2.3) and (2.4)

In this section, the stabilities and convergences of NL-IES (2.3) and L-IES (2.4) are studied. Let be the approximation solution of in Eqs. (2.3) or (2.4) and be the error satisfying equation

 eji−w1(d+,ii+1∑k=0g(α)keji−k+1+d−,iN−i+1∑k=0g(α)keji+k−1)+w2(d+,i−d−,i)(eji−eji−1)=ej−1i+τ(fjU,i−fju,i) (%for NL-IES)

or

 eji−w1(d+,ii+1∑k=0g(α)keji−k+1+d−,iN−i+1∑k=0g(α)keji+k−1)+w2(d+,i−d−,i)(eji−eji−1)=ej−1i+τ(fj−1U,i−fj−1u,i) (for L-IES),

in which . To prove the stabilities and convergences of (2.3) and (2.4), the following results given in sabzikar2015tempered (); zhuang2009numerical () are required.

###### Lemma 3.1

(sabzikar2015tempered () ) The coefficients , for , satisfy:

 ⎧⎪ ⎪⎨⎪ ⎪⎩g(α)1<0, g(α)k>0 (for k≠1);∞∑k=0g(α)k=0, j∑k=0g(α)k<0 (for j≥1).
###### Lemma 3.2

(zhuang2009numerical (), discrete Gronwall inequality) Suppose that , and

 ηk+1≤ρηk+τ~fk, ρ=1+C0τ, η0=0,

where is a constant, then

 ηk+1≤exp(C0tk)k∑j=0τ~fj.

### 3.1 The stabilities of (2.3) and (2.4)

Denote , and assume that

 ∥∥Ej∥∥∞=∣eℓj∣=max1≤ℓ≤N−1∣ejℓ∣ (0≤j≤M,1≤ℓj≤N−1).

Then the next theorem is established.

###### Theorem 3.1

Suppose , then the L-IES (2.4) is stable, and we have

 ∥∥Ej∥∥∞≤exp(TL)∥∥E0∥∥∞, for j=1,⋯,M.

Proof. From Lemma 3.1, and . Then

 ∣ejℓj∣≤⎡⎣1−w1⎛⎝d+,ℓjℓj+1∑k=0g(α)k+d−,ℓjN−ℓj+1∑k=0g(α)k⎞⎠⎤⎦∣ejℓj∣+w2(d+,ℓj−d−,ℓj)×(∣ejℓj∣−∣ejℓj∣)≤∣ejℓj∣−w1⎛⎝d+,ℓjℓj+1∑k=0g(α)k∣ejℓj−k+1∣+d−,ℓjN−ℓj+1∑k=0g(α)k∣ejℓj+k−1∣⎞⎠+w2(d+,ℓj−d−,ℓj)(∣ejℓj∣−∣ejℓj−1∣)=∣ejℓj∣+∣−w1d+,ℓjg(α)1ejℓj∣+∣−w1d−,ℓjg(α)1ejℓj∣+∣w2(d+,ℓj−d−,ℓj)ejℓj∣−ℓj+1∑k=0,k≠1∣−w1d+,ℓjg(α)kejℓj−k+1∣−N−ℓj+1∑k=0,k≠1∣−w1d−,ℓjg(α)kejℓj+k−1∣−∣−w2(d+,ℓj−d−,ℓj)ejℓj−1∣=∣ejℓj−w1(d+,ℓjg(α)1ejℓj+d−,ℓjg(α)1ejℓj)+w2(d+,ℓj−d−,ℓj)ejℓj∣−ℓj+1∑k=0,k≠1∣−w1d+,ℓjg(α)kejℓj−k+1∣−N−ℓj+1∑k=0,k≠1∣−w1d−,ℓjg(α)kejℓj+k−1∣−∣−w2(d+,ℓj−d−,ℓj)ejℓj−1∣≤∣ejℓj−w1⎛⎝d+,ℓjℓj+1∑k=0g(α)kejℓj−k+1+d−,ℓjN−ℓj+1∑k=0g(α)kejℓj+k−1⎞⎠+w2(d+,ℓj−d−,ℓj)(ejℓj−ejℓj−1)∣=∣ej−1ℓj+τ(fj−1U,ℓj−fj−1u,ℓj)∣≤(1+τL)∣ej−1ℓj∣≤(1+τL)∥∥Ej−1∥∥∞.

The above inequality implies . Then

 ∥∥Ej∥∥∞≤(1+τL)j∥∥E0∥∥∞≤exp(TL)∥∥E0∥∥∞.

From the above proof, it can be find that if , the following result is true.

###### Corollary 1

Suppose and , then the NL-IES (2.3) is stable, and it obtains

 ∥∥Ek∥∥∞≤C1∥∥E0∥∥∞, for k=1,⋯,M,

where is a positive constant.

Proof. Based on the proof of Theorem 3.1, it yields

 (1−τL)∥∥Ej∥∥∞≤∥∥Ej−1∥∥∞.

Summing up for from to and using Lemma 3.4 in zhao2018fast (), it gets

 ∥∥Ek∥∥∞≤exp(TL1−τL)1−τL∥∥E0∥∥∞.

Note that

 limτ→0exp(TL1−τL)1−τL=exp(TL).

Hence, there is a positive constant such that

 exp(TL1−τL)1−τL≤C1,

thereby .

In Corollary 1, it is worth to notice that the assumption is independent of the spatial size . Actually, the smaller time step size is, the easier such assumption can be satisfied.

### 3.2 The convergences of (2.3) and (2.4)

In this subsection, the convergences of (2.3) and (2.4) are studied. Let satisfies

 ξji−w1(d+,ii+1∑k=0g(α)kξji−k+1+d−,iN−i+1∑k=0g(α)kξji+k−1)+w2(d+,i−d−,i)(ξji−ξji−1)=ξj−1i+τ(f(u(xi,tj),xi,tj)−fju,i)+Rji (for NL-IES)

or

 ξji−w1(d+,ii+1∑k=0g(α)kξji−k+1+d−,iN−i+1∑k=0g(α)kξji+k−1)+w2(d+,i−d−,i)(ξji−ξji−1)=ξj−1i+τ(f(u(xi,tj−1),xi,tj−1)−fj−1u,i)+Rji (for L-IES),

where ( is a positive constant). Denote and . Similar to the proof of Theorem 3.1, the following theorem about the convergence of L-IES can be established.

###### Theorem 3.2

Assume that and the problem (1.1) has a sufficiently smooth solution . is the numerical solution of (2.4). Then there is a positive constant such that

 ∥∥ξj∥∥∞≤C(τ+h), j=1,2,⋯,M.

Proof. Same technique in Theorem 3.1 is utilized, then it yields

Using Lemma 3.2, it gets

 ∥∥ξj∥∥∞≤exp(TL)TC2(τ+h)≤C(τ+h).

###### Corollary 2

Assume that , and the problem (1.1) has a sufficiently smooth solution . is the numerical solution of (2.3). Then

 ∥∥ξj∥∥∞≤C(τ+h), j=1,2,⋯,M,

where is a positive constant.

Proof. According to Corollary 1, we have

 (1−τL)∥∥ξj∥∥∞≤∥∥ξj−1∥∥∞+C2(τ2+τh).

Similarly, it arrives

 ∥∥ξj∥∥∞≤exp(TL1−τL)1−τLTC2(τ+h)≤C(τ+h), j=1,2,⋯,M,

in which is employed.

It is interesting to note that if represents the error between NL-IES (2.3) and L-IES (2.4), then it also satisfies . This can be proved easily through the condition . In the next section, fast implementations are designed to solve (2.5).

## 4 The preconditioned iterative method

The Jacobian matrix of (2.5) can be treated as the sum of a diagonal block matrix and a block bi-diagonal matrix with Toeplitz-like blocks, then its matrix-vector multiplication can be done by FFT in operations. Such the technique truly reduces the computational cost of a Krylov subspace method, such as the biconjugate gradient stabilized (BiCGSTAB) method, but the convergence rate of this method is slow when the Jacobi matrix is very ill-conditioned. In order to speed up the convergence rate of the Krylov subspace method, a preconditioner is proposed and analyzed in this section, in which

 Aℓ=I−w1(D+Gℓ+D−GTℓ)+w2(D+−D−)B with Gℓ=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣g(α)1g(α)0g(α)2g(α)1g(α)0⋮⋱⋱⋱g(α)ℓ⋯g(α)2g(α)1g(α)0⋱⋱⋱⋱⋱g(α)ℓ⋯g(α)2g(α)1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦.

Noticing the properties of given in Lemma 3.1, the following result about is true.

###### Theorem 4.1

The preconditioner is a nonsingular matrix.

Proof. Obviously, we only need to proof the nonsingularity of . Let

 H=Aℓ+ATℓ2=I−ω12(D++D−)(Gℓ+GTℓ)+ω22(D+−D−)(B+BT)

and be the arbitrary eigenvalue of . According to Lemma 3.1 and Gershgorin circle theorem varga2010gervsgorin (), it arrives at

This implies that all eigenvalues of are larger than 1. Then, the desired result is achieved.

Unfortunately, it is difficult to theoretically investigate the eigenvalue distributions of the preconditioned Jacobian matrix, but we still can give some figures to illustrate the clustering eigenvalue distributions of several specified preconditioned matrices in the next section. For convenience, let be the approximation of obtained in the -th Newton iterative step, the Jacobian matrix in the -th Newton iterative step is denoted as . With these auxiliary notations, the preconditioned Newton’s method can be summarized as below:

In fact, this algorithm can be viewed as a simple two-grid method, our readers can refer to xu1994twogrid (); xu1996twogrid (); kim2018twogrid () for details.

## 5 Numerical examples

Two numerical experiments presented in this section have a two-fold objective. On the one hand, they illustrate that the convergence orders of our two implicit schemes (2.3)-(2.4) are 1. On the other hand, they show the performance of the preconditioner proposed in Section 4. In Algorithm 1, for generating the initial guess , the MATLAB build-in function “” is employed in this work. The and in Algorithm 1 are fixed as and , respectively. For the PBiCGSTAB method (or the BiCGSTAB method), it terminates if the relative residual error satisfies or the iteration number is more than , where is the residual vector of the linear system after iteration, and the initial guess of the PBiCGSTAB method (or the BiCGSTAB method) is chosen as the zero vector. ”” represents that our proposed preconditioned iterative method in Section 4 is utilized to solve (2.5). “BS” (or “”) means that the PBiCGSTAB method in Step 3 of Algorithm 1 is replaced by the MATLAB’s backslash method (or the BiCGSTAB method). Some other notations, which will appear in later, are given:

 Err(τ,h)=max0≤j≤M∥ξj∥∞,
 Order1=log2Err(2τ,h)/Err(τ,h),Order2=log2Err(τ,2h)/Err(τ,h).

” represents the number of iterations required by Algorithm 1. “” denotes the average number of iterations required by the PBiCGSTAB method (or the BiCGSTAB method) in Algorithm 1, i.e.,

 Iter2=Iter1∑m=1Iter2(m)/Iter1,

where is the number of iterations required by such the method in the -th iterative step of Algorithm 1. “Time” denotes the total CPU time in seconds for solving the system (2.5). “” means the maximum iterative step is reached but not convergence, and“” means out of memory.

All experiments were performed on a Windows 10 (64 bit) PC-Intel(R) Core(TM) i7-8700k CPU 3.70 GHz, 16 GB of RAM using MATLAB R2016a.

Example 1. We consider Eq. (1.1) with , the initial value , the nonlinear source term and the continuous coefficients and . Obviously, it is hard to obtain the exact solution of Eq. (1.1). Thus, the numerical solution computed from the finer mesh () is treated as the exact solution.