hp-Adaptive Galerkin Time Stepping Methods for Nonlinear IVPs

# hp-Adaptive Galerkin Time Stepping Methods for Nonlinear Initial Value Problems

Irene Kyza Department of Mathematics, University of Dundee, Nethergate, Dundee, DD1 4HN, United Kingdom  and  Stephen Metcalfe and Thomas P. Wihler Mathematisches Institut, Universität Bern, Sidlerstr. 5, CH-3012 Bern, Switzerland
###### Abstract.

This work is concerned with the derivation of an a posteriori error estimator for Galerkin approximations to nonlinear initial value problems with an emphasis on finite-time existence in the context of blow-up. The stucture of the derived estimator leads naturally to the development of both and versions of an adaptive algorithm designed to approximate the blow-up time. The adaptive algorithms are then applied in a series of numerical experiments, and the rate of convergence to the blow-up time is investigated.

###### Key words and phrases:
Initial value problems in Hilbert spaces, Galerkin time stepping schemes, high-order methods, blow-up singularities.
###### 2010 Mathematics Subject Classification:
65J08, 65L05, 65L60
The authors acknowledge the support of the Swiss National Science Foundation (SNF)

## 1. Introduction

In this paper we focus on continuous Galerkin (cG) and discontinuous Galerkin (dG) time stepping discretizations as applied to abstract initial value problems of the form

 u′(t) =F(t,u(t)),t∈(0,T),u(0)=u0. (1.1)

Here, , for some , denotes the unknown solution with values in a real Hilbert space  with inner product   and induced norm . The initial value prescribes the value of the solution at , and  is a possibly nonlinear operator. We emphasize that we include the case of  being unbounded in the sense that

 ∥F(t,x)∥H∥x∥H→∞ as ∥x∥H→∞,0≤t≤T.

Note that the existence interval for solutions may be arbitrarily small even for smooth . Indeed, for certain data the solution of (1.1) can become unbounded in finite time, i.e.,

 ∥u(t)∥H<∞ for 0

This effect is commonly termed finite-time blow-up or sometimes just blow-up and the quantity is called the blow-up time.

The main contributions of this paper are as follows:

1. The derivation of conditional a posteriori error bounds for -cG and -dG approximations to the nonlinear initial value problem (1.1).

2. The design of efficient adaptive algorithms that lead to accurate approximation of the blow-up time in the case where problem (1.1) exhibits finite time blow-up.

To the best of our knowledge, this is the first time that an adaptive algorithm has been developed for -cG and -dG time-stepping methods based on rigorous a posteriori error control for problems of the form (1.1). The adaptive procedure that we propose includes both -adaptive and -adaptive variants. Indeed, one of the contributions of this paper is to motivate how we can choose effectively between - or -adaptivity locally while taking into account the possible singular behavior of the problem under consideration. In this sense, these results extend the -adaptive algorithm analyzed for some special cases of (1.1) and Euler-type time discretizations in . In particular, the inclusion of higher order time-stepping schemes and -adaptivity allows us to overcome key limitations encountered in .

A posteriori error estimators for linear problems tend to be unconditional, that is, they always hold independent of the problem data and the size of the discretization parameters. For nonlinear problems, the situation is more complicated since the existence of a solution to an appropriate error equation (and, thus, of an error bound) usually requires that either the data or the discretization parameters are sufficiently small. As a result, a posteriori error estimators for nonlinear problems tend to be conditional, that is, they only hold provided that an a posteriori verifiable condition (which can be either explicit or implicit) is satisfied. For nonlinear time-dependent problems, there are two commonly used approaches for deriving conditional a posteriori error bounds: continuation arguments, cf. [4, 13], and fixed point arguments, cf. [14, 5]. The a posteriori error bounds that we derive here are obtained by utilizing a local continuation argument.

Galerkin time stepping methods for initial value problems are based on weak formulations and for both the cG and dG time stepping schemes, the test spaces consist of polynomials that are discontinuous at the time nodes. In this way, the discrete Galerkin formulations decouple into local problems on each time step and the discretizations can therefore be understood as implicit one-step schemes. In the literature, Galerkin time stepping schemes have been extensively analyzed for ordinary differential equations (ODEs), cf.  [2, 7, 6, 8, 9, 12].

A key feature of Galerkin time stepping methods is their great flexibility with respect to the size of the time steps and the local approximation orders which lends these schemes well to an -framework. The -versions of the cG and dG time stepping schemes were introduced and analyzed in the works [16, 17, 19, 21]. In particular, in the articles [16, 21] which focus on initial value problems with uniform Lipschitz nonlinearities, the use of the Banach fixed point theorem made it possible to prove existence and uniqueness results for the discrete Galerkin solutions which are independent of the local approximation orders; these results have been extended to discrete Peano-type existence results in the context of more general nonlinearities in . We emphasize that the -approach is well known for its ability to approximate smooth solutions with possible local singularities at high algebraic or even exponential rates of convergence; see, e.g., [3, 17, 18, 20] for the numerical approximation of problems with start-up singularities. In light of this, a main aim of this paper is to establish through numerical experiments whether or not -refinement, utilizing the derived a posteriori error estimator, can lead to exponential convergence towards the blow-up time for the case where (1.1) exhibits finite time blow-up.

### Outline

The remainder of our article is organized as follows. In Section 2, we introduce the -cG and -dG time stepping schemes while in Section 3 we develop a posteriori error bounds for these schemes. We design as well as version adaptive algorithms to approximate the blow-up time in Section 4 before applying them to some numerical experiments in Section 5. Finally, we draw conclusions and comment on possible future research in Section 6.

### Notation

Let denote a real Hilbert space with inner product and induced norm as before. Given an interval , the Bochner space consists of all functions that are continuous on with values in . Moreover, for , we define the norm

 ∥u∥Lp(I;H):=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩(∫I∥u(t)∥pHdt)\nicefrac1p,1≤p<∞,esssupt∈I∥u(t)∥H,p=∞,

and we let be the space of all measurable functions such that the corresponding norm is bounded. Note that  is a Hilbert space with inner product and norm given by

 (u,v)L2(I;H):=∫I(u(t),v(t))Hdt,and∥u∥L2(I;H):=(∫I∥u(t)∥2Hdt)\nicefrac12,

resepctively.

## 2. Galerkin Time Discretizations

In this section, we briefly recall the -cG and -dG time stepping schemes for the discretisation of (1.1). To this end, on the open interval , we introduce a set of time nodes, , which define a time partition of  into  open time intervals , . The length (which may be variable) of the time interval is called the time step length. Furthermore, to each interval we associate a polynomial degree  which takes the role of a local approximation order. Then, given a (real) Hilbert space  and some , the set

 Pr(J;X):={p∈C(¯J;X):p(t)=r∑i=0xiti,xi∈X},

signifies the space of all polynomials of degree at most  on an interval  with values in .

In practical computations, the Hilbert space  on which (1.1) is based will typically be replaced by a finite-dimensional subspace  on each interval , . In this context, it is useful to define the -orthogonal projection   given by

 x↦πmx,(x−πmx,y)H=0∀y∈Hm.

### 2.1. hp-cG Time Stepping

With the above definitions, the (fully-discrete) -cG time marching scheme is iteratively given as follows: For , we seek such that

 ∫Im(dUdt,V)Hdt=∫Im(F(t,U),V)Hdt∀V∈Prm−1(Im;Hm), (2.1)

with the initial condition for , and for ; here, the one-sided limits of a piecewise continuous function at each time node are given by

 U(t+m)\coloneqqlims↘0U(tm+s),U(t−m)\coloneqqlims↘0U(tm−s).

Note that in order to enforce the initial condition on each time step, the local trial space has one degree of freedom more than the local test space in the -cG scheme. Furthermore, if , we remark that the -cG solution  is globally continuous on .

The strong form of (2.1) on is

 U′(t)=Πrm−1mF(t,U(t)), (2.2)

where denotes the -projection operator onto the space ; see  for details. Whilst the strong form (2.2) can be exploited for the purpose of deriving a posteriori error estimates, it is well known that employing such a straightfoward approach leads to suboptimal error estimates, cf. . This issue will be addressed in the derivation of our error bound.

### 2.2. hp-dG Time Stepping

The (fully-discrete) -dG time marching scheme is iteratively given as follows: For  , we seek such that

 ∫Im(dUdt,V)Hdt+([[U]]m−1,V(t+m−1))H=∫Im(F(t,U),V)Hdt∀V∈Prm(Im;Hm), (2.3)

where the discontinuity jump of at , , is given by with . We emphasize that, in contrast to the continuous Galerkin formulation, the trial and test spaces are the same for the dG scheme; this is due to the fact that the initial values are weakly imposed (by means of an upwind flux) on each time interval.

In order to find the strong formulation of  (2.3), we require the use of a lifting operator. More precisely, given some real Hilbert space , we define  for uniquely through

 z↦Lrmmz,∫Im(Lrmmz,V)Hdt=(z,V(t+m−1))H∀V∈Prm(Im;X), (2.4)

cf. [19, Section 4.1]. Then, in view of this definition with  and proceeding as in , we obtain the strong formulation of the dG time stepping method (2.3) on , viz.,

 U′(t)+(Lrmmπm[[U]]m−1)(t)=ΠrmmF(t,U(t)), (2.5)

where  denotes the -projection onto .

## 3. A Posteriori Error Analysis

The goal of this section is to derive a posteriori error bounds for the -cG and -dG approximations to (1.1). To this end, we require some structural assumptions on the nonlinearity . Specifically, is assumed to satisfy along with the local -Lipschitz estimate

 ∥F(t,v)−F(t,w)∥H≤L(t,∥v∥H,∥w∥H)∥v−w∥H∀t∈[0,T]∀v,w∈H. (3.1)

Here, is a known function that satisfies for any and that is continuous and monotone increasing in the second and third arguments. Under these assumptions on , problem (1.1) admits a unique (local) solution .

### 3.1. Preliminary Error Bound

In order to remedy the suboptimal error estimates that would result from using (2.2) and (2.5) directly, we follow the approach proposed in   by introducing the reconstruction of which is defined over each closed interval , , by

 ˆUm(t):=πmU(t−m−1)+∫ttm−1ΠrmmF(s,U)ds. (3.2)

For , , we define the error where  is either the -cG solution from (2.1) or the -dG solution from (2.3). Since we will be dealing with the reconstruction , it will also be necessary to introduce the reconstruction error given by , , . We will proceed with the error analysis by first proving an -error bound for .

To formulate the error equation, we begin by substituting (1.1) into the definition of , viz.,

 ˆem(t)=u(tm−1)−ˆUm(t)+∫ttm−1F(s,u)ds. (3.3)

 (3.4)

where denotes the residual given by

 R(t):=ˆUm(tm−1)−ˆUm(t)+∫ttm−1F(s,ˆUm)ds,t∈Im. (3.5)

Using the triangle inequality and Bochner’s Theorem implies that

 ∥ˆem(t)∥H≤∥ˆem(tm−1)∥H+∥R(t)∥H+∫ttm−1∥F(s,u)−F(s,ˆUm)∥Hds. (3.6)

Moreover, applying the local -Lipschitz estimate (3.1) together with the monotonicity of yields

 ∥ˆem(t)∥H≤∥ˆem(tm−1)∥H+ηresm+∫ttm−1L(s,∥ˆem∥H+∥ˆUm∥H,∥ˆUm∥H)∥ˆem∥Hds, (3.7)

for . Here, denotes the residual estimator given by .

On the first interval, recalling that , we can estimate directly by where the projection estimator is given by

 ηprojm:=∥U(t−m)−πm+1U(t−m)∥H,m≥0. (3.8)

For later intervals, the unknown term needs to be replaced with the known term . To this end, for , we have

 ˆem+1(tm)−ˆem(tm)=ˆUm(tm)−ˆUm+1(tm)=πmU(t−m−1)−πm+1U(t−m)+∫ImΠrmmF(s,U)ds. (3.9)

Recall that the -cG method (2.1) satisfies the strong form (2.2) on . Thus, we have that

 U(t−m)=πmU(t−m−1)+∫ImΠrm−1mF(s,U)ds. (3.10)

By the definition of the -projection an equivalent formulation of the above is

 U(t−m)=πmU(t−m−1)+∫ImΠrmmF(s,U)ds. (3.11)

Substituting this into (3.9) implies that for the -cG method we have

 ˆem+1(tm)−ˆem(tm)=U(t−m)−πm+1U(t−m). (3.12)

For the -dG method (2.3), we have the strong form (2.5) on . Thus, it follows that

 U(t−m)=U(t+m−1)+∫ImΠrmmF(s,U)ds−∫Im(Lrmmπm[[U]]m−1)(s)ds. (3.13)

By definition of the lifting operator  from (2.4) we arrive at

 U(t−m)=U(t+m−1)−πm[[U]]m−1+∫ImΠrmmF(s,U)ds. (3.14)

Equivalently,

 U(t−m)=πmU(t−m−1)+∫ImΠrmmF(s,U)ds. (3.15)

Since this is the same form as for the -cG method then for the -dG method we also have that

 ˆem+1(tm)−ˆem(tm)=U(t−m)−πm+1U(t−m). (3.16)

Applying the triangle inequality to (3.12) and (3.16) yields

 ∥ˆem+1(tm)∥H≤∥ˆem(tm)∥H+ηprojm, (3.17)

with  from (3.8). Substituting this result into (3.7) gives

 ∥ˆem(t)∥H≤ψm+∫ttm−1L(s,∥ˆem∥H+∥ˆUm∥H,∥ˆUm∥H)∥ˆem∥Hds, (3.18)

where  is chosen such that

 ψm≥{ηprojm−1+ηresmif m=1∥ˆem−1(tm−1)∥H+ηprojm−1+ηresmif m≠1. (3.19)

Finally, applying Gronwall’s inequality to (3.18) for , , yields the following result.

###### Proposition 3.20.

For the cG and dG time stepping schemes (2.1) and (2.3), respectively, there holds the error bound

 ∥ˆem(t)∥H≤Gm(t)ψm,t∈¯Im,1≤m≤M, (3.21)

where   satisfies (3.19) and is given by

 Gm(t):=exp(∫ttm−1L(s,∥ˆem∥H+∥ˆUm∥H,∥ˆUm∥H)ds). (3.22)

### 3.2. Continuation Argument

In order to transform (3.21) into an a posteriori error bound, we will employ a continuation argument, cf., e.g., . To this end, for , we define the set

 Im:={t∈¯Im:∥ˆem∥C([tm−1,t];H)≤δψm}, (3.23)

where is a parameter to be chosen. Note that is closed since is time-continuous and, obviously, bounded. Moreover, is assumed to be non-empty; this is certainly true for the first interval since and is a posteriori verifiable for later intervals. To state the full error bound, we require some additional definitions. Specifically, define the function by

 φm(δ):=exp(∫ImL(s,δψm+∥ˆUm∥H,∥ˆUm∥H)ds)−δ. (3.24)

Additionally, if it exists, we introduce

 δm:=inf{δ>1:φm(δ)<0}∈[1,∞), (3.25)

for and we let for convenience.

We are now ready to establish the following conditional a posteriori error bound for both Galerkin time stepping methods.

###### Theorem 3.26.

For any , if exist then the -cG scheme (2.1) and the -dG scheme (2.3) satisfy the a posteriori error bound

 ∥ˆem∥L∞(Im;H)≤δmψm, (3.27)

on .

###### Proof.

We omit the trivial case . Since exists, there is some with . Suppose that exists then so is non-empty, closed and bounded and thus has some maximal time . Let us first make the assumption that and work towards a contradiction. Substituting the error bound from into (3.21) and using the monotonicity of implies that

 ∥ˆem∥C([tm−1,t⋆];H)≤exp(∫ImL(s,δψm+∥ˆUm∥H,∥ˆUm∥H)ds)ψm=(φm(δ)+δ)ψm<δψm.

This is a contradiction since we had assumed that the set had maximal time . Hence, it follows that . Taking the limit  we deduce (3.27). To complete the proof, we note that we can conclude by recursion that exists if exist and . Since trivially and exist by premise, the original assumption that exists is unneeded. ∎

An important question arises with regard to Theorem 3.26 is it possible that never exists for certain nonlinearities ? The following lemma provides an answer to this question.

###### Lemma 3.28.

For any , if exist and the time step length  is chosen sufficiently small then the set  is non-empty, from (3.25) exists and .

###### Proof.

Since exist, Theorem 3.26 implies that exists and thus is well-defined. For fixed and upon setting , we can choose  small enough so that

 exp(∫ImL(s,δ⋆ψm+∥ˆUm∥H,∥ˆUm∥H)ds)<1+ϵ.

A quick calculation reveals that . Therefore, for  sufficiently small, the set  is non-empty. Furthermore, since is continuous and , it follows that  exists and satisfies . ∎

In some sense, is a better approximation to than ; thus from a practical standpoint it is often best to use Theorem 3.26 directly, however, for some applications a bound on the error rather than on the reconstruction error may be required so we introduce the following corollary.

###### Corollary 3.29.

For any , if exist then the -cG scheme (2.1) and the -dG scheme (2.3) satisfiy the a posteriori error bound

 ∥e∥L∞(Im;H)≤δmψm+∥U−ˆUm∥L∞(Im;H), (3.30)

on .

###### Proof.

The bound follows directly from rewriting the error, viz., , the triangle inequality and the reconstruction error bound of Theorem 3.26. ∎

### 3.3. Computable Error Bound

In order to yield fully computable error bounds we must give an explicit characterization of from (3.19). Theorem 3.26 implies that

 ∥ˆem−1(tm−1)∥H≤∥ˆem−1∥L∞(Im−1;H)≤δm−1ψm−1. (3.31)

Thus, we can define  by

 ψm:={ηprojm−1+ηresmif m=1δm−1ψm−1+ηprojm−1+ηresmif m≠1. (3.32)

Defining in this way yields a recursive error estimator and makes the error bounds of Theorem 3.26 and Corollary 3.29 fully computable.

In order to develop adaptive algorithms that can approximate the blow-up time of a nonlinear initial value problem, it is important to interpret the role that  plays in the error bound of Theorem 3.26. Recalling the bound and applying (3.32), we see that the reconstruction error on satisfies

 ∥ˆem∥L∞(Im;H)≤δm(δm−1ψm−1+ηprojm−1+ηresm). (3.33)

Of the three components that make up the error estimator, the term is a bound for the error on the previous time step while represents the local (additive) contribution to the error estimator on the current time step. The correct interpretation of , then, is that it is an a posteriori approximation to the growth rate of the error on ; this becomes clear upon consideration of the following simple example. In fact, the following corollary shows that is the expected local growth rate for globally -Lipschitz .

###### Corollary 3.34.

Suppose that is globally -Lipschitz with constant then and, thus, the error bound of Theorem 3.26 holds unconditionally on , viz.,

 ∥ˆem∥L∞(Im;H)≤eLkmψm. (3.35)
###### Proof.

From the definition of the function , it follows that . Therefore, . Since exists and is finite for any , we conclude that the error bound of Theorem 3.26 holds unconditionally on . ∎

The error estimators derived in the previous section will form the basis of an adaptive strategy to estimate the blow-up time of (1.1). In particular, we will consider both a -adaptive approach and an -adaptive approach. For the remainder of this section, we assume that for simplicity but remark that both adaptive algorithms can be easily modified to account for variable finite-dimensional subspaces.

The first difficulty encountered in the construction of a -adaptive algorithm is that both (2.1) and (2.3) are implicit methods which means that the existence of a numerical solution is not guaranteed. It is tempting to conduct an existence analysis such as in  to obtain bounds on the length of the time steps needed in order to yield a numerical solution, however, such bounds are inherently pessimistic. Since existence can be determined a posteriori, our -adaptive algorithm just reduces the time step length until a numerical solution exists.

The second difficulty is how to approximate ; unfortunately, it is impossible to give a precise characterization for how to do this for any given primarily because may be ‘pathological’. Fortunately, however, most nonlinearities of interest do not fall into this category. Suppose that is chosen such that has, at most, a small finite number of zeros then it should be possible to approximate the zeros of numerically. Since satisfies , we then only need to check the roots of and verify that one of our numerical approximations, , satisfies . In our numerical experiments, we employ a Newton method to find with an initial guess close to one on (the proof of Lemma 3.28 implies that for ) and an initial guess close to on for ; this approach works well for the studied problems.

As is standard in finite element algorithms for linear problems, the time step is halved and the numerical solution recomputed until is below the tolerance , however, we must also account for the nonlinear scaling that enters through . The structure of the error estimator implies that the interval is the most important since the residual estimator on propagates through the remainder of the error estimator. Similarly, each successive subinterval is less important than the previous subinterval with the term measuring the extent to which this is true. To account for this, we increase the tolerance by the factor after computations on are complete.

We then advance to the next interval using the previous time step length as a reference and continue in this way until no longer exists; the adaptive algorithm is then terminated and it outputs the total number of degrees of freedom (DoFs) used along with the sum of all the time step lengths, , as an estimate for . The pseudocode for the -adaptive algorithm is given in Algorithm 1.

The basic outline of the -adaptive algorithm will be the same as that of the -adaptive algorithm; the only difficulty lies in how we choose locally between -refinement and -refinement. The theory of the -FEM suggests that -refinement is superior to -refinement if the solution is ‘smooth enough’, so we perform -refinement if is smooth; otherwise, we do -refinement. The pseudocode for the -adaptive algorithm is very similar to Algorithm 1; the difference lies in replacing the simple time step bisections in lines (8:) and (19:) by the following procedure:

if  is smooth then
.
else
.
end if
Compute and determine .

In the numerical experiments, we consider only real-valued ODEs, and so we remark specifically on the estimation of smoothness in this special case. There are many ways to estimate smoothness of the numerical solution (see  for an overview); we choose to use a computationally simple approach from  based on Sobolev embeddings. Here, the basic idea is to monitor the constant in the Sobolev embedding  in order the classify a given function as locally smooth or not. Specifically, we define the smoothness indicator  by

 θm[u]:=⎧⎪⎨⎪⎩∥u∥L∞(Im)(k−\nicefrac12m∥u∥L2(Im)+1√2k\nicefrac12m∥u′∥L2(Im))−1if u≢0,1if u≡0, (4.1)

which, intuitively, takes values close to one if  is smooth and values close to zero otherwise; see the reasoning of  for details. Following , we characterize as smooth if

 θm[drm−1U%dtrm−1]≥θ⋆; (4.2)

here, values around were observed to produce the best results in our numerical experiments below.

We conclude this section with a corollary on the magnitude of the reconstruction error under either the -adaptive algorithm or the -adaptive algorithm. In order to state the corollary, we require some additional notation. To that end, we denote the initial tolerance by and define the a posteriori approximation to the growth rate of the error on by

 ˆδm:=m∏i=0δi.

We are now ready to state the corollary.

###### Corollary 4.3.

Suppose that and that exist then under the -adaptive algorithm or the -adaptive algorithm, the reconstruction error satisfies

 max1≤m≤M∥ˆem∥L∞(Im;H)≤MˆδMtol⋆.
###### Proof.

To begin the proof, we will first prove by induction that . For , we have that , so the base case is verified. Assuming that the bound is true for , then recalling the definition of from (3.32) as well as the scaling nature of the tolerances in the -adaptive and -adaptive algorithms, that is, , we have

 ψm=δm−1ψm−1+ηresm≤(m−1)δm−1ˆδm−2tol⋆+ˆδm−1tol⋆=(m−1)ˆδm−1tol⋆+ˆδm−1tol⋆=mˆδm−1tol⋆.

Thus the stated bound holds for any . Theorem 3.26 combined with this bound yields

 max1≤m≤M∥ˆem∥L∞(Im;H)≤max1≤m≤Mδmψm≤max1≤m≤Mmˆδmtol⋆=MˆδMtol⋆.

This completes the proof. ∎

## 5. Numerical Experiments

In this section, we will apply the adaptive algorithms developed in the previous section to two real-valued initial value problems. In both numerical experiments, we approximate the implicit Galerkin methods (2.1) and (2.3) with an explicit Picard-type iteration; cf. .

### 5.1. Example 1

In this numerical example, we consider (1.1) with the polynomial nonlinearity . Through separation of variables the exact solution is given by

 u(t)=u01−u0t,

which has blow-up time given by . Note that for any , the nonlinearity satisfies

 |F(v1)−F(v2)|=|v21−v22|≤|v1−v2|(|v1|+|v2|), (5.1)

so we set