# An hp-Adaptive Newton-Galerkin Finite Element Procedure for Semilinear Boundary Value Problems

Mario Amrein Lucerne University of Applied Sciences and Arts, CH-6002 Luzern, Switzerland Jens M. Melenk Institut für Analysis und Scientific Computing, TU Wien, A-1040 Wien, Austria  and  Thomas P. Wihler Mathematics Institute, University of Bern, CH-3012 Switzerland
###### Abstract.

In this paper we develop an -adaptive procedure for the numerical solution of general, semilinear elliptic boundary value problems in 1d, with possible singular perturbations. Our approach combines both a prediction-type adaptive Newton method and an -version adaptive finite element discretization (based on a robust a posteriori residual analysis), thereby leading to a fully -adaptive Newton-Galerkin scheme. Numerical experiments underline the robustness and reliability of the proposed approach for various examples.

###### 2010 Mathematics Subject Classification:
49M15,58C15,65N30
TW was supported by the Swiss National Science Foundation (SNF), Grant No. 200021-162990.

## 1. Introduction

The purpose of this paper is the development of a numerical approximation procedure for semilinear elliptic boundary value problems, with possible singular perturbations. More precisely, for a fixed parameter  (possibly with ), and a continuously differentiable function , we consider the problem of finding a function  in an interval , , which satisfies

 −ϵu′′(x) =f(x,u(x)) for x∈Ω,u(a)=u(b)=0. (1.1)

Solutions of (1.1) are typically not unique (even infinitely many solutions may exist), and, in the singularly perturbed case, may exhibit boundary layers, interior shocks, and (multiple) spikes. The existence of multiple solutions due to the nonlinearity of the problem and/or the appearance of singular effects constitute two challenging issues when solving problems of this type numerically; see, e.g.,[23, 27].

Problem formulation and notation: Henceforth, we suppose that a (not necessarily unique) solution  of (1.1) exists; here, we denote by the standard Sobolev space of functions in  with zero trace on . Furthermore, signifying by  the dual space of , and upon defining the map through

 ⟨Fϵ(u),v⟩:=∫Ω{ϵu′v′−f(u)v}dx∀v∈X, (1.2)

where is the dual product in , the above problem (1.1) can be written as a nonlinear operator equation in :

 u∈X:Fϵ(u)=0. (1.3)

In addition, for open subsets , we introduce the norm

 |||u|||ϵ,D:=(ϵ∥∥u′∥∥20,D+∥u∥20,D)\nicefrac12,

where  denotes the -norm on . Note that, in the case of , with given , i.e., when (1.1) is linear and strongly elliptic, the norm is a natural energy norm on . Frequently, for , the subindex ‘’ will be omitted. Furthermore, the associated dual norm of  from (1.2) is given by

 |||Fϵ(u)|||X′,ϵ=supv∈X|||v|||ϵ=1∫Ω{ϵu′v′−f(u)v}dx.

Throughout this work we shall use the abbreviation to mean , for a constant independent of the mesh size , the local polynomial degree , and of .

Outline: In Section 2 we will consider the Newton method within the context of dynamical systems in general Banach spaces, and present a prediction strategy for controlling the Newton step size parameter. Furthermore, Section 3 focuses on the application of the Newton methodology to semilinear elliptic problems as well as on the discretization of the problems under consideration by -finite element methods; in addition, we derive an -robust a posteriori residual analysis. An -adaptive procedure and a series of numerical experiments illustrating the performance of the proposed idea will be presented in Section 5. Finally, we summarize our findings in Section 6.

## 2. Adaptive Newton Methods in Banach Spaces

In the following section we shall briefly revisit an adaptive prediction-type Newton algorithm from [3].

### 2.1. Abstract Framework

Let be two Banach spaces, with norms  and , respectively. Given an open subset , and a (possibly nonlinear) operator , we consider the nonlinear operator equation

 F(u)=0, (2.1)

for some unknown zeros . Supposing that the Fréchet derivative  of  exists in  (or in a suitable subset), the classical Newton method for solving (2.1) starts from an initial guess , and generates the iterates

 un+1=un+δn,n≥0, (2.2)

where the update  is implicitly given by the linear equation

 F′(un)δn=−F(un),n≥0.

Naturally, for this iteration to be well-defined, we need to assume that  is invertible for all , and that .

### 2.2. An Adaptive Prediction Strategy

In order to improve the reliability of the Newton method (2.2) in the case that the initial guess  is relatively far away from a root of , , introducing some damping in the Newton method is a well-known remedy. In that case (2.2) is rewritten as

 un+1=un−ΔtnF′(un)−1F(un),n≥0, (2.3)

where , , is a damping parameter that may be adjusted adaptively in each iteration step.

Provided that  is invertible on a suitable subset of , we define the Newton transform by

 u↦NF(u):=−F′(u)−1F(u).

Then, rearranging terms in (2.3), we notice that

 un+1−unΔtn=NF(un),n≥0,

i.e., (2.3) can be seen as the discretization of the Davydenko-type system,

 ˙u(t)=NF(u(t)),t≥0,u(0)=u0, (2.4)

by the forward Euler scheme, with step size .

For , the solution  of (2.4), if it exists, defines a trajectory in  that starts at , and that will potentially converge to a zero of  as . Indeed, this can be seen (formally) from the integral form of (2.4), that is,

 F(u(t))=F(u0)e−t,t≥0,

which implies that  as .

Now taking the view of dynamical systems, our goal is to compute an upper bound for the value of the step sizes  from (2.3), , so that the discrete forward Euler solution  from (2.3) stays reasonably close to the continuous solution of (2.4). Specifically, for a prescribed tolerance , a Taylor expansion analysis (see [3, Section 2] for details) reveals that

 u(t)=u0+tNF(u0)+t22hηh+O(t3)+O(t2h∥NF(u0)∥2X),

where, for any sufficiently small , we let . Hence, after the first time step of length  there holds

 u(Δt0)−u1=Δt202hηh+O(Δt30)+O(Δt20h∥NF(u0)∥2X), (2.5)

where  is the forward Euler solution from (2.3). Therefore, upon setting

 Δt0=√2τh∥ηh∥−1X,

we arrive at

 ∥u(Δt0)−u1∥X≤τ+O(Δt30)+O(Δt20h∥NF(u0)∥2X).

In order to balance the -terms in (2.5) it is sensible to make the choice

 h=O(Δt0∥NF(u0)∥−2X),

i.e.,

 h=γΔt0∥NF(u0)∥−2X, (2.6)

for some parameter . This leads to the following adaptive Newton algorithm.

###### Algorithm 2.1.

Fix a tolerance as well as a parameter , and set .

1:Start the Newton iteration with an initial guess .
2:if  then choose based on [3, Algorithm 2.1],
3:else if  then
4:     let , and based on (2.6);
5:     define the Newton step size
 Δtn=min{√2τhn∥NF(un+hnNF(un))−NF(un)∥−1X,1}. (2.7)
6:end if
7:Compute  based on the Newton iteration (2.3), and go to (3:) with .
###### Remark 2.2.

1. We remark that the purpose of the above derivations is to work under minimal structural assumptions on the dynamics caused by the Newton transform. This is particularly important in the context of nonlinear singularly perturbed problems (with ): Indeed, for such problems (as ) the inverse of the derivative of the associated nonlinear operator will typically become highly irregular (even on a local level), and, for instance, local or global Lipschitz type or boundedness assumptions (involving, e.g., second derivative information) will in general not be available with practically feasible constants. We remark, however, that more sophisticated step size prediction schemes can be derived if the structure of the nonlinearities can be exploited further; see, in particular, the monograph [7].

2. The minimum in (2.7) ensures that the step size  is chosen to be 1 whenever possible. Indeed, this is required in order to guarantee quadratic convergence of the Newton iteration close to a root (provided that the root is simple).

3. The preset tolerance  in the above adaptive strategy will typically be fixed a priori. Here, for highly nonlinear problems featuring numerous or even infinitely many solutions, it is typically mandatory to select  small in order to remain within the attractor of the given initial guess. This is particularly important if the starting value is relatively far away from a solution.

## 3. Newton-Galerkin hp-FEM for Semilinear Elliptic Problems

The purpose of this section is to derive an -version Newton-Galerkin finite element formulation for the solution of (1.1). To this end, we will apply the abstract adaptive Newton framework from the previous Section 2, and apply an -discretization methodology for the resulting sequence of (locally) linearized problems.

### 3.1. Newton Linearization

In order to apply an adaptive Newton method as introduced in Section 2 to the nonlinear problem (1.3), note that the Fréchet-derivative of from (1.3) at  is given by

 ⟨F′ϵ(u)w,v⟩=∫Ω{ϵw′v′−f′(u)wv}dx,v,w∈X=H10(Ω),

where we write . We note that, if , then is a well-defined linear and bounded mapping from  to ; see [3, Lemma A.1].

Then, given an initial guess  for (1.3), the Newton method (2.3) is to find  from , , such that

 F′ϵ(un)(un+1−un)=−ΔtnFϵ(un),

in . Equivalently,

 aϵ(un;un+1,v)=aϵ(un;un,v)−Δtnℓϵ(un;v)∀v∈X, (3.1)

where, for fixed ,

 aϵ(u;w,v) :=∫Ω{ϵw′v′−f′(u)wv}dx, lϵ(u;v) :=∫Ω{ϵu′v′−f(u)v}dx

are bilinear and linear forms on  and , respectively.

###### Remark 3.1.

Incidentally, if there are constants  with such that holds for all , where is the (best) constant in the Poincaré inequality on , i.e.,

 ∥w∥0≤CP∥w′∥0∀w∈X, (3.2)

see [14, Theorem 257], then, it is elementary to show that the formulation (3.1) is a linear second-order diffusion-reaction problem, which is coercive on . In consequence, for any given , it has a unique solution ; see [1] for details.

### 3.2. Linearized Finite Element Approach

In order to provide a numerical approximation of (1.1), we will discretize the weak formulation (3.1) by means of an -finite element method, which, in combination with the Newton iteration, constitutes a Newton-Galerkin approximation scheme. Furthermore, in view of deriving an -adaptive algorithm, we will develop an -robust -version a posteriori residual analysis for the linear finite element formulation.

#### 3.2.1. hp-Spaces

We introduce a partition  of  (open) elements , , on , with . The length of an element  is denoted by , . For each element , it will be convenient to introduce the patch as the union of  and of the elements adjacent to it. In addition, to each element  we associate a polynomial degree , . These numbers are stored in a polynomial degree vector . Then, we define an -finite element space by

 Vhp(T,p)={v∈H10(Ω):v|Kj∈Ppj(Kj),j=1,2,…,N},

where, for , we denote by  the space of all polynomials of degree at most . We say that the pair of a partition and of a degree vector is -shape regular, for some constant  independent of , if

 μ−1hj+1≤hj≤μhj+1,μ−1pj+1≤pj≤μpj+1,j=1,…,N−1, (3.3)

i.e., if both the element sizes and polynomial degrees of neighboring elements are comparable.

#### 3.2.2. Linear hp-Fem

We define the numerical approximation of (1.1) in an iterative way as follows: For , we let  be an initial -discretization space, and a suitable initial guess for the Newton iteration. Furthermore, for , and given , we find a finite element approximation  of (3.1) as the solution of the weak formulation

 aϵ(uhpn;uhpn+1,v)=aϵ(uhpn;uhpn,v)−Δtℓϵ(uhpn;v)∀v∈Vhp(T,p). (3.4)

Here, takes the role of a (possibly varying) step size parameter in the adaptive Newton scheme as described earlier in Section 2.2.

## 4. A Posteriori Residual Analysis

Recalling (3.4), for given , let us introduce the function

 u(Δt,hp)n+1:=uhpn+1−(1−Δt)uhpn, (4.1)

as well as

 fΔt(uhpn+1):=Δtf(uhpn)+f′(uhpn)(uhpn+1−uhpn). (4.2)

Then, rearranging terms, we can rewrite (3.4) in the following form:

 ϵ∫Ω(u(Δt,hp)n+1)′v′dx =∫ΩfΔt(uhpn+1)vdx∀v∈Vh0. (4.3)

The aim of this section is to derive a posteriori residual-based bounds for (4.3). In order to measure the discrepancy between the finite element discretization (3.4) and the original problem (1.1), an obvious quantity to bound is the residual  in . In order to proceed in this direction, we notice that the adaptively chosen damping parameter  in the Newton method (3.4) will equal 1 sufficiently close to a root of . For this reason, as was proposed in [3], we may focus on the ‘shifted’ residual  in  instead; indeed, estimating this quantity turns out to be very natural in view of the linearized finite element discretization (4.3).

### 4.1. Robust Interpolation Estimates

For the purpose of deriving an (upper) a posteriori residual estimate that is robust with respect to the singular perturbation parameter  (for ) as well as optimally scaled with respect to the local element sizes  and polynomial degrees , it is crucial to construct an interpolation operator that is simultaneously - and -stable. This will be accomplished in the current section; see Proposition 4.1 and Corollary 4.2 below.

###### Proposition 4.1.

Let the pair be -shape regular; see (3.3). Then, there exists an interpolation operator  such that, for any , and for all , there holds

 (4.4)

Here, is a constant that depends solely on ; in particular, it is independent of , , and of .

###### Proof.

Let us, without loss of generality, assume that . The result can be shown with the techniques employed in the higher-dimensional case in [17, Cor. 3.7]. In the present, one-dimensional case, a significantly simpler argument can be brought to bear, which allows us to make the paper self-contained. Let and and , be the standard piecewise linear hat functions associated with the nodes , . The extra nodes and define in a natural way the elements and . The (open) patches , , are given by the supports of the functions , i.e., .

Polynomial approximation (see, e.g., [19, Proposition A.2]) gives the existence of an interpolation operator that is uniformly (in ) stable, i.e., for all and has the following properties for :

 (p+1)∥v−Jpv∥L2(−1,1)+∥(v−Jpv)′∥L2(−1,1)≤C∥v′∥L2(−1,1).

Furthermore, if is antisymmetric with respect to the midpoint , then can be assumed to be antisymmetric as well, i.e., (this follows from studying the antisymmetric part of the original function ).

The approximation is now constructed with the aid of a “partition of unity argument” as described in [20, Theorem 2.1]. For and , extend anti-symmetrically, i.e., for and for . Then is defined on each patch , . For each patch , let (with the understanding and ). The above operator then induces for each patch by scaling an operator with the following properties:

 p′i+1hi∥v−Jiv∥L2(ωi)+∥(v−Jiv)′∥L2(ωi)≤C∥v′∥L2(ωi);

here, we have exploited the -shape regularity of the mesh. We note that and . Also, the operators are uniformly (in the polynomial degree) stable in . The approximation is now taken to be . The desired approximation properties follow now from [20, Theorem 2.1]. ∎

The above proposition implies the following bounds.

###### Corollary 4.2.

For , the interpolant from Proposition 4.1 satisfies

 ∥∥v−πVhp(T,p)v∥∥2L2(Kj)≤C2Iαj|||v|||2ϵ,˜Kj,j=1,2,…,N,

and

 ∣∣(v−πVhp(T,p)v)(xj)∣∣2≤C2Iγj(|||v|||2ϵ,˜Kj+|||v|||2ϵ,˜Kj+1),j=1,2,…,N−1.

Here, we let

 αj =min{ϵ−1h2jp−2j,1},j=1,…,N, (4.5)

and, with

 βj =αjh−1j+2√ϵ−1αj, (4.6)

we define

 γj=βjβj+1βj+βj+1, (4.7)

for , and . Moreover, is the constant from (4.4).

###### Proof.

We proceed along the lines of [26]. Using the bounds from Proposition 4.1, for each element , we have that

 ∥∥v−πVhp(T,p)v∥∥2L2(Kj) ≤C2Ih2jϵp2jϵ∥∥v′∥∥2L2(˜Kj).

Furthermore,

 ∥∥v−πVhp(T,p)v∥∥2L2(Kj) ≤C2I∥v∥2L2(˜Kj).

Combining these two estimates, yields the first bound.

In order to prove the second estimate, we apply, for , a multiplicative trace inequality; see [21, Lemma A.1]):

 ∣∣(v−πVhp(T,p)v)(xj)∣∣2 ≤h−1j∥∥v−πVhp(T,p)v∥∥2L2(Kj) +2∥∥v−πVhp(T,p)v∥∥L2(Kj)∥∥(v−πVhp(T,p)v)′∥∥L2(Kj).

Then, invoking the above bounds as well as the estimates from Proposition 4.1, we get

 ∣∣(v−πVhp(T,p)v)(xj)∣∣2 ≤C2I(αjh−1j|||v|||2ϵ,˜Kj+2√αj|||v|||ϵ,˜Kj∥∥v′∥∥L2(˜Kj)) ≤C2I(αjh−1j|||v|||2ϵ,˜Kj+2√ϵ−1αj|||v|||2ϵ,˜Kj) ≤C2Iβj|||v|||2ϵ,˜Kj,

with from (4.6). Since  is also a boundary point of , we similarly obtain that

 ∣∣(v−πVhp(T,p)v)(xj)∣∣2≤C2Iβj+1|||v|||2ϵ,˜Kj+1.

Therefore,

 ∣∣(v−πVhp(T,p)v)(xj)∣∣2 =βj+1βj+βj+1∣∣(v−πVhp(T,p)v)(xj)∣∣2 +βjβj+βj+1∣∣(v−πVhp(T,p)v)(xj)∣∣2 ≤C2Iγj(|||v|||2ϵ,˜Kj+|||v|||2ϵ,˜Kj+1),

with  from (4.7). Thus, we have shown the second estimate. ∎

### 4.2. Robust A Posteriori Residual Estimate

Based on the previous interpolation analysis in Section 4.1, we are now in a position to prove the following main result.

###### Theorem 4.3.

For  from (4.1) there holds the upper a posteriori bound

 ∣∣∣∣∣∣Fϵ(u(Δt,hp)n+1)∣∣∣∣∣∣2X′,ϵ≼δ2n,Ω+N∑j=1η2n,j, (4.8)

with

 (4.9)

for , and

 δn,Ω:=∥∥fΔt(uhpn+1)−f(u(Δt,hp)n+1)∥∥0,Ω. (4.10)

Here, and  are defined in (4.5) and (4.7), respectively, and we set .

###### Remark 4.4.

We emphasize that the constants  and  appearing in the error indicators  from (4.9) remain bounded as  (and ). In addition, we note that , and that .

###### Remark 4.5.

For linear problems, the (adaptive) Newton linearization is redundant. In particular, upon setting , we infer that  in (4.8), and the residual is bounded by the local error indicators  only. These quantities, in turn, can be exploited for the purpose of designing an -adaptive mesh refinement strategy, as outlined, for example, in Section 5.1 below. We emphasize that this approach has been studied earlier in the report [21], where a number of numerical tests in the context of -adaptive FEM for singularly perturbed linear problems have been performed; in particular, these experiments have underlined the robustness of the bound (4.8) as  in the linear case. In Section 5, we will extend this idea by proposing a suitable interplay between -mesh refinements and the (adaptive) Newton linearization from Algorithm 2.1.

###### Proof of Theorem 4.3.

Given  and  from (4.1) and (4.2), respectively, and any , there holds the identity

 (4.11)

where

 aj 1≤j≤N−1, bj :=∫Kj{fΔt(uhpn+1)+ϵ(u(Δt,hp)n+1)′′}(πVhp(T,p)v−v)dx, 1≤j≤N, (4.12) cj :=∫Kj{fΔt(uhpn+1)−f(u(Δt,hp)n+1)}vdx, 1≤j≤N;

see [3, Proposition 4.3].

We estimate the terms , , and individually. First let . Then, from (4.12) can be estimated using Corollary 4.2 as follows:

 |aj|

Applying the Cauchy-Schwarz inequality leads to

 ∣∣ ∣∣N−1∑j=1aj∣∣ ∣∣

Furthermore, again using Corollary 4.2, we see that

 ∣∣ ∣∣N∑j=1bj∣∣ ∣∣ ≼N∑j=0α\nicefrac12j∥∥fΔt(uhpn+1)+ϵ(u(Δt,hp)n+1)′′∥∥0,Kj|||v|||ϵ,˜Kj ≼(N∑j=1αj∥∥fΔt(uhpn+1)+ϵ(u(Δt,hp)n+1)′′∥∥20,Kj)\nicefrac12(N∑j=1|||v|||2ϵ,˜Kj)\nicefrac12 ≼(N∑j=1αj∥∥fΔt(uhpn+1)+ϵ(u(Δt,hp)n+1)′′∥∥20,Kj)\nicefrac12|||v|||ϵ.

Similarly, there holds

 ∣∣ ∣∣N∑j=1cj∣∣ ∣∣ ≼N∑j=1∥∥fΔt(uhpn+1)−f(u(Δt,hp)n+1)∥∥0,Kj∥v∥0,Kj ≼(N∑j=1∥∥fΔt(uhpn+1)−f(u(Δt,hp)n+1)∥∥20,Kj)\nicefrac12|||v|||ϵ.

Now, applying the Cauchy-Schwarz inequality to (4.11) we see that

 ∣∣⟨Fϵ(u(Δt,hp)n+1),v⟩∣∣ ≼(δ2n,Ω+N∑j=1η2n,j)\nicefrac12|||v|||ϵ.

Dividing by , and taking the supremum for all , completes the proof. ∎

###### Remark 4.6.

Suppose that there exist constants  and , with  being the Poincaré constant from (3.2), such that

 (f(x)−f(y))(x−y)≤−λ(x−y)2,|f(x)−f(y)|≤L|x−y|,

for all . Then, the residual norm and the energy norm of the error are equivalent; see [3, Theorem 4.5] for details.

###### Remark 4.7.

Following along the lines of [3, §4.4.2] and [21, Appendix], it is possible to prove -robust local lower residual bounds in terms of the error indicators  and the data oscillation terms. This approach, however, results in local efficiency bounds that will be slightly suboptimal with respect to the local polynomial degrees due to the need of applying -dependent norm equivalences (involving cut-off functions).

## 5. An hp-Adaptive Newton-Galerkin Procedure

In this section, we will discuss how the a posteriori bound from Theorem 4.3 can be exploited for the purpose of designing an -adaptive Newton-Galerkin algorithm for the numerical solution of (1.1). We will begin by revisiting an -adaptive testing strategy from [11] (see also [28, 12]).

In order to -refine the -finite element space , we shall apply an -adaptive algorithm which is based on the following two ingredients.

#### (a) Element marking:

The elementwise error indicators  from Theorem 4.3 are employed in order to mark elements for refinement. More precisely, we fix a parameter , and select elements to be refined according to the Dörfler marking criterion:

 ϑN∑j=1η2n,j≤M∑j′=1η2n,j′. (D)

Here, the indices  are chosen such that the error indicators  from (4.9) are sorted in descending order, and  is minimal.

#### (b) hp-refinement criterion:

The decision of whether a marked element in step (a) is refined with respect to  (element bisection) or  (increasing the local polynomial order by 1) is based on a smoothness testing approach. Specifically, if the (numerical) solution is considered smooth on a marked element , then the polynomial degree is increased by 1 on that particular element (no element bisection), otherwise the element is bisected (retaining the current polynomial degree  on both subelements). In order to evaluate the smoothness of the solution  on a marked element , we employ an elementwise smoothness indicator as introduced in [11, Eq. (3)],

 (F)

if