hp-Adaptive Newton-Galerkin FEM

An -Adaptive Newton-Galerkin Finite Element Procedure for Semilinear Boundary Value Problems

Mario Amrein Lucerne University of Applied Sciences and Arts, CH-6002 Luzern, Switzerland Jens M. Melenk Institut für Analysis und Scientific Computing, TU Wien, A-1040 Wien, Austria  and  Thomas P. Wihler Mathematics Institute, University of Bern, CH-3012 Switzerland

In this paper we develop an -adaptive procedure for the numerical solution of general, semilinear elliptic boundary value problems in 1d, with possible singular perturbations. Our approach combines both a prediction-type adaptive Newton method and an -version adaptive finite element discretization (based on a robust a posteriori residual analysis), thereby leading to a fully -adaptive Newton-Galerkin scheme. Numerical experiments underline the robustness and reliability of the proposed approach for various examples.

Key words and phrases:
Adaptive Newton methods, semilinear elliptic problems, singularly perturbed problems, adaptive finite element methods, -FEM, -adaptivity.
2010 Mathematics Subject Classification:
TW was supported by the Swiss National Science Foundation (SNF), Grant No. 200021-162990.

1. Introduction

The purpose of this paper is the development of a numerical approximation procedure for semilinear elliptic boundary value problems, with possible singular perturbations. More precisely, for a fixed parameter  (possibly with ), and a continuously differentiable function , we consider the problem of finding a function  in an interval , , which satisfies


Solutions of (1.1) are typically not unique (even infinitely many solutions may exist), and, in the singularly perturbed case, may exhibit boundary layers, interior shocks, and (multiple) spikes. The existence of multiple solutions due to the nonlinearity of the problem and/or the appearance of singular effects constitute two challenging issues when solving problems of this type numerically; see, e.g.,[23, 27].

Newton-Galerkin Methods: In order to address the numerical solution of nonlinear singularly perturbed problems of type (1.1), a Newton-Galerkin approach has recently been proposed in [2, 3]. It is built upon two main ideas: Firstly, adaptive Newton methods are applied on the continuous level in order to approximate the nonlinear differential equation (1.1) by a sequence of linearized problems. Secondly, the resulting linear problems are discretized by means of an adaptive finite element procedure which, in turn, is based on an -robust a posteriori error analysis. Then, providing an appropriate interplay between the adaptive Newton method and the adaptive finite element approach, by which we either perform a Newton step (if the Newton linearization effect dominates) or refine the current finite element mesh based on some suitable a posteriori error indicators (in case that the finite element discretization causes the main source of error), a fully adaptive Newton-Galerkin scheme is obtained; see also the related works on monotone nonlinear problems [4, 9, 6], or the articles [5, 13] dealing with modelling errors in linearized models. Our numerical results in [3] reveal that sensible solutions can be found even in the singularly perturbed case, and that our scheme is reliable in the sense that the adaptive Newton-Galerkin method is able to converge to solutions which preserve the qualitative structure of (reasonably chosen) initial guesses (see also [2, 24]).

-FEM and -adaptivity: The aim of the present paper is to extend our work in [3] from a low-order to an -adaptive Newton-Galerkin approach. For simplicity of exposition, we focus on the 1d case, however, we mention that an extension to the multi-d case is possible as well. Exploiting the -framework may potentially lead to exponential rates of convergence in the numerical approximation. For the purpose of -adaptivity, we carry out an -robust -version a posteriori residual analysis for the FEM discretizations of the Newton linearizations. We remark that using -adaptivity, in contrast to the adaptive low-order methodology developed in the previous works [9, 2, 3], introduces an additional aspect of adaptivity that may further improve the performance of the Newton-Galerkin approach. In the -literature, several adaptive strategies and algorithms have been proposed; see, e.g., the overview article [22] and the references therein. In particular, we point to the smoothness estimation techniques proposed in [8, 15, 16, 18], and the related approach [11, 12, 28] involving continuous Sobolev embeddings. The latter works will be applied in the current paper in order to drive the -adaptive finite element refinements; in combination with the adaptive Newton linearizations this will lead to a fully -adaptive Newton-Galerkin procedure.

Problem formulation and notation: Henceforth, we suppose that a (not necessarily unique) solution  of (1.1) exists; here, we denote by the standard Sobolev space of functions in  with zero trace on . Furthermore, signifying by  the dual space of , and upon defining the map through


where is the dual product in , the above problem (1.1) can be written as a nonlinear operator equation in :


In addition, for open subsets , we introduce the norm

where  denotes the -norm on . Note that, in the case of , with given , i.e., when (1.1) is linear and strongly elliptic, the norm is a natural energy norm on . Frequently, for , the subindex ‘’ will be omitted. Furthermore, the associated dual norm of  from (1.2) is given by

Throughout this work we shall use the abbreviation to mean , for a constant independent of the mesh size , the local polynomial degree , and of .

Outline: In Section 2 we will consider the Newton method within the context of dynamical systems in general Banach spaces, and present a prediction strategy for controlling the Newton step size parameter. Furthermore, Section 3 focuses on the application of the Newton methodology to semilinear elliptic problems as well as on the discretization of the problems under consideration by -finite element methods; in addition, we derive an -robust a posteriori residual analysis. An -adaptive procedure and a series of numerical experiments illustrating the performance of the proposed idea will be presented in Section 5. Finally, we summarize our findings in Section 6.

2. Adaptive Newton Methods in Banach Spaces

In the following section we shall briefly revisit an adaptive prediction-type Newton algorithm from [3].

2.1. Abstract Framework

Let be two Banach spaces, with norms  and , respectively. Given an open subset , and a (possibly nonlinear) operator , we consider the nonlinear operator equation


for some unknown zeros . Supposing that the Fréchet derivative  of  exists in  (or in a suitable subset), the classical Newton method for solving (2.1) starts from an initial guess , and generates the iterates


where the update  is implicitly given by the linear equation

Naturally, for this iteration to be well-defined, we need to assume that  is invertible for all , and that .

2.2. An Adaptive Prediction Strategy

In order to improve the reliability of the Newton method (2.2) in the case that the initial guess  is relatively far away from a root of , , introducing some damping in the Newton method is a well-known remedy. In that case (2.2) is rewritten as


where , , is a damping parameter that may be adjusted adaptively in each iteration step.

Provided that  is invertible on a suitable subset of , we define the Newton transform by

Then, rearranging terms in (2.3), we notice that

i.e., (2.3) can be seen as the discretization of the Davydenko-type system,


by the forward Euler scheme, with step size .

For , the solution  of (2.4), if it exists, defines a trajectory in  that starts at , and that will potentially converge to a zero of  as . Indeed, this can be seen (formally) from the integral form of (2.4), that is,

which implies that  as .

Now taking the view of dynamical systems, our goal is to compute an upper bound for the value of the step sizes  from (2.3), , so that the discrete forward Euler solution  from (2.3) stays reasonably close to the continuous solution of (2.4). Specifically, for a prescribed tolerance , a Taylor expansion analysis (see [3, Section 2] for details) reveals that

where, for any sufficiently small , we let . Hence, after the first time step of length  there holds


where  is the forward Euler solution from (2.3). Therefore, upon setting

we arrive at

In order to balance the -terms in (2.5) it is sensible to make the choice



for some parameter . This leads to the following adaptive Newton algorithm.

Algorithm 2.1.

Fix a tolerance as well as a parameter , and set .

1:Start the Newton iteration with an initial guess .
2:if  then choose
based on [3, Algorithm 2.1],
3:else if  then
4:     let , and based on (2.6);
5:     define the Newton step size
6:end if
7:Compute  based on the Newton iteration (2.3), and go to (3:) with .
Remark 2.2.

The following comments are noteworthy.

  1. We remark that the purpose of the above derivations is to work under minimal structural assumptions on the dynamics caused by the Newton transform. This is particularly important in the context of nonlinear singularly perturbed problems (with ): Indeed, for such problems (as ) the inverse of the derivative of the associated nonlinear operator will typically become highly irregular (even on a local level), and, for instance, local or global Lipschitz type or boundedness assumptions (involving, e.g., second derivative information) will in general not be available with practically feasible constants. We remark, however, that more sophisticated step size prediction schemes can be derived if the structure of the nonlinearities can be exploited further; see, in particular, the monograph [7].

  2. The minimum in (2.7) ensures that the step size  is chosen to be 1 whenever possible. Indeed, this is required in order to guarantee quadratic convergence of the Newton iteration close to a root (provided that the root is simple).

  3. The preset tolerance  in the above adaptive strategy will typically be fixed a priori. Here, for highly nonlinear problems featuring numerous or even infinitely many solutions, it is typically mandatory to select  small in order to remain within the attractor of the given initial guess. This is particularly important if the starting value is relatively far away from a solution.  

3. Newton-Galerkin -FEM for Semilinear Elliptic Problems

The purpose of this section is to derive an -version Newton-Galerkin finite element formulation for the solution of (1.1). To this end, we will apply the abstract adaptive Newton framework from the previous Section 2, and apply an -discretization methodology for the resulting sequence of (locally) linearized problems.

3.1. Newton Linearization

In order to apply an adaptive Newton method as introduced in Section 2 to the nonlinear problem (1.3), note that the Fréchet-derivative of from (1.3) at  is given by

where we write . We note that, if , then is a well-defined linear and bounded mapping from  to ; see [3, Lemma A.1].

Then, given an initial guess  for (1.3), the Newton method (2.3) is to find  from , , such that

in . Equivalently,


where, for fixed ,

are bilinear and linear forms on  and , respectively.

Remark 3.1.

Incidentally, if there are constants  with such that holds for all , where is the (best) constant in the Poincaré inequality on , i.e.,


see [14, Theorem 257], then, it is elementary to show that the formulation (3.1) is a linear second-order diffusion-reaction problem, which is coercive on . In consequence, for any given , it has a unique solution ; see [1] for details.  

3.2. Linearized Finite Element Approach

In order to provide a numerical approximation of (1.1), we will discretize the weak formulation (3.1) by means of an -finite element method, which, in combination with the Newton iteration, constitutes a Newton-Galerkin approximation scheme. Furthermore, in view of deriving an -adaptive algorithm, we will develop an -robust -version a posteriori residual analysis for the linear finite element formulation.

3.2.1. -Spaces

We introduce a partition  of  (open) elements , , on , with . The length of an element  is denoted by , . For each element , it will be convenient to introduce the patch as the union of  and of the elements adjacent to it. In addition, to each element  we associate a polynomial degree , . These numbers are stored in a polynomial degree vector . Then, we define an -finite element space by

where, for , we denote by  the space of all polynomials of degree at most . We say that the pair of a partition and of a degree vector is -shape regular, for some constant  independent of , if


i.e., if both the element sizes and polynomial degrees of neighboring elements are comparable.

3.2.2. Linear -Fem

We define the numerical approximation of (1.1) in an iterative way as follows: For , we let  be an initial -discretization space, and a suitable initial guess for the Newton iteration. Furthermore, for , and given , we find a finite element approximation  of (3.1) as the solution of the weak formulation


Here, takes the role of a (possibly varying) step size parameter in the adaptive Newton scheme as described earlier in Section 2.2.

4. A Posteriori Residual Analysis

Recalling (3.4), for given , let us introduce the function


as well as


Then, rearranging terms, we can rewrite (3.4) in the following form:


The aim of this section is to derive a posteriori residual-based bounds for (4.3). In order to measure the discrepancy between the finite element discretization (3.4) and the original problem (1.1), an obvious quantity to bound is the residual  in . In order to proceed in this direction, we notice that the adaptively chosen damping parameter  in the Newton method (3.4) will equal 1 sufficiently close to a root of . For this reason, as was proposed in [3], we may focus on the ‘shifted’ residual  in  instead; indeed, estimating this quantity turns out to be very natural in view of the linearized finite element discretization (4.3).

4.1. Robust Interpolation Estimates

For the purpose of deriving an (upper) a posteriori residual estimate that is robust with respect to the singular perturbation parameter  (for ) as well as optimally scaled with respect to the local element sizes  and polynomial degrees , it is crucial to construct an interpolation operator that is simultaneously - and -stable. This will be accomplished in the current section; see Proposition 4.1 and Corollary 4.2 below.

Proposition 4.1.

Let the pair be -shape regular; see (3.3). Then, there exists an interpolation operator  such that, for any , and for all , there holds


Here, is a constant that depends solely on ; in particular, it is independent of , , and of .


Let us, without loss of generality, assume that . The result can be shown with the techniques employed in the higher-dimensional case in [17, Cor. 3.7]. In the present, one-dimensional case, a significantly simpler argument can be brought to bear, which allows us to make the paper self-contained. Let and and , be the standard piecewise linear hat functions associated with the nodes , . The extra nodes and define in a natural way the elements and . The (open) patches , , are given by the supports of the functions , i.e., .

Polynomial approximation (see, e.g., [19, Proposition A.2]) gives the existence of an interpolation operator that is uniformly (in ) stable, i.e., for all and has the following properties for :

Furthermore, if is antisymmetric with respect to the midpoint , then can be assumed to be antisymmetric as well, i.e., (this follows from studying the antisymmetric part of the original function ).

The approximation is now constructed with the aid of a “partition of unity argument” as described in [20, Theorem 2.1]. For and , extend anti-symmetrically, i.e., for and for . Then is defined on each patch , . For each patch , let (with the understanding and ). The above operator then induces for each patch by scaling an operator with the following properties:

here, we have exploited the -shape regularity of the mesh. We note that and . Also, the operators are uniformly (in the polynomial degree) stable in . The approximation is now taken to be . The desired approximation properties follow now from [20, Theorem 2.1]. ∎

The above proposition implies the following bounds.

Corollary 4.2.

For , the interpolant from Proposition 4.1 satisfies


Here, we let


and, with


we define


for , and . Moreover, is the constant from (4.4).


We proceed along the lines of [26]. Using the bounds from Proposition 4.1, for each element , we have that


Combining these two estimates, yields the first bound.

In order to prove the second estimate, we apply, for , a multiplicative trace inequality; see [21, Lemma A.1]):

Then, invoking the above bounds as well as the estimates from Proposition 4.1, we get

with from (4.6). Since  is also a boundary point of , we similarly obtain that


with  from (4.7). Thus, we have shown the second estimate. ∎

4.2. Robust A Posteriori Residual Estimate

Based on the previous interpolation analysis in Section 4.1, we are now in a position to prove the following main result.

Theorem 4.3.

For  from (4.1) there holds the upper a posteriori bound




for , and


Here, and  are defined in (4.5) and (4.7), respectively, and we set .

Remark 4.4.

We emphasize that the constants  and  appearing in the error indicators  from (4.9) remain bounded as  (and ). In addition, we note that , and that .  

Remark 4.5.

For linear problems, the (adaptive) Newton linearization is redundant. In particular, upon setting , we infer that  in (4.8), and the residual is bounded by the local error indicators  only. These quantities, in turn, can be exploited for the purpose of designing an -adaptive mesh refinement strategy, as outlined, for example, in Section 5.1 below. We emphasize that this approach has been studied earlier in the report [21], where a number of numerical tests in the context of -adaptive FEM for singularly perturbed linear problems have been performed; in particular, these experiments have underlined the robustness of the bound (4.8) as  in the linear case. In Section 5, we will extend this idea by proposing a suitable interplay between -mesh refinements and the (adaptive) Newton linearization from Algorithm 2.1.  

Proof of Theorem 4.3.

Given  and  from (4.1) and (4.2), respectively, and any , there holds the identity




see [3, Proposition 4.3].

We estimate the terms , , and individually. First let . Then, from (4.12) can be estimated using Corollary 4.2 as follows:

Applying the Cauchy-Schwarz inequality leads to

Furthermore, again using Corollary 4.2, we see that

Similarly, there holds

Now, applying the Cauchy-Schwarz inequality to (4.11) we see that

Dividing by , and taking the supremum for all , completes the proof. ∎

Remark 4.6.

Suppose that there exist constants  and , with  being the Poincaré constant from (3.2), such that

for all . Then, the residual norm and the energy norm of the error are equivalent; see [3, Theorem 4.5] for details.  

Remark 4.7.

Following along the lines of [3, §4.4.2] and [21, Appendix], it is possible to prove -robust local lower residual bounds in terms of the error indicators  and the data oscillation terms. This approach, however, results in local efficiency bounds that will be slightly suboptimal with respect to the local polynomial degrees due to the need of applying -dependent norm equivalences (involving cut-off functions).  

5. An -Adaptive Newton-Galerkin Procedure

In this section, we will discuss how the a posteriori bound from Theorem 4.3 can be exploited for the purpose of designing an -adaptive Newton-Galerkin algorithm for the numerical solution of (1.1). We will begin by revisiting an -adaptive testing strategy from [11] (see also [28, 12]).

5.1. -Adaptive Refinement Procedure

In order to -refine the -finite element space , we shall apply an -adaptive algorithm which is based on the following two ingredients.

(a) Element marking:

The elementwise error indicators  from Theorem 4.3 are employed in order to mark elements for refinement. More precisely, we fix a parameter , and select elements to be refined according to the Dörfler marking criterion:


Here, the indices  are chosen such that the error indicators  from (4.9) are sorted in descending order, and  is minimal.

(b) -refinement criterion:

The decision of whether a marked element in step (a) is refined with respect to  (element bisection) or  (increasing the local polynomial order by 1) is based on a smoothness testing approach. Specifically, if the (numerical) solution is considered smooth on a marked element , then the polynomial degree is increased by 1 on that particular element (no element bisection), otherwise the element is bisected (retaining the current polynomial degree  on both subelements). In order to evaluate the smoothness of the solution  on a marked element , we employ an elementwise smoothness indicator as introduced in [11, Eq. (3)],