An improved method for solving quasilinear convection diffusion problems on a coarse mesh
Abstract.
A method is developed for solving quasilinear convection diffusion problems starting on a coarse mesh where the data and solutiondependent coefficients are unresolved, the problem is unstable and approximation properties do not hold. The Newtonlike iterations of the solver are based on the framework of regularized pseudotransient continuation where the proposed time integrator is a variation on the Newmark strategy, designed to introduce controllable numerical dissipation and to reduce the fluctuation between the iterates in the coarse mesh regime where the data is rough and the linearized problems are badly conditioned and possibly indefinite. An algorithm and updated marking strategy is presented to produce a stable sequence of iterates as boundary and internal layers in the data are captured by adaptive mesh partitioning. The method is suitable for use in an adaptive framework making use of local error indicators to determine mesh refinement and targeted regularization. Derivation and qlinear local convergence of the method is established, and numerical examples demonstrate the theory including the predicted rate of convergence of the iterations.
Key words and phrases:
Nonlinear diffusion, convection diffusion, quasilinear equations, pseudotransient continuation, Newtonlike methods, adaptive methods.1. Introduction
This paper builds on the framework of [21] and develops a nonlinear solver suitable for use in adaptive methods for quasilinear elliptic problems. The method is developed to stabilize the linearizations of nonlinear diffusion and convection diffusion problems, especially when there may be steep internal or boundary layers in the problem data. The sequence of linear problems encountered by a Newtonlike method under these circumstances takes the form of convection diffusion or reaction convection diffusion and the sequence of approximate solutions is subject to spikes, overshoots and spurious oscillations in the convectiondominated regime.
The present approach builds on a regularized version of the pseudotransient continuation method as in for instance [1, 19, 6, 12, 7, 21] and the references therein, and on each mesh refinement seeks a steadystate solution of the nonlinear evolution problem , for positive semidefinite regularization term in the interest of solving . In this analysis, further stabilization is introduced into the time discretization to address the problem of nonphysical oscillations. Time discretizations featuring controllable highfrequency numerical dissipation are well known in the finite element analysis of structures as in for instance [20, 16, 15, 17, 4], and a variation related to these methods referred to here as the split Newmark update is presently introduced. This method makes use of the controllable dissipation of the Newmark update and further controls fluctuations between the iterates by freezing a small fraction the linearization about a point with favorable properties. The split Newmark method is derived, linear local convergence with a predictable rate is established, and the method is demonstrated on three variations of a model problem with steep internal layers.
The goal of the solver in the adaptive setting is to start on a coarse mesh where the problem data is generally not resolved and produce a sequence of transitional states which may not be accurate solutions to the coarse mesh problems, but do allow a posteriori error indicators to detect the layers present in the problem data and refine the mesh to be globally fine enough for stability and locally fine enough to achieve accuracy and efficiency. The sequence of approximate problems can be classified into three phases, the initial phase where the mesh is globally too coarse and quadrature error is high and many features of the problem data remain undetected by the discretization, the preasymptotic phase where the mesh is fine enough for stability but the problem data is only partially resolved, and the asymptotic regime where the standard existence, uniqueness and approximation properties hold. In the coarse mesh regime the solver uses as much stabilization as necessary to produce smooth transitional solutions; in the preasymptotic regime the solver adaptively reduces the added stabilization increasing both accuracy and the convergence rate as the data is resolved and the approximate problem becomes less rough; in the asymptotic phase the solver limits to a standard Newton method where the initial guess interpolated from the previous refinement is a good approximation to the solution and the iterations converge quadratically.
The requirements of the adaptive method are that a local error indicator is computed on each refinement and a reasonable marking strategy in the sense of [22] is employed to determine the next mesh refinement. A modification of the standard adaptive marking strategy is proposed in which the marked set is determined by two parts: one that refines the elements with the largest indicators, and the other that refines a subset of the coarsest elements of the mesh when the residual from the final Newtonlike iteration is significant. The method reported here improves on [21] both in terms of efficiency and in terms of the strengths of nearsingularities it is able to resolve.
The remained of the paper is organized as follows. Section 3 shows the derivation of the pseudotransient Newmark and split Newmark methods, (2.8) and (2.9). Section 4 demonstrates local linear convergence of the residual for these methods, both with rate , with the parameter from the Newmark update. Section 5 describes a basic algorithm to implement the solver in an adaptive method, and Section 6 contains the results of numerical experiments using the described adaptive algorithm and (2.9).
The following notation is used in the remainder of the paper. The function refers to a specific problem or problem class and is used in the formal discussion of Newtonlike methods. The th iteration subordinate to the th partition is denoted , while is the th iteration on a fixed partition and is the final iteration on the th mesh, taken as the approximate solution on . In defining the weak and bilinear forms in the next section , and similarly for vectorvalued functions.
2. Target problem class
The nonlinear solver is developed to approximate solutions to the nonlinear problem = 0, for polygonal domain and with for real Banach spaces , particularly where takes the form of a quasilinear convection diffusion or diffusion problem in divergence form
(2.1) 
or
(2.2) 
Multiplication by a test function and integration by parts over the divergence term yields the weak form of (2.1)
(2.3) 
and linearizing about at yields the bilinear form induced by
(2.4) 
For and bounded away from zero with bounded as in [3], then (2.2) has a unique solution , with . Both (2.1) and (2.2) fit into the context of [24] with the assumption that is bounded and is an isomorphism, in which case the solution is an isolated solution.
The discretized equation is, find such that where and are discrete finite element spaces with respect to triangulation , where the family of triangulations is regular and quasiuniform in the sense of [5]. Existence, uniqueness and approximation properties of the discrete problems induced by (2.1) and (2.2) are found in [3], [24] and the references therein, assuming the mesh is fine enough. The problem is then to start on a coarse mesh; one that is not fine enough in terms of data resolution, stability or approximation properties, and build one that is. The goal of the solver is to navigate from a coarse to a sufficiently fine mesh where the approximation properties do hold, and to do so by building an both an efficient adaptive mesh and a reasonable initial guess to start the Newtonlike iterations on each refinement along the way.
The linearization of (2.2) has the form of a convectiondiffusion equation, and (2.4) the linearized form of (2.1) has the form of a convection reaction diffusion equation with
(2.5)  
(2.6)  
(2.7) 
Using a standard Newton method to solve the nonlinear (2.3) with (2.4) generally does not work when there are steep layers present in the problem data, and coarse mesh approximations of the problem are observed to be indefinite and illconditioned, consistent with the observations in [14]. Many of the problems encountered including formation of spurious spikes, overshoots and instability are symptomatic of the corresponding linear convectiondominated problems [9]. Techniques form from the finite element analysis of structures [20, 16, 15, 17, 4] use numerical integrators featuring highfrequency dissipation to capture lower frequency modes of the solution. The approach here investigates the use of the Newmark update and a stabilized variation of it as the timeintegrators of a pseudotransient continuationlike method as in [8, 1, 6, 19, 21].
The next section develops the two following methods to improve the convergence of the coarsemesh iterates in the sequence of linearized problems. A positive semidefinite is used to target specific degrees of freedom for regularization, and the role of may be seen either from the homotopy perspective as a modification of the path between an initial and that solves , or from the regularization viewpoint as a penalty against certain characteristics of the iterates. In what follows, is chosen based on the Laplacian to penalize against high curvature, with degrees of freedom selected for regularization based on an a posteriori error indicator. The selective approach to regularization distinguishes the method presented here and in [21] from the method of pseudotransient continuation found elsewhere in the literature. For , a Newmark update generalizing a backward Euler discretization yields the iteration
(2.8) 
The proposed split Newmark update yields the iteration,
(2.9) 
and analysis and demonstration of this last method is the focus of the remainder of the paper. Local convergence of (2.8) is discussed along the way and local convergence of (2.9) is established by a perturbation of that result. The method based on (2.9) was also tested and found effective on a shifted Laplacian
(2.10) 
and the related
(2.11) 
problems similar to those investigated in [2] and [13], which reside outside the current target class but still benefit from the regularization techniques described here when starting the adaptive method from a coarse mesh, especially if the coefficient .
3. Continuation methods
The homotopy or pseudotransient continuation method of stabilizing the Newtonlike iterations for finding the solution of is developed by discretizing the ODE
(3.1) 
with a positive semidefinite linear functional . In much of the literature, is taken to be the identity or a scaled version thereof [8, 19], and the references therein; however, the theory is developed for positivedefinite functionals other than the identity [1] where it is referred to as the “s” method, and positive semidefinite functionals in [6, 21]. This idea can be generalized to discretizing the ODE based on the normal equations formulation of (3.1)
(3.2) 
with adjoint the formal adjoint of and the formal adjoint of . As shown in [21], the discretization of (3.2) corresponds to the method of Tikhonov regularization.
As discussed in for instance [1, 8, 19, 6], letting a finite dimensional approximation to with , the standard method is to discretize (3.1) by a backwardEuler approximation to and a linearization of about . In the case of discretizing (3.2), is approximated by . To increase stability of the linearized system, other discretizations of (3.2) are presently considered. The backwardEuler time discretization is replaced by the more general Newmark update, and the linearization of is split about two distinct points, one the latest iterate and the other yielding a Jacobian with favorable properties.
By the original method of backwardEuler discretization and first order Taylor expansion about the previous iterate, the resulting Newtonlike iteration is given by
(3.3) 
This method increases the stability of the linear system for positive definite and possibly semidefinite , but runs the risk of shifting the spectrum of the approximate Jacobian towards zero, making the condition significantly worse in the case that is indefinite, resulting in large fluctuations between the iterates. To cope with this situation which occurs in the coarse mesh approximation of quasilinear problems, the iteration based on the normal equations enforcing the shift of the spectrum away from zero is given by
(3.4) 
This method is introduced in [21], where it is shown that (3.4) is also found by minimizing the Tikhonov functional for
(3.5) 
with and the norm. As in [10], the necessary and sufficient condition for the minimizer is , yielding (3.4). While successfully increasing the stability of the system, this method remains unsatisfactory due to the increased complexity of solving the system based on the normal equations. A new method is now introduced which adds stability while preserving the sparsity of the system. The cost, as shown in Section 4 is trading the asymptotically quadratic convergence of (3.3) for linear convergence. The proposed algorithm of Section 5 updates the parameter of the Newmark method on each mesh refinement as the sequence of linear systems stabilize until the method reduces to the original (3.3) for which quadratic convergence of the error is observed.
3.1. Time discretization by the Newmark method
The Newmark method [20] discretizes the time derivative by
(3.6) 
For , (3.6) reduces to the backward Euler discretization described above. As discussed in [20] this time integrator is second order accurate for and introduces nonphysical damping of the high frequency modes for proportional to . More sophisticated and collocation methods as in [16, 15, 4] incorporating the Newmark update are designed to further control numerical dissipation across targeted frequencies. In the current context the improved resolution and risk of overshoot of these methods designed with two time derivatives in mind is potentially of interest but their implementation in a pseudotransient continuation setting is beyond the scope of this article. For stabilizing the transitional states of the sequence of coarsemesh problems, damping of the highfrequency oscillations is the desirable property and more important than a higher order of accuracy.
Solving (3.6) for
(3.7) 
Applying and to (3.7)
(3.8) 
Linearizing about , obtain the Newtonlike iteration
(3.9) 
For this method comparatively scales down the influence of the righthand side data; the danger is still allowing and arguably increasing the likeliness of shifting an indefinite lefthand side operator spectrum towards zero. By the same approach, the iteration based on the normal equations is given by
(3.10) 
Due to the overdamping effect of setting the parameter , the iterations (3.9) and (3.10) stabilize the solution in some situations where the iteration (3.4) is infeasible. Similarly to (3.4) (3.5), the solution of (3.10) is seen to be the minimizer of the generalized Tikhonovtype functional
(3.11) 
with .
3.2. The Newmark update with splitting
The Newark discretization of the previous section successfully introduces highfrequency dissipation increasing the stability of the linearized system but suffers the same drawback as the backwardEuler discretization requiring the formulation based on the normal equations in the case of a possibly indefinite Jacobian to control highly unstable sequences of iterates. The split Newmark update runs without the use of the normal equations: effectively freezing a small fraction of the Jacobian at a point with favorable properties dramatically reduces the fluctuations between iterates in the coarse mesh regime where boundary and internal layers are only partially resolved.
As seen in the derivation, the method can be thought of as splitting the linearization of about two points, or more precisely as approximating by a combination of and where the first is used in the backwardEuler update, the first two are used in the Newmark update with the second adding control of numerical dissipation; and the third is introduced here to reduce the fluctuation of the Jacobian outside the domain of convergence of the Newtonlike iterations. The method is derived as follows, and is demonstrated in Section 6 to work without the use of the normal equations in situations where other methods including those involving the normal equations are seen to fail.
Starting with the Newmark update (3.6), each of the time derivative terms on the right is split into two parts
(3.12) 
Solving for and applying the relation
(3.13) 
Applying the relation on the left and linearizing about on the right
(3.14) 
Linearizing both about fixed , . Applying this and multiplying through by
(3.15) 
Equation (3.15) is the basis for the split Newmark update. It contains four parameters: ; while some guidance is provided, a detailed analysis of these parameters will be addressed in subsequent work by the author. In the current results is used, and the solution from the previous refinement interpolated onto the current mesh has also been observed to work; is chosen as in [21] for the backwardEuler discretization, may be set adaptively, increased to add stability then decreased to speed convergence; and is adaptively sent towards one based on the norm of the latest residual on each Newtonlike iteration. It is observed in numerical experiments that should be close to one, and as seen in Section 4 this is a necessary for the asymptotic linear convergence at the rate , in agreement with the rate found using the Newmark discretization without the splitting.
4. Local convergence
Local convergence of the residual is established for both algorithms (2.8) based on the Newmark update, and (2.9) based on the Nemark update with splitting of the Jacobian. The second result follows as a perturbation of the first for close to unity, relying on a weaker set of assumptions. Both results require the same Lipschitz condition on the Jacobian. Denote the open ball about by .
Assumption 4.1.
There exist so that for all
Assumption 4.2.
First, linear convergence of the residual is established with rate , for . Linear and asymptotic quadratic convergence for the case is shown in [21], following from convergence of the error. In the present discussion, convergence of the error is neither used nor shown: it is observed in numerical experiments that the residual decreases at the predicted rate over iterations where may not yet be decreasing; in contrast, for the case the same quantity displays asymptotically quadratic convergence to zero together with the residual. The following proof is a variation of Theorem 2.12 of [7], where Assumptions 4.1 and 4.2 replace the affine contravariant Lipschitz condition used in that version of the NewtonMysovskikh Theorem for the standard Newton method.
Theorem 4.3.
Let and let Assumptions 4.1 and 4.2 hold. Define the open set by
(4.1) 
and suppose ; then by iteration (2.8), , and the sequence of residuals converges linearly to zero with asymptotic rate .
Proof.
Let . Then iteration (2.8) is given by
(4.2) 
By the integral meanvalue theorem
(4.3) 
Applying Assumption 4.2 (3) to the second term and the Lipschitz condition 4.1 to the third term of (4.3)
(4.4) 
By the iteration (4.2) and Assumption 4.2 (2), , yielding
By the assumption , and for
(4.6) 
for each such that for all . By the logic of [7] Theorem 2.12, proceed by contradiction and assume ; then there is a smallest with . For that
(4.7) 
a contradiction. This shows . To establish the rate of convergence, set in 4.6.
(4.8) 
which shows both that and the asymptotic linear rate of as . ∎
Remark 4.4.
It is observed that the method converges with the predicted rate when the iterates as given by (4.1). The lapse in the theory appears to be the bound on which apparently converges within a smaller set as compared to the residual.
The asymptotic convergence of the split algorithm is addressed in the next theorem as a perturbation of Theorem 4.3. While it is not necessary in the proof for to be positive definite, is ideally chosen so that improves the condition of the approximate Jacobian, allowing Assumptions 4.5 to hold with better constants than 4.2.
Assumption 4.5.
(c.f. Assumption 4.2). There exist so that for , fixed , positive semidefinite , , and for all , then for all :

is invertible.

.

As with Assumption 4.2 (3), the third clause agrees with the similar stability bound of [6] Assumption 2.3, with replaced by . Local convergence of the split Newmark algorithm is established with the same asymptotic rate as in the previous result.
Theorem 4.6.
Let and let Assumptions 4.1 and 4.5 hold. Define by
(4.9) 
for a given . Then there exists such that for in the open set given by
(4.10) 
and defined by iteration (2.9), it hold that , and the sequence of residuals converges linearly to zero with asymptotic rate .
Proof.
Let . Then iteration (2.9) is given by
(4.11) 
Much of the proof parallels Theorem 4.3, and is summarized here. Starting with the integral meanvalue theorem
(4.12) 
By iteration (4.11)
(4.13) 
Bounding the second term on the right of (4.13) by Assumptions 4.5 (3)
(4.14) 
and by Assumptions 4.1 and 4.5 (2), then applying and
(4.15) 
Applying (4.9) to the quantity under the assumption and expanding in orders of ,
(4.16) 
Similarly
(4.17) 
Applying (4.6) and (4.17) to (4.6), then for any fixed
(4.18) 
Supposing
(4.19) 
it follows that
(4.20) 
so there exists for which implies
(4.21) 
The result follows by the same logic as in the proof of Theorem 4.3. ∎
Remark 4.7.
It is observed in some problems that the residual can decrease at a rate slightly better than the one predicted when , whereas for the rate is as predicted.
5. Algorithm
The solver using the Newmark iteration (2.9) may be implemented in an adaptive method according to the following basic algorithm. To exploit both the stability of method (2.9) and the quadratic convergence of (3.3), the parameters are updated on each refinement, and the parameters are updated on each iteration of the solver. The main steps of the adaptive method are as follows with the exit criteria and parameter updates specified below. An example of a targeted regularization term based on a posteriori error indicators is given in Section 6.
Algorithm 5.1 (Basic algorithm for (2.9)).
Criteria 5.2 (Exit criteria).
Given a user set tolerance , an accepted rate of convergence given by for some constant , e.g., , and a maximum number of iterations either chosen as a constant or based on the predicted rate of convergence, exit the solver on partition after calculating iterate when one of the conditions holds.

.


, AND

, AND

, AND

.


Maximum number of iterations exceeded.
In terms of the three phases of the solution process in [21], the final asymptotic regime is characterized by the iterations terminating by Criteria 5.2 (1); iterations in the preasympototic regime terminate with Criteria 5.2 (2); and iterations terminate with a mix of Criteria 5.2 (2) and (3) in the initial phase.
The second exit criterion 5.2 (2) merits some explanation, as it allows the iteration to end once reduction of the residual has slowed indicating the iterate has a attained a reasonably stable configuration from which a good prediction about where to refine the mesh may be determined. The first two clauses require the residual is decreasing, and has decreased below the level given by the previous iterate with at least as much decrease from the initial iteration on the current partition and the residual from the previous partition, if it is well defined. The third criterion requires error reduction at or close to the predicted rate, and the last that the error reduction between iterates is slowing down. These four criteria together assure the sequence of transitional states is not getting further in the sense of the solver’s residual from a converged solution, and prevent situations such as spikes propagating indefinitely across a sequence of partitions. While spikes, overshoots or other undesirable characteristics may propagate through several refinements, such solutions will eventually not reduce the residual at the specified rate. When the adaptive mesh is fine enough, these characteristics are observed to smooth out, and otherwise the solver eventually fails and the solution is reset and started again on a finer mesh.
One of the main improvements of (2.9) over the regularized method of [21] is the generation of more stable sequences of transitional iterates through the preasympototic phase. The sequence of preasymptotic approximate problems only partially resolve the data and as internal layers are uncovered over several refinements the condition of these problems is generally bad. The highfrequency dissipation of the Newmark strategy combined with the stabilization of the iterates produced by the splitting allow sequences of solutions to propagate through this regime and the mesh to be marked via error indicators leading to the accurate solution of the problem on otherwise coarser meshes than possible if starting the solver on a mesh that is uniformly fine enough to resolve the data. In order to make use of the stability and dissipation properties of larger values of as well as the asymptotically quadratic convergence if the iterations are stable for , the following minimal guidelines are presented, based on the termination criteria above.
Update 5.3 (Newmark parameter ).
In practice, the residual tends to get reduced below tolerance once . In the results of Section 6, is decreased by two when the target rate is achieved, decreased by one if a stable rate lower than predicted is achieved, increased by one when the solver fails, and by two if the maximum number of iteration is exceeded while the iterations are converging below the acceptable rate. An initial should be chosen large enough to see error reduction on the initial mesh, and not significantly larger.
The regularization parameter is updated by the method described in [21], repeated here for convenience. In accordance with the convergence Theorems 4.3 and 4.6, this strategy assures so long as the residual is decreasing.
Update 5.4 (Tikhonov regularization parameter ).
Set . For ,
To reduce rapid fluctuation of , correct to ensure in the case that and if .
5.1. Marking strategy
An a posteriori error indicator is assumed available to determine adaptive mesh refinement and as one option for determining a targeted regularization term . In the current results, standard local residualbased element indicators as in for instance [23, 11] are used for both of these purposes, further described in Section 6. Other approaches to solver and problemspecific regularization and mesh refinement are currently under investigation by the author.
The goal of the marking strategy is to build a mesh that is globally fine enough for stability, and locally as fine as necessary to achieve the desired accuracy. To improve the efficiency of the method by increasing the stability of the transitional states in the preasymptotic phase, the following marking strategy is presented. Based on exit criteria 5.2 there are three possible outcomes of the nonlinear solve on refinement .
In the case of 5.2 (3), the coarsest set of elements in the mesh are refined. Failure of the solver reflects a global rather than a local problem. Starting on a sufficiently coarse mesh, several resets are expected.
In the case of 5.2 (2) the error indicators are computed, and the mesh is refined both according to the elements with the largest local indicators, and according to the coarsest elements with the largest local indicators. This strategy allows local refinement to take place in order to capture boundary and internal layers to attain eventual accuracy and efficiency while also building the adaptive mesh fine enough to achieve stability. In meshes that are too coarse to resolve the data and where the approximate problem may have coefficients based on highly inaccurate approximate solutions, nonphysical overshoots often develop in the iterates; while the error indicators in the vicinity of these spikes may be high, refining primarily in these regions exacerbates the problem. As in case 5.2 (3), a large residual predicts a global issue with the mesh. However, valuable information about the nearsingularities in the problem data can be predicted from the nonconverged iterates, so some local refinement can build a more efficient mesh.
In the case of 5.2 (1) the error indicators are computed, and the mesh is refined with respect to the largest local indicators. Any reasonable marking procedure [22] for cases 5.2 (1)(2) may be applied; in particular for case 5.2 (2) both the element with the largest local indicator must be marked, as well as the coarsest element with the largest local indicator. In the current results, the Dörfler marking strategy is used with parameter , which for case 5.2 (2) is split into and the marked sets are chosen by sets of least cardinality with
(5.1) 
and the marked set . In case 5.2 (1), and . An heuristic choice of is given in Section 6.
6. Numerical Examples
The nonlinear solver (2.9) is demonstrated on three problems with different structure in their internal layers. In Example 6.2 results are reported for a model nonlinear convectiondiffusion problem of the form = 0 which has a smooth sinusoidal solution. The first variation on the model problem, Example 6.3 demonstrates the algorithm on the same differential operator with the problem data chosen to generate a higher frequency solution. Example 6.4 shows the results for a related nonlinear diffusion problem with two concentric internal layers. The adaptive finite element method is implemented using the finite element library FETK [18] and a direct solver is used on each linear system. Both trial and test spaces are taken as the linear finite element space consisting of Lagrange finite elements over partition that satisfy the homogeneous Dirichlet boundary conditions.
The local a posteriori residualbased indicator, and corresponding jumpbased indicator for element