An improved method for solving quasilinear convection diffusion problems on a coarse mesh
A method is developed for solving quasilinear convection diffusion problems starting on a coarse mesh where the data and solution-dependent coefficients are unresolved, the problem is unstable and approximation properties do not hold. The Newton-like iterations of the solver are based on the framework of regularized pseudo-transient continuation where the proposed time integrator is a variation on the Newmark strategy, designed to introduce controllable numerical dissipation and to reduce the fluctuation between the iterates in the coarse mesh regime where the data is rough and the linearized problems are badly conditioned and possibly indefinite. An algorithm and updated marking strategy is presented to produce a stable sequence of iterates as boundary and internal layers in the data are captured by adaptive mesh partitioning. The method is suitable for use in an adaptive framework making use of local error indicators to determine mesh refinement and targeted regularization. Derivation and q-linear local convergence of the method is established, and numerical examples demonstrate the theory including the predicted rate of convergence of the iterations.
Key words and phrases:Nonlinear diffusion, convection diffusion, quasilinear equations, pseudo-transient continuation, Newton-like methods, adaptive methods.
This paper builds on the framework of  and develops a nonlinear solver suitable for use in adaptive methods for quasilinear elliptic problems. The method is developed to stabilize the linearizations of nonlinear diffusion and convection diffusion problems, especially when there may be steep internal or boundary layers in the problem data. The sequence of linear problems encountered by a Newton-like method under these circumstances takes the form of convection diffusion or reaction convection diffusion and the sequence of approximate solutions is subject to spikes, overshoots and spurious oscillations in the convection-dominated regime.
The present approach builds on a regularized version of the pseudo-transient continuation method as in for instance [1, 19, 6, 12, 7, 21] and the references therein, and on each mesh refinement seeks a steady-state solution of the nonlinear evolution problem , for positive semidefinite regularization term in the interest of solving . In this analysis, further stabilization is introduced into the time discretization to address the problem of nonphysical oscillations. Time discretizations featuring controllable high-frequency numerical dissipation are well known in the finite element analysis of structures as in for instance [20, 16, 15, 17, 4], and a variation related to these methods referred to here as the -split Newmark update is presently introduced. This method makes use of the controllable dissipation of the Newmark update and further controls fluctuations between the iterates by freezing a small fraction the linearization about a point with favorable properties. The -split Newmark method is derived, -linear local convergence with a predictable rate is established, and the method is demonstrated on three variations of a model problem with steep internal layers.
The goal of the solver in the adaptive setting is to start on a coarse mesh where the problem data is generally not resolved and produce a sequence of transitional states which may not be accurate solutions to the coarse mesh problems, but do allow a posteriori error indicators to detect the layers present in the problem data and refine the mesh to be globally fine enough for stability and locally fine enough to achieve accuracy and efficiency. The sequence of approximate problems can be classified into three phases, the initial phase where the mesh is globally too coarse and quadrature error is high and many features of the problem data remain undetected by the discretization, the pre-asymptotic phase where the mesh is fine enough for stability but the problem data is only partially resolved, and the asymptotic regime where the standard existence, uniqueness and approximation properties hold. In the coarse mesh regime the solver uses as much stabilization as necessary to produce smooth transitional solutions; in the pre-asymptotic regime the solver adaptively reduces the added stabilization increasing both accuracy and the convergence rate as the data is resolved and the approximate problem becomes less rough; in the asymptotic phase the solver limits to a standard Newton method where the initial guess interpolated from the previous refinement is a good approximation to the solution and the iterations converge quadratically.
The requirements of the adaptive method are that a local error indicator is computed on each refinement and a reasonable marking strategy in the sense of  is employed to determine the next mesh refinement. A modification of the standard adaptive marking strategy is proposed in which the marked set is determined by two parts: one that refines the elements with the largest indicators, and the other that refines a subset of the coarsest elements of the mesh when the residual from the final Newton-like iteration is significant. The method reported here improves on  both in terms of efficiency and in terms of the strengths of near-singularities it is able to resolve.
The remained of the paper is organized as follows. Section 3 shows the derivation of the pseudo-transient Newmark and -split Newmark methods, (2.8) and (2.9). Section 4 demonstrates local -linear convergence of the residual for these methods, both with rate , with the parameter from the Newmark update. Section 5 describes a basic algorithm to implement the solver in an adaptive method, and Section 6 contains the results of numerical experiments using the described adaptive algorithm and (2.9).
The following notation is used in the remainder of the paper. The function refers to a specific problem or problem class and is used in the formal discussion of Newton-like methods. The th iteration subordinate to the th partition is denoted , while is the th iteration on a fixed partition and is the final iteration on the th mesh, taken as the approximate solution on . In defining the weak and bilinear forms in the next section , and similarly for vector-valued functions.
2. Target problem class
The nonlinear solver is developed to approximate solutions to the nonlinear problem = 0, for polygonal domain and with for real Banach spaces , particularly where takes the form of a quasilinear convection diffusion or diffusion problem in divergence form
Multiplication by a test function and integration by parts over the divergence term yields the weak form of (2.1)
and linearizing about at yields the bilinear form induced by
For and bounded away from zero with bounded as in , then (2.2) has a unique solution , with . Both (2.1) and (2.2) fit into the context of  with the assumption that is bounded and is an isomorphism, in which case the solution is an isolated solution.
The discretized equation is, find such that where and are discrete finite element spaces with respect to triangulation , where the family of triangulations is regular and quasi-uniform in the sense of . Existence, uniqueness and approximation properties of the discrete problems induced by (2.1) and (2.2) are found in ,  and the references therein, assuming the mesh is fine enough. The problem is then to start on a coarse mesh; one that is not fine enough in terms of data resolution, stability or approximation properties, and build one that is. The goal of the solver is to navigate from a coarse to a sufficiently fine mesh where the approximation properties do hold, and to do so by building an both an efficient adaptive mesh and a reasonable initial guess to start the Newton-like iterations on each refinement along the way.
Using a standard Newton method to solve the nonlinear (2.3) with (2.4) generally does not work when there are steep layers present in the problem data, and coarse mesh approximations of the problem are observed to be indefinite and ill-conditioned, consistent with the observations in . Many of the problems encountered including formation of spurious spikes, overshoots and instability are symptomatic of the corresponding linear convection-dominated problems . Techniques form from the finite element analysis of structures [20, 16, 15, 17, 4] use numerical integrators featuring high-frequency dissipation to capture lower frequency modes of the solution. The approach here investigates the use of the Newmark update and a stabilized variation of it as the time-integrators of a pseudo-transient continuation-like method as in [8, 1, 6, 19, 21].
The next section develops the two following methods to improve the convergence of the coarse-mesh iterates in the sequence of linearized problems. A positive semidefinite is used to target specific degrees of freedom for regularization, and the role of may be seen either from the homotopy perspective as a modification of the path between an initial and that solves , or from the regularization viewpoint as a penalty against certain characteristics of the iterates. In what follows, is chosen based on the Laplacian to penalize against high curvature, with degrees of freedom selected for regularization based on an a posteriori error indicator. The selective approach to regularization distinguishes the method presented here and in  from the method of pseudo-transient continuation found elsewhere in the literature. For , a Newmark update generalizing a backward Euler discretization yields the iteration
The proposed -split Newmark update yields the iteration,
and analysis and demonstration of this last method is the focus of the remainder of the paper. Local convergence of (2.8) is discussed along the way and local convergence of (2.9) is established by a perturbation of that result. The method based on (2.9) was also tested and found effective on a shifted -Laplacian
and the related
problems similar to those investigated in  and , which reside outside the current target class but still benefit from the regularization techniques described here when starting the adaptive method from a coarse mesh, especially if the coefficient .
3. Continuation methods
The homotopy or pseudo-transient continuation method of stabilizing the Newton-like iterations for finding the solution of is developed by discretizing the ODE
with a positive semidefinite linear functional . In much of the literature, is taken to be the identity or a scaled version thereof [8, 19], and the references therein; however, the theory is developed for positive-definite functionals other than the identity  where it is referred to as the “s” method, and positive semi-definite functionals in [6, 21]. This idea can be generalized to discretizing the ODE based on the normal equations formulation of (3.1)
As discussed in for instance [1, 8, 19, 6], letting a finite dimensional approximation to with , the standard method is to discretize (3.1) by a backward-Euler approximation to and a linearization of about . In the case of discretizing (3.2), is approximated by . To increase stability of the linearized system, other discretizations of (3.2) are presently considered. The backward-Euler time discretization is replaced by the more general Newmark update, and the linearization of is split about two distinct points, one the latest iterate and the other yielding a Jacobian with favorable properties.
By the original method of backward-Euler discretization and first order Taylor expansion about the previous iterate, the resulting Newton-like iteration is given by
This method increases the stability of the linear system for positive definite and possibly semi-definite , but runs the risk of shifting the spectrum of the approximate Jacobian towards zero, making the condition significantly worse in the case that is indefinite, resulting in large fluctuations between the iterates. To cope with this situation which occurs in the coarse mesh approximation of quasilinear problems, the iteration based on the normal equations enforcing the shift of the spectrum away from zero is given by
with and the norm. As in , the necessary and sufficient condition for the minimizer is , yielding (3.4). While successfully increasing the stability of the system, this method remains unsatisfactory due to the increased complexity of solving the system based on the normal equations. A new method is now introduced which adds stability while preserving the sparsity of the system. The cost, as shown in Section 4 is trading the asymptotically quadratic convergence of (3.3) for -linear convergence. The proposed algorithm of Section 5 updates the parameter of the Newmark method on each mesh refinement as the sequence of linear systems stabilize until the method reduces to the original (3.3) for which quadratic convergence of the error is observed.
3.1. Time discretization by the Newmark method
The Newmark method  discretizes the time derivative by
For , (3.6) reduces to the backward Euler discretization described above. As discussed in  this time integrator is second order accurate for and introduces nonphysical damping of the high frequency modes for proportional to . More sophisticated and collocation methods as in [16, 15, 4] incorporating the Newmark update are designed to further control numerical dissipation across targeted frequencies. In the current context the improved resolution and risk of overshoot of these methods designed with two time derivatives in mind is potentially of interest but their implementation in a pseudo-transient continuation setting is beyond the scope of this article. For stabilizing the transitional states of the sequence of coarse-mesh problems, damping of the high-frequency oscillations is the desirable property and more important than a higher order of accuracy.
Solving (3.6) for
Applying and to (3.7)
Linearizing about , obtain the Newton-like iteration
For this method comparatively scales down the influence of the right-hand side data; the danger is still allowing and arguably increasing the likeliness of shifting an indefinite left-hand side operator spectrum towards zero. By the same approach, the iteration based on the normal equations is given by
Due to the overdamping effect of setting the parameter , the iterations (3.9) and (3.10) stabilize the solution in some situations where the iteration (3.4) is infeasible. Similarly to (3.4)- (3.5), the solution of (3.10) is seen to be the minimizer of the generalized Tikhonov-type functional
3.2. The Newmark update with -splitting
The Newark discretization of the previous section successfully introduces high-frequency dissipation increasing the stability of the linearized system but suffers the same drawback as the backward-Euler discretization requiring the formulation based on the normal equations in the case of a possibly indefinite Jacobian to control highly unstable sequences of iterates. The -split Newmark update runs without the use of the normal equations: effectively freezing a small fraction of the Jacobian at a point with favorable properties dramatically reduces the fluctuations between iterates in the coarse mesh regime where boundary and internal layers are only partially resolved.
As seen in the derivation, the method can be thought of as splitting the linearization of about two points, or more precisely as approximating by a combination of and where the first is used in the backward-Euler update, the first two are used in the Newmark update with the second adding control of numerical dissipation; and the third is introduced here to reduce the fluctuation of the Jacobian outside the domain of convergence of the Newton-like iterations. The method is derived as follows, and is demonstrated in Section 6 to work without the use of the normal equations in situations where other methods including those involving the normal equations are seen to fail.
Starting with the Newmark update (3.6), each of the time derivative terms on the right is split into two parts
Solving for and applying the relation
Applying the relation on the left and linearizing about on the right
Linearizing both about fixed , . Applying this and multiplying through by
Equation (3.15) is the basis for the -split Newmark update. It contains four parameters: ; while some guidance is provided, a detailed analysis of these parameters will be addressed in subsequent work by the author. In the current results is used, and the solution from the previous refinement interpolated onto the current mesh has also been observed to work; is chosen as in  for the backward-Euler discretization, may be set adaptively, increased to add stability then decreased to speed convergence; and is adaptively sent towards one based on the norm of the latest residual on each Newton-like iteration. It is observed in numerical experiments that should be close to one, and as seen in Section 4 this is a necessary for the asymptotic -linear convergence at the rate , in agreement with the rate found using the Newmark discretization without the -splitting.
4. Local convergence
Local convergence of the residual is established for both algorithms (2.8) based on the Newmark update, and (2.9) based on the Nemark update with -splitting of the Jacobian. The second result follows as a perturbation of the first for close to unity, relying on a weaker set of assumptions. Both results require the same Lipschitz condition on the Jacobian. Denote the open ball about by .
There exist so that for all
First, -linear convergence of the residual is established with rate , for . Linear and asymptotic quadratic convergence for the case is shown in , following from convergence of the error. In the present discussion, convergence of the error is neither used nor shown: it is observed in numerical experiments that the residual decreases at the predicted rate over iterations where may not yet be decreasing; in contrast, for the case the same quantity displays asymptotically quadratic convergence to zero together with the residual. The following proof is a variation of Theorem 2.12 of , where Assumptions 4.1 and 4.2 replace the affine contravariant Lipschitz condition used in that version of the Newton-Mysovskikh Theorem for the standard Newton method.
and suppose ; then by iteration (2.8), , and the sequence of residuals converges -linearly to zero with asymptotic rate .
Let . Then iteration (2.8) is given by
By the integral mean-value theorem
By the assumption , and for
for each such that for all . By the logic of  Theorem 2.12, proceed by contradiction and assume ; then there is a smallest with . For that
a contradiction. This shows . To establish the rate of convergence, set in 4.6.
which shows both that and the asymptotic -linear rate of as . ∎
It is observed that the method converges with the predicted rate when the iterates as given by (4.1). The lapse in the theory appears to be the bound on which apparently converges within a smaller set as compared to the residual.
The asymptotic convergence of the -split algorithm is addressed in the next theorem as a perturbation of Theorem 4.3. While it is not necessary in the proof for to be positive definite, is ideally chosen so that improves the condition of the approximate Jacobian, allowing Assumptions 4.5 to hold with better constants than 4.2.
(c.f. Assumption 4.2). There exist so that for , fixed , positive semidefinite , , and for all , then for all :
As with Assumption 4.2 (3), the third clause agrees with the similar stability bound of  Assumption 2.3, with replaced by . Local convergence of the -split Newmark algorithm is established with the same asymptotic rate as in the previous result.
for a given . Then there exists such that for in the open set given by
and defined by iteration (2.9), it hold that , and the sequence of residuals converges -linearly to zero with asymptotic rate .
Let . Then iteration (2.9) is given by
Much of the proof parallels Theorem 4.3, and is summarized here. Starting with the integral mean-value theorem
By iteration (4.11)
Applying (4.9) to the quantity under the assumption and expanding in orders of ,
it follows that
so there exists for which implies
The result follows by the same logic as in the proof of Theorem 4.3. ∎
It is observed in some problems that the residual can decrease at a rate slightly better than the one predicted when , whereas for the rate is as predicted.
The solver using the -Newmark iteration (2.9) may be implemented in an adaptive method according to the following basic algorithm. To exploit both the stability of method (2.9) and the quadratic convergence of (3.3), the parameters are updated on each refinement, and the parameters are updated on each iteration of the solver. The main steps of the adaptive method are as follows with the exit criteria and parameter updates specified below. An example of a targeted regularization term based on a posteriori error indicators is given in Section 6.
Algorithm 5.1 (Basic algorithm for (2.9)).
Criteria 5.2 (Exit criteria).
Given a user set tolerance , an accepted rate of convergence given by for some constant , e.g., , and a maximum number of iterations either chosen as a constant or based on the predicted rate of convergence, exit the solver on partition after calculating iterate when one of the conditions holds.
Maximum number of iterations exceeded.
In terms of the three phases of the solution process in , the final asymptotic regime is characterized by the iterations terminating by Criteria 5.2 (1); iterations in the pre-asympototic regime terminate with Criteria 5.2 (2); and iterations terminate with a mix of Criteria 5.2 (2) and (3) in the initial phase.
The second exit criterion 5.2 (2) merits some explanation, as it allows the iteration to end once reduction of the residual has slowed indicating the iterate has a attained a reasonably stable configuration from which a good prediction about where to refine the mesh may be determined. The first two clauses require the residual is decreasing, and has decreased below the level given by the previous iterate with at least as much decrease from the initial iteration on the current partition and the residual from the previous partition, if it is well defined. The third criterion requires error reduction at or close to the predicted rate, and the last that the error reduction between iterates is slowing down. These four criteria together assure the sequence of transitional states is not getting further in the sense of the solver’s residual from a converged solution, and prevent situations such as spikes propagating indefinitely across a sequence of partitions. While spikes, overshoots or other undesirable characteristics may propagate through several refinements, such solutions will eventually not reduce the residual at the specified rate. When the adaptive mesh is fine enough, these characteristics are observed to smooth out, and otherwise the solver eventually fails and the solution is reset and started again on a finer mesh.
One of the main improvements of (2.9) over the regularized method of  is the generation of more stable sequences of transitional iterates through the pre-asympototic phase. The sequence of pre-asymptotic approximate problems only partially resolve the data and as internal layers are uncovered over several refinements the condition of these problems is generally bad. The high-frequency dissipation of the Newmark strategy combined with the stabilization of the iterates produced by the -splitting allow sequences of solutions to propagate through this regime and the mesh to be marked via error indicators leading to the accurate solution of the problem on otherwise coarser meshes than possible if starting the solver on a mesh that is uniformly fine enough to resolve the data. In order to make use of the stability and dissipation properties of larger values of as well as the asymptotically quadratic convergence if the iterations are stable for , the following minimal guidelines are presented, based on the termination criteria above.
Update 5.3 (Newmark parameter ).
In practice, the residual tends to get reduced below tolerance once . In the results of Section 6, is decreased by two when the target rate is achieved, decreased by one if a stable rate lower than predicted is achieved, increased by one when the solver fails, and by two if the maximum number of iteration is exceeded while the iterations are converging below the acceptable rate. An initial should be chosen large enough to see error reduction on the initial mesh, and not significantly larger.
The regularization parameter is updated by the method described in , repeated here for convenience. In accordance with the convergence Theorems 4.3 and 4.6, this strategy assures so long as the residual is decreasing.
Update 5.4 (Tikhonov regularization parameter ).
Set . For ,
To reduce rapid fluctuation of , correct to ensure in the case that and if .
5.1. Marking strategy
An a posteriori error indicator is assumed available to determine adaptive mesh refinement and as one option for determining a targeted regularization term . In the current results, standard local residual-based element indicators as in for instance [23, 11] are used for both of these purposes, further described in Section 6. Other approaches to solver- and problem-specific regularization and mesh refinement are currently under investigation by the author.
The goal of the marking strategy is to build a mesh that is globally fine enough for stability, and locally as fine as necessary to achieve the desired accuracy. To improve the efficiency of the method by increasing the stability of the transitional states in the pre-asymptotic phase, the following marking strategy is presented. Based on exit criteria 5.2 there are three possible outcomes of the nonlinear solve on refinement .
In the case of 5.2 (3), the coarsest set of elements in the mesh are refined. Failure of the solver reflects a global rather than a local problem. Starting on a sufficiently coarse mesh, several resets are expected.
In the case of 5.2 (2) the error indicators are computed, and the mesh is refined both according to the elements with the largest local indicators, and according to the coarsest elements with the largest local indicators. This strategy allows local refinement to take place in order to capture boundary and internal layers to attain eventual accuracy and efficiency while also building the adaptive mesh fine enough to achieve stability. In meshes that are too coarse to resolve the data and where the approximate problem may have coefficients based on highly inaccurate approximate solutions, nonphysical overshoots often develop in the iterates; while the error indicators in the vicinity of these spikes may be high, refining primarily in these regions exacerbates the problem. As in case 5.2 (3), a large residual predicts a global issue with the mesh. However, valuable information about the near-singularities in the problem data can be predicted from the non-converged iterates, so some local refinement can build a more efficient mesh.
In the case of 5.2 (1) the error indicators are computed, and the mesh is refined with respect to the largest local indicators. Any reasonable marking procedure  for cases 5.2 (1)-(2) may be applied; in particular for case 5.2 (2) both the element with the largest local indicator must be marked, as well as the coarsest element with the largest local indicator. In the current results, the Dörfler marking strategy is used with parameter , which for case 5.2 (2) is split into and the marked sets are chosen by sets of least cardinality with
6. Numerical Examples
The nonlinear solver (2.9) is demonstrated on three problems with different structure in their internal layers. In Example 6.2 results are reported for a model nonlinear convection-diffusion problem of the form = 0 which has a smooth sinusoidal solution. The first variation on the model problem, Example 6.3 demonstrates the algorithm on the same differential operator with the problem data chosen to generate a higher frequency solution. Example 6.4 shows the results for a related nonlinear diffusion problem with two concentric internal layers. The adaptive finite element method is implemented using the finite element library FETK  and a direct solver is used on each linear system. Both trial and test spaces are taken as the linear finite element space consisting of Lagrange finite elements over partition that satisfy the homogeneous Dirichlet boundary conditions.
The local a posteriori residual-based indicator, and corresponding jump-based indicator for element