A Jacobian-free Newton-Krylov method for time-implicit multidimensional hydrodynamics

A Jacobian-free Newton-Krylov method for time-implicit multidimensional hydrodynamics

Physics-based preconditioning for sound waves and thermal diffusion
M. Viallet    T. Goffrey    I. Baraffe    D. Folini    C. Geroux    M. V. Popov    J. Pratt    R.Walder
Received; accepted
Key Words.:
Hydrodynamics - Methods: numerical - Stars: interiors
11institutetext: MaxPlanckInstitut für Astrophysik, Karl Schwarzschild Strasse 1, Garching, D-85741, Germany
11email: mviallet@mpa-garching.mpg.de
22institutetext: College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, EX4 4QL, UK 33institutetext: Ecole Normale Supérieure, Lyon, CRAL, UMR CNRS 5574, Université de Lyon, France

This work is a continuation of our efforts to develop an efficient implicit solver for multidimensional hydrodynamics for the purpose of studying important physical processes in stellar interiors, such as turbulent convection and overshooting. We present an implicit solver resulting from the combination of a Jacobian-Free Newton-Krylov method and a preconditioning technique tailored for the inviscid, compressible equations of stellar hydrodynamics. We assess the accuracy and performance of the solver for both 2D and 3D problems, for Mach numbers down to . Although our applications concern flows in stellar interiors, the method can be applied to general advection and/or diffusion dominated flows. The method presented in this paper opens up new avenues in 3D modeling of realistic stellar interiors allowing the study of important problems in stellar structure and evolution.

1 Introduction

The transport of heat, chemical species, and angular momentum in stellar interiors is governed by three dimensional, nonlinear (magneto-)hydrodynamical processes which develop over a wide range of temporal and spatial scales. The study of these processes with numerical simulations is a powerful way to improve our understanding of stellar structure and stellar evolution. Unfortunately, the integration of the compressible hydrodynamical equations with time explicit methods comes with a constraint on the time step resulting from the propagation of sound waves. This is the well-known Courant-Friedrich-Lewy (CFL) stability condition. We define CFL as the ratio between the time step and the largest explicit time step allowed by the CFL condition:

(1)

where is the time step, the mesh spacing, the adiabatic sound speed, and the flow velocity. Time explicit methods require CFL. This results in values of the time step which are small compared to the typical time scale of the relevant processes (e.g., the convective turnover time scale), making this approach computationally demanding. Nevertheless, an explicit time integration method remains the method of choice for multi-dimensional hydrodynamics in the astrophysical community (see e.g. Bazán et al. 2003; Meakin & Arnett 2007; Mocák et al. 2011; Herwig et al. 2014). A way to overcome this limitation is to rely on sound-proof models, which filter sound waves. Popular sound-proof models are the Boussinesq, the anelastic, or the pseudo-incompressible models (see e.g. Glatzmaier 2013, for a review). The use of such models, however, comes at the cost of a restricted range of applications due to the underlying approximations. Ideally, one seeks a way to efficiently solve the hydrodynamical equations regardless of the wide range of physical conditions characterizing stellar interiors (e.g. density stratification and a wide range of Mach numbers).

The MUlti-dimensional Stellar Implicit Code (MUSIC) follows a different approach by solving implicitly the fully compressible hydrodynamical equations (Viallet et al. 2011, 2013). The challenge for an implicit solver lies in the necessity of solving a large nonlinear system at each time step. In Viallet et al. (2013), the best performance was obtained with Newton-Krylov methods, which combine the Newton-Raphson method with an iterative linear solver. It was shown that the iterative solver requires preconditioning in order to achieve fast convergence for large CFL. In fact, within the framework of Newton-Krylov methods, the preconditioner is the crucial ingredient of the implicit solver. One of the important performance bottlenecks that was identified by the authors, particularly when considering three-dimensional calculations, is the inefficiency of “black-box” algebraic preconditioning techniques such as incomplete LU factorizations (ILU) for large CFL number computations. Furthermore, the memory requirement for the storage of the Jacobian matrix and the ILU factorization increases significantly in 3D, restricting the range of problems that can be addressed with such a method.

In this paper, we present an implicit solver which aims at overcoming these limitations. This is achieved by combining a Jacobian-free Newton-Krylov method with a preconditioner that is tailored for our physical equations, as described in more detail in Sect. 2. In Sect. 3, we design semi-implicit schemes that treat sound waves and thermal diffusion implicitly; in Sect. 4, we show how these semi-implicit schemes can be utilized to form an efficient preconditioner for the Newton-Krylov method. We present in Sect. 5 results that illustrate the performance of the solver for idealized test problems and for stellar interiors. We conclude in Sect. 6.

2 Numerical Description

MUSIC solves the equations describing the evolution of density, momentum, and internal energy, taking into account external gravity and thermal diffusion:

(2)
(3)
(4)

where is the density, the specific internal energy, the velocity, the gas pressure, the temperature, the gravitational acceleration, and the thermal conductivity. The system of equations is closed with an equation of state (EOS). For stellar interiors, these equations describe radiation-hydrodynamics in the diffusion limit. This is appropriate when the plasma is optically thick. In this case, the thermal conductivity due to photons is given by

(5)

where is the Rosseland mean opacity, and the Stefan-Boltzmann constant. Furthermore, the EOS includes the contribution of radiation to the internal energy and pressure. Optionally, MUSIC can solve the total energy equation in place of the internal energy equation:

(6)

where is the specific total energy.

We follow the method of lines and perform the spatial discretization independently of the time discretization (see e.g. LeVeque 2007). The spatial discretization is performed using a finite volume method with staggered velocity components located at cell interfaces. The numerical fluxes are calculated using an upwind, monotonicity preserving method of van Leer (1974). The resulting scheme is second-order in space and total variation diminishing.

The “conserved” variables, for which the conservation equations (24) & (6) are solved, are represented as the column vector when solving the internal energy equation, or when solving the total energy equation. In MUSIC, the unknowns are different from the conserved variables , and are represented as the column vector when solving the internal energy equation or when solving the total energy equation.

The spatial discretization yields a system of ordinary differential equations:

(7)

where contains the flux differencing and source terms.

The time discretization is carried out using the second-order Crank-Nicolson method:

(8)

We define the nonlinear residual function as

(9)

so that

(10)

defines the solution at time step . Equation (10) is solved with a Newton-Raphson method. At each Newton iteration, a linear problem of the form

(11)

must be solved, where is the Jacobian matrix.

The use of a Krylov iterative method like GMRES (Saad & Schultz 1986) is a standard practice for solving Eq. (11) when the matrix is large. However, Viallet et al. (2013) find that, for CFL, the iterative method requires preconditioning to remain effective. ILU factorizations can perform an adequate job at modest CFL numbers (CFL), but becomes inefficient at larger values of the CFL number. Furthermore, we find that such a preconditioning technique significantly increases the memory requirements.

In this work, we adopt an approach in which the Jacobian matrix is never explicitly formed. Jacobian-free Newton-Krylov (JFNK) methods are a popular choice for the resolution of large nonlinear system of equations, see Knoll & Keyes (2004) for a review. Since we do not form the Jacobian matrix, algebraic preconditioning techniques, such as ILU factorizations, have to be abandoned. Preconditioning the JFNK method, particularly when the CFL number is large, remains important for performance. One of the main goals of this paper is the design of an efficient preconditioner adapted specifically to the physics of stellar interiors.

From a physical point of view, at large CFL numbers (CFL) waves propagate over a large portion, if not all of the computational domain during a single time step. Effectively, it is as if information propagates at an infinite speed, as in parabolic problems. This changes the mathematical nature of the problem, i.e. from hyperbolic to parabolic, resulting in numerical stiffness. To be efficient, the numerical method has to take this property into consideration. Multi-grid methods attempt to exploit this property by exchanging information between the large and small scales, see for example Kifonidis & Müller (2012). We adopt another approach which consists of using legacy methods, known as semi-implicit (SI) schemes, as preconditioners for the Krylov solver. This strategy is known as “physics-based preconditioning” (PBP), as the preconditioner is tailored for the physical problem, see e.g. Mousseau et al. (2000); Knoll & Keyes (2004); Reisner et al. (2005); Park et al. (2009). SI schemes treat implicitly only the terms that are responsible for numerical stiffness. The scheme derived this way is numerically stable for CFL greater than one, as the stiff physics is treated implicitly. However, the accuracy of the solution obtained by the SI scheme is usually quite poor due to the approximations involved in the derivation of the scheme (see Sect. 3). Good accuracy can be achieved by embedding such a scheme within a Newton-Krylov method as a preconditioner. This work closely follows Park et al. (2009), adapting their method to our numerical scheme and physical equations.

3 Semi-implicit schemes for gas dynamics

In this section, we derive SI schemes for the hydrodynamic equations (24) & (6) which treat sound waves and thermal diffusion implicitly. The remaining terms (e.g. advection) are treated explicitly. In this section, our only concern is to design schemes that are stable and inexpensive, rather than accurate. Later, we will use these schemes as preconditioners for a fully implicit and accurate method.

3.1 Equations for , , and

Our SI schemes are derived from the evolution equations for the primitive variables . These are

(12)
(13)
(14)

where and are the generalized adiabatic indices for a general equation of state. For a perfect gas without any internal degrees of freedom, these adiabatic indices reduce to , where is the usual adiabatic index. The detailed derivation of the pressure equation, Eq. (12), is given in Appendix A.1.

To simplify the notation and without loss of generality, we will consider the one-dimensional version of these equations:

(15)
(16)
(17)

where we assumed that , where is a unity vector in the -direction. Extension of the numerical scheme to higher dimensions is straightforward.

3.2 Transformation matrices

Having introduced the conserved variables , the independent variables , and the primitive variables , we will need the transformation matrices and , which are defined as:

(18)
(19)

These matrices are given below for both the case of the internal and total energy equations, and the details of their derivation is postponed to Appendix B.

3.2.1 Internal energy equation

When solving for the internal energy equation, and , the transformation matrices take the following form:

(20)

and

(21)

The required derivatives are those typically provided by EOS routines.

3.2.2 Total energy equation

When solving for the total energy equation, and , the matrices take the form:

(22)

and

(23)

3.3 SI scheme for sound waves

We first design a SI scheme that treats sound waves implicitly. In Sect. 3.3.1 we start with deriving the propagation equation for adiabatic acoustic fluctuations, which identifies the terms in the equations that need to be treated implicitly.

3.3.1 Propagation equation for acoustic fluctuations

In this section we neglect thermal diffusion and gravity. We linearize the 1D equations (15) and (17) around a uniform background state:

(24)
(25)
(26)

Keeping only linear terms in the perturbations, we obtain:

(27)
(28)

Next, we take the derivatives of Eq. (27) with respect to and Eq. (28) with respect to . We substitute the result of the differentiation of Eq. (28) into the result of the differentiation of Eq. (27) to eliminate and obtain the wave equation that describes the adiabatic propagation of sound waves:

(29)

where is the adiabatic sound speed.

The terms on the r.h.s of Eqs. (27) & (28) are responsible for the propagation of sound waves. To overcome the corresponding CFL limit, we will treat them implicitly in the following section.

3.3.2 Pressure equation

To treat sound waves implicitly, Sect. 3.3.1 suggests that we treat the “” term in the pressure equation (Eq. 15) implicitly, with a simple backward Euler method:

(30)

Here , being the temporal index. We use Picard linearization in order to keep the scheme linear111Picard linearization refers to the fact that we write instead of when applying the implicit discretization. This is an approximation, but it has the advantage of keeping the scheme linear in the new variables.. All other terms in the equation are treated explicitly using the forward Euler method. Using Eq. (28), we approximate with

(31)

which we substitute in Eq. (3.3.2) to obtain

(32)

where is the adiabatic sound speed evaluated at time step . The right hand side of Eq. (3.3.2) corresponds to the explicit discretization of the original equation, but on the left hand side a Laplacian operator illustrates the parabolic character of this equation.

3.3.3 Internal energy equation

We approach the internal energy equation, Eq. (16), in the same way as the pressure equation. Advection and thermal diffusion terms are discretized using an explicit scheme, and the compressional work is discretized using an implicit scheme. This produces:

(33)

where . Again, we used Picard linearization to discretize the compressional work. We use again Eq. (31) to eliminate in Eq. (33) to obtain

(34)

The resulting equation is similar in form to the implicit version of the pressure equation, Eq. (3.3.2), as it contains the Laplacian of the pressure field.

3.3.4 Velocity equation

We discretize the pressure gradient in the velocity equation, Eq. (17), implicitly, using Picard linearization. All remaining terms are discretized explicitly. We obtain

(35)

where .

3.3.5 -form” of the equations

By replacing for all implicit terms in Eqs. (3.3.2, 3.3.3, 35), rather than only those terms involving time derivatives, one obtains the following system of equations:

(36)
(37)
(38)

where we introduced the following residual functions:

(39)
(40)
(41)

The solution at time satisfies , , and .

We write the system in a matrix form:

(42)

with and . This formulation of the equations is known as the “form”. The block structure of is:

(43)

In this form, the system can be solved by operator splitting: the pressure equation (36) is first solved for , is deduced from the internal energy equation (37), and is deduced from the velocity equation (38). Note that in the energy equation, one can use the equality

(44)

which is obtained from the pressure equation, rather than writing the Laplacian of explicitly. In Sect. 3.5, we discuss how we solve numerically the parabolic equation for .

3.4 SI scheme for sound waves and thermal diffusion

When thermal diffusion is important, it can cause numerical stiffness. In this case, it also needs to be treated implicitly. This can be easily implemented in the framework of the previous section: all that is required is to treat thermal diffusion implicitly in the pressure and internal energy equations. Equation (3.3.2) now becomes:

(45)

and Eq. (3.3.3) becomes:

(46)

In both Eqs. (3.4) and (3.4), we use Picard linearization to treat the diffusion term.

The new system for variables in -form is

(47)
(48)
(49)

where the residuals are unchanged, and given in Eqs. (39), (40) & (41).

We use the linearized equation-of-state to express as

(50)

where is the specific heat capacity at constant volume222Here it is understood that partial derivatives are evaluated at time step .. In general, the contribution due to density fluctuations is much smaller333It is zero for a perfect gas, for which . and we neglect them:

(51)

This approximation is used to replace with in the previous system to obtain

(52)
(53)
(54)

We now have a system of two coupled parabolic equations, as seen from the block structure of the matrix :

(55)

The solution strategy of a system of equations in which thermal diffusion terms are treated implicitly is therefore more complicated than in the previous section: the two coupled parabolic equations (52) and (53) have to be solved jointly for and , and finally is obtained from Eq. (54).

3.5 Numerical solution of the parabolic system

For both SI schemes presented previously, the numerical solution of a system of linear parabolic equations is required. This is accomplished in MUSIC using the Trilinos library (see Heroux et al. 2005). Specifically, MUSIC uses the iterative linear solver GMRES implemented in the package AztecOO to solve the parabolic system. The convergence of the linear solver is checked based on the criterion:

(56)

where is the system matrix, the solution vector, the r.h.s. of the linear system, and controls the accuracy of the solution. When setting , we find that the preconditioner has the same performances as when we use a direct solver to solve the parabolic system. However, in practice it is not necessary to solve the parabolic problem with such accuracy, as the preconditioner is only meant to provide an approximate solution of the problem. The results presented in this work were obtained by adopting a value . This value ensures an accuracy that is sufficient for the purpose of preconditioning. It is possible that the performance could be improved by adopting even larger values of , as the decrease in the quality of the preconditioner could be mitigated by the decrease in its computational cost. This is left for future investigation. A multi-level preconditioner is applied to speed up convergence of the linear solver. We use the ML package of the Trilinos library to setup a multi-level preconditioner (Gee et al. 2006). Our preconditioner is based on the default parameters provided for the smoothed-aggregation setup in the ML package (parameters set “SA”) with the following two modifications. First, instead of using the default method to estimate the eigenvalues of the matrix we use the 1-norm of the matrix. The default method of estimating the eigenvalues used a method based on a conjugate-gradient solution of the system, seeded by a random vector. This random vector caused simulations continued after restarting to differ from simulations without the restart, removing the ability to reproduce results. Second, we reduce the damping factor for the precondition from the default of 1.33 to 1.2, which in our case, results in fewer iterations for the parabolic solver to converge.

3.6 Time-stepping with SI schemes

The SI schemes designed in this section can be used as time-stepping methods to solve Eqs. (24) & (6). The time-marching algorithm is:

  1. Given the solution at time step , , compute (see Eq. 9);

  2. Transform into a residual for the primitive variables :

    (57)
  3. Compute corresponding to the desired SI scheme and solve

    (58)

    for ;

  4. Transform into :

    (59)
  5. Set .

The scheme is linear (we used Picard linearization to deal with nonlinear terms) and only first-order in time (we used the forward/backward Euler methods). However the CFL limit is less restrictive as the terms associated with acoustic fluctuations were discretized implicitly. Similarly, thermal diffusion does not imply any stability restriction on the time step if the second SI scheme is used. However, since the advective terms were discretized explicitly, a time step restriction based on the flow speed remains.

3.7 2D isentropic vortex test

Figure 1: Convergence tests of the SI scheme treating sound waves implicitly - advection of an isentropic vortex at different Mach numbers. The continuous lines show the norm of the errors measured in the velocity component parallel to the direction of advection.

In this section, we test the SI scheme for sound waves. We use the isentropic vortex advection test originally proposed in Yee et al. (2000), and we adopt a setup similar to the one used by Kifonidis & Müller (2012) and Viallet et al. (2013) to test the accuracy of the SI scheme.

The initial state consists of an isentropic vortex (i.e. zero entropy perturbation) embedded in an uniform flow of norm . We use a Cartesian system of coordinates where the -axis is taken in the direction of the flow. The vortex corresponds to the following perturbations in the state variables:

(60)
(61)

where , (we set the gas constant and work with dimensionless quantities), is the adiabatic index, and the vortex strength. We use and , with initial conditions

(62)
(63)
(64)
(65)

where the subscript indicates the background value. The sound speed of the background is . The maximum velocity of the vortex is . We define the vortex Mach number as . By varying , we change the Mach number of the flow. We consider , which corresponds to , , , respectively.

The computations are performed on a 2D Cartesian domain . Initially, the vortex is centered on the origin. The vortex is advected until . The exact solution corresponds to the initial vortex profile being shifted by a distance equal to in the direction. To test the accuracy of the scheme, we compare the velocity in the direction of advection, , with the expected analytic solution . Kifonidis & Müller (2012) and Viallet et al. (2013) used the density field to monitor the error, but here the background density is changed when is changed, which is not the case for the velocity field. We monitor three different norms of the error:

(66)
(67)
(68)

where , are the grid dimensions, and the indices , range over the simulation grid.

We introduce the advective CFL number:

(69)

where is the time step and the mesh spacing. For a vortex advected at , the advective CFL number provides a measure of the number of grid cells crossed per time step .

To characterise the accuracy of the SI scheme, we perform a temporal convergence study. The resolution of the domain is set to and we choose different time steps in order to cover a broad range of CFL, between and 3. This corresponds to CFL as large as . Later, we will compare the result with a more accurate time integration method. We do not study convergence with spatial resolution here, as our spatial method remains the same in all schemes presented in this work, and is unchanged as compared to previous publications. A spatial resolution study is presented in Viallet et al. (2011).

We evolve the isentropic vortex varying both the Mach number and CFL, and monitor the numerical errors. We expect two behavioral regimes. At low values of CFL, the error should be approximately independent of the time step, as the spatial error dominates. At higher values of CFL, the temporal error should dominate, and be proportional to as the SI scheme is first-order in time. The results are presented in Fig. 1. The expected behavior is recovered, although the flat regime at low values of the time step is not clearly seen. Temporal truncation errors remain significant for small time steps, as a result of the approximations introduced when designing the SI scheme. The first-order character of the temporal discretization appears clearly for larger values of the time step. However, the most important conclusion is that the numerical error is independent of the Mach number. Effectively, we achieved our goal of designing a scheme that is independent of the stiffness of the background pressure field. We stress that this behavior is observed over a range of Mach numbers spanning five orders of magnitude.

Finally, although we successfully removed the stability constraint on the time step caused by sound waves, there is still a CFL-like condition based on the advective velocity. Such a stability limit is not evident from Fig. 1, as only a few models are computed for the largest time steps. Empirically, we determined that the SI scheme becomes unstable for CFL.

4 Jacobian-free Newton-Krylov method and physics-based preconditioning

4.1 Newton-Krylov method

To solve the nonlinear system of equations, , resulting from our fully implicit method we perform Newton-Raphson iterations. The Newton-Raphson procedure is initiated by taking an initial guess for the solution, typically . At the -th Newton-Raphson iteration, the solution of a linear system is required:

(70)

where . The variable is the solution at iteration , and

(71)

is the Jacobian matrix at iteration .

The components of and can have considerably different numerical values as they represent different physical quantities in different units. For instance, densities can have typical values around g/cc, whereas specific internal energies have values around erg/g. Also, due to the stratification of stellar interiors, some variables, such as the density, can vary by several orders of magnitude throughout the domain. Such a wide range of values can cause numerical difficulties due to round-off errors. Therefore, before the system (70) can be solved, it is necessary to scale it. We introduce two diagonal matrices and to scale Eq. (70):

(72)

As and are diagonal matrices, we use the same symbol to represent their diagonal entries as a vector. The size of these vectors is equal to the number of variables multiplied by the number of cells. Each cell is treated in the same way, and the definitions of and only differ for different variables:

where is the adiabatic sound speed computed from the solution at iteration . represents the typical value of the unknown vector , and attempts to remove both the effects of units and stratification. We follow a similar idea for and use the typical value of the conserved variables to scale the residual vector . However, as velocities can be arbitrarily small, it is necessary to introduce a minimum velocity, here measured relative to the sound speed using the parameters in the definitions of and . The work described in Viallet et al. (2011) and Viallet et al. (2013) used . After testing, we found that and gives good performance for a wide range of Mach numbers, typically , see discussion in Sect. 5.

The Newton-Raphson procedure is terminated when the relative corrections fall below a certain value :

(73)

In Viallet et al. (2013), it was shown that the nonlinear tolerance has to be chosen small enough so that the truncation errors of the scheme dominate the numerical error. We follow their recommendation and set . Finally, if Eq. (73) is already fulfilled at the first iteration, we enforce a second Newton-Raphson iteration. For the sake of clarity, we drop from now on the superscript of the outer nonlinear iteration of the Newton-Raphson procedure, and we do not carry the scaling matrices and in the notation in the rest of the paper.

We use the GMRES method to solve iteratively Eq. (72). We start from an initial guess , and we define the initial residual as . In practice, we choose so that . At the -th iteration, the GMRES method seeks an approximation of the solution by solving a minimization problem in the -th Krylov space of :

(74)

The dimension of the search space increases at each iteration until convergence is achieved.The convergence of the linear solver is tested with the criterion

(75)

where is a parameter that determines the accuracy of the solution. Typical values of that are used in this paper are and , see discussion in Sect. 5.

4.2 Jacobian-free approach

To build successive Krylov spaces, the GMRES algorithm computes the action of the Jacobian matrix on a vector. This is the only use of the Jacobian operator, and we take advantage of the fact that this operation can be approximated by finite-differencing:

(76)

where is a small number. We rely on the implementation of matrix-free operators available from the Trilinos package NOX. This package contains two preset options for calculating :

(77)

and,

(78)

In both cases, is a small parameter with a default value of . The standard choice in MUSIC is to use Eq. (77), as it gives the best results (see discussion in Sect. 5).

In this Jacobian-free approach, the Jacobian matrix is not needed explicitly, lowering the memory cost of the scheme. Instead, computing the action of the Jacobian on a given vector requires one evaluation of the nonlinear residual in Eq. (76), assuming that has been already computed and stored.

When has a large condition number, the Krylov method fails to converge in an acceptable number of iterations (a few dozen) as the Krylov space is dominated by the direction of the eigenvector associated with the largest eigenvalue. In such cases, preconditioning is necessary. In this work, we use the SI schemes presented in Sect. 3 as preconditioners for the Krylov method. This is detailed in the next section.

4.3 Right-preconditioning of GMRES with SI schemes

Right-preconditioning of system (72) corresponds to solving the equivalent system:

(79)
(80)

where is the preconditioning matrix. is an intermediate solution vector, which once known, is used to find the solution . If the preconditioning matrix is a good approximation of , i.e. has a low condition number, the Krylov space of is better suited to construct an approximation of the solution

(81)

Once a suitable solution has been found in the search space, based on the same convergence criterion as (75), a final linear system, Eq. (80), has to be solved to get the actual solution .

The key part of the right-preconditioning process is the application of on a Krylov vector , provided by GMRES. This operation is required at each iteration to build the successive Krylov spaces. In right-preconditioning, is computed in two steps:

  1. Solve for ;

  2. Apply to .

The first step requires the inversion of a linear system; the second step requires the action of the Jacobian on the vector and is approximated by a finite-difference formula (Jacobian-free approach).

The basic idea of physics-based preconditioning is to interpret the system in step 1 above as a system corresponding to a linear time-stepping scheme written in -form:

(82)

where (,) describes a numerical scheme that approximates the full nonlinear scheme (,). Another way to understand physics-based preconditioning is that Eq. (82) defines a mapping from residuals to perturbations . Therefore, the Jacobian matrix is always applied to a to yield a residual vector which is used to build Krylov spaces.

The SI schemes designed in Sect. 3 are good candidates for the scheme in Eq. (82). These schemes provide a good approximation of the solution (i.e. ), and most importantly they remove the numerical stiffness by solving the stiff physics (sounds waves and thermal diffusion) implicitly. Physics-based preconditioner therefore “injects” physical insight at the heart of the linear method, improving its convergence. However, the Krylov vector is a residual for variables , and it needs to be transformed into a residual for variables before a SI scheme can be used. Furthermore, the SI scheme provides , which needs to be transformed into before can be applied. As in the time-stepping algorithm described in 3.6, we use the matrices derived in Sect. 3.2 to do these transformations. The complete algorithm to use the SI as a preconditioner is:

  1. Input: the GMRES method provides a vector . can be interpreted as a residual vector for the conservative variables , which we denote ;

  2. Transform into a residual for the primitive variables :

    (83)
  3. Apply the SI scheme to get :

    (84)
  4. Transform into :

    (85)
  5. Compute using the Jacobian-free method.

  6. Output: vector , which is provided to GMRES to build successive Krylov spaces.

We note that the scheme is not fully matrix free: the SI scheme requires the resolution of a linear problem for which the matrix is explicitly formed and stored. However, thanks to the simplifications made in deriving the SI scheme, the matrix system is significantly smaller and more sparse than the Jacobian matrix. This keeps memory demand low.

In the remainder of the paper, the combination of the Jacobian-free Newton-Krylov method with physics-based preconditioning presented in this section will be referred to as the JFNK+PBP method.

Figure 2: Same as Fig. 1, but using the JFNK+PBP scheme to advect the isentropic vortex.

5 Results

In this section, we assess the performance of our JFNK+PBP method in both 2D and 3D. In Sect. 5.1, we test the accuracy and efficiency of the method using idealized tests that use an ideal-gas equation of state and a Cartesian geometry. In Sect. 5.2, we test the method to model stellar interiors in a spherical geometry. An important goal of this section is to demonstrate the good performance, robustness and accuracy of the solver for a wide range of Mach numbers, typically from down to .

5.1 Ideal Test Cases

5.1.1 2D isentropic vortex

We first investigate the accuracy of the solver by considering the 2D isentropic vortex problem that we used to test the SI scheme in Sect. 3.7. We perform the same set of runs with the JFNK+PBP method, and the computed errors are shown in Fig. 2. Comparing with the error of the semi-implicit scheme (see Fig. 1), the JFNK+PBP scheme achieves an overall reduction in the error. The range of time steps where spatial truncation errors dominate is larger, and we observe the second-order character of the temporal error at large time steps. The use of a SI scheme as a preconditioner does not impact the overall accuracy of the JFNK+PBP method. Figure 2 also shows that the results of the JFNK+PBP method are independent of the Mach number, and this desirable property of the SI scheme has been inherited by the nonlinear method.

Four free parameters enter the JFNK+PBP method: the choice of perturbation strategy (Eqs. 77 & 78); the two scaling coefficients for the velocity components of the solution and residual vectors (parameters and in Sect. 4.1); the tolerance required for the solution of the Jacobian equation (parameter in Eq. 75). These free parameters were determined by testing.

We find that the accuracy of the solver for the higher end of the Mach number range being considered () is good regardless of the choice of perturbation strategy. However, for lower Mach numbers () we find that only Eq. (77), with , is able to yield accurate results. When using Eq. (78), the Jacobian operator is poorly approximated, regardless of the value of , resulting in a failure of the nonlinear method.

In the scaling of the linear system, we find that a value of gives the most consistent errors across Mach numbers for the range being considered, i.e. . However, this range can be adjusted by tuning to the problem at hand, with a higher value producing more accurate solutions for high Mach number flows. We find that enables us to obtain accurate results for the range of Mach numbers being considered.

We find that a linear tolerance produces solutions with similar errors for the full range of Mach numbers considered in this work. A choice of produces similar results for the higher Mach numbers, but the quality of the solutions for low Mach numbers degrades seriously.

Next, we assess the efficiency of the SI scheme as a preconditioner. This is done by considering the number of GMRES iterations necessary to reach convergence without preconditioning and with physics-based preconditioning, for different values of CFL and different Mach numbers (the linear tolerance is set to ). When solving the unpreconditioned linear system, we found that for the majority of linear problems, particularly for higher Mach numbers, fails to converge. Instead, we present for the unpreconditioned case the convergence behavior for , as a best case scenario. When solving the linear system with physics-based preconditioning, we use the optimal parameters described previously.

The results of convergence tests for the iterative method are shown in Fig. 3. For these tests, the simulations are run for 100 time steps. Without preconditioning, the different Mach number cases behave similarly: the number of GMRES iterations increases rapidly for CFL, and above CFL no convergence is achieved despite the large number of iterations allowed. Such behavior is due to the stiffness of acoustic waves which increases with CFL. Our physics-based preconditioner is tailored to treat this effect, and the improvement is demonstrated in the right panel of Fig. 3, as compared to the left panel without preconditioning. In each case, the increase in the number of GMRES iterations takes place at larger values of the time step. With physics-based preconditioning, the failure of the linear solver is now coming from the unstable behavior of the SI scheme for too large a CFL.

Figure 3: Convergence of the GMRES solver without preconditioning (left panels) and with the physics-based preconditioner (right panels). The upper panels correspond to the 2D isentropic vortex, and the lower panels to the 3D Taylor-Green vortex. In both cases, the Mach numbers considered are , , , (the right panels assume the same legend as the left ones). The maximum allowed number of GMRES iterations was set to 300. The mean values of the number of iterations for convergence is plotted, with shaded areas showing maximum and minimum values. For each Mach number, the location of CFL corresponding to CFL is shown by a vertical dashed line.

5.1.2 3D Taylor-Green vortex

We consider the Taylor-Green vortex problem to test our physics-based preconditioner for an adiabatic (i.e. no thermal diffusion) flow in 3D. We consider a Cartesian domain , where is a lengthscale that sets the physical size of the domain. The initial conditions for the velocity field are

(86)
(87)
(88)

The domain has a uniform density of . The initial pressure field is

(89)

which ensures that

(90)

i.e. the initial conditions do not induce any acoustic modes. The initial amplitude of the vortex is measured in terms of the Mach number , where is the adiabatic sound speed. The adiabatic index is taken as . We take as our unit of length, as our unit of velocity, and as our unit of mass. In this normalisation, time is measured in units of , and energy density in units of . We change to vary the Mach number in the range . We consider a numerical domain with a resolution of . For this test case, we define the advective CFL number as