Krylov integrators for Hamiltonian systems

# Krylov integrators for Hamiltonian systems

Timo Eirola and Antti Koskela
In memory of Timo Eirola (1951–2016)
###### Abstract

We consider Arnoldi like processes to obtain symplectic subspaces for Hamiltonian systems. Large systems are locally approximated by ones living in low dimensional subspaces; we especially consider Krylov subspaces and some extensions. This will be utilized in two ways: solve numerically local small dimensional systems or in a given numerical, e.g. exponential, integrator, use the subspace for approximations of necessary functions. In the former case one can expect an excellent energy preservation. For the latter this is so for linear systems. For some second order exponential integrators we consider these two approaches are shown to be equivalent. In numerical experiments with nonlinear Hamiltonian problems their behaviour seems promising.

## 1 Introduction

Symplectic methods have shown to be very effective in long time integration of Hamiltonian systems (see [11]). Many of them are implicit and necessitate the solution of systems of equations. If the differential equation system is large and sparse, a natural approach is to use Krylov subspace techniques to approximate solution of the algebraic equations.

A related approach is to use Krylov approximations of the matrix exponential in the so-called exponential integrators (see [13]). This has turned out to be a superior technique for many large systems of ordinary differential equations.

Krylov subspace techniques can be viewed as local low dimensional approximations of the large system. For Hamiltonian systems the standard Arnoldi type iterations produce low dimensional systems that are no longer Hamiltonian. In this paper special attention is paid to produce subspaces with symplectic bases. Also the time symmetry of the Hamiltonian systems taken into account when producing the bases.

This is an extended version of the slides by Eirola, presented at a Workshop on Exponential Integrators in Innsbruck in 2004 (see [7]). Part of the material was introduced in the master’s thesis of the second author [16] which was supervised by Eirola. The original ideas of Eirola came from considering linear Hamiltonian systems in as –linear111 These are of the form: for , systems in (see [8]). The slides [7] were using that language, but the present version is written in a more standard form.

## 2 Hamiltonian systems

Given a smooth function , consider the Hamiltonian system

 \em x′(t)=\em J−1∇H(\em x(t)) ,\em x(0)=\em x0 , (1)

where . A matrix is called Hamiltonian, if , and symplectic, if . The Jacobian of is a symmetric matrix at every point. Thus is a Hamiltonian matrix.

We assume that (1) has a unique solution and write it . Then

• Energy is preserved: is constant in .

• For every the mapping is symplectic (or canonical), that is, its derivative is a symplectic matrix at every point.

• The mapping is time symmetric, i.e. for every .

Symplectic integrators produce symplectic one step maps for Hamiltonian systems (see [11]). For example, the implicit midpoint rule

 \em xj+1=\em xj+h\em J−1∇H((\em xj+\em xj+1)/2)

is such. For linear systems, i.e., when is of the form , the energy is also preserved in the numerical solution with this and many other symplectic methods. One step methods are called symmetric the map given by the integrator is time symmetric, i.e. chaning to is equivalent to switching and . The implicit midpoint rule, for example, is symmetric.

For large systems implicit methods may become expensive. In this paper we will consider several low dimensional Hamiltonian approximations and the use of implicit methods or exponential integrators for these.

## 3 Symplectic subspaces and low dimensional approximations

Recall some basic definitions and properties in . Denote the nondegenerate skew-symmetric bilinear form . A subspace is isotropic, if for all , and a subspace is symplectic, if for every nonzero there exists such that . Then the dimension of is even. A basis is called symplectic, or a Darboux basis, if for all holds and .

If is an isotropic subspace with an orthonormal basis , then are also –orthogonal and is a symplectic subspace and is a symplectic basis of .

We call also a matrix symplectic, if (pointing out the dimensions) . Then is a left inverse of if and only if U is symplectic.

We will consider local approximations of the Hamiltonian system

 \em x′(t)=\em f(\em x(t))=% \em J−1∇H(\em x(t)) .

Assume that at a point we are given a symplectic matrix . Consider the Hamiltonian system in corresponding to the function . Then we get

 ξ′=\em J−1∇η(ξ)=\em J−1\em UT∇H(\em x0+\em Uξ) ,

which is Hamiltonian in . Set . Then

 ξ′(t)=\em U†\em f(\em x0+\em Uξ(t)) . (2)

One strategy is to solve (2) numerically from up to and set . Clearly, if we use an energy preserving scheme for the system (2), we will conserve the energy of the large system too, i.e. .

Note that if the sets of constant energy of the original system are bounded, then they are such for the small dimensional approximations too. This implies that the approximations inherit stability of equilibria in a natural way.

In case that at we are given a matrix with orthonormal columns we set in (2). Then the system is not necessarily Hamiltonian.

We will consider also another strategy which is, instead of solving low-dimesional systems, we approximate suitable functions of a numerical method in the low dimensional space . As we will see, for the exponential integrators we consider these two approaches are equivalent.

The idea of approximating a Hamiltonian system by another of smaller dimension is not new. See, for example the discussion in [17]. A novelty here is to use local (later Krylov) approximations.

If is symplectic and does not depend on , then it is not difficult to prove, for example, that using the implicit midpoint rule for (2) induces a map that is symplectic in , that is

 ω(\em Dψ(\em x0)\em d% ,\em Dψ(\em x0)˜\em d)=ω(\em d,˜\em d)for all\em d,˜\em d∈R(\em U) .

But in order to get efficient algorithms we let to depend on and then this approach generally does not produce a symplectic map.

## 4 Exponential integrators

The use of approximations of the matrix exponential as a part of time propagation methods for differential equations has turned out to be very effective (see e.g. [13]). We will consider application of three second order exponential integrators to the Hamiltonian system (1). In what follows, the matrix H will denote the Jacobian of the right hand side of (1) at , i.e., . The methods can be seen as exponential integrators applied to semilinear equations which are local linearizations of (1). In the literature methods of this type are also called exponential Rosenbrock methods [14].

### 4.1 Exponential Euler method

As a first method we consider the exponential Euler method (EE) 222We use the shorthand notation etc.

 \em x+=\em x+hϕ(h\em H)% \em f(\em x) , (3)

where and .

Note that if is linear, then , i.e., the method gives exact values of the solution.

Assume now that the system is Hamiltonian in and is symplectic. Then is a Hamiltonian matrix as well as for .

If we use the exponential Euler method (3) for the low dimensional system (2) we produce

 \em x+=\em x+h\em Uϕ(h% \em F)\em U†\em f(\em x) . (4)

For linear problems this will preserve energy exactly:

###### Lemma 1.

Assume the system is of the form . Then the exponential Euler method (4) preserves energy, i.e., .

###### Proof.

The local problem now is (see (2))

 ξ′(t)=\em Fξ(t)+\em U†(\em H\em x+\em c) ,ξ(0)=0 .

Then , i.e., the exponential Euler approximation gives the exact solution for the problem in . Hence the energy is preserved in the small system and consequently also for . ∎

### 4.2 Explicit exponential midpoint rule

We consider next the explicit exponential midpoint rule (see [13])

 \em x+=\em x+eh\em H(\em x−−\em x)+2hϕ(h\em H)\em f(% \em x) , (5)

where . For linear Hamiltonian problems this gives

 \em x+ =\em x+eh\em H(\em x−−\em x)+2(eh\em H−\em I)\em x+2hϕ(h\em H)\em c =eh\em H(\em x−+\em x)−\em x+2hϕ(h\em H)\em c ,

i.e., , where is the solution of , . Hence the energy of the averages is preserved:

 H(12(\em x++\em x))=H(12(%\emx+\em x−)) . (6)
###### Remark 2.

In [9] it was noticed that the explicit midpoint rule (for the homogeneous problem)

 \em x+=\em x−+2h\em H\em x

preserves another quantity: . Equation (6) implies this, too.

Again we approximate (5) with

 \em x+=\em x+\em Ueh\em F\em U†(\em x−−\em x)+2h\em Uϕ(h\em F)\em U†% \em f(\em x) . (7)

For this we have

###### Theorem 3.

Let be linear, symplectic, and assume that is in the range of . Then (6) holds for the scheme (7).

###### Proof.

Now . Write . By the assumption there exists such that . Then . From (7) and we get

 ˆ\em x+ =\em x−\em Ueh\em Fζ+h\em Uϕ(h\em F)\em U% †(\em H(ˆ\em x+\em Uζ)+\em c) =ˆ\em x+h\em Uϕ(h\em F% )\em U†(\em Hˆ\em x+\em c)+\em U[\em I−eh\em F+hϕ(h\em F)\em F]ζ =ˆ\em x+h\em Uϕ(h\em F% )\em U†(\em Hˆ\em x+\em c) .

Thus the -vectors propagate according to the exponential Euler method (4) and we get the result by Lemma 1. ∎

In case the column space of contains the vector , we also have the following.

###### Lemma 4.

Assume that is a full rank matrix at with a left inverse , and that contains . Then, the approximate explicit exponential midpoint rule is symmetric.

###### Proof.

Multiplying (7) with gives

 \em Ue−h\em F\em U†(% \em x+−\em x)=\em x−−\em x+2h\em Ue−h\em Fϕ(h\em F)\em U% †\em f(\em x) .

Since we get

 \em x−=\em x+\em Ue−h\em F\em U†(\em x+−\em x)−2h\em Uϕ(−h\em F)\em U†% \em f(\em x) .

Thus the steps backward can be taken by replacing with .

### 4.3 Implicit exponential midpoint rule

As a third exponential integrator, we consider implicit exponential midpoint rule (IEMP)

 0 =eh\em H(\em x−ˆ\em x)+hϕ(h\em H)\em f(ˆ\em x) , (8) \em x+ =ˆ\em x+e2h\em H(\em x−ˆ\em x)+2hϕ(2h\em H)\em f(ˆ\em x)

(see [3]). This gives a symmetric method when the linear part of is fixed. When H comes from a linearization of a nonlinear Hamiltonian system (1), the method is symmetric if , where satisfies (8).

For linear systems of the form , the second equation of (8) can be written equivalently as . Then, propagates according to the exponential Euler method and the energy is preserved in case is symplectic (Lemma 1).

When we apply (8), the total approximation is symmetric if is evaluated at the midpoint .

###### Lemma 5.

Assume that is a full rank matrix with a left inverse . Suppose is evaluated at , where satisfies (8). Consider the approximation

 \em x+=\em x+\em Uξ+, (9)

where is obtained from applying (8) to the local system. Then, (9) gives a symmetric method.

###### Proof.

Applying (8) to the local system (2) gives

 0 =−eh\em Hξ+hϕ(h\em F% )\em U†\em f(\em x+\em U% ξ) (10) ξ+ =ξ−e2h\em Fξ+2hϕ(2h\em F)\em U†\em f(% \em x+\em Uξ).

We show that (10) leads to a symmetric approximation of the full system. Multiplying the upper equation of (10) by , and using the relation gives

 −eh\em Hξ+2hϕ(2h\em F)\em U†\em f(\em x+\em Uξ)=hϕ(h\em F)\em U†\em f(\em x+\em Uξ) .

Combining this and both equations of (10) gives

 ξ+=ξ+eh\em Fξ.

Multiplying this from left by and adding gives

 \em x+−ˆ\em x=\em Ueh\em F% \em U†(ˆ\em x−\em x) ,

where and . Replacing here with and multiplying from the left by shows the symmetry. ∎

Crucial for the symmetry of EIMP is that the Jacobian and the basis are the same when considering stepping from to and vice versa. This is the case if is evaluated at , and is generated using the Krylov subspace methods described in 5.

Our numerical strategy is to perform one time step using the exponential Euler method from to in order to approximate the midpoint . Then after evaluating the Jacobian and forming the basis at we solve the implicit equation using fixed point iteration and perform the step of size to obtain and .

## 5 Forming the local basis using Krylov subspace methods

We discuss next the approximation of matrix valued functions using Krylov subspace methods and show how they are naturally connected to the local approximation discussed in Section 4.

When matrix is large but the operation inexpensive, it is reasonable to use Krylov subspace methods. These work in Krylov subspaces

 Kk(\em A,\em v)=span{\em v,\em A\em v,\em A2\em v,…,\em Ak−1\em v}.

Then we have . The Arnoldi iteration uses the Gram-Schmidt process and produces an orthonormal basis for . Denote and , which is a Hessenberg matrix.

If the iteration stops, i.e. , then

• and for all ,

• and for the spectra we have ,

• If has convergence radius larger than the spectral radius of and , then

 φ(\em A)\em w=\em Qkφ(\em Fk)\em QTk\em w .

The effectivity of Krylov subspace methods is based on the fact that if the component of orthogonal to is small, then things are approximately as above and this can happen already for a reasonable size . Thus it is reasonable to consider the approximation

 φ(\em A)\em v=\em Qkφ(\em Fk)\em QTk\em w (11)

which was used already in [5] and [10]. We refer to [12] for a detailed error analysis.

We show next how the Krylov approximation (11) is naturally connected to the strategy of applying exponential integrators to the local system (2).

### 5.1 Equivalence of the Krylov and the local system approximations

Consider the local system (2) corresponding to the basis , where gives an orthonormal basis for . Recall from Section 4 the strategy of solving the local system (2), i.e.,

 ξ′(t)=\em U†\em f(\em x0+\em Uξ(t))

numerically from up to and setting . As shown in Subsection 4.1, applying the exponential Euler method to the local system gives the approximation

 \em x+=\em x+h\em Uϕ(h\em F)\em U†\em f(\em x).

We immidiately see from (11) that this is the Krylov subspace approximation of the exponential Euler step (3).

As shown in subsection 4.2, applying the exponential explicit midpoint rule gives

 \em x+=\em x+\em Ueh\em F\em U†(\em x−−\em x)+2h\em Uϕ(h\em F)\em U†% \em f(\em x).

As for the explicit Euler method, the resulting formula can be seen as a Krylov subspace approximation (11) of the EEMP step (5). In order to satisfy the assumptions of Lemma 4 the vector has to be in the range of U. This is addressed in Subsection 5.2.1.

Similarly, if we perform a Krylov approximation of the IEMP step (8), and denote and , we get the small dimensional system (10).

The present concern, however, is that if above is a Hamiltonian matrix, this does not necessarily bring any special structure to or .

### 5.2 Symplectic Krylov processes

In order to obtain good local approximations for a Hamiltonian system with linear part we would like to have

• A symplectic subspace with a corresponding basis.

• in order to have polynomials of applied to represented in . We expect this to be worth pursuing for approximations of .

Consider first the Krylov subspace corresponding to and :

 Kk(\em H,\em v)=span{\em v,\em H\em v,\em H2\em v,…,\em Hk−1\em v} ,

and set .

• Now , generally.

• If is a degree polynomial, then .

The construction of a symplectic basis for is slightly more complicated than the standard Arnoldi process:

We just reorthogonalize with respect to and – the –orthogonal vectors provided by the standard Arnoldi. The result is a symplectic and orthonormal matrix.

1. for do
,
if , set ,   ,
,
.
else stop.

2. Set , .

Here the columns of form an orthonormal basis for and those of a symplectic basis for .

###### Remark 6.

There is a way to construct matrix more economically from the computations of step 2. but anyway the reorthogonalization stays the costly part of this approach.

Isotropic Arnoldi

Mehrmann and Watkins [18] suggest the isotropic Arnoldi process, which is a direct and –orthogonalization in the Arnoldi process:

1. for do
,
if , set ,   ,
else stop.

2. Set , .

Here we obtain a symplectic matrix with orthonormal columns. However its range does not necessarily contain the Krylov subspace . Thus, generally, this iteration does not have the property that is in the span of for every polynomial of degree . Since our present aim is to approximate operator functions that have converging power series, this iteration can be expected to be less effective for our purposes333Mehrmann and Watkins use the iteration for a quite different purpose: eigenpairs of skew-Hamiltonian/Hamiltonian pencils.. We will see this in the numerical tests.

Also there is a possibility of a breakdown: after orthogonalization we may get without obtaining any useful information about .

Hamiltonian Lanczos
Benner, Faßbender, and Watkins have several versions of Hamiltonian Lanczos processes (see [1], [20]). The following is a promising one from Watkins (Algorithm 4 of [20]):

1. for do
if , then

if , then

if then stop (breakdown).

,
if , then

2. Form the matrices
,
where   and
Then is symplectic, its range contains the vectors , and .

Due to short recursion this is an economic iteration. But it has similar problems as the usual biorthogonal Lanczos, e.g., near breakdowns and loss of orthogonality. These can be partly circumvented. For small this may be a good choice.

By the very construction of these symplectic maps we get the following:

###### Proposition 7.

Combining any of the symplectic Krylov processes with a method that preserves energy for the small dimensional system (2) will preserve the energy of the original system, too.

In the numerical experiments we will use four algorithms to produce Krylov subspaces: the standard Arnoldi iteration in and the three symplectic ones: symplectic Arnoldi, isotropic Arnoldi, and the Hamiltonian Lanczos process. With respect to their costs to produce a space of fixed dimension they can be ordered as

Hamiltonian Lanczos Arnoldi Isotropic Arnoldi Symplectic Arnoldi

The main weaknesses of each of these are

• Arnoldi in : the approximation is not Hamiltonian.

• Hamiltonian Lanczos: breakdown, early loss of symplecticity.

• Isotropic Arnoldi: does not include a Krylov subspace.

• Symplectic Arnoldi: expensive.

#### 5.2.1 Adding a vector to the basis

When using the EEMP method (5), the vector needs to be added to the basis at each time step. For orthogonal and/or isotropic basis this is straightforward. For the symplectic basis, a symplectic version of the Gram–Schmidt algorithm adds x and Jx to the basis . This algorithm is shown in the following pseudocode. Here we denote . Here the symplectic orthogonalization can also be performed in modified Gram-Schmidt manner, one vector at a time. Notice also that in the second step the vector can be scaled with any constant.

## 6 Numerical tests

We compare numerically the three exponential time integrators of Section 4 and the four Arnoldi like processes of Section 5 to produce the local basis . We apply the methods to large sparse Hamiltonian systems which are obtained from finite difference discretizations of one dimensional nonlinear wave equations. For ease of presentation we first illustrate by an example our approach of deriving large sparse Hamiltonian systems from Hamiltonian PDEs. For further examples we refer to [4].

### 6.1 Spatial discretization of Hamiltonian PDEs

As an example consider the nonlinear Klein-Gordon equation in one dimension,

 utt=uxx−f(u), (12)

where is scalar valued periodic function for some ) and is a smooth function. Setting and , the equation (12) can be viewed as a Hamiltonian system

 \em J\em ut=δHδ\em u,

where

 \em J=[01−10],δHδ\em u=(∂H∂u,∂H∂v)T,

and and denote the functional derivatives of the Hamiltonian

 H(\em u)=L∫0[12v2−12u2x+F(u)]dx,F′(u)=f(u). (13)

To obtain a Hamiltonian system approximating the equation (12), we perform a discretization with respect to the spatial variable on an equidistant grid with an interval , , and denote by and , the approximations to and . For the second derivative we use the central difference approximation

 uxx(iΔx,t)≈qi−1(t)−2qi(t)+qi+1(t)(Δx)2.

Expressing the approximations as vectors

 \em q(t)=⎡⎢ ⎢⎣q1(t)⋮qn(t)⎤⎥ ⎥⎦and\em p(t)=⎡⎢ ⎢⎣p1(t)⋮pn(t)⎤⎥ ⎥⎦,

we get the approximation of the PDE in matrix form as

 \em p′(t)=Δn\em q(t)−\em f(\em q(t)),

where and is the discretized Laplacian with periodic boundary conditions,

 Δn=n2⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣−2111−21⋱⋱⋱1−2111−2⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (14)

Defining the Hamiltonian function

 H(\em q,\em p)=12\em pT% \em p−12\em qTΔh\em q+n∑i=1F(qi), (15)

we see that

 \em p′(t)=−∇\em qH(\em q(t),\em p(t)),

and by setting , we have the Hamiltonian system

 \em x′(t)=\em J−1∇H(\em x(t)),\em x(0)=\em x0, (16)

where ,