Feedback Integrators

# Feedback Integrators

Dong Eui Chang 111Corresponding author. Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. dechang@uwaterloo.ca    Fernando Jiménez222Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada. fjimenez@uwaterloo.ca    Matthew Perlmutter333Departamento de Matemática, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil. matthew@mat.ufmg.br
To appear in Journal of Nonlinear Science. http://dx.doi.org/10.1007/s00332-016-9316-7
###### Abstract

A new method is proposed to numerically integrate a dynamical system on a manifold such that the trajectory stably remains on the manifold and preserves first integrals of the system. The idea is that given an initial point in the manifold we extend the dynamics from the manifold to its ambient Euclidean space and then modify the dynamics outside the intersection of the manifold and the level sets of the first integrals containing the initial point such that the intersection becomes a unique local attractor of the resultant dynamics. While the modified dynamics theoretically produces the same trajectory as the original dynamics, it yields a numerical trajectory that stably remains on the manifold and preserves the first integrals. The big merit of our method is that the modified dynamics can be integrated with any ordinary numerical integrator such as Euler or Runge-Kutta. We illustrate this method by applying it to three famous problems: the free rigid body, the Kepler problem and a perturbed Kepler problem with rotational symmetry. We also carry out simulation studies to demonstrate the excellence of our method and make comparisons with the standard projection method, a splitting method and Störmer-Verlet schemes.

## 1 Introduction

Given a dynamical system on a manifold with first integrals, it is important for a numerical integrator to preserve the manifold structure and the first integrals of the equations of motion. This has been the focus of much effort in the development of numerical integration schemes [3]. In this paper we do not propose any specific numerical integration scheme, but rather propose a new paradigm of integration that can faithfully preserve conserved quantities with existing numerical integration schemes.

The main idea in our paradigm is as follows. Consider a dynamical system on a manifold with first integrals , . Assume that we can embed the manifold into Euclidean space and extend the first integrals to a neighborhood of in . For an arbitrary point , consider the set

 Λ={x∈U∣x∈M,fi(x)=fi(x0),i=1,…,ℓ}

which is the intersection of with all the level sets of the first integrals containing the point , and is an invariant set of the dynamical system. We then extend the dynamical system from to and then modify the dynamics outside of such that the set becomes a unique local attractor of the extended, modified system. Since the dynamics have not changed on by the extension and modification to , both the original system on and the extended, modified system on produce the same trajectory for the initial point . Numerically, however, integrating the extended system has the following advantage: if the trajectory deviates from at some numerical integration step, then it will get pushed back toward the attractor in the extended, modified dynamics, thus remaining on the manifold and preserving all the first integrals. It can be rigorously shown that the discrete-time dynamical system derived from any one-step numerical integrator with uniform step size for the extended, modified continuous-time system indeed has an attractor that contains the set in its interior and converges to as . In this paper we shall use the word, preserve, in this sense. It is noteworthy that the numerical integration of the extended dynamics can be carried out with any ordinary integrator and is done in one global Cartesian coordinate system on . We find conditions for applicability of this method and implement the result on the following three examples: the free rigid body dynamics, the Kepler problem, and a perturbed Kepler problem with rotational symmetry. We also carry out simulation studies to show the excellence of our new paradigm of integration for numerical preservation of conserved quantities in comparison with other well-known integration schemes, such as projection and splitting methods and symplectic Störmer-Verlet integrators.

## 2 Theory

Consider a dynamical system on an open subset of :

 ˙x=X(x), (1)

where is a vector field on . Let us make the following assumptions:

• There is a function such that for all , , and

 ∇V(x)⋅X(x)=0 (2)

for all .

• There is a positive number such that is a compact subset of .

• The set of all critical points of in is equal to .

Adding the negative gradient of to (1), let us consider the following dynamical system on :

 ˙x=X(x)−∇V(x). (3)

Since is the minimum value of , for all . Hence, the two vector fields and coincide on .

###### Theorem 2.1.

Under assumptions A1 – A3, every trajectory of (3) starting from a point in stays in for all and asymptotically converges to the set as . Furthermore, is an invariant set of both (1) and (3).

###### Proof.

Let be a trajectory of (3) starting from a point in . By A1

 ddtV(x(t))=∇V(x(t))⋅(X(x(t))−∇V(x(t)))=−|∇V(x)|2≤0 (4)

for all . Hence, is a positively invariant set of (3). From (4) and A3, it follows that . Hence, by LaSalle’s invariance principle [6], converges asymptotically to as , where A2 is used for compactness of . The invariance of follows from (2) and the coincidence of (1) and (3) on . ∎

Let us find a higher-order condition than that in assumption A3 so that A3 can be relaxed. For the function and the vector field in the statement of assumption A1, which are now both assumed to be of , let

 S={x∈U∣∣∣Xk∂V∂xi=0∀k≥0,1≤i≤n}, (5)

where , and denotes the th order directional derivative of along , i.e.,

 X0∂V∂xi=∂V∂xi;X∂V∂xi=X⋅∇∂V∂xi;Xk∂V∂xi=X(Xk−1∂V∂xi),k≥2.

Consider the following assumption in place of A3:

• .

The following theorem generalizes Theorem 2.1:

###### Theorem 2.2.

Under assumptions A1, A2 and A3, every trajectory of (3) starting in stays in for all and asymptotically converges to the set as . Furthermore, is an invariant set of both (1) and (3).

###### Proof.

Consider the dynamics (3). It is easy to show that is a positively invariant set of the dynamics. Let be the largest invariant set in . Let be an arbitrary trajectory in . Since as shown in the proof of Theorem 2.1, the trajectory satisfies , i.e.,

 ∂V∂xi(x(t))=0 (6)

for all and . Since along , the trajectory satisfies

 ˙x(t)=X(x(t)) (7)

for all . By differentiating (6) repeatedly in and using (7) on each differentiation, we can show that the trajectory satisfies

 Xk∂V∂xi=0

for all , and . Thus, the entire trajectory is contained in the set defined in (5), implying , from which and A3 it follows . Hence, by LaSalle’s invariance principle, every trajectory starting in asymptotically converges to and thus to as .

The invariance of follows from (2) and the coincidence of (1) and (3) on .∎

###### Remark 2.3.

1. If condition (2) is replaced by in assumption A1, then Theorems 2.1 and 2.2 still hold provided that the invariance of is replaced by positive invariance in the statement of the theorems.

2.Theorems 2.1 and 2.2 still hold with the use of the following modified dynamics

 ˙x=X(x)−A(x)∇V(x)

instead of (3), where is an matrix-valued function with its symmetric part positive definite at each .

3. From the control viewpoint, the added term in (3) can be regarded as a negative feedback control to asymptotically stabilize the set for the control system with control .

Suppose that assumptions A1, A2 and A3 (or A3 instead of A3) hold and that we want to integrate the dynamics (1) for an initial point . Since is positively invariant, the trajectory must remain in for all . Recall that the two dynamics (1) and (3) coincide on , so we can integrate (3) instead of (1) for the initial condition. Though there is no theoretical difference between the two integrations, integrating (3) has a numerical advantage over integrating (1). Suppose that the trajectory numerically deviates from the positively invariant set during integration. Then the dynamics (3) will push the trajectory back toward since is the attractor of (3) in whereas the dynamics (1) will leave the trajectory outside of which would not happen in the exact solution. It is noteworthy that this integration strategy is independent of the choice of integration schemes. In the Appendix we show that any one-step numerical integrator, as a discrete-time dynamical system, with uniform step size for (3) has an attractor that contains in its interior and converges to as .

Let us now apply this integration strategy to numerically integrate dynamics on a manifold while preserving its first integrals and the domain manifold. Consider a manifold and dynamics

 ˙x=X(x) (8)

on that have first integrals , . Suppose that is an embedded manifold in as a level set of a function for some , and that both the dynamics (8) and the functions , extend to an open neighborhood of in . Our goal is to numerically integrate (8) with an initial condition while preserving the manifold and the first integrals. Let

 f=(f0,f1,…,fℓ):Rn→Rr+ℓ (9)

and define a function by

 V(x)=12(f(x)−f(x0))TK(f(x)−f(x0)), (10)

where is an constant symmetric positive definite matrix. Notice that

 V−1(0)={x∈U∣x∈M,fi(x)=fi(x0),i=1,…,ℓ},

and that is invariant under the flow of (8). Or, more generally we can define a function as where is a non-negative function that takes the value of only at . If the function satisfies assumptions A1, A2 and A3 (or A3 instead of A3), then by Theorem 2.1 (or Theorem 2.2), is the local attractor of the modified dynamics

 ˙x=X(x)−∇V(x) (11)

that coincide with the original dynamics (8) on .

The following lemma provides a sufficient condition under which the function defined in (10) satisfies assumptions A2 and A3:

###### Lemma 2.4.

Consider the functions and defined in (9) and (10). If is compact and the Jacobian matrix of has rank for all , then there is a number such that assumptions A2 and A3 hold.

###### Proof.

By compactness of and the regularity of , there is a bounded open set such that , and (x) has rank for all , where denotes the closure of . Consider now the gradient of . An easy calculation shows that,

 ∇V(x)=Df(x)TK(f(x)−f(x0)).

Now, since for all , is onto as a linear map, is therefore one to one. It follows that, for ,

 ∇V(x)=0⟺f(x)−f(x0)=0⟺x∈V−1(0). (12)

In other words, the set of all critical points of in is equal to . Since the boundary of , being closed and bounded, is compact and , the minimum value, denoted by , of on is positive. If necessary, restrict the function to , replacing its original domain with . Then, there is a positive number less than such that . Therefore, assumption A3 holds for this number . Since the closed set is contained in the bounded set , it is compact, which implies that assumption A2 holds. ∎

###### Theorem 2.5.

For the functions and defined in (9) and (10), if satisfies (2) for all , the set is compact and the Jacobian matrix is onto for all , then there is a number such that every trajectory starting in remains in for all and asymptotically converges to as .

###### Theorem 2.6.

For the functions and defined in (9) and (10), if satisfies (2) for all , the set is compact and there is an open subset of containing such that the Jacobian matrix is onto for all , then there is a number such that every trajectory starting in remains in for all and asymptotically converges to as .

###### Proof.

Modify the proof of Lemma 2.4 appropriately. ∎

As discussed above, we can integrate (11) instead of (8) for the initial condition , which will yield a trajectory that is expected to numerically well remain on the manifold and preserve the values of the first integrals , . It is noteworthy that the integration is carried out in one Cartesian coordinate system on rather than over local charts on the manifold which would take additional computational costs for coordinate changes between local charts. In the following section, we will apply this strategy to the free rigid body dynamics, the Kepler problem and a perturbed Kepler problem with rotational symmetry to integrate the dynamics preserving their first integrals and domain manifolds.

## 3 Applications

### 3.1 The Free Rigid Body

Consider the free rigid body dynamics:

 ˙R =R^Ω, (13a) ˙Ω =I−1((IΩ)×Ω), (13b)

where ; is the moment of inertia matrix; and

 ^Ω=⎡⎢⎣0−Ω3Ω2Ω30−Ω1−Ω2Ω10⎤⎥⎦ (14)

for

 Ω=⎡⎢⎣Ω1Ω2Ω3⎤⎥⎦.

Since , from here on we assume that the rigid body dynamics are defined on the Euclidean space and that the matrix denotes a matrix, not necessarily in . This is the extension of the dynamics step.

Define two functions and by

 E(Ω) =12ΩTIΩ, (15) π(R,Ω) =RIΩ, (16)

where represents the kinetic energy of the free rigid body and the spatial angular momentum vector when . These quantities are first integrals of (13). Choose any

 R0∈SO(3),Ω0∈R3∖{(0,0,0)},

and let

 E0=E(Ω0)>0,π0=π(R0,Ω0)∈R3∖{(0,0,0)}. (17)

Define an open set by

 U={(R,Ω)∈R3×3×R3∣det(R)>0}

and a function by

 V(R,Ω)=k04∥RTR−I∥2+k12|E(Ω)−E0|2+k22|π(R,Ω)−π0|2 (18)

for , where , are constants, and is the 2-norm defined by for a matrix . Observe that we are endowing the space with the standard inner product, and that the trace norm is precisely the norm induced on by this inner product. We compute all gradients that follow with respect to this inner product. Notice that

 V−1(0)={(R,Ω)∈R3×3×R3∣R∈SO(3),E(Ω)=E0,π(R,Ω)=π0}.
###### Lemma 3.1.

The gradient of the function (18) is given by

 ∇RV =k0R(RTR−I)+k2(π(R,Ω)−π0)ΩTI, (19a) ∇ΩV =k1(E(Ω)−E0)IΩ+k2IRT(π(R,Ω)−π0). (19b)
###### Proof.

Straightforward. ∎

The following lemma shows that the function satisfies assumption A1 stated in §2.

###### Lemma 3.2.

The function satisfies

 ⟨(∇RV,∇ΩV),(R^Ω,I−1((IΩ)×Ω))⟩=0. (20)
###### Proof.

One can compute

 ⟨∇RV,(R^Ω)⟩ =trace(^ΩTRT(k0R(RTR−I)+k2(π(R,Ω)−π0)ΩTI)) =−k2ΩTI^ΩRT(π(R,Ω)−π0),

where, in the third equality we use the fact that for symmetric and antisymmetric, .

Next, we compute,

 ⟨∇ΩV,I−1((IΩ)×Ω)⟩ =⟨k1(E(Ω)−E0)IΩ+k2IRT(π(R,Ω)−π0),I−1((IΩ)×Ω)⟩ =k1(E(Ω)−E0)⟨(IΩ)×Ω,Ω⟩+k2⟨RT(π(R,Ω)−π0),(IΩ)×Ω⟩ =k2⟨IΩ,Ω×RT(π(R,Ω)−π0)⟩ =k2ΩTI^ΩRT(π(R,Ω)−π0).

Hence,

 ⟨(∇RV,∇ΩV),(R^Ω,I−1((IΩ)×Ω))⟩=⟨∇RV,R^Ω⟩+⟨∇ΩV,I−1((IΩ)×Ω)⟩=0.

The following lemma shows that the function satisfies assumptions A2 and A3 stated in §2.

###### Lemma 3.3.

There is a number satisfying

 0

such that is a compact subset of and the set of all critical points of in is equal to .

###### Proof.

It is obvious that there is a number satisfying (21) such that becomes a compact set in . For such a number , the matrix is invertible for every . Since is the minimum value of , every point in is a critical point of .

Let be a critical point of in . By Lemma 3.1 it satisfies

 k0R(RTR−I)+k2(π−π0)ΩTI =0, (22a) k1(E−E0)IΩ+k2IRT(π−π0) =0, (22b)

where

 π=π(R,Ω),E=E(R,Ω).

Post-multiplying (22a) by and pre-multiplying (22b) by yield

 k0R(RTR−I)RT+k2(π−π0)πT =0, (23a) k1(E−E0)π+k2RIRT(π−π0) =0, (23b)

since . Notice that would imply , contradicting . Hence, . It follows from (22) that if any of the three equations

 RTR−I=0,π−π0=0,E−E0=0

holds, then the three of them all hold. Thus

 RTR≠I,π≠π0,E≠E0 (24)

since . Since the matrix in (22a) has rank 1 and the matrix is symmetric, there exist a unit vector and a number such that

 RTR−I=κuuT. (25)

Substitution of (25) into (22a) and (23a) yields

 k0κRuuT+k2(π−π0)ΩTI=0, k0κRuuTRT+k2(π−π0)πT=0,

which implies

 Ru∥π∥π0,u∥IΩ, (26)

where the symbol means ‘is parallel to.’ Hence, we can express and as

 R =w1uT1+w2uT2+aeπ0uT, (27) π =bπ0, (28)

for some numbers , and vectors , where and the vectors and can be any vectors such that becomes an orthonormal basis for . Substitution of (27) into (25) implies that is an orthonormal basis for . Substitution of (27) and (28) into (22b) implies , which together with in (26), implies , i.e., is an eigenvector of . We can now choose or re-define the unit vectors and such that they become eigenvectors of the symmetric matrix , too. In the orthonormal basis , we can now write the moment of inertia matrix as

 I=I1u1uT1+I2u2uT2+I3uuT,

where are the eigenvalues of , which are all positive, corresponding to the eigenvectors , respectively. It is then easy to see that equations (23) imply

 k0a2(a2−1)+k2|π0|2b(b−1) =0, (29a) k1(|π0|2b22I3a2−E0)b+k2I3a2(b−1) =0, (29b)

where we have used .

We consider the following two separate cases: and . Suppose . If , then

 V(R,Ω)≥k22|π−π0|2=k22(|b|+1)2|π0|2>c

by (21), which contradicts . Hence, . If , then equation (29a) implies , but equation (29b) implies , implying . This cannot be compatible with . Hence, is ruled out. Similarly, can be ruled out. Hence, , which implies contradicting (24). Thus, when , there are no critical points of in .

Suppose . We analyze equations (29) using a continuity argument. At , (29a) implies or , neither of which satisfies (29b) at . Thus, by continuity there exists a number with such that for any with there is no number satisfying both (29a) and (29b). Hence, . We now shrink the number such that it not only satisfies (21) but also . For such a number , we have

 V(R,Ω)≥k04∥RTR−I∥2=k04∥(a2−1)uuT∥2≥k04δ2>c,

which contradicts . Hence, when , there are no critical points of in for some .

Therefore, there exists a number such that is the set of all critical points of in . ∎

Consider the dynamics

 ˙R =R^Ω−k0R(RTR−I)−k2(π(R,Ω)−π0)ΩTI, (30a) ˙Ω =I−1((IΩ)×Ω)−k1(E(Ω)−E0)IΩ−k2IRT(π(R,Ω)−π0), (30b)

which correspond to (3). From Theorem 2.1 and Lemmas 3.2 and 3.3 comes the following theorem:

###### Theorem 3.4.

There is a number such that every trajectory of (30) starting from a point in stays in for all and asymptotically converges to the set

 V−1(0)={(R,Ω)∈R3×3×R3∣R∈SO(3),E(Ω)=E0,π(R,Ω)=π0}

as , where the function is defined in (18). Furthermore, is an invariant set of both (13) and (30).

### 3.2 The Kepler Problem

The two-body dynamics in the Kepler problem are given in the usual barycentric coordinates by

 ˙x =v, (31a) ˙v =−μx|x|3, (31b)

where is the position vector, is the velocity vector and is the gravitational parameter. Define two functions and by

 L(x,v) =x×v, (32) A(x,v) =v×(x×v)−μx|x|, (33)

where is called the angular momentum vector and is called the Laplace-Runge-Lenz vector. It is known that both and are first integrals of the two-body dynamics (31) and they are orthogonal to each other, i.e.,

 L(x,v)⊥A(x,v)

for all . The energy function

 E(x,v)=12|v|2−μ|x|

satisfies

 |A(x,v)|2=μ2+2E(x,v)|L(x,v)|2 (34)

for all , implying that the energy is also a first integral of the two-body dynamics (31). It is also known that a non-degenerate elliptic Keplerian orbit is uniquely determined by a pair that satisfies , and [2].

Fix a non-degenerate elliptic Keplerian orbit, i.e., a pair of vectors that satisfies

 L0⊥A0,|L0|≠0,|A0|<μ.

Define a function by

 V(x,v)=k12|L(x,v)−L0|2+k22|A(x,v)−A0|2 (35)

for , where and . Notice that

 V−1(0)={(x,v)∈R30×R3∣L(x,v)=L0,A(x,v)=A0},

which is the non-degenerate Keplerian elliptic orbit whose angular momentum vector and Laplace-Runge-Lenz vector are and , respectively.

###### Lemma 3.5.

The gradient of the function defined in (35) is given by

 ∇xV =k1v×ΔL+k2(v×(ΔA×v)−μ|x|ΔA+μ|x|3xxTΔA), ∇vV =k1ΔL×x+k2((x×v)×ΔA+x×(v×ΔA)),

where and .

The following lemma shows that the function defined in (35) satisfies assumptions A1 and A2 stated in §2.

###### Lemma 3.6.

1. The function satisfies

 ⟨(∇xV,∇vV),(v,−μx/|x|3)⟩=0.

2. For any number satisfying

 0

the set is a compact set in .

###### Proof.

The first fact is a straightforward calculation using the previous Lemma. For the second, the essential idea is that the fibers of are homeomorphic to circles, corresponding to the elliptic orbits, and are therefore compact. For a detailed proof of the second statement, refer to Corollary 2.2 in [2]. ∎

The following lemma shows that the function defined in (35) satisfies assumption A3 stated in §2.

###### Lemma 3.7.

For any number satisfying (36) the set of all critical points of in is equal to .

###### Proof.

Choose an arbitrary number satisfying (36). Let be an arbitrary critical point of in . For notational convenience, let us write

 L=L(x,v),A=A(x,v)

suppressing the dependence on . By Lemma 3.5, the critical point satisfies

 0 =k1v×ΔL+k2(v×(ΔA×v)−μ|x|ΔA+μ|x|3xxTΔA), (37a) 0 =k1ΔL×x+k2((x×v)×ΔA+x×(v×ΔA)). (37b)

If , then , contradicting . Hence, , which together with (32) implies that the three vectors form a basis for . The dot product of (37b) with yields

 0=x⋅((x×v)×ΔA)=ΔA⋅(x×L),

so there are numbers and such that

 ΔA=ax+bL. (38)

Substitution of (38) into (37) gives

 0 =v×(k1ΔL+k2(aL−bv×L+bμ|x|x)), 0 =(k1ΔL+k2(2aL−bv