Second-order variational equations for N-body simulations

# Second-order variational equations for N-body simulations

Hanno Rein, Daniel Tamayo
Department of Physical and Environmental Sciences, University of Toronto at Scarborough, Toronto, Ontario M1C 1A4, Canada
Department of Astronomy and Astrophysics, University of Toronto, Toronto, Ontario, M5S 3H4, Canada
Canadian Institute for Theoretical Astrophysics, 60 St. George St, University of Toronto, Toronto, Ontario M5S 3H8, Canada
Centre for Planetary Sciences Fellow
Submitted: 14 February 2016, Accepted: 10 March 2016.
###### Abstract

First-order variational equations are widely used in -body simulations to study how nearby trajectories diverge from one another. These allow for efficient and reliable determinations of chaos indicators such as the Maximal Lyapunov characteristic Exponent (MLE) and the Mean Exponential Growth factor of Nearby Orbits (MEGNO).

In this paper we lay out the theoretical framework to extend the idea of variational equations to higher order. We explicitly derive the differential equations that govern the evolution of second-order variations in the -body problem. Going to second order opens the door to new applications, including optimization algorithms that require the first and second derivatives of the solution, like the classical Newton’s method. Typically, these methods have faster convergence rates than derivative-free methods. Derivatives are also required for Riemann manifold Langevin and Hamiltonian Monte Carlo methods which provide significantly shorter correlation times than standard methods. Such improved optimization methods can be applied to anything from radial-velocity/transit-timing-variation fitting to spacecraft trajectory optimization to asteroid deflection.

We provide an implementation of first and second-order variational equations for the publicly available REBOUND integrator package. Our implementation allows the simultaneous integration of any number of first and second-order variational equations with the high-accuracy IAS15 integrator. We also provide routines to generate consistent and accurate initial conditions without the need for finite differencing.

###### keywords:
methods: numerical — gravitation — planets and satellites: dynamical evolution and stability

## 1 Introduction

Calculating the orbital motion of planets and predicting the position of planets in the night sky is one of astronomy’s oldest reoccurring tasks. Today this is considered a solved problem, a simple application of Newtonian physics. Typically the dynamical system is solved by numerically integrating forward in time using an -body integrator. Different techniques are available to do this very accurately over both short (see e.g. Rein & Spiegel, 2015) and long (see e.g. Wisdom & Holman, 1991; Rein & Tamayo, 2015) timescales.

But often the solutions of even very simple dynamical systems are complex, in some cases exhibiting chaos. This means that small perturbations in the initial conditions lead to exponentially diverging solutions at late times. The solar system is one such chaotic dynamical system (Roy et al., 1988; Sussman & Wisdom, 1988; Laskar & Gastineau, 2009). One way to characterize chaotic systems is to numerically determine the Maximal Lyapunov characteristic Exponent (MLE), which measures the rate of exponential divergence between neighbouring trajectories in phase space.

Calculating how particle trajectories vary with respect to their initial conditions is therefore an important numerical task in modern celestial mechanics. It is also immediately relevant to orbital fitting and optimization. For example, when fitting an -body simulation of a planetary system to data, one might want to calculate the derivative of the value with respect to a planet’s initial orbital eccentricity.

The MLE, or more generally any derivative with respect to initial conditions, can be calculated by running a separate -body simulation with shadow particles, where the initial conditions of one or more particles are slightly perturbed. Measuring how fast the distance in phase space of the shadow particles with respect to their unperturbed counterparts grows then yields the MLE (Benettin et al., 1976).

However, it is well known that there are problems associated with this shadow-particle method (Tancredi et al., 2001). On the one hand, we want to start the shadow particles close so that we obtain a local measure of the divergence of trajectories, and so that as the paths begin to drift apart, there are several decades over which to characterize the rate of divergence. On the other hand, the closer we put the shadow particles, the more digits we lose to numerical roundoff error. One workaround is to periodically rescale the separation vectors to keep shadow particles nearby their real counterparts (see, e.g., Sec. 9.3.4 of Murray & Dermott, 2000). However, we show in Sec. 5.4 that the use of shadow particles requires problem-dependent fine-tuning and that the problem is exacerbated when computing higher-order derivatives.

Luckily, instead of integrating a separate simulation of shadow particles, one can also use variational equations to measure divergences. Rather than differencing two nearly equal trajectories, one instead derives a new linearized dynamical system that directly evolves the small distance between two initially offset particles. These variational equations are scale-free and circumvent the numerical pitfalls associated with the shadow-particle method (Tancredi et al., 2001).

First-order variational equations have been widely discussed and applied in the literature (e.g., Mikkola & Innanen, 1999; Tancredi et al., 2001; Cincotta et al., 2003). In this paper we derive second-order variational equations for the -body problem for the first time. These provide the second derivatives of the solution with respect to the initial conditions. Although mathematically straightforward to calculate, the number of terms and therefore the complexity rises significantly. As we will see below, some terms involve 7 different (summation) indices.

Our work opens up many new opportunities for a variety of applications. Perhaps most importantly, it is now straightforward to implement derivative-based optimization methods. While the first derivatives provide a gradient that yields the direction towards a local minimum on a landscape, the second derivatives provide the scale for how far one should move to reach the minimum. This can, among other things, dramatically improve fitting algorithms for radial velocity and transit planet searches, posterior estimation using Markov Chain Monte Carlo (MCMC) methods and even spacecraft trajectory optimization.

We begin in Sec. 2 with a formal introduction to variational equations that generalizes to higher order. In Sec. 3 we specialize to the case we are interested in, the -body problem. For completeness, in Sec. 3.1 we rederive the first-order variational equations for the -body problem. We then go one step further in Sec. LABEL:sec:derivation2 and derive the second-order variational equations.

We have implemented the second-order variational equations within the REBOUND package and make them freely available. REBOUND is a very modular -body package written in C99 and comes with an optional python interface. We have abstracted the complexity of higher order variations significantly, and summarize our adopted syntax in Sec. 4. Obtaining consistent initial conditions for variational equations in terms of Keplerian orbital elements without relying on finite difference is non-trivial. We have therefore also implemented several convenience methods for this purpose.

In Sec. 5 we demonstrate how second-order variational equations and Newton’s method can be used to fit observational data to a dynamical -body model. Finally, we compare variational and finite-difference methods in Sec. 5.4, and conclude in Sec. 6 by outlining the next steps in using higher order variational equations efficiently in optimization problems and MCMC methods.

## 2 Derivation of differential equations

### 2.1 Variational Equations

In this section, we define what we mean by variational equations and introduce our notation. We follow the work of Morales-Ruiz et al. (2007) and start with an analytic differential equation of the form

 ˙x=X(x). (1)

In the case that we are interested in later, encodes the 3 position and 3 velocity coordinates for each particle. is then a vector field on . The dot represents a time derivative. Given a suitable set of initial conditions , an -body simulation allows us to calculate (or at least approximate) the solution to Eq 1. We denote this solution , a dimensional vector that depends on the initial conditions and time .

Our goal is to estimate the solution vector for different initial conditions, i.e. we want to approximate . One way to do that is to simply solve the differential equation in Eq. 1 with the new initial conditions . However, depending on the problem, finding the new solution with an -body integration can be either very inefficient or inaccurate111In particular, it might be inaccurate if we are interested in the difference of the two solutions. See the discussion in Sec. 1 about shadow particles..

Thus, we are looking for a better way to estimate solutions for the initial conditions in a neighbourhood of . The approach we consider here uses the fact that one can expand around a reference solution in a power series. For simplicity, we first consider the case of varying a single scalar parameter on which the initial conditions depend, . If , then . This could correspond to varying a Cartesian component of , or a parameter that mixes Cartesian components, such as a planet’s orbital eccentricity. Then each component of can be expanded around the reference solution as a power series in ,

 ϕ(y0,t)=∑m≥01m!ϕ(m)αm, (2)

In the above equation, , i.e. the reference solution, and

 ϕ(m)≡∂mϕ(y0(α),t)∂αm∣∣∣α=0, (3)

i.e. a vector of the -th derivative of each of the reference solution’s components with respect to the parameter . For sufficiently small and , this approximation is accurate even if we terminate the series at a finite . The precise domain on which the solution can be trusted depends on the system and the initial conditions. For example, in chaotic dynamical systems, might grow exponentially fast, limiting the domain to relatively short times or small .

In conclusion, if one can obtain the , one can approximate all nearby solutions of . Each is a function of time and must be numerically integrated. We therefore seek their governing differential equations.

We henceforth denote the reference solution simply as . The solution , by definition, satisfies the original Eq. 1, in other words . We now take the derivative of this equation with respect , the parameter we are varying,

 ∂∂t∂ϕ∂α=∂X(ϕ)∂α=∑b∂X(ϕ)∂ϕb∂ϕb∂α, (4)

where we changed the order of the derivatives and made use of the chain rule. The summation index runs over all elements of the vector . The derivative of with respect to is the we seek for use in Eq. 2. Let us define the by matrix with components

 X(1)ab(ϕ)≡∂Xa(ϕ)∂ϕb. (5)

Using the matrix we then arrive at a compact set of differential equations for the vector :

 ˙ϕ(1)=X(1)(ϕ)ϕ(1). (6)

This equation is the first-order variational equation. We will later calculate the components of the matrix explicitly. We then solve for the vector by integrating the differential equation numerically. Note that depends on the time-dependent reference solution but not on . It is a linear operator acting on .

Repeating the steps above but differentiating Eq. 1 twice instead of once, we can write down the differential equation for . Because we apply the chain rule in the process, one finds that the time derivative of the second-order variations depends not only on but also on . Explicitly, the differential equation after two derivatives becomes

 ∂∂t∂2ϕ∂α2=∂2X(ϕ)∂α2=∑b,c∂2X(ϕ)∂ϕb∂ϕc∂ϕb∂α∂ϕc∂α+∑b∂X(ϕ)∂ϕb∂2ϕb∂α2. (7)

Defining the by by tensor with components

 X(2)abc(ϕ)≡∂Xa(ϕ)∂ϕb∂ϕc, (8)

and using a short hand notation that suppresses the summation indices as well as the function arguments (we give explicit component forms for the general case in Sec. 2.3), we have

 ˙ϕ(1) =X(1)ϕ(1), ˙ϕ(2) =X(1)ϕ(2)+X(2)[ϕ(1)]2. (9)

This set of equations is not linear anymore. But note that the linear term of the second line is the same as in the first line.

Higher order equations can be constructed in a straightforward way. Using the shorthand notation makes this particularly easy. One can reintroduce the indices at the end of the calculation. In this paper, we will only use variational equations up to second order (see e.g. Morales-Ruiz et al., 2007, for equations up to order 3).

### 2.2 Initial conditions

To integrate a differential equation forward in time, one needs appropriate initial conditions. To obtain the initial conditions for and , one simply applies the chain rule to Eq. 3 and evaluates it at ,

 ϕ(1)(x0,0) =∂y0(α)∂α,∣∣∣α=0 (10) ϕ(2)(x0,0) (11)

In the case where the varied parameter corresponds to a Cartesian component, choosing the initial conditions for and is straightforward. If we assume the varied parameter has the coordinate index , then the initial conditions for in component form are

 ϕa(y0(α),0)=ϕa(x0,0)+αδab, (12)

where is the Kronecker delta. Thus

 ϕ(1)a(x0,0)=δabandϕ(2)a(x0,0)=0. (13)

In practice, the function can be very complicated. As an example, let us consider a planetary system with one planet of mass on an initially circular and coplanar orbit around a star with mass . The initial conditions of the planet might then be defined through the semi-major axis as

 ⎛⎜⎝rxryrz⎞⎟⎠=⎛⎜⎝a00⎞⎟⎠,⎛⎜⎝vxvyvz⎞⎟⎠=⎛⎜⎝0√G(m+M)/a0⎞⎟⎠. (14)

If we vary the initial semi-major axis by some length , then the initial conditions for the first-order variation are given by Eq. 10, in our case

 ⎛⎜⎝\mathbbmrx\mathbbmry\mathbbmrz⎞⎟⎠(1)=⎛⎜⎝100⎞⎟⎠,⎛⎜⎝\mathbbmvx\mathbbmvy\mathbbmvz⎞⎟⎠(1) =⎛⎜ ⎜⎝0−12√G(m+M)/a30⎞⎟ ⎟⎠. (15)

The initial conditions for the second-order variation are given by Eq. 11, which for the present case are

 ⎛⎜⎝\mathbbmrx\mathbbmry\mathbbmrz⎞⎟⎠(2)=⎛⎜⎝000⎞⎟⎠,⎛⎜⎝\mathbbmvx\mathbbmvy\mathbbmvz⎞⎟⎠(2) =⎛⎜ ⎜⎝034√G(m+M)/a50⎞⎟ ⎟⎠. (16)

The components of and that we calculated above correspond to the planet. All components corresponding to the star are 0.

The initialization can quickly get complicated. Suppose we work in the centre-of-mass frame. Then the star’s initial conditions will also depend on the semi-major axis of the planet. Similarly, if we add an additional outer planet and work in Jacobi coordinates, the outer planet’s initial conditions depend on the inner planet’s orbital parameters. For that reason we’ve implemented convenience functions for the initialization of orbits which we present later in Sec. 4.3.

### 2.3 Multiple sets of variational equations

The above derivation of variational equations can be straightforwardly generalized when varying multiple parameters. Consider varying the initial value of separate parameters . Here and in the rest of this paper Greek variables indicate to variations with respect to one parameter and will run over the interval . We write all equations in this section in component form for direct comparison with our later results.

When varying several parameters, the coupled set of differential equations, Eq. 9, becomes

 ˙ϕ(1)a,ξ =∑bX(1)abϕ(1)b,ξ, ˙ϕ(2)a,ξη =∑bX(1)abϕ(2)b,ξη+∑b,cX(2)abcϕ(1)b,ξϕ(1)c,η, (17)

where , and . Therefore, when varying parameters, there are sets of first-order variational equations, one for each of the vectors . There are sets of second-order variational equations (one for each of the vectors ). Each set of variational equations has components.

Once numerically integrated, these variations can then be plugged into a multi-variate power series expansion analog to Eq. 2 to obtain trajectories for arbitrary nearby initial conditions. Explicitly, to second order,

 ϕa(y0,t)≈ϕa(x0,t)+∑ξϕ(1)a,ξαξ+12∑ξ,ηϕ(2)a,ξηαξαη. (18)

Note that because derivatives commute we find that . Thus the total number of differential equations we need to integrate for the second order variations can be reduced from to . This is in addition to the differential equations for the reference simulation and equations for the first order variations.

### 2.4 Index convention

As we saw above, the number of indices in second-order expressions is high. We therefore adopt a consistent index notation for the remainder of this paper. Specifically, we will consider a dynamical system consisting of particles and use the indices and to label different particles. These indices thus run from to . The indices and label coordinate axes. Above, these indices ran over the coordinates of the -body system. We will find below that for the -body system, it is simpler to consider positions and velocities separately. Therefore, in what follows and will run over the Cartesian , , and components only. As before, we will also make use of Greek characters and to indicate different sets of variational equations corresponding to different varied parameters (different variations). In the following sections we explicitly write summation symbols, i.e., we do not use a summation convention over repeating indices.

## 3 Variational Equations for the N-body System

Let us now derive the differential equations from above for a specific problem: the dynamical system of gravitationally interacting particles. The differential equation for the -body problem in vector notation is

 ¨ri = −∑j,j≠iGmjrijr3ij (19)

where and is the norm of . We can also write the equation in component form

 ˙vi,a = −∑j,j≠iGmjrij,ar3ij. (20)

where is the relative position between particles and . This is a second-order differential equation. However, note that the first time-derivative of the position is just the velocity, . The differential equation can thus easily be brought into the form of Eq. 1 by introducing

 x≡(r0,xr0,y…rN−1,yrN−1,zv0,xv0,y…vN−1,yvN−1,z)T (21)

such that

 (22)

We end up with a first-order differential equation with twice as many variables as Eq. 19 ( compared to ). This set of differential equations together with suitable initial conditions completely describes the -body problem.

### 3.1 First-order variational equations

To derive the first-order variational equation for the -body problem, we start by differentiating Eq. 20 with respect to . To be as explicit as possible we do this in component form. We end up with an equation with four indices. Two of the indices run over coordinates, and two over particles,

 ∂˙vi,a∂rj,b =∂∂rj,b⎛⎝−∑k,k≠iGmkrik,ar3ik.⎞⎠ =−∑k,k≠iGmk(δij−δkj)δabr3ik+∑k,k≠i3Gmkrik,arik,br5ik(δij−δkj) =⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩∑k,k≠i(−Gmkδabr3ik+3Gmkrik,arik,br5ik) if i=j+Gmjδabr3ij−3Gmjrij,arij,br5ij if i≠j. (23)

The above expression gives us the matrix elements of (Eq. 5). Note that two indices and combined correspond to the row of that matrix, the other two ( and ) correspond to the column of the matrix.

We also want to consider the influence of varying masses. Note that one can think of the masses as part of the initial conditions. However, we assume that the masses do not vary with time after the system has been initialized, thus . We need the derivative of the force with respect to the mass:

 ∂˙vi,a∂mj=∂∂mj⎛⎝−∑k,k≠iGmkrik,ar3ik⎞⎠=⎧⎨⎩−Grij,ar3ijif j≠i0if j=i. (24)

We can now write down the differential equation for the first-order variational equation using Eq. 6. We could do this in terms of and its components, but choose to use two separate vectors for the variational position and velocity components (to be consistent with Eq. 19). Variational quantities are denoted by double-striped symbols , and . First-order variational quantities receive the superscript . Thus, one might write in terms of and :

 (25)

which should be compared to Eq. 21. The units of and depend on the variation we are considering. In general the units are not the same as those of and (see Eq. 10). We end up with the following set of equations for the components of (corresponding to the second half of the components of ):

 ˙\mathbbmv(1)i,a =∑j∑b∂˙vi,a∂rj,b\mathbbmr(1)j,b+∑j∂˙vi,a∂mj\mathbbmm(1)j, =∑j,j≠i∑b∂˙vi,a∂rj,b\mathbbmr(1)j,b,+∑b∂˙vi,a∂ri,b\mathbbmr(1)i,b+∑j∂˙vi,a∂mj\mathbbmm(1)j, =∑j,j≠i[∑b(−Gmjδabr3ij+3Gmjrij,arij,br5ij)\mathbbmr(1)ij,b−Grij,ar3ij\mathbbmm(1)j], (26)

where . We can rewrite this in vector notation, which allows us to drop the indices and :

 ˙\mathbbmv(1)i = ∑j,j≠i(−Gmjr3ij\mathbbmr(1)ij+3Gmjrijr5ij(rij⋅\mathbbmr(1)ij)−Grijr3ij\mathbbmm(1)j) (27)

The represent the usual vector product. The equations for the variational positions, (the first half of the components of ), are significantly easier to write down:

 ˙\mathbbmr(1)i = \mathbbmv(1)i (28)

As mentioned before, the masses are assumed to be constant throughout a simulation; thus, the variational equation for the mass coordinates are not dynamic:

 ˙\mathbbmm(1)i = 0 (29)

The solutions are trivial, , and we therefore do not evolve the quantities .

## 4 Implementation

We have implemented first and second-order variational equations into the -body code REBOUND (Rein & Liu, 2012). REBOUND is very modular and allows the user to choose from different numerical integrators. What we describe here has been tested for the high-accuracy integrator IAS15, which is based on a 15th-order Gauß-Radau quadrature (Rein & Spiegel, 2015). First-order variational equations have also been implemented for the symplectic WHFast integrator (Rein & Tamayo, 2015) as a symplectic tangent map (Mikkola & Innanen, 1999). In principle, higher-order variational equations could also be implemented as a symplectic tangent map. However, the complexity of such a higher-order tangent map goes beyond what we expect to be useful in practice. We therefore exclusively focus on the general-purpose IAS15 integrator for the remainder of this paper.

We implement the variational equations in terms of variational particles. This provides an elegant implementation where variational particles follow a structurally similar (though conceptually different) set of differential equations to the real particles (cf. Eqs. 27 and LABEL:eq:vdot2ndorder). For each first and second order variation that we consider we add such variational particles to the simulation. The Cartesian components of a variational particle are then the derivatives of the corresponding real particle’s components with respect to the parameter we are varying. This implies that the units for different variations will vary (compare with Eq. 3).

### 4.1 Initialization routines

In addition to the variational equations themselves, we have implemented convenience methods for initializing the variational particles. If one is interested in varying one of the cartesian coordinates of a particle, initializing variational particles is as easy as setting all of the coordinates to 0 except one which is set to 1, see Eq. 13. However, as shown above, varying parameters that are non-linear functions of the cartesian coordinates involves calculating first and second derivatives and can quickly become cumbersome. We are particularly interested in applications involving planetary systems. We therefore provide routines that allow the initialization of variational particles with respect to changing a particle’s mass , as well as its orbit’s semi-major axis , eccentricity , inclination , longitude of the ascending node , argument of pericenter and true anomaly .

Since we are doing this to second order for 7 orbital elements222This includes the mass of the particle., we thus have different functions. In principle one could also initialize variational particles by calculating finite differences, i.e. creating a second particle with one orbital parameter shifted by a small amount , subtracting each component from the un-shifted particle and then dividing by . The problem is that this procedure easily leads to numerical issues as the shift needs to be small enough to be in the linear (quadratic) regime, but large enough to avoid any rounding error due to limited floating point precision. Our functions that calculate the derivatives analytically avoid this issue. Our current implementation does not support Jacobi coordinates and assumes that all parameters are given with respect to a fixed central object (heliocentric frame).

We have also implemented a routine that moves the entire system to the centre of mass frame and corrects the variational particles’ positions and velocity coordinates consistently.

It is worth pointing out that automatic differentiation (AD) would be well suited to automate this task for even more complicated initialization routines (Neidinger, 2010).

### 4.2 Test particles

We call a particle in an -body simulation a test particle if it does not affect other particles in the simulation because it has no mass, . If one is interested in the effect of varying the initial conditions of test particles, then the variational equations simplify significantly. Because variations of a test particle do not affect the variations of other particles, one can reduce the dimensionality of the first-order variational differential equation, Eq. 6, from to . This speeds up the calculation significantly and we have implemented this as an optional flag that can be set when a set of variational equations is initialized. This might become particularly useful if an approximation of the derivatives is sufficient for a given application.

### 4.3 Syntax

Here, we briefly demonstrate how to initialize and run a simulation using the python interface to REBOUND. We do this because the layer of abstraction that we came up with to hide the complicated expressions for second order variational equations is essential in making this tool useable in a real world scenario. We provide the full documentation for how to use variational equations within REBOUND online at http://rebound.readthedocs.org.

A simulation of one planet orbiting a central star can be setup with the following code in python:

import rebound
sim = rebound.Simulation()

By default REBOUND uses units in which . One set of first-order variational particles can be set up with a single command:

The var_i object contains all the information of this set of variational particles, e.g. the order, the location of variational particles, etc. By default, the variational particles’ position, velocity and mass coordinates are initialized to zero. For this example, let us assume that we want to vary the planet’s semi-major axis. We initialise the planet’s variational particle using the following command:

var_i.vary(1,"a")

In the background, this command first calculates the orbital parameters of the particle with index 1 (the planet) in heliocentric coordinates. Then, the variational particle is initialized using the analytic derivative with respect to the semi-major axis, see Eq.15. We can now integrate the system forward in time for, say, 100 time units:

sim.integrate(100.)

The planet’s x-position after the integration can be accessed via sim.particles[1].x. Let us use the result of our integration to estimate the planet’s x-position assuming its initial semi-major axis was shifted by . This can be achieved with the following code

Delta_a = 0.01
print sim.particles[1].x
+ Delta_a * var_i.particles[+1].x

which should be compared with Eq. 2. To go beyond first order and include second-order variational equations, we setup the second order variational equations (before the integration) with

first_order=var_i)

Note that we need to specify the the corresponding first-order variational particles. This is because second-order variational particles depend on the first-order variational particles and in principle there can be many different first oder variational particles for different parameters. The initialization of the particle is identical to before

var_ii.vary(1,"a")

The final position can then be estimated by applying Eq. 2 as before, but now accurate to second order,

print sim.particles[1].x
+ Delta_a * var_i.particles[1].x
+ 0.5*Delta_a*Delta_a * var_ii.particles[1].x

More complicated and realistic examples are available in the documentation at http://rebound.readthedocs.org.

## 5 Tests

In this section we present various tests of our implementation. These show not only that the implementation is working correctly, but also what second-order variational equations can be used for. We plan to follow up on several of these ideas in much more detail in future work.

### 5.1 Varying one orbital parameter

As a first test, we study a two-planet system and vary the initial semi-major axis of the outer planet. We use the first and second-order variational equations to approximate the -position of the inner planet after 10 orbits using Eq. 18. The inner planet’s position changes with time because of the planet-star as well as the planet-planet interactions.

We plot the results in Fig. 1. The bold black line corresponds to the final -position of the inner planet using a direct -body integration. The results for both the first and second-order variational equations are shown as a green dashed and blue dotted line, respectively. To arrive at these approximations, only one -body simulation with variational equations was run. Note that this is in contrast to 400 individual -body simulations which were carried out to generate the black curve. The red dot indicates the initial semi-major axis used for the single run with variational equations.

As the plot clearly shows, we can use the results from second-order variational equations to accurately predict the final position of the inner planet to within a few percent for initial conditions that are not too far from the original simulation. Also note that as expected, the second-order variational equations give a significantly better estimate than the first-order equations alone.

### 5.2 Optimization problem with one one orbital parameter

We continue to work with the above two-planet system. We now attempt to find the initial semi-major axis of the outer planet that minimizes the coordinate of the inner planet at the end of the simulation. This test therefore represents a simple case of a wide range of optimization problems. Instead of minimizing the coordinate of the planet, we could also minimize the distance to another planet, or maximize the velocity. Furthermore, one could replace one of the planets with a spacecraft and then search for an optimal spacecraft trajectory that uses a minimal amount of fuel to reach a final point, and so on.

We use the standard Newton’s method to find the optimal value, . For that we need the first and second derivatives of the planet’s position (we are looking for the root of the first derivative). We calculate these using the variational equations. As a starting point in Newton’s method, we use the red dot in Fig. 1.

We plot the results in Fig. 2. The vertical axis shows the relative position offset as a function of the iteration. After four iterations, the method has converged to machine precision. With any derivative free method such as the bisection method we would need more iterations to achieve machine precision.

### 5.3 Fitting a radial velocity curve

We now present a more complicated example in which we attempt to fit the reflex motion of a star in a two-planet system to a synthetic radial velocity data set. In Fig. 3 we show the synthetic radial velocity curve of the star as a function of time (the units are irrelevant for this discussion). The red dots show where an observation is taken. For simplicity we only vary two orbital parameters, the semi-major axis and the eccentricity of the inner planet. All other parameters are the same as in the reference simulation. Our goal is to match these synthetic observations and thus to find the true parameters , starting from an arbitrary initial guess of semi-major axis and eccentricity, .

We label the synthetic observation at time with and the radial velocity in our simulation with . The problem can be expressed again as an optimization problem by defining a goodness of fit, e.g.,

 χ2=∑i(vi−oi)2, (40)

which we try to minimize. Note that are functions of the initial and . More complicated and realistic functions that take into account observational uncertainties can be easily constructed, but we here work with the simplest case. The chain rule yields

 ∂∂aχ2 =2∑i(vi−oi)∂vi∂a (41) ∂2∂a2χ2 =2∑i((vi−oi)∂2vi∂a2+(∂vi∂a)2) (42)

and similar expressions for the derivatives with respect to and the cross-term . The second-order variational equations are used to calculate the derivatives involving . We can then use the standard Newton’s method to iterate and find the extremum in :

 (an+1en+1)=(anen)−⎛⎜ ⎜⎝∂2χ2∂a2∂2χ2∂a∂e∂2χ2∂a∂e∂2χ2∂e2⎞⎟ ⎟⎠−1⋅⎛⎜⎝∂χ2∂a∂χ2∂e⎞⎟⎠. (43)

The matrix in the above equation is the inverse of the Hessian of , . Newton’s method will only converge where is convex, or in other words where the matrix is positive definite. To increase the convergence region we use a trick to ensure that is positive definite everywhere by using the softabs metric of the Hessian , instead of itself (Betancourt, 2013). The modified Newton’s method becomes

 (an+1en+1)=(anen)−\raisebox0.3pt$≀${H}\raisebox0.3pt$≀$−1⋅⎛⎜⎝∂χ2∂a∂χ2∂e⎞⎟⎠. (44)

In Fig. 4 we plot the relevant part of the parameter space. The colours and contours correspond to the logarithm of . We start the iteration at and converge to the true minimum within machine precision in less than ten iterations. Newton’s method converges to the global minimum for most nearby starting values (those near the centre in Fig. 4). If the initial conditions are far from the global optimum, then, as expected, the method might not converge to the global minimum. We note that this problem of non-convergence is a feature of the adopted optimization algorithm (Newton’s method), and not of the variational equations. In particular, even in cases where Newton’s method does not converge, the derivatives are calculated exactly (to machine precision).

For the above reasons, the method presented in this example is not well suited for finding the global minimum within a complex parameter space. Other methods such as simulated annealing or parallel tempering are most likely faster and more reliable. However, as we discuss below, a combination of methods is a promising future area of research if one can make use the second-order variational equations to converge to a local optimum within almost constant time (or iterations)333Newton’s method converges quadratically, i.e. the number of significant digits roughly doubles after every iteration. Thus, if we are close to a local minimum and we work in double floating point precision with 16 significant digits, we need  iterations to converge to machine precision..

### 5.4 Comparison to a finite difference approach

In the above optimization problem, we use variational equations to calculate the first and second derivatives of . One can also use a finite difference approach to estimate the derivatives. As we show in this section, this is not viable in most scenarios as two separate competing constraints require fine tuning of the finite difference parameters.

Let us try to calculate all the first and second order derivatives that we need in the radial velocity fit problem from Sec. 5.5: , , , and the cross term . To use the finite difference method, we need to choose a finite initial differences and . These are then used to initialize the orbits of shadow particles which are integrated using the normal equations of motion.

The actual value of and is crucial. It has to be small enough to ensure the simulation remains in a linear regime (quadratic for second order). However, making the finite differences too small results in loss of accuracy due to finite floating point precision. Thus there is an optimum between these two competing effects. The precise value is problem specific. For the radial velocity test case we find an optimum around for first-order derivatives and for second-order derivatives.

This problem is illustrated in Fig. 5. We plot the relative error of the first and second-order derivative as a function of the initial finite differences and . One can see that the best possible estimate of the first derivative is only accurate to within . Worse yet, the best estimate of the second derivative is only accurate to within . Using the finite difference approach we cannot obtain a better estimate.

The problem gets even worse if one is interested in the cross-term in the Jacobian, e.g. . The relative error of this quantity is plotted in the right panel of Fig. 5. The cross term depends on both finite differences and . There is only a small area in the / space that gives reasonably accurate results. Finding the best combination of the initial finite differences is difficult, requires problem-specific fine tuning and becomes quickly infeasible, especially for applications where a wide range of the physical parameter space is explored.

None of these problems exist using the variational equation approach that we present in this paper. Note that we also do not use finite difference for the initialization of variational particles. The entire framework does not contain any small parameters that could lead to numerical problems (cf. ).

It is worth pointing out that the IAS15 integrator that we use for the normal -body integration as well as for the variational equations is accurate to machine precision (Rein & Spiegel, 2015). Furthermore, the energy error in long-term simulations grows sub-linearly and follows Brouwer’s Law (Newcomb, 1899; Brouwer, 1937). IAS15 is therefore as exact as any integrator can possibly be444To within a constant factor of a few., using only double precision arithmatic. This statement also applies to the variational equations and therefor to the derivatives we calculate with their help. All derivatives are exact up to machine precision. They can not be calculated more accurately without going to extended precision.

### 5.5 Runtime

One thing to keep in mind for optimization problems is the computational complexity of a simulation with first and second-order variational equations. If there are particles and free parameters, then the computation time for a simulation with second-order variational equations scales as .

We tested this scaling in a simulation of two planets in which we vary all 14 planet parameters (all orbital parameters and the masses). The results are plotted in Fig. 6 and agree with our estimate.

If every parameter of every particle is varied, one ends up with a runtime that scales approximately as . This indicates that using variational equations might only be competitive when combined with other methods. However, if another method brings us close to a local minimum (using, e.g., simulated annealing, parallel tempering), then an approach based on variational equations can converge to the local optimum within just a few iterations, in (almost) constant time.

## 6 Conclusions

In this paper we presented the theoretical framework for using second-order variational equations in -body simulations to estimate how particle trajectories vary with respect to their initial conditions. We described a flexible implementation of these equations within the REBOUND integrator package.

A major motivation for developing first-order variational equations was to overcome the numerical inaccuracies associated with finite-difference methods that use shadow particles (e.g., Tancredi et al., 2001). We showed in Sec. 5.4 that this problem is exacerbated at second order, requiring careful problem-dependent fine tuning. Additionally, the number of shadow particles required by the finite-difference approach is always the same as the corresponding number of variational equations to follow. The variational approach is therefore much more robust and effectively equal in speed.

An important application for second-order variational equations is in solving optimization problems. First derivatives furnish only the right direction to move in a parameter landscape toward a minimum; second derivatives provide a scale for how far one must jump to reach that minimum. If near a minimum, the first and second derivatives furnished by the variational equations can converge to within machine precision of the minimum in just a few iterations. We illustrated this behaviour in both a simple two-planet case (Sec. 5.2) and in the fitting of a radial velocity curve (Sec. 5.5). Variational equations might also be applied to spacecraft trajectory optimization or asteroid deflection.

The optimization problem presented in this paper uses second order variational equations in connection with the classical optimization algorithm of Newton. One can also use the second-order variational equations in connection with a Markov Chain Monte Carlo (MCMC) method. Specifically, for both Riemann Manifold Langevin and Hamiltonian Monte Carlo methods, higher order derivatives, and therefore higher order variational equations, are essential (Girolami & Calderhead, 2011). A full discussion of these MCMC methods and their application goes beyond the scope of this paper but we note that our initial tests of these methods show great promise. In particular, we observe very short auto-correlation times when using a Riemann Manifold Langevin MCMC to sample the posterior of radial velocity curves.

For a long time, first-order variational equations have been widely used to calculate Lyapunov exponents and the Mean Exponential Growth of Nearby Orbits (MEGNO, Cincotta et al. 2003) in the astrophysics community (Tancredi et al., 2001; Hinse et al., 2010). We speculate that higher-order variational equations may be able to improve such chaos indicators. Since only one set of variational equations is needed for the calculation of the Lyapunov exponent, including second-order variational equations will keep the numerical scaling and only increase the computational cost by 50%.

The latest version of REBOUND includes the second-order variational equations and can be downloaded at https://github/com/hannorein/rebound. The package is free to use under an open source license. We provide also a git repository with jupiter notebooks to reproduce the figures in this paper at https://github.com/hannorein/variations. The notebooks can be run interactively in the web browser without the need to install any software locally.

## Acknowledgments

We thank Eric Ford, Benjamin Nelson and Scott Tremaine for many helpful discussions and Gottfried Wilhelm Leibniz for the chain rule. This research has been supported by the NSERC Discovery Grant RGPIN-2014-04553. D.T. is grateful for support from the Jeffrey L. Bishop Fellowship. This research made use of iPython (Pérez & Granger, 2007), SciPy (Jones et al., 2001–) and matplotlib (Hunter, 2007).

## References

• Benettin et al. (1976) Benettin, G., Galgani, L., & Strelcyn, J.-M. 1976, Phys. Rev. A, 14, 2338
• Betancourt (2013) Betancourt, M. 2013, in Lecture Notes in Computer Science, Vol. 8085, Geometric Science of Information, ed. F. Nielsen & F. Barbaresco (Springer Berlin Heidelberg), 327–334
• Brouwer (1937) Brouwer, D. 1937, AJ, 46, 149
• Cincotta et al. (2003) Cincotta, P., Giordano, C., & Simó, C. 2003, Physica D: Nonlinear Phenomena, 182, 151
• Girolami & Calderhead (2011) Girolami, M. & Calderhead, B. 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 123
• Hinse et al. (2010) Hinse, T. C., Christou, A. A., Alvarellos, J. L. A., & Goździewski, K. 2010, MNRAS, 404, 837
• Hunter (2007) Hunter, J. D. 2007, Computing In Science & Engineering, 9, 90
• Jones et al. (2001–) Jones, E., Oliphant, T., Peterson, P., et al. 2001–, SciPy: Open source scientific tools for Python, [Online; accessed 2016-02-12]
• Laskar & Gastineau (2009) Laskar, J. & Gastineau, M. 2009, Nat, 459, 817
• Mikkola & Innanen (1999) Mikkola, S. & Innanen, K. 1999, Celestial Mechanics and Dynamical Astronomy, 74, 59
• Morales-Ruiz et al. (2007) Morales-Ruiz, J. J., Ramis, J.-P., & Simo, C. 2007, in Annales scientifiques de l’Ecole normale superieure, Vol. 40/6, 845–884
• Murray & Dermott (2000) Murray, C. D. & Dermott, S. F. 2000, Solar System Dynamics (Cambridge University Press)
• Neidinger (2010) Neidinger, R. D. 2010, SIAM Review, 52, 545
• Newcomb (1899) Newcomb, S. 1899, Astronomische Nachrichten, 148, 321
• Pérez & Granger (2007) Pérez, F. & Granger, B. E. 2007, Computing in Science and Engineering, 9, 21
• Rein & Liu (2012) Rein, H. & Liu, S.-F. 2012, A&A, 537, A128
• Rein & Spiegel (2015) Rein, H. & Spiegel, D. S. 2015, MNRAS, 446, 1424
• Rein & Tamayo (2015) Rein, H. & Tamayo, D. 2015, MNRAS, 452, 376
• Roy et al. (1988) Roy, A. E., Walker, I. W., MacDonald, A. J., Williams, I. P., & Fox, K. 1988, Vistas in Astronomy, 32, 95
• Sussman & Wisdom (1988) Sussman, G. J. & Wisdom, J. 1988, Science, 241, 433
• Tancredi et al. (2001) Tancredi, G., Sánchez, A., & Roig, F. 2001, AJ, 121, 1171
• Wisdom & Holman (1991) Wisdom, J. & Holman, M. 1991, AJ, 102, 1528
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters