Optimal bounds and extremal trajectories for
time averages in nonlinear dynamical systems
For any quantity of interest in a system governed by ordinary differential equations, it is natural to seek the largest (or smallest) long-time average among solution trajectories, as well as the extremal trajectories themselves. Upper bounds on time averages can be proved a priori using auxiliary functions, the optimal choice of which is a convex optimization problem. We prove that the problems of finding maximal trajectories and minimal auxiliary functions are strongly dual. Thus, auxiliary functions provide arbitrarily sharp upper bounds on time averages. Moreover, any nearly minimal auxiliary function provides phase space volumes in which all nearly maximal trajectories are guaranteed to lie. For polynomial equations, auxiliary functions can be constructed by semidefinite programming, which we illustrate using the Lorenz system.
Keywords: Nonlinear dynamical systems, Time averages, Ergodic theory, Semidefinite programming, Sum-of-squares polynomials, Lorenz equations
1. Introduction For dynamical systems governed by ordinary differential equations (ODEs) whose solutions are complicated and perhaps chaotic, the primary interest is often in long-time averages of key quantities. Time averages can depend on initial conditions, so it is natural to seek the largest or smallest average among all trajectories, as well as the extremal trajectories that realize them. For various purposes including the control of chaos Yang et al. (2000), it is valuable to know extremal trajectories regardless of their stability. In other situations one is interested only in stable trajectories, but determining extrema only among these can be prohibitively difficult. The next best option is to determine extrema among all trajectories.
One common way to seek extremal time averages is to construct a large number of candidate trajectories. However, for many nonlinear systems it is challenging both to compute trajectories and to determine that the extremal ones have not been overlooked. In this Letter we study an alternative approach that is broadly applicable and often more tractable: constructing sharp a priori bounds on long-time averages. We focus on upper bounds; lower bounds are analogous.
The search for an upper bound on a long-time average can be posed as a convex optimization problem Chernyshenko et al. (2014), as described in the next section. Its solution requires no knowledge of trajectories. What is optimized is an auxiliary function defined on phase space, similar to but distinct from Lyapunov functions in stability theory. We prove here that the best bound produced by solving this convex optimization problem coincides exactly with the extremal long-time average. That is, arbitrarily sharp bounds on time averages can be produced using increasingly optimal auxiliary functions. Moreover, nearly optimal auxiliary functions yield volumes in phase space where maximal and nearly maximal trajectories must reside. Whether such auxiliary functions can be computed in practice depends on the system being studied, but when the ODE and quantity of interest are polynomial, auxiliary functions can be constructed by solving semidefinite programs (SDPs) Chernyshenko et al. (2014); Fantuzzi et al. (2016); Goluskin (2017). The resulting bounds can be arbitrarily sharp. We illustrate these methods using the Lorenz system Lorenz (1963).
Consider a well-posed autonomous ODE on ,
whose solutions are continuously differentiable in their initial conditions. To guarantee this, we assume that is continuously differentiable. Given a continuous quantity of interest , we define its long-time average along a trajectory with initial condition by
Time averages could be defined using instead; our results hold mutatis mutandis 111The and averages need not coincide on every trajectory, but their maxima over trajectories do..
Let be a closed bounded region such that trajectories beginning in remain there. In a dissipative system could be an absorbing set; in a conservative system could be defined by constraints on invariants. We are interested in the maximal long-time average among all trajectories eventually remaining in :
As shown below, there exist attaining the maximum. The fundamental questions addressed here are: what is the value of , and which trajectories attain it?
2. Bounds by convex optimization Upper bounds on long-time averages can be deduced using the fact that time derivatives of bounded functions average to zero. Given any initial condition in and any in the class of continuously differentiable functions on 222Here denotes functions on admitting a continuously differentiable extension to a neighborhood of .,
This generates an infinite family of functions with the same time average as since for all such
Bounding the righthand side pointwise gives
for all initial conditions and auxiliary functions . Expression (6) is useful since no knowledge of trajectories is needed to evaluate the righthand side.
To obtain the optimal bound implied by (6), we minimize the righthand side over and maximize the lefthand side over :
The minimization over auxiliary functions in (7) is convex, although minimizers need not exist. The main mathematical result of this Letter is that the lefthand and righthand optimizations are dual variational problems, and moreover that strong duality holds, meaning that (7) can be improved to an equality:
Thus, arbitrarily sharp bounds on the maximal time average can be obtained using increasingly optimal .
The auxiliary function method is not the same as the various Lyapunov-type methods used to show stability or boundedness in ODE systems. However, in instances where approaches infinity as , auxiliary functions that imply finite upper bounds also imply the existence of trapping sets by the following argument. Suppose is an auxiliary function for which the maximum of over is no larger than . Then,
as . Expression (9) is a typical Lyapunov-type condition implying that all sufficiently large sublevel sets of must be trapping sets.
The remainder of this Letter is organized as follows. The next section describes how nearly optimal can also be used to locate maximal and nearly maximal trajectories in phase space. The section after illustrates these ideas using the Lorenz system, for which we have constructed nearly optimal by solving SDPs. The final section proves the strong duality (8) and establishes the existence of maximal trajectories.
3. Near optimizers In light of the duality (8), an initial condition and auxiliary function are optimal if and only if they satisfy
In such cases, is within of being a sharp upper bound on , while the trajectory starting at achieves a time average within of .
for such trajectories. The integrand in (12) is nonnegative, and the fraction of time it exceeds can be estimated. Consider the set where the integrand is no larger than ,
Let denote the fraction of time during which . For any trajectory obeying (12), this time fraction is bounded below as
This follows from an application of Markov’s inequality: as the integrand in (12) is nonnegative,
In practice, it may not be known if there exist trajectories satisfying (11) for a given and . Still, the estimate (14) says that any such trajectories would lie in for a fraction of time no smaller than . The conclusion is strongest when , but if is too large the volume is large and featureless, failing to distinguish nearly maximal trajectories. The result is most informative when is nearly optimal so that there exist trajectories where with not too large.
If a minimal exists, its set is related to maximal trajectories. Any such trajectory achieves in (12). If it is a periodic orbit, for instance, it must lie in . Thus is determined up to a constant on maximal orbits. More generally, must satisfy
for all . It is tempting to conjecture that coincides with maximal trajectories but, as described at the end of the next section, can also contain points not on any maximal trajectory.
4. Nearly optimal bounds and orbits in the Lorenz system When and are polynomials, can be optimized computationally within a chosen polynomial ansatz by solving an SDP Chernyshenko et al. (2014); Fantuzzi et al. (2016); Goluskin (2017). The bound follows from (6) if for all . A sufficient condition for this is that is a sum of squares (SOS) of polynomials. The latter is equivalent to an SDP and is often computationally tractable Parrilo (2003); Lasserre (2010).
It does not follow from the strong duality result (8) that bounds computed by SOS methods can be arbitrarily sharp. This is because requiring the polynomial to be SOS is generally stronger than requiring it to be nonnegative Hilbert (1888); Parrilo (2003). Nonetheless, in the few examples where time averages have been bounded using SOS methods Fantuzzi et al. (2016); Goluskin (2017), the bounds either are sharp or appear to become sharp as the polynomial degree of increases.
The remainder of this section presents the results of SOS bounding computations for the Lorenz system at the standard chaotic parameters . We obtain nearly sharp bounds on the maximal time average of , as well as approximations to maximal trajectories. Because there exist compact absorbing balls Lorenz (1979), maximization over such in (3) is equivalent to maximization over . As reported in Goluskin (2017), searching among the periodic orbits computed by Viswanath Viswanath (2003) suggests that the maximal average is attained by the shortest periodic orbit—the black curves in Fig. 1. We have used SOS methods to construct nearly optimal and accompanying upper bounds . Similar results for various in the Lorenz system appear in Goluskin (2017), along with a more detailed discussion of computational implementation. Here we report more precise computations for , obtained using the multiple precision SDP solver SDPA-GMP Yamashita et al. (2010); Nakata (2010). Conversion of SOS conditions to SDPs was automated by YALIMP Löfberg (2004, 2009), which was interfaced with the solver via mpYALMIP Fantuzzi (2017).
|Degree of||Upper bound|
Table 1 reports upper bounds computed by solving SDPs that produce optimal of various polynomial degrees. As the degree of increases, the bounds approach the value of on the shortest periodic orbit to within 7 significant figures. This suggests that is maximized on this orbit, and it reflects the sharpness of the bounds asserted by the duality (8). We do not report the lengthy expressions for these ; some simpler examples appear in Goluskin (2017).
To demonstrate how the volumes defined in (13) approximate maximal trajectories, we consider the polynomials of degrees 6 and 10 that produce the bounds in Table 1. For the maximum in the definition of we use the corresponding , which bounds it from above. In each case we find that is within of the true maximum over any ball containing the attractor.
Figure 1a shows the volume for the degree-6 , as well as the orbit that appears to maximize . The volume captures the rough location and shape of the orbit while omitting much of the strange attractor, but this is not optimal enough to yield strong quantitative statements. It follows from (14) that any trajectory where is within of the upper bound must lie inside for a fraction of time no less than . However, there are no trajectories on which this is close to unity; the higher-degree bounds in Table 1 preclude any trajectories with .
The degree-10 gives a significantly refined picture of maximal and nearly maximal trajectories for . Figure 1b shows the volume defined using this . It follows from (14) that any trajectory where comes within of must lie in for a fraction of time no less than . There exist trajectories on which this is nearly unity: on the shortest periodic orbit is only smaller than . Any trajectory where is so large must spend at least 99.97% of its time in .
Finding maximal trajectories directly may be intractable in many systems. We propose that the next best option is to compute volumes like those in Fig. 1. However, we caution that finding points in a set defined by (13) can itself be difficult, even for polynomials.
As the auxiliary functions producing upper bounds on approach optimality, the integrand in (12) approaches zero almost everywhere on maximal trajectories. This can be seen in Fig. 2a, where the integrand is plotted along the shortest periodic orbit in the Lorenz system for our polynomials of degrees 6, 8, and 10. Along other orbits where is large but not maximal, is less strongly constrained. As an example, we consider the periodic orbit computed in Viswanath (2003) that winds around the two wings of the Lorenz attractor with symbol sequence . On this orbit is smaller than the maximum by approximately 2798. The integral in (12) remains between 0 and 2798 as approaches optimality but need not approach 0 on this orbit. In our computations it does not, as seen in Fig. 2b.
Although the auxiliary polynomials yielding the bounds on in Table 1 approach optimality, they are not exactly optimal. Optimal which are polynomial have been constructed to prove sharp bounds on other averages in the Lorenz system, including , , and Malkus (1972); Knobloch (1979); Goluskin (2017). These averages are maximized on the two nonzero equilibria; in each case the set corresponding to is the line through these equilibria. These notably include points not on any maximal trajectory. In contrast, for the shortest periodic orbit appears to be maximal. This conjecture could be proved by constructing a whose contains the shortest orbit. If such a exists, it would necessarily be optimal. However, we expect that this orbit is non-algebraic and that no polynomial can be optimal.
5. Proof of duality To prove the strong duality (8) we require several facts from ergodic theory, which are provable by standard methods as in Walters (1982). (See also (Phelps, 2001, Chap. 12).) Let denote the flow map for the ODE (1). By assumption, is well-defined on for all and is continuously differentiable there. Let denote the space of Borel probability measures on . A measure is invariant with respect to if for all Borel sets and all . Such a measure is ergodic if to any invariant Borel set it assigns measure either zero or one. The set of invariant probability measures on is nonempty, convex, and weak- compact; its extreme points are ergodic.
In (17a) we reformulate our problem as a maximization over invariant measures, whose analogue for discrete maps is the topic of the field of ergodic optimization Jenkinson (2006). The remainder of this section is devoted to proving the first three equalities (17a)–(17c), along with the fact that the maximum in (17a) is attained. The final equality (17d) is evident since, for each , the supremum in (17c) is attained by a suitable Dirac measure.
We begin by proving (17a). We claim that the right hand problem appearing there is a concave relaxation of the lefthand problem, and that it attains the same maximum. To see this, note first that for each initial condition in there exists an invariant probability measure that attains . Thus,
The righthand problem in (18) is a maximization of a continuous linear functional over a compact convex subset of , so it achieves its maximum at an extreme point (Lax, 2002, Chap. 13), which is an ergodic invariant measure. By Birkhoff’s ergodic theorem Walters (1982),
To prove the second equality (17b) we require the following equivalence of Lagrangian and Eulerian notions of invariance: a Borel probability measure is invariant with respect to by the usual (Lagrangian) definition if and only if the vector-valued measure is weakly divergence-free. The latter condition, which we denote by , means that
for all smooth and compactly supported . This is an Eulerian characterization of invariance.
The fact that is equivalent to invariance is quickly proved using the flow semigroup identity, which states that for all and . It follows that
for all smooth and compactly supported . If , the righthand side of (21) vanishes, so is invariant. Conversely, if is invariant then the lefthand side of (21) vanishes for all , and at we find the statement that is weakly divergence-free.
Only measures for which can give values larger than in (17b). As shown above, if and only if is invariant. Therefore, since there always exists at least one invariant probability measure,
Thus (17b) is proven. In other words,
is a Lagrangian for the constrained maximization appearing on the righthand side of (23).
Finally, we prove the equality (17c). In terms of the Lagrangian , we must show that
The fact that the order of and can be reversed without introducing a so-called duality gap is not trivial; it is at the heart of our proof of the strong duality (8). This reversal relies on properties of the Lagrangian and the spaces and .
The desired equality (25) can be proved using any of several abstract minimax theorems from convex analysis. Here we apply a fairly general infinite-dimensional version due to Sion Sion (1958). We follow the notation of its statement in the introduction of Komiya (1988), which contains an elementary proof. Let in the weak- topology. It is a compact convex subset of a linear topological space. Let in the -norm topology, which is itself a linear topological space. Take and observe that is upper semicontinuous and quasi-concave on for each , and that is lower semicontinuous and quasi-convex on for each . Then (25) follows from a direct application of Sion’s minimax theorem Komiya (1988), so (17c) is proven.
6. Conclusions This Letter establishes that the auxiliary function method for proving a priori bounds on long-time averages in dynamical systems yields arbitrarily sharp bounds, so long as the dynamics arise from ODEs. The proof elucidates the role that auxiliary functions play in the search for optimal bounds: they are Lagrange multipliers enforcing the constraint of invariance for probability measures on phase space. We also have demonstrated that certain sets constructed from nearly optimal auxiliary functions can be used to locate all optimal and nearly optimal trajectories. How close an auxiliary function is to optimality determines the fraction of time nearly optimal trajectories are guaranteed to spend in these sets. We expect much can be learned about the shape of optimal trajectories and their invariant measures by the auxiliary function approach.
Many of these observations extend to infinite-dimensional dynamics that are governed by nonlinear partial differential equations (PDEs) of the form
Auxiliary functionals defined on a suitable function space yield a priori bounds on long-time averages just as in the finite-dimensional case. The “background method” used to bound mean quantities in fluid dynamics and other systems Doering and Constantin (1992) is an example of using quadratic Chernyshenko (2017). Whether or not nearly optimal functionals are always guaranteed to exist, and if they are ever quadratic for systems of interest, remains unclear. This emphasizes the need for a rigorous proof of duality between auxiliary functionals and extremal trajectories for general PDEs.
Acknowledgements We thank Lora Billings, Rich Kerswell, Edward Ott, Ralf Spatzier, Divakar Viswanath, and Lai-Sang Young for helpful discussions and encouragement. This work was supported by NSF Award DMS-1515161, Van Loo Postdoctoral Fellowships (IT, DG), and a Guggenheim Foundation Fellowship (CRD).
- Yang et al. (2000) T.-H. Yang, B. R. Hunt, and E. Ott, Phys. Rev. E 62, 1950 (2000).
- Chernyshenko et al. (2014) S. I. Chernyshenko, P. Goulart, D. Huang, and A. Papachristodoulou, Philos. Trans. R. Soc. A 372, 20130350 (2014).
- Fantuzzi et al. (2016) G. Fantuzzi, D. Goluskin, D. Huang, and S. I. Chernyshenko, SIAM J. Appl. Dyn. Syst. 15, 1962 (2016), arXiv:1512.05599 .
- Goluskin (2017) D. Goluskin, J. Nonlinear Sci. In press (2017), 10.1007/s00332-017-9421-2.
- Lorenz (1963) E. N. Lorenz, J. Atmos. Sci. 20, 130 (1963).
- (6) The and averages need not coincide on every trajectory, but their maxima over trajectories do.
- (7) Here denotes functions on admitting a continuously differentiable extension to a neighborhood of .
- Parrilo (2003) P. A. Parrilo, Math. Program. Ser. B 96, 293 (2003).
- Lasserre (2010) J. B. Lasserre, Moments, positive polynomials and their applications (Imperial College Press, 2010).
- Hilbert (1888) D. Hilbert, Math. Ann. 32, 342 (1888).
- Lorenz (1979) E. N. Lorenz, in Glob. Anal., edited by M. Grmela and J. E. Marsden (Springer, 1979) pp. 53–75.
- Viswanath (2003) D. Viswanath, Nonlinearity 16, 1035 (2003).
- Yamashita et al. (2010) M. Yamashita, K. Fujisawa, M. Fukuda, K. Nakata, and M. Nakata, Res. Rep. B-463, Tech. Rep. (Deptartment of Mathematical and Computing Science, Tokyo Institute of Technology, Tokyo, 2010).
- Nakata (2010) M. Nakata, in Proc. IEEE Int. Symp. Comput. Control Syst. Des. (2010) pp. 29–34.
- Löfberg (2004) J. Löfberg, in IEEE Int. Conf. Comput. Aided Control Syst. Des. (Taipei, Taiwan, 2004) pp. 284–289.
- Löfberg (2009) J. Löfberg, IEEE Trans. Automat. Contr. 54, 1007 (2009).
- Fantuzzi (2017) G. Fantuzzi, http://github.com/giofantuzzi/mpYALMIP (2017).
- Malkus (1972) W. V. R. Malkus, Mémoires la Société R. des Sci. Liège, Collect. IV 6, 125 (1972).
- Knobloch (1979) E. Knobloch, J. Stat. Phys. 20, 695 (1979).
- Walters (1982) P. Walters, An introduction to ergodic theory, Vol. 79 (Springer-Verlag, New York-Berlin, 1982) pp. ix+250.
- Phelps (2001) R. R. Phelps, Lectures on Choquet’s theorem, 2nd ed., Vol. 1757 (Springer-Verlag, Berlin, 2001) pp. viii+124.
- Ekeland and Témam (1999) I. Ekeland and R. Témam, Convex analysis and variational problems (SIAM, Philadelphia, PA, 1999).
- Jenkinson (2006) O. Jenkinson, Discret. Contin. Dyn. Syst. 15, 197 (2006).
- Lax (2002) P. D. Lax, Functional analysis (Wiley-Interscience [John Wiley & Sons], New York, 2002) pp. xx+580.
- Sion (1958) M. Sion, Pacific J. Math. 8, 171 (1958).
- Komiya (1988) H. Komiya, Kodai Math. J. 11, 5 (1988).
- Doering and Constantin (1992) C. R. Doering and P. Constantin, Phys. Rev. Lett. 69, 1648 (1992).
- Chernyshenko (2017) S. Chernyshenko, arXiv:1704.02475v2 (2017).