Long-time Average Cost Control of Polynomial Systems: A Sum-of-squares-based Small-feedback Approach 1footnote 11footnote 1Funding from EPSRC under the grant EP/J011126/1 and support in kind from Airbus Operation Ltd., ETH Zurich (Automatic Control Laboratory), University of Michigan (Department of Mathematics), and University of California, Santa Barbara (Department of Mechanical Engineering) are gratefully acknowledged.

Long-time Average Cost Control of Polynomial Systems: A Sum-of-squares-based Small-feedback Approach 111Funding from EPSRC under the grant EP/J011126/1 and support in kind from Airbus Operation Ltd., ETH Zurich (Automatic Control Laboratory), University of Michigan (Department of Mathematics), and University of California, Santa Barbara (Department of Mechanical Engineering) are gratefully acknowledged.

Deqing Huang d.huang@imperial.ac.uk Sergei Chernyshenko s.chernyshenko@imperial.ac.uk Department of Aeronautics, Imperial College London, Prince Consort Road, London SW7 2AZ, United Kingdom
Abstract

The two main contributions of this paper are a proof of concept of the recent novel idea in the area of long-time average cost control, and a new method of overcoming the well-known difficulty of non-convexity of simultaneous optimization of a control law and an additional tunable function. A recently-proposed method of obtaining rigorous bounds of long-time average cost is first outlined for the uncontrolled system with polynomials of system state on the right-hand side. In this method the polynomial constraints are relaxed to be sum-of-squares and formulated as semi-definite programs. It was proposed to use the upper bound of long-time average cost as the objective function instead of the time-average cost itself in controller design. In the present paper this suggestion is implemented for a particular system and is shown to give good results. Designing the optimal controller by this method requires optimising simultaneously both the control law and a tunable function similar to the Lyapunov function. The new approach proposed and implemented in this paper for overcoming the inherent non-convexity of this optimisation is based on a formal assumption that the amplitude of control is small. By expanding the tunable function and the bound in the small parameter, the long-time average cost is reduced by minimizing the respective bound in each term of the series. The derivation of all the polynomial coefficients in controller is given in terms of the solvability conditions of state-dependent linear and bilinear inequalities. The resultant sum-of-squares problems are solved in sequence, thus avoiding the non-convexity in optimization. The proposed approach is implemented for a simple model of oscillatory vortex shedding behind a cylinder.

keywords:
Sum of squares; Long-time average; Polynomial systems; Small feedback; Non-convexity

,

1 Introduction

Although global stabilization of dynamical systems is of importance in system theory and engineering Kh:02 (); An:02 (), it is sometimes difficult or impossible to synthesize a global stabilizing controller for certain linear and nonlinear systems Va:02 (). The reasons could be the poor controllability of system, e.g., systems that have uncontrollable linearizations Di:11 () and systems that have fewer degrees of control freedom than the degrees of freedom to be controlled Gu:13 (); Gu:14 (), the input/output constraints in practice, e.g., an unstable linear time-invariant system cannot be globally stabilized in the presence of input saturations Bl:99 (), time delay Sun:11 (); Sun:13 (), and/or the involved large disturbances Ki:06 (), etc. Moreover, in many applications the full stabilization, while possible, carries high penalty due to the cost of the control, thus is also not desirable.

Instead, minimizing a long-time average of the cost functional might be more realistic. For instance, long-time-average cost analysis and control is often considered in irrigation, flood control, navigation, water supply, hydroelectric power, computer communication networks, and other applications Du:10 (); Bo:92 (). In addition, systems that include stochastic factors are often controlled in the sense of long-time average. In Ro:83 (), a summary of long-time-average cost problems for continuous-time Markov processes is given. In Me:00 (), the long-time-average control of a class of problems that arise in the modeling of semi-active suspension systems was considered, where the cost includes a term based on the local time process diffusion. Notice that the controller design methods proposed in Ro:83 (); Me:00 () are highly dependent on the stochastic property of dynamical systems.

In certain cases, as, for example, turbulent flows of fluid, calculating the time averages is a big challenge even in the uncontrolled case. As a result, developing the control aimed at reducing the time-averaged cost for turbulent flows, for example by using the receding horizon technique, leads to controllers too complicated for practical implementation Bewley:01 (). To overcome this complexity, it was proposed Ph:14 () to use an upper bound for the long-time average cost instead of the long-time average cost itself in cases when such an upper bound is easier to calculate. The idea is based on the hope that the control reducing an upper bound for a quantity will also reduce the quantity itself. Meanwhile, Ph:14 () uses the sum of squares (SOS) decomposition of polynomials and semidefinite programming (SDP) and allows a trade-off between the quality of bound and the complexity of its calculation.

The SOS methods apply to systems defined by a polynomial vector field. Such systems may describe a wide variety of dynamics Va:01 () or approximate a system defined by an analytical vector field Va:02 (). A polynomial system can therefore yield a reliable model of a dynamical system globally or in larger regions than the linear approximation in the state-space Va:03 (). Recent results on SOS decomposition have transformed the verification of non-negativity of polynomials into SDP, hence providing promising algorithmic procedures for stability analysis of polynomial systems. However, using SOS techniques for optimal control, as for example in Pr:02 (); Zh:07 (); Ma:10 (), is subject to a generic difficulty: while the problem of optimizing the candidate Lyapunov function certifying the stability for a closed-loop system for a given controller and the problem of optimizing the controller for a given candidate Lyapunov function are reducible to an SDP and thus, are tractable, the problem of simultaneously optimizing both the control and the Lyapuniov function is non-convex. Iterative procedures were proposed for overcoming this difficulty Zh:07 (); Zh:09 (); Ng:11 ().

While optimization of an upper bound with control proposed in Ph:14 () does not involve a Lyapunov function, it does involve a similar tunable function, and it shares the same difficulty of non-convexity. In the present work we propose a polynomial type state feedback controller design scheme for the long-time average upper-bound control, where the controller takes the structure of an asymptotic series in a small-amplitude perturbation parameter. By fully utilizing the smallness of the perturbation parameter, the resultant SOS optimization problems are solved in sequence, thus avoiding the non-convexity in optimization. We apply it to an illustrative example and demonstrate that it does allow to reduce the long-time average cost even without fully stabilizing the system. Notice the significant conceptual difference between our approach and the studies of control by small perturbations, often referred to as tiny feedback, see for example tc:93 ().

The paper is organized as follows. Section 2 presents some preliminary introduction on SOS and its application in bound estimation of long-time average cost for uncontrolled systems. Section 3 gives the problem formulation. Bound optimization of the long-time average cost for controlled polynomial systems is considered in Section 4. An illustrative example of a cylinder wake flow is addressed in Section 5. Section 6 concludes the work.

2 Background

In this section SOS of polynomials and a recently-proposed method of obtaining rigorous bounds of long-time average cost via SOS for uncontrolled polynomial systems are introduced.

2.1 SOS of polynomials

SOS techniques have been frequently used in the stability analysis and controller design for all kinds of systems, e.g., constrained ordinary differential equation systems An:02 (), hybrid systems An:05 (), time-delay systems An:04 (), and partial differential equation systems Pa:06 (); Yu:08 (); GC:11 (). These techniques help to overcome the common drawback of approaches based on Lyapunov functions: before Pr:02 (), there were no coherent and tractable computational methods for constructing Lyapunov functions.

A multivariate polynomial is a SOS, if there exist polynomials such that

If is a SOS then . In the general multivariate case, however, does not necessarily imply that is SOS. While being stricter, the condition that is SOS is much more computationally tractable than non-negativity Par:00 (). At the same time, practical experience indicates that in many cases replacing non-negativity with the SOS property leads to satisfactory results.

In the present paper we will utilize the existence of efficient numerical methods and software Pra:04 (); Lo:09 () for solving the optimization problems of the following type: minimize the linear objective function

(1)

where is the vector of weighting coefficients for the linear objective function, and is a vector formed from the (unknown) coefficients of the polynomials for and SOS for , such that

(2)
(3)

In (2) and (3), the are given scalar constant coefficient polynomials.

The lemma below that provides a sufficient condition to test inclusions of sets defined by polynomials is frequently used for feedback controller design in Section  4. It is a particular case of the Positivstellensatz Theorem Po:99 () and is a generalized -procedure Ta:06 ().

Lemma 1

Consider two sets of ,

where and are scalar polynomial functions. The set containment holds if there exist a polynomial function and SOS polynomial functions such that

2.2 Bound estimation of long-time average cost for uncontrolled systems

For the convenience of the reader we outline here the method of obtaining bounds for long-time averages proposed in Ph:14 () and make some remarks on it. Consider a system

(4)

where and is a vector of multivariate polynomials of the components of the state vector . The long-time average of a function of the state is defined as

where is the solution of (4).

Define a polynomial function of the system state, , of degree and containing unknown decision variables as its coefficients. The time derivative of along the trajectories of system (4) is

Consider the following quantity:

The following result is from Ph:14 ():

Lemma 2

For the system (4), assume that the state is bounded in . Then, implies .

Hence, an upper bound of can be obtained by minimizing over under the constraint , which can be formulated as a SOS optimization problem in the form:

(5)
(6)

which is a special case of (1). A better bound might be obtained by removing the requirement for to be a polynomial and replacing (6) with the requirement of non-negativeness. However, the resulting problem would be too difficult, since the classical algebraic-geometry problem of verifying positive-definiteness of a general multi-variate polynomial is NP-hard An:02 (); An:05 ().

Notice that while is similar to a Lyapunov function in a stability analysis, it is not required to be positive-definite. Notice also that a lower bound of any long-time average cost of the system (4) can be analyzed in a similar way.

Remark 1

For many systems the boundedness of system state immediately follows from energy consideration. In general, if the system state is bounded, this can often be proven using the SOS approach. It suffices to check whether there exists a large but bounded global attractor, denoted by As an example, let , where the constant is sufficiently large. Then, the global attraction property of system in may be expressed as

(7)

Introducing a tunable polynomial satisfying , by Lemma 1, (7) can be relaxed to

(8)

Minimization of upper bound of long-time average cost for systems that have unbounded global attractor is usually meaningless, since the cost itself could be infinitely large.

3 Problem Formulation

Consider a polynomial system with single input

(9)

where and are polynomial functions of system state . The approach of this paper can easily be extended to multiple input systems. The control , which is assumed to be a polynomial vector of the system state with maximum degree , is designed to minimize the upper bound of an average cost of the form:

(10)

where is the closed-loop solution of the system (9) with the control . The continuous function is a given non-negative polynomial cost in and .

Similarly to (5)-(6), we consider the following optimization problem:

(11)
(12)

When it cannot be guaranteed that the closed-loop system state is bounded, SOS constraints (8) must be added to (12) to make our analysis rigorous.

Under the framework of SOS optimization, the main problem in solving (11)-(12) is due to the non-convexity of (12) caused by the control input and the decision function both of which are tunable, entering (12) nonlinearly. Iterative methods Zh:07 (); Zh:09 (); Ng:11 () may help to overcome this issue indirectly in the following way: first fix one subset of bilinear decision variables and solve the resulting linear inequalities in the other decision variables; in the next step, the other bilinear decision variables are fixed and the procedure is repeated. For the particular long-time average cost control problem (11)-(12), the non-convexity will be resolved in the following by considering a type of so-called small-feedback controller. In such a new way, iterative updating of decision variables is exempted, and replaced by solving a sequence of SOS optimization problems.

4 Bound optimization of long-time average cost for controlled polynomial systems

In this section a small-feedback controller is designed to reduce the upper bound of the long-time average cost (10) for the controlled polynomial system (9). It is reasonable to hope that a controller reducing the upper bound for the time-averaged cost will also reduce the time-averaged cost itself Ph:14 ().

4.1 Basic formalism of the controller design

We will look for a controller in the form

(13)

where is a parameter, and are polynomial vector functions of system state In other words, we seek a family of controllers parameterised by in the form of a Taylor series in . Notice that the expansion starts at the first-order term, so that gives the uncontrolled system. To resolve the non-convexity problem of SOS optimization, we expand and in :

(14)
(15)

where and are the Taylor series coefficients for the tunable function and the bound, respectively, in the th-order term of . Define

(16)

Substituting (13), (14), and (15) into (16), we have

Noticing

it follows that

(17)

where

(18)

In (18), denotes the th partial derivative of with respect to at , and denotes the th partial derivative of with respect to at .

Expression (17) becomes more clear when a specific cost function is considered. For instance, let . Then,

where

and denotes all the terms with order of being equal or greater than 3.

It is clear that holds if , simultaneously, and the series (13)-(15) converge. Notice that includes tunable functions , and . For any non-negative integers satisfying , the tunable variables in are always a subset of the tunable variables in . Hence (11)-(12) can be solved as a sequence of convex optimization problems. When the inequality constraints are relaxed to SOS conditions, our idea can be summarized as follows.

  The sequential steps to solve (11)-(12): A-I
 

  • First minimize over under the constraint , or more conservatively,

    Denote the optimal by and the associated by .

  • Now, let in , and then minimize over and under the constraint , or under the framework of SOS optimization,

    Using the generalized -procedure given in Lemma 1 and the fact that

    (19)

    can be revised by incorporating one more tunable function :

    Denote the optimal by and the associated and by and , respectively.

  • Further let , , and in , and then minimize over and under the constraint . In a more tractable way, consider

    Similarly as in (s1), noticing (19) and

    the SDP problem can be revised by the generalized -procedure to the following form:

    Denote the optimal by and the associated and by and , respectively.

    Notice that here might differ from the tunable function in . Throughout this paper we will use the same notations for the tunable functions like and in various instances of the -procedure, to keep the notation simple.

  • The SOS-based controller design procedure is continued for higher-order terms.

 

Now, define three series

(20)

When all of them converge, the following statement will be true.

Theorem 1

By applying the state-feedback controller for the system (9), if the trajectories of the closed-loop system are bounded 222In the context of long-time average cost controller design and analysis, it is actually enough to assume the boundedness of the global attractor of the system to ensure the existence of . , then is an upper bound of the long-time average cost .

Proof. Using the algorithm A-I, we obtain

Then, it follows that

where are given in (20). By virtue of a same analysis as in proving Lemma 2 (see Ph:14 ()), we can conclude that .  

Remark 2

After specifying the structure of controller to be of the form (13), the non-convexity in solving the optimization problem (11)-(12) has been avoided by solving the linear SDPs in sequence. During the process, all the involved decision variables are optimized sequentially, but not iteratively as in other methods Zh:07 (); Zh:09 (); Ng:11 ().

Remark 3

The smallness of can be used to relax further. For instance, in , in order to prove we prove with the aid of the known constraint , thus not using that is small. In fact, when is small, for to be negative has to be negative only for those where is small, and not for all as required in . Meanwhile, checking the convergence of the series (20) would be challenging or even impractical. These points will be addressed in what follows.

4.2 Design of small-feedback controller

Next, the sequential design method A-I is revised to utilize that .

  The revised sequential steps to solve (11)-(12): A-II

 

  • Same as in A-I, first solve the SOS optimization problem . Denote the optimal by and the associated by .

  • Let in , and then consider the following SDP problem:

    where is any tunable polynomial function of of fixed degree. Denote the optimal by and the associated and by and , respectively. Unlike , here the non-negativity requirement of is not imposed. This can be understood as that the non-negativity constraint is imposed only for such that .

  • Further let , , and in , and then consider

    where and are any tunable polynomial functions of fixed degrees. here does not need to be the same as in . Denote the optimal by and the associated and by and , respectively. Similarly as in , here the non-negativity constraint is in effect imposed only where .

  • The revised SOS-based controller design procedure is continued for higher-order terms.

 

Since the constraints of being SOS imposed in A-I are removed in A-II, the coefficients obtained in A-II can be smaller than the coefficients obtained in A-I. This advantage comes at a price: even if all the relevant series converge for a particular value of , the procedure A-II does not guarantee that the value given in (20) is an upper bound for the time-averaged cost of the closed-loop system with the controller . We have now to consider (20) as asymptotic expansions rather than Taylor series. Accordingly, we have to truncate the series and hope that the resulting controller will work for (sufficiently) small 333It is worthy of noticing that the series truncation here does not mean that our controller design and analysis are conducted in a non-rigorous way. The truncated controller would be effective if it leads to a better (lower) bound of the long-time average cost.. It is possible to prove that this is, indeed, the case.

For illustration, the first-order truncation is considered only.

Theorem 2

Consider the first-order small-feedback controller for the system (9),

(21)

where is sufficiently small. Assume that the trajectories of the closed-loop system are bounded, and that . Then, is an upper bound of the long-time average cost . Clearly, .

Proof. Let . By substituting in the constraint function that is defined in (16), the remaining task is to seek small such that

(22)

Notice that

(23)

where

and , being polynomial in , possess all the continuity properties implied by the proof. Let be the phase domain that interests us, where the closed-loop trajectories are all bounded. Then,

(24)

and is bounded for any and any finite (the latter following from the standard mean-value-theorem-based formula for the Lagrange remainder). By (23) and (24),

(25)

Meanwhile, consider the two inequality constraints obtained by solving and :

(26)

Define for a given constant . Clearly, as . Further define

(27)

By the second constraint in (26), . Therefore, by continuity and the fact , for any there exists a constant such that

(28)

In consequence, (23), the first constraint in (26), and (28) render to

(29)

for sufficiently small .

Next, we prove (22) for any . By the definition of the set , we have

(30)

Then, (25) and (30) yield

(31)

if is sufficiently small.

(29) and (31) imply that (22) holds . The proof is complete.  

In practice, once the form of the controller has been specified in (21), the upper bound and the corresponding actually can be obtained by solving the following optimization problem directly:

This problem can be further relaxed by incorporating the known constraints (26). In , if is set as one of the tunable variables, the SOS optimization problem will become non-convex again, thus causing additional trouble in solving it. Alternatively, one can fix here, and investigate its effect on the upper bound of by trial and error. We will follow this route in Section 5.

5 Illustrative example

As an illustrative example we consider a system proposed in Ki:05 () as a model for studying control of oscillatory vortex shedding behind a cylinder. The actuation was assumed to be achieved by a volume force applied in a compact support region downstream of the cylinder. The Karhunen-Loève (KL) decomposition No:03 () was used and the first two KL modes and an additional shift mode were selected. For the Reynolds number equal to 100 the resulting low-order Galerkin model of the cylinder flow with control was given as follows

(32)

where , and . More details on deriving the reduced-order model (32) are given in Ro:14 ().

The system (32) possesses a unique equilibrium when , which is at the origin. Let , where . The proposed algorithms A-I and A-II were applied to (32), with the system state assumed to be available. In experiment, it could be estimated by designing a state observer with some sensed output measurement at a typical position Ro:14 ().

5.1 Performance of algorithm A-I

The SDP problem is solved first. It corresponds to the uncontrolled sysytem. The minimal upper bound we could achieve was It was obtained with

Increasing the degree of cannot give a better bound because there exists a stable limit cycle in the phase space of (32), on which and . Since on the limit cycle, the minimal upper bound achieved by SOS optimization is tight in the sense that the difference between and is less than the prescribed precision for , .

Solving the SDP problem , where and are tunable functions, gave . Solving with being tuning functions, gave the same result: . In both cases, increasing the degrees of the tuning functions did not reduce the upper bound. The consequent SOS optimization problems, with