Longtime Average Cost Control of Polynomial Systems: A Sumofsquaresbased Smallfeedback Approach ^{1}^{1}1Funding from EPSRC under the grant EP/J011126/1 and support in kind from Airbus Operation Ltd., ETH Zurich (Automatic Control Laboratory), University of Michigan (Department of Mathematics), and University of California, Santa Barbara (Department of Mechanical Engineering) are gratefully acknowledged.
Abstract
The two main contributions of this paper are a proof of concept of the recent novel idea in the area of longtime average cost control, and a new method of overcoming the wellknown difficulty of nonconvexity of simultaneous optimization of a control law and an additional tunable function. A recentlyproposed method of obtaining rigorous bounds of longtime average cost is first outlined for the uncontrolled system with polynomials of system state on the righthand side. In this method the polynomial constraints are relaxed to be sumofsquares and formulated as semidefinite programs. It was proposed to use the upper bound of longtime average cost as the objective function instead of the timeaverage cost itself in controller design. In the present paper this suggestion is implemented for a particular system and is shown to give good results. Designing the optimal controller by this method requires optimising simultaneously both the control law and a tunable function similar to the Lyapunov function. The new approach proposed and implemented in this paper for overcoming the inherent nonconvexity of this optimisation is based on a formal assumption that the amplitude of control is small. By expanding the tunable function and the bound in the small parameter, the longtime average cost is reduced by minimizing the respective bound in each term of the series. The derivation of all the polynomial coefficients in controller is given in terms of the solvability conditions of statedependent linear and bilinear inequalities. The resultant sumofsquares problems are solved in sequence, thus avoiding the nonconvexity in optimization. The proposed approach is implemented for a simple model of oscillatory vortex shedding behind a cylinder.
keywords:
Sum of squares; Longtime average; Polynomial systems; Small feedback; Nonconvexity,
1 Introduction
Although global stabilization of dynamical systems is of importance in system theory and engineering Kh:02 (); An:02 (), it is sometimes difficult or impossible to synthesize a global stabilizing controller for certain linear and nonlinear systems Va:02 (). The reasons could be the poor controllability of system, e.g., systems that have uncontrollable linearizations Di:11 () and systems that have fewer degrees of control freedom than the degrees of freedom to be controlled Gu:13 (); Gu:14 (), the input/output constraints in practice, e.g., an unstable linear timeinvariant system cannot be globally stabilized in the presence of input saturations Bl:99 (), time delay Sun:11 (); Sun:13 (), and/or the involved large disturbances Ki:06 (), etc. Moreover, in many applications the full stabilization, while possible, carries high penalty due to the cost of the control, thus is also not desirable.
Instead, minimizing a longtime average of the cost functional might be more realistic. For instance, longtimeaverage cost analysis and control is often considered in irrigation, flood control, navigation, water supply, hydroelectric power, computer communication networks, and other applications Du:10 (); Bo:92 (). In addition, systems that include stochastic factors are often controlled in the sense of longtime average. In Ro:83 (), a summary of longtimeaverage cost problems for continuoustime Markov processes is given. In Me:00 (), the longtimeaverage control of a class of problems that arise in the modeling of semiactive suspension systems was considered, where the cost includes a term based on the local time process diffusion. Notice that the controller design methods proposed in Ro:83 (); Me:00 () are highly dependent on the stochastic property of dynamical systems.
In certain cases, as, for example, turbulent flows of fluid, calculating the time averages is a big challenge even in the uncontrolled case. As a result, developing the control aimed at reducing the timeaveraged cost for turbulent flows, for example by using the receding horizon technique, leads to controllers too complicated for practical implementation Bewley:01 (). To overcome this complexity, it was proposed Ph:14 () to use an upper bound for the longtime average cost instead of the longtime average cost itself in cases when such an upper bound is easier to calculate. The idea is based on the hope that the control reducing an upper bound for a quantity will also reduce the quantity itself. Meanwhile, Ph:14 () uses the sum of squares (SOS) decomposition of polynomials and semidefinite programming (SDP) and allows a tradeoff between the quality of bound and the complexity of its calculation.
The SOS methods apply to systems defined by a polynomial vector field. Such systems may describe a wide variety of dynamics Va:01 () or approximate a system defined by an analytical vector field Va:02 (). A polynomial system can therefore yield a reliable model of a dynamical system globally or in larger regions than the linear approximation in the statespace Va:03 (). Recent results on SOS decomposition have transformed the verification of nonnegativity of polynomials into SDP, hence providing promising algorithmic procedures for stability analysis of polynomial systems. However, using SOS techniques for optimal control, as for example in Pr:02 (); Zh:07 (); Ma:10 (), is subject to a generic difficulty: while the problem of optimizing the candidate Lyapunov function certifying the stability for a closedloop system for a given controller and the problem of optimizing the controller for a given candidate Lyapunov function are reducible to an SDP and thus, are tractable, the problem of simultaneously optimizing both the control and the Lyapuniov function is nonconvex. Iterative procedures were proposed for overcoming this difficulty Zh:07 (); Zh:09 (); Ng:11 ().
While optimization of an upper bound with control proposed in Ph:14 () does not involve a Lyapunov function, it does involve a similar tunable function, and it shares the same difficulty of nonconvexity. In the present work we propose a polynomial type state feedback controller design scheme for the longtime average upperbound control, where the controller takes the structure of an asymptotic series in a smallamplitude perturbation parameter. By fully utilizing the smallness of the perturbation parameter, the resultant SOS optimization problems are solved in sequence, thus avoiding the nonconvexity in optimization. We apply it to an illustrative example and demonstrate that it does allow to reduce the longtime average cost even without fully stabilizing the system. Notice the significant conceptual difference between our approach and the studies of control by small perturbations, often referred to as tiny feedback, see for example tc:93 ().
The paper is organized as follows. Section 2 presents some preliminary introduction on SOS and its application in bound estimation of longtime average cost for uncontrolled systems. Section 3 gives the problem formulation. Bound optimization of the longtime average cost for controlled polynomial systems is considered in Section 4. An illustrative example of a cylinder wake flow is addressed in Section 5. Section 6 concludes the work.
2 Background
In this section SOS of polynomials and a recentlyproposed method of obtaining rigorous bounds of longtime average cost via SOS for uncontrolled polynomial systems are introduced.
2.1 SOS of polynomials
SOS techniques have been frequently used in the stability analysis and controller design for all kinds of systems, e.g., constrained ordinary differential equation systems An:02 (), hybrid systems An:05 (), timedelay systems An:04 (), and partial differential equation systems Pa:06 (); Yu:08 (); GC:11 (). These techniques help to overcome the common drawback of approaches based on Lyapunov functions: before Pr:02 (), there were no coherent and tractable computational methods for constructing Lyapunov functions.
A multivariate polynomial is a SOS, if there exist polynomials such that
If is a SOS then . In the general multivariate case, however, does not necessarily imply that is SOS. While being stricter, the condition that is SOS is much more computationally tractable than nonnegativity Par:00 (). At the same time, practical experience indicates that in many cases replacing nonnegativity with the SOS property leads to satisfactory results.
In the present paper we will utilize the existence of efficient numerical methods and software Pra:04 (); Lo:09 () for solving the optimization problems of the following type: minimize the linear objective function
(1) 
where is the vector of weighting coefficients for the linear objective function, and is a vector formed from the (unknown) coefficients of the polynomials for and SOS for , such that
(2)  
(3) 
In (2) and (3), the are given scalar constant coefficient polynomials.
The lemma below that provides a sufficient condition to test inclusions of sets defined by polynomials is frequently used for feedback controller design in Section 4. It is a particular case of the Positivstellensatz Theorem Po:99 () and is a generalized procedure Ta:06 ().
Lemma 1
Consider two sets of ,
where and are scalar polynomial functions. The set containment holds if there exist a polynomial function and SOS polynomial functions such that
2.2 Bound estimation of longtime average cost for uncontrolled systems
For the convenience of the reader we outline here the method of obtaining bounds for longtime averages proposed in Ph:14 () and make some remarks on it. Consider a system
(4) 
where and is a vector of multivariate polynomials of the components of the state vector . The longtime average of a function of the state is defined as
where is the solution of (4).
Define a polynomial function of the system state, , of degree and containing unknown decision variables as its coefficients. The time derivative of along the trajectories of system (4) is
Consider the following quantity:
The following result is from Ph:14 ():
Lemma 2
For the system (4), assume that the state is bounded in . Then, implies .
Hence, an upper bound of can be obtained by minimizing over under the constraint , which can be formulated as a SOS optimization problem in the form:
(5)  
(6) 
which is a special case of (1). A better bound might be obtained by removing the requirement for to be a polynomial and replacing (6) with the requirement of nonnegativeness. However, the resulting problem would be too difficult, since the classical algebraicgeometry problem of verifying positivedefiniteness of a general multivariate polynomial is NPhard An:02 (); An:05 ().
Notice that while is similar to a Lyapunov function in a stability analysis, it is not required to be positivedefinite. Notice also that a lower bound of any longtime average cost of the system (4) can be analyzed in a similar way.
Remark 1
For many systems the boundedness of system state immediately follows from energy consideration. In general, if the system state is bounded, this can often be proven using the SOS approach. It suffices to check whether there exists a large but bounded global attractor, denoted by As an example, let , where the constant is sufficiently large. Then, the global attraction property of system in may be expressed as
(7) 
Introducing a tunable polynomial satisfying , by Lemma 1, (7) can be relaxed to
(8) 
Minimization of upper bound of longtime average cost for systems that have unbounded global attractor is usually meaningless, since the cost itself could be infinitely large.
3 Problem Formulation
Consider a polynomial system with single input
(9) 
where and are polynomial functions of system state . The approach of this paper can easily be extended to multiple input systems. The control , which is assumed to be a polynomial vector of the system state with maximum degree , is designed to minimize the upper bound of an average cost of the form:
(10) 
where is the closedloop solution of the system (9) with the control . The continuous function is a given nonnegative polynomial cost in and .
Similarly to (5)(6), we consider the following optimization problem:
(11)  
(12) 
When it cannot be guaranteed that the closedloop system state is bounded, SOS constraints (8) must be added to (12) to make our analysis rigorous.
Under the framework of SOS optimization, the main problem in solving (11)(12) is due to the nonconvexity of (12) caused by the control input and the decision function both of which are tunable, entering (12) nonlinearly. Iterative methods Zh:07 (); Zh:09 (); Ng:11 () may help to overcome this issue indirectly in the following way: first fix one subset of bilinear decision variables and solve the resulting linear inequalities in the other decision variables; in the next step, the other bilinear decision variables are fixed and the procedure is repeated. For the particular longtime average cost control problem (11)(12), the nonconvexity will be resolved in the following by considering a type of socalled smallfeedback controller. In such a new way, iterative updating of decision variables is exempted, and replaced by solving a sequence of SOS optimization problems.
4 Bound optimization of longtime average cost for controlled polynomial systems
In this section a smallfeedback controller is designed to reduce the upper bound of the longtime average cost (10) for the controlled polynomial system (9). It is reasonable to hope that a controller reducing the upper bound for the timeaveraged cost will also reduce the timeaveraged cost itself Ph:14 ().
4.1 Basic formalism of the controller design
We will look for a controller in the form
(13) 
where is a parameter, and are polynomial vector functions of system state In other words, we seek a family of controllers parameterised by in the form of a Taylor series in . Notice that the expansion starts at the firstorder term, so that gives the uncontrolled system. To resolve the nonconvexity problem of SOS optimization, we expand and in :
(14)  
(15) 
where and are the Taylor series coefficients for the tunable function and the bound, respectively, in the thorder term of . Define
(16) 
Substituting (13), (14), and (15) into (16), we have
Noticing
it follows that
(17) 
where
(18) 
In (18), denotes the th partial derivative of with respect to at , and denotes the th partial derivative of with respect to at .
Expression (17) becomes more clear when a specific cost function is considered. For instance, let . Then,
where
and denotes all the terms with order of being equal or greater than 3.
It is clear that holds if , simultaneously, and the series (13)(15) converge. Notice that includes tunable functions , and . For any nonnegative integers satisfying , the tunable variables in are always a subset of the tunable variables in . Hence (11)(12) can be solved as a sequence of convex optimization problems. When the inequality constraints are relaxed to SOS conditions, our idea can be summarized as follows.

First minimize over under the constraint , or more conservatively,
Denote the optimal by and the associated by .

Now, let in , and then minimize over and under the constraint , or under the framework of SOS optimization,
Using the generalized procedure given in Lemma 1 and the fact that
(19) can be revised by incorporating one more tunable function :
Denote the optimal by and the associated and by and , respectively.

Further let , , and in , and then minimize over and under the constraint . In a more tractable way, consider
Similarly as in (s1), noticing (19) and
the SDP problem can be revised by the generalized procedure to the following form:
Denote the optimal by and the associated and by and , respectively.
Notice that here might differ from the tunable function in . Throughout this paper we will use the same notations for the tunable functions like and in various instances of the procedure, to keep the notation simple.

The SOSbased controller design procedure is continued for higherorder terms.
Now, define three series
(20) 
When all of them converge, the following statement will be true.
Theorem 1
By applying the statefeedback controller for the system (9), if the trajectories of the closedloop system are bounded ^{2}^{2}2In the context of longtime average cost controller design and analysis, it is actually enough to assume the boundedness of the global attractor of the system to ensure the existence of . , then is an upper bound of the longtime average cost .
Proof. Using the algorithm AI, we obtain
Then, it follows that
where are given in (20). By virtue of a same analysis as in proving Lemma 2 (see Ph:14 ()), we can conclude that .
Remark 2
After specifying the structure of controller to be of the form (13), the nonconvexity in solving the optimization problem (11)(12) has been avoided by solving the linear SDPs in sequence. During the process, all the involved decision variables are optimized sequentially, but not iteratively as in other methods Zh:07 (); Zh:09 (); Ng:11 ().
Remark 3
The smallness of can be used to relax further. For instance, in , in order to prove we prove with the aid of the known constraint , thus not using that is small. In fact, when is small, for to be negative has to be negative only for those where is small, and not for all as required in . Meanwhile, checking the convergence of the series (20) would be challenging or even impractical. These points will be addressed in what follows.
4.2 Design of smallfeedback controller
Next, the sequential design method AI is revised to utilize that .

Same as in AI, first solve the SOS optimization problem . Denote the optimal by and the associated by .

Let in , and then consider the following SDP problem:
where is any tunable polynomial function of of fixed degree. Denote the optimal by and the associated and by and , respectively. Unlike , here the nonnegativity requirement of is not imposed. This can be understood as that the nonnegativity constraint is imposed only for such that .

Further let , , and in , and then consider
where and are any tunable polynomial functions of fixed degrees. here does not need to be the same as in . Denote the optimal by and the associated and by and , respectively. Similarly as in , here the nonnegativity constraint is in effect imposed only where .

The revised SOSbased controller design procedure is continued for higherorder terms.
Since the constraints of being SOS imposed in AI are removed in AII, the coefficients obtained in AII can be smaller than the coefficients obtained in AI. This advantage comes at a price: even if all the relevant series converge for a particular value of , the procedure AII does not guarantee that the value given in (20) is an upper bound for the timeaveraged cost of the closedloop system with the controller . We have now to consider (20) as asymptotic expansions rather than Taylor series. Accordingly, we have to truncate the series and hope that the resulting controller will work for (sufficiently) small ^{3}^{3}3It is worthy of noticing that the series truncation here does not mean that our controller design and analysis are conducted in a nonrigorous way. The truncated controller would be effective if it leads to a better (lower) bound of the longtime average cost.. It is possible to prove that this is, indeed, the case.
For illustration, the firstorder truncation is considered only.
Theorem 2
Consider the firstorder smallfeedback controller for the system (9),
(21) 
where is sufficiently small. Assume that the trajectories of the closedloop system are bounded, and that . Then, is an upper bound of the longtime average cost . Clearly, .
Proof. Let . By substituting in the constraint function that is defined in (16), the remaining task is to seek small such that
(22) 
Notice that
(23) 
where
and , being polynomial in , possess all the continuity properties implied by the proof. Let be the phase domain that interests us, where the closedloop trajectories are all bounded. Then,
(24) 
and is bounded for any and any finite (the latter following from the standard meanvaluetheorembased formula for the Lagrange remainder). By (23) and (24),
(25) 
Meanwhile, consider the two inequality constraints obtained by solving and :
(26) 
Define for a given constant . Clearly, as . Further define
(27) 
By the second constraint in (26), . Therefore, by continuity and the fact , for any there exists a constant such that
(28) 
In consequence, (23), the first constraint in (26), and (28) render to
(29)  
for sufficiently small .
Next, we prove (22) for any . By the definition of the set , we have
(30) 
(31) 
if is sufficiently small.
In practice, once the form of the controller has been specified in (21), the upper bound and the corresponding actually can be obtained by solving the following optimization problem directly:
This problem can be further relaxed by incorporating the known constraints (26). In , if is set as one of the tunable variables, the SOS optimization problem will become nonconvex again, thus causing additional trouble in solving it. Alternatively, one can fix here, and investigate its effect on the upper bound of by trial and error. We will follow this route in Section 5.
5 Illustrative example
As an illustrative example we consider a system proposed in Ki:05 () as a model for studying control of oscillatory vortex shedding behind a cylinder. The actuation was assumed to be achieved by a volume force applied in a compact support region downstream of the cylinder. The KarhunenLoève (KL) decomposition No:03 () was used and the first two KL modes and an additional shift mode were selected. For the Reynolds number equal to 100 the resulting loworder Galerkin model of the cylinder flow with control was given as follows
(32) 
where , and . More details on deriving the reducedorder model (32) are given in Ro:14 ().
The system (32) possesses a unique equilibrium when , which is at the origin. Let , where . The proposed algorithms AI and AII were applied to (32), with the system state assumed to be available. In experiment, it could be estimated by designing a state observer with some sensed output measurement at a typical position Ro:14 ().
5.1 Performance of algorithm AI
The SDP problem is solved first. It corresponds to the uncontrolled sysytem. The minimal upper bound we could achieve was It was obtained with
Increasing the degree of cannot give a better bound because there exists a stable limit cycle in the phase space of (32), on which and . Since on the limit cycle, the minimal upper bound achieved by SOS optimization is tight in the sense that the difference between and is less than the prescribed precision for , .
Solving the SDP problem , where and are tunable functions, gave . Solving with being tuning functions, gave the same result: . In both cases, increasing the degrees of the tuning functions did not reduce the upper bound. The consequent SOS optimization problems, with