# An optimal control approach to the design of periodic orbits for mechanical systems with impacts^{1}^{1}footnotemark: 1

###### Abstract

In this paper we study the problem of designing periodic orbits for a special class of hybrid systems, namely mechanical systems with underactuated continuous dynamics and impulse events. We approach the problem by means of optimal control. Specifically, we design an optimal control based strategy that combines trajectory optimization, dynamics embedding, optimal control relaxation and root finding techniques. The proposed strategy allows us to design, in a numerically stable manner, trajectories that optimize a desired cost and satisfy boundary state constraints consistent with a periodic orbit. To show the effectiveness of the proposed strategy, we perform numerical computations on a compass biped model with torso.

###### keywords:

Nonlinear optimal control, Trajectory generation, Hybrid systems, Biped walking^{†}

^{†}journal: Nonlinear Analysis: Hybrid Systems

^{ecr}

^{ecr}footnotetext: This result is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 638992 - OPT4SMART).

## 1 Introduction

Hybrid systems, involving both continuous and discrete dynamics, arise naturally in a number of engineering applications. In particular, many robotics tasks, such as legged locomotion, (multi-finger) manipulation and load transportation with aerial vehicles, can be modeled as mechanical systems with impacts, a particular class of hybrid systems. These classes of systems experience continuous dynamics until an interaction with the surrounding environment (i.e., an impact) occurs, thus causing a state discontinuity due to impulsive forces and (possibly) a switch between different dynamics.

In this paper we concentrate on a particular aspect that has important implications both for modeling and control of robots, namely trajectory design.

#### Motivations

The trajectory design task has gained a lot of attention both as a preliminary task for control and as a tool to explore and understand the capabilities of the system egerstedt2001optimal (),shaikh2003optimal () ( see betts1998survey () for an earlier reference). Optimization techniques allow us to design reference trajectories for robot controllers by minimizing a given cost function. Typical cost functions are (i) the distance from a desired state-input curve (which does not satisfy the dynamics) and (ii) the energy injected into the system. For example, for humanoid robot design, the distance from a desired (but unfeasible) human-like walking pattern (vasudevan2013persistent (), IL-OK-JHP:14 ()) is often considered. Additional challenges arise when the trajectory generation problem is addressed for (underactuated) mechanical systems with impacts. The impact events complicate the trajectory optimization problem since discontinuous changes in the state occur. For some systems, the underactuation and the instability of the continuous dynamics render the problem even more challenging. The optimal control theory offers powerfull tools to deal with trajectory generation problems for hybrid dynamical systems.

#### Literature review

The literature on optimal control of hybrid systems is quite vast, thus we report only two sets of contributions relevant for our work.

Fist, a general overview on recent advances for optimal control of hybrid systems is presented. A recent survey, focusing in particular on switched systems, is zhu2015optimal (). In baotic2006constrained () the authors propose optimal control algorithms for discrete-time linear hybrid systems which “combine a dynamic programming exploration strategy with multiparametric linear programming and basic polyhedral manipulation”. In shaikh2007hybrid () a set of necessary conditions is formulated and optimization algorithms are presented for optimal control of hybrid systems with continuous nonlinear dynamics and with autonomous and controlled switchings. The algorithm described in passenberg2013optimal () (for hybrid systems with partitioned state space and autonomous switching) is based on a version of the minimum principle for hybrid systems providing optimality conditions for intersections and corners of switching manifolds, and thus avoiding the combinatorial complexity of other algorithms (e.g., shaikh2007hybrid ()). Furthermore, the work in wardi2014switched () deals with optimal mode-scheduling via a gradient descent algorithm for the particular class of autonomous switched-mode hybrid dynamical systems. In riedinger2014lq () and corona2014stabilization () the design of switching laws for switched systems with linear dynamics, based on the optimization of a quadratic criterion, is addressed.

Second, we focus on trajectory optimization for the particular class of mechanical systems with impacts. Besides results regarding systems with impulsive controls, e.g. huang2012lq (), long2011optimal (), we focus on the trajectory design for systems controlled by inputs of the continuous dynamics. The majority of works adopt parametric optimization methods, as, e.g., DD-CC-YA:05 (), batz2010ball (), pchelkin2015algorithms (). This means that trajectories are approximated as, for example, classical, trigonometric or Bézier polynomials and the optimization is performed with polynomial coefficients as decision variables. In other works, e.g. liu2013biped (), the optimization problem is addressed via dynamics equation discretization and the optimal periodic trajectory is computed by means of an approximated cost function. Available software toolboxes are used to solve trajectory generation problems addressed in DD-CC-YA:05 (), batz2010ball (), pchelkin2015algorithms () and liu2013biped (). Recently, the optimization over jump times and/or mode sequence, as e.g., in nakanishi2013spatio () and posa2014direct (), has been considered. In nakanishi2013spatio () the instants of jump are optimized, but the sequence of continuous dynamics modes is fixed. In posa2014direct () the mode sequence is also optimized by sequential quadratic programming. Focusing in particular on the trajectory generation problem for legged robots (the major robotic application regarding mechanical systems with impacts), challenges arising when dealing with underactuated robots are addressed in sugianto2013motion () and erez2012trajectory () by considering an additional “virtual” input acting on the unactuated degrees of freedom. Then, an optimal approximated trajectory of the underactuated dynamics is computed by dynamics inversion. Furthermore, online trajectory optimization is addressed in tassa2012synthesis () through a method based on iterative LQG and in PLH-AS-LF-UM-SG:13 () trajectories are designed by using an auxiliary system of differential equations and then stabilizing the generated curves through a feedback on the target system.

#### Contributions

As main contribution of this paper, we develop an optimal control based strategy to design periodic orbits for a class of hybrid dynamical systems with impacts. First, we formulate an optimal control problem with ad-hoc boundary conditions. These ones are provided by studying how the initial conditions and the jump conditions are related in a periodic orbit with one jump per period. Second, instead of using available softwares, we develop an ad-hoc strategy based on the combination of trajectory functional optimization with three main tools: dynamics embedding, constraint relaxation and zero finding techniques. These tools enable us to (i) deal with the undeartuated nature of dynamics, (ii) consider highly non trivial constraints and (iii) avoid the tedious search for a suitable “initial (guess) trajectory” to initialize the optimization algorithm. Furthermore, optimal control problems involved in our strategy are solved by combining the penalty function approach bryson1975applied () with the Projection Operator Newton method JH:02 (). In contrast with many strategies reported in the literature, we do not resort to approximations such as considering discrete sets of motion primitives (thus ending up with the only optimization of their parameters), and/or discrete time. In fact we consider system states and inputs as optimization variables and a second-order approximation of the optimization problem is directly constructed in continuous time. In detail, the proposed strategy is based on the following steps. By adding a fictitious input, we embed the system into a completely controllable (fully actuated) one. On this system we set up, by constraint relaxation, an unconstrained optimal control problem to find a trajectory of the system (a curve satisfying the dynamics) that minimizes a weighted distance from a desired curve. The desired curve, together with the weights of the cost functional become important parameters in the designer’s hand to explore the system dynamics, that is different periodic orbits that the system can execute. We solve the optimal control problem to generate a trajectory of the fully actuated system with almost null fictitious input. Differently from sugianto2013motion () and erez2012trajectory (), we use this trajectory to initialize the last step of our strategy, where the optimal control problem is set up on the underactuated dynamics. A Newton update rule on the final penalty of the weighted cost functional is adopted in order to hit the desired final state. It relies on the solution of an optimal state transfer problem presented in JH:03 (), where the advantages of the method are highlighted with respect to the classic ones.

We provide a set of numerical computations showing the effectiveness of the proposed strategy. In particular, we generate a periodic gait for a three-link biped robot (compass model with torso). We perform two computations by choosing two different sets of weights. In the first scenario we try to generate a trajectory that is as close as possible to the guessed state curve. Vice-versa, in the second one, we compute a trajectory that minimizes the input effort (i.e., some sort of minimum-energy trajectory). It is worth noting that, although the three links model is relatively simple, its underactuation represents a significant challenge. Also, it is well known that for many applications, even such a reduced model of the biped dynamics is instrumental for analyzing and controlling the actual system, see, e.g., PLH-AS-LF-UM-SG:13 ().

#### Paper organization

The paper is organized as follows. In Section 2 we introduce the model of the particular class of hybrid system we study in this paper and we present the problem formulation for the generation of periodic orbits. In Section 3 we describe the proposed strategy and in Section 4 we provide numerical computations (on a biped walking model) showing its effectiveness.

## 2 Problem formulation for the generation of periodic orbits

In this section we provide the optimal control formulation for solving the problem of periodic orbit generation.

### 2.1 Hybrid system model: underactuated systems with impacts

In this subsection we introduce the particular class of hybrid systems studied in the paper. The hybrid model consists of a continuous dynamics phase and a discrete (impulsive) jump event, occurred when the system state reaches a jump set. The mechanical system dynamics is modeled by the ordinary differential equation

(1) |

where , , , , , are respectively the generalized coordinate vector, the mass matrix, the Coriolis vector, the gravity vector, and the input vector. The dynamics (1) can be written in the state space form as the continuous and control affine dynamics

(2) |

with state defined as and input . In particular, we assume that (i) the system is underactuated, i.e., we let , and (ii) functions and are (at least) . When the system state, evolving as modeled in (2), reaches a jump set , a discrete impulsive event occurs, causing a discontinuity in the system state evolution. The discrete event is modeled by the jump map , which we assume to be invertible, such that

(3) |

where and denote the left and right limits of system trajectories satisfying (2). Finally, the overall hybrid model is:

(4) |

### 2.2 Optimal control problem set-up

We aim at designing periodic trajectories, minimizing a weighted distance from a desired curve, whose period is fixed. We approach the task by formulating an optimal control problem over the continuous dynamics (2). We impose a boundary constraint , where is a final state that allows the system to jump exactly to the initial condition , i.e., . In particular, we choose the initial state and we compute the final state by inverting the jump map, i.e., .

Let a desired curve , satisfying , and thus , be given. Furthermore, let us assume that the desired curve does not reach the jump set in the open time interval . We are ready now to present the optimal control problem we aim to solve for the generation of periodic orbits,

(5) |

where is an absolutely continuous state trajectory, is a bounded (measurable) input, and are positive definite weighting matrices and for some vector and matrix we denote .

The time-horizon is a design parameter together with the desired curve, the initial (or final) conditions and the weighting matrices and . We stress the fact that the desired curve is not a trajectory of the system, i.e., even if it satisfies the boundary conditions, it does not satisfy the dynamics. Thus, the desired curves can be seen as “design tools” to parameterize actual trajectories of the system. Furthermore, by choosing the desired trajectory to satisfy the jump condition only at time , we expect to compute an optimal (state) trajectory satisfying the same property. Otherwise, this constraint can be (implicitly) enforced by properly choosing the weight matrices and .

###### Remark (Final state constraint and controllability).

It is worth noticing that for problem (5) to be feasible the state must belong to the reachable space of the system.

## 3 Optimal control based strategy for periodic orbit design

As highlighted in the literature review, several software toolboxes are available to solve nonlinear constrained optimal control problems. Nevertheless, optimal control problems are optimization problems in the infinite dimensional space of state-input curves subject to highly non trivial constraints as the dynamics and the fixed final state constraint. This means that the solution of an algorithm is highly influenced by the desired curve and by the “initial curve” (possibly a trajectory) that initializes the algorithm. For this reason, instead of attacking the problem directly by simply applying one of the solvers available in the literature, we develop a strategy based on suitable embedding and relaxation ideas that gives a systematic method to compute periodic trajectories.

We divide our strategy into three steps that will be presented in detail in the next subsections. Here we provide an informal idea. First, we add a fictitious input to the underactuated dynamics, thus getting a fully actuated system. Second, we consider an unconstrained relaxation of the optimal control problem (5) by substituting the final state constraint with a penalty in the cost function. We solve the relaxed problem by using also the fictitious input, but with a high penalty. Third and final, we get rid of the fictitious input and enforce the final state constraint exactly by means of a suitable iterative method (i.e., a zero finding Newton iteration on the final state penalty).

### 3.1 Dynamics embedding

The first part of the strategy is called dynamics embedding and is inspired by JH-AS-RF:04 (). It consists of embedding the underactuated system into a fully actuated one by adding a fictitious input , with . Thus, embedding the mechanical system (1) into a fully actuated one, we get

(6) |

where and is assumed to be invertible . Accordingly, the state space equations of the (embedding) fully-actuated dynamics are

(7) |

where is a vector field taking into account the action of the embedding input. The role of this additional input is to allow for the tracking of any (sufficiently regular) desired curve. This provides a useful tool to initialize the optimal control problem solved in the next part of the strategy.

### 3.2 Optimal control problem relaxation and continuation with respect to parameters

Next, we design an optimal control relaxation of problem (5). The relaxation involves two aspects. First, according to the dynamics embedding introduced in the previous step we consider the fully actuated version of the system. We add a penalty in the cost function for the embedding input. Second, we relax the final state constraint by adding a penalty in the cost functional. The relaxed problem is

(8) |

where and in are respectively the weights on the embedding input and the final state error. The target state is a strategy parameter that is set to at beginning of the strategy, but will be modified during the strategy evolution according to the root finding procedure described in the next step. The idea is to weight the embedding input and the final state error much more than the other inputs and the states. In order to avoid a conflict between the two objectives we propose a continuation procedure on the two penalties.

Problem (8) is an unconstrained optimal control problem that could be solved by means of several available tools in the literature. We use the PRojection Operator Newton method for Trajectory Optimization (PRONTO) introduced in JH:02 () and described in A. We adopt the PRONTO tool because it shows two main appealing features for our trajectory design task. First, it allows one to handle unstable dynamics in a numerically stable manner and with a “reasonable” computational effort. Second, it guarantees recursive feasibility during the algorithm evolution. That is, at each step a system trajectory is available.

We are now ready to present the second step of the strategy. First, given a desired state curve such that , we can easily compute input trajectories of the fully actuated system such that the desired state curve is a system trajectory. By inverting the fully actuated dynamics model (6), we compute , as

Note that, depends on the choice of but once is defined, the input is unique.

Thus, the second step of the strategy can be informally described as follows. We denote by a generic state-input curve for the fully actuated system. We set the desired curve to and let be the initial trajectory (of the fully actuated system) for the Projection Operator Newton method (the optimal control solver). Then, following an integral penalty function approach bryson1975applied (), we iteratively solve problem (8) increasing the weight at each step, by means of a suitable heuristic. When the norm of the embedding input is sufficiently small (i.e., when , where is a given tolerance) we stop the procedure, thus obtaining an approximated trajectory of the underactuated system.

A pseudo-code description of the strategy is given in the next table (Algorithm 1). We denote the projection operator (17) acting on the fully actuated system, so that is a trajectory of the fully-actuated system. For given and , we denote a routine such that , i.e., it takes as inputs an initial trajectory and a desired (state-control) curve , and computes an embedding optimal trajectory, which is denoted as , solving the nonlinear optimal control relaxation in (8).

###### Remark.

(Convergence) Provided that a solution to (8) with exists, the existence of the solution is also guaranteed, by continuity, when is in a neighbourhood of the origin (with suitable radius ). Thus, a solution to (8) with exists and the convergence to that solution follows since we are using an integral penalty function approach (see, e.g. bryson1975applied ()). Nevertheless, dealing with a non-convex problem, only a local convergence is ensured, provided that the algorithm is initialized with a trajectory lying inside the basin of attraction.

### 3.3 Enforcing the final state constraint

We are ready to present the third and final step of the strategy. This part relies on the idea, proposed in JH:03 (), of enforcing the final state constraint of problem (5) by combining two actions: (i) increase the penalty on the final state error, and (ii) vary the target state until the candidate optimal trajectory “practically” meets the final state constraint.

The following optimal control problem is the relaxation of problem (5) subject to the original system, i.e., without dynamics embedding

(9) |

Before presenting the theorem that establishes a connection between the original problem (5) and its relaxed version (9), let us state the second order sufficient condition for optimality satisfied by a local minimum of the original problem (5).

###### Definition 1.

(Second order sufficient condition JH:03 ()) The second order sufficient condition for a local minimum is

where is the control Hamiltonian and , , .

The following result is at the basis of the third step of the proposed strategy.

###### Theorem 1.

(Equivalence of constrained and relaxed minimizers JH:03 ()) Suppose that is a local minimum of (5) satisfying the second order sufficiency condition for optimality. Then for each , there is a target state such that is a local minimizer of problem (9) satisfying the corresponding second order sufficiency condition.

Thus, the third step of the strategy can be informally described as follows. Given an embedding optimal trajectory from the previous step, we obtain the initial trajectory of the original system, where is the projection operator acting on the original system. Furthermore, we reduce the embedding desired curve , in order to get the desired curve for the original system as . Then, adopting a penalty function approach, we iteratively solve problem (9) increasing the weight at each step. The use of a large can easily result in a poorly conditioned problem where the terminal cost overwhelms the remaining one. Thus, we start with a reasonably small and we gradually increase it by means of a suitable heuristic. Note that our penalty function approach generates an optimal trajectory which approximately satisfies the final state constraint. For this reason, when the final state of the temporary optimal trajectory is “sufficiently close” to the desired final state (i.e., , where is tolerance guaranteeing the root finding convergence), we apply a Newton method for root finding on to meet exactly the final state constraint. The update rule we use on is the one proposed in JH:03 (). According to the Implicit Function Theorem, (i) there exists a mapping such that , where is the solution to (9) with target state and (ii) the first Fréchet differential of at , i.e., , exists. Furthermore, provided that the linearization of about is controllable on , the mapping is invertible. Thus, a Newton method for root finding is applied on the final state constraint equation in order to find the value of such that the solution to (9) is equivalent to the solution to (5). In particular, at the iteration , we update the value of according to the following rule: . The final state constraint is met when , where is the desired tolerance on the final state error.

A pseudo code description of the third step is given in the following table (Algorithm 2). We use the following notation. For a given and , we denote PRONTO the Projection Operator Newton method for the original system, so that we have . Note that, as for Algorithm 1, convergence of Algorithm 2 directly follows since it is a properly formulated penalty function approach.

## 4 An illustrative example: three link planar biped robot

As an example, we apply the proposed technique to a planar biped walking model. The model adopted in this paper was introduced in JWG-GA-FP:01 ().

### 4.1 Biped model

The three rigid links biped with four lumped masses consists of a torso and two legs of equal length connected at the hip. The hybrid walking model consists of a continuous swing phase model and a discrete jump event model.

The swing phase model describes the motion of the swing leg to develop a walking step. Let denote the orientation of the stance leg, the swing leg and the torso, respectively. Furthermore, let be the torque applied between the stance leg and the torso while the torque acts between the swing leg and the torso. The swing phase model via Euler-Lagrange equations is given by

(10) |

where and , , and (defined in B) are respectively the mass matrix, the Coriolis vector, the gravity vector and the generalized torques vector. Defining the state and the input , we can write the dynamics (10) in state-space form (2).

The jump event model takes into account (i) the impulse force arising when the swing leg touches the ground and (ii) the switch of the leg being in contact with the ground. See JWG-GA-FP:01 () for more details. The jump event model is given by (3) where

(11) |

with reported in C and

The jump event occurs when the system state reaches the jump set

(12) |

where is set according to physical considerations.

### 4.2 Numerical computations

According to JWG-GA-FP:01 (), we consider the following model parameters: mass of the legs kg; mass of the hip kg; mass of the torso kg; length of the legs m, and the length of the torso m. We use as jump angle . We choose as time horizon and as initial condition .

The first step of the proposed strategy requires the definition of a fully actuated dynamics and the design of a desired curve. In order to obtain a fully actuated dynamics, we consider the dynamics (10) and add a (generalized) torque acting directly on the torso angle velocity . That is, the swing phase model becomes

(13) |

where ,

(14) |

and . To design a desired curve that satisfies constraints on the initial and final states, we choose as desired angles with spline curves so that the initial and final point and their relative slopes, i.e., , , and , can be assigned a priori. Then, the desired velocity and acceleration curves, respectively and , are obtained by symbolic time differentiation. According to the fully-actuated dynamic model (13), we compute

where , with and . Furthermore, as regards the update rule for the penalty parameters, at iteration , we set . The same update rule is adopted for . Numerical computations characterized by two different design objectives are presented in the following. We also invite the reader to watch the attached video showing optimal walking gaits obtained by means of the proposed strategy.

In the first computation, we choose and , . We find the optimal trajectory choosing diagonal Q and R matrices and penalizing the angles times the input, while the velocities times the input. That is, and . The optimal trajectory (blue) is depicted on the left column of Figure 1 and Figure 2 together with the desired curve (red), the embedding optimal trajectory (green) and temporary trajectories of the underactuated system (black). The optimization strategy shapes the desired curve in order to obtain a trajectory (a curve satisfying the dynamics), thus showing that in this case the desired curve is just a rough guess of the biped gait.

In the second computation, we look for a trajectory that minimizes the “energy” injected in the system. Indeed, we choose , , , and penalize the inputs times the angles and times the velocities. That is, we choose and . The optimal trajectory (blue) is depicted on the right column of Figure 1 and Figure 2 together with the desired curve (red), the embedding optimal trajectory (green) and temporary trajectories of the underactuated system (black). As expected, the desired curve and the optimal trajectory coincide at , while the embedding optimal trajectory has a nonzero final state error. This shows the effectiveness of the third step of the strategy that enforces the final state constraint. This optimal trajectory is different from the previous one, thus showing another possible gait. In order to minimize the injected energy, the swing leg first goes behind the stance one, it gains potential energy and then, it passes in front of the stance leg. This gait requires a less amount of torque, with respect to the first gait. Approaching the final time , the torque of the first gait (Figure (c)c) is greater than while it approaches zero in the second gait (Figure (d)d).

## 5 Conclusion

In this paper we have developed an optimal control based strategy to compute periodic orbits for underactuated mechanical systems with impacts. The strategy combines trajectory optimization with dynamics embedding, optimal control relaxation and root finding techniques. The proposed strategy provides a systematic and numerically robust methodology to design periodic orbits for a particular class of hybrid systems.

## References

## References

- (1) M. Egerstedt, C. F. Martin, Optimal trajectory planning and smoothing splines, Automatica 37 (7) (2001) 1057–1064.
- (2) M. S. Shaikh, P. E. Caines, On the optimal control of hybrid systems: Optimization of trajectories, switching times, and location schedules, in: Hybrid systems: Computation and control, Springer, 2003, pp. 466–481.
- (3) J. T. Betts, Survey of numerical methods for trajectory optimization, Journal of guidance, control, and dynamics 21 (2) (1998) 193–207.
- (4) R. Vasudevan, A. Ames, R. Bajcsy, Persistent homology for automatic determination of human-data based cost of bipedal walking, Nonlinear Analysis: Hybrid Systems 7 (1) (2013) 101–115.
- (5) I. Lim, O. Kwon, J. H. Park, Gait optimization of biped robots based on human motion analysis, Robotics and Autonomous Systems 62 (2) (2014) 229 – 240.
- (6) F. Zhu, P. J. Antsaklis, Optimal control of hybrid switched systems: A brief survey, Discrete Event Dynamic Systems 25 (3) (2015) 345–364.
- (7) M. Baotic, F. J. Christophersen, M. Morari, Constrained optimal control of hybrid systems with a linear performance index, IEEE Transactions on Automatic Control 51 (12) (2006) 1903–1919.
- (8) M. S. Shaikh, P. E. Caines, On the hybrid optimal control problem: theory and algorithms, IEEE Transactions on Automatic Control 52 (9) (2007) 1587–1603.
- (9) B. Passenberg, P. E. Caines, M. Leibold, O. Stursberg, M. Buss, Optimal control for hybrid systems with partitioned state space, IEEE Transactions on Automatic Control 58 (8) (2013) 2131–2136.
- (10) Y. Wardi, M. Egerstedt, M. Hale, Switched-mode systems: gradient-descent algorithms with armijo step sizes, Discrete Event Dynamic Systems 25 (4) (2015) 571–599.
- (11) P. Riedinger, J.-C. Vivalda, An LQ sub-optimal stabilizing feedback law for switched linear systems, in: International Conference on Hybrid systems: computation and control, 2014, pp. 21–30.
- (12) D. Corona, A. Giua, C. Seatzu, Stabilization of switched systems via optimal control, Nonlinear Analysis: Hybrid Systems 11 (2014) 1–10.
- (13) Y. Huang, Y. Zhang, H. Xia, T. Liang, D. Cheng, LQ-based optimization for linear impulsive control systems mixed with continuous-time controls and fixed-time impulses, in: IEEE Conference on Decision and Control, 2012, pp. 6144–6150.
- (14) A. W. Long, T. D. Murphey, K. M. Lynch, Optimal motion planning for a class of hybrid dynamical systems with impacts, in: IEEE International Conference on Robotics and Automation, 2011, pp. 4220–4226.
- (15) D. Djoudi, C. Chevallereau, Y. Aoustin, Optimal reference motions for walking of a biped robot, in: IEEE International Conference on Robotics and Automation, 2005, pp. 2002–2007.
- (16) G. Bätz, U. Mettin, A. Schmidts, M. Scheint, D. Wollherr, A. S. Shiriaev, Ball dribbling with an underactuated continuous-time control phase: Theory & experiments, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2010, pp. 2890–2895.
- (17) S. S. Pchelkin, A. S. Shiriaev, U. Mettin, L. B. Freidovich, L. V. Paramonov, S. V. Gusev, Algorithms for finding gaits of locomotive mechanisms: case studies for gorilla robot brachiation, Autonomous Robots (2015) 1–17.
- (18) C. Liu, C. G. Atkeson, J. Su, Biped walking control using a trajectory library, Robotica 31 (02) (2013) 311–322.
- (19) J. Nakanishi, A. Radulescu, S. Vijayakumar, Spatio-temporal optimization of multi-phase movements: Dealing with contacts and switching dynamics, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp. 5100–5107.
- (20) M. Posa, C. Cantu, R. Tedrake, A direct method for trajectory optimization of rigid bodies through contact, The International Journal of Robotics Research 33 (1) (2014) 69–81.
- (21) H. Sugianto, D. Lau, C. Burvill, P. Lee, D. Oetomo, Motion planning for underactuated bipedal mechanisms with kinematic constraints, in: International Conference on Advanced Robotics, 2013, pp. 1–6.
- (22) T. Erez, E. Todorov, Trajectory optimization for domains with contacts using inverse dynamics, in: Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, IEEE, 2012, pp. 4914–4919.
- (23) Y. Tassa, T. Erez, E. Todorov, Synthesis and stabilization of complex behaviors through online trajectory optimization, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 4906–4913.
- (24) P. La Hera, A. Shiriaev, L. Freidovich, U. Mettin, S. Gusev, Stable walking gaits for a three-link planar biped robot with one actuator, IEEE Transactions on Robotics 29 (3) (2013) 589–601.
- (25) A. E. Bryson, Y.-C. Ho, Applied optimal control: optimization, estimation and control, CRC Press, 1975.
- (26) J. Hauser, A projection operator approach to the optimization of trajectory functionals, in: IFAC World Congress, 2002.
- (27) J. Hauser, On the computation of optimal state transfers with application to the control of quantum spin systems, in: American Control Conference, 2003, pp. 2169–2174.
- (28) J. Hauser, A. Saccon, R. Frezza, Aggressive motorcycle trajectories, in: IFAC Symposium on Nonlinear Control Systems, 2004.
- (29) J. Grizzle, G. Abba, F. Plestan, Asymptotically stable walking for biped robots: analysis via systems with impulse effects, IEEE Transactions on Automatic Control 46 (1) (2001) 51–64.
- (30) L. Cesari, Optimization â theory and applications: problems with ordinary differential equations, Springer-Verlag, 1983.
- (31) E. B. Lee, L. Markus, Foundations of Optimal Control Theory, John Wiley & Sons, 1989.
- (32) J. Hauser, A. Saccon, A barrier function method for the optimization of trajectory functionals with constraints, in: IEEE Conference on Decision and Control, 2006, pp. 864–869.

## Appendix A The Projection Operator approach for the optimization of trajectory functionals

The PRojection Operator based Newton method for Trajectory Optimization (PRONTO) JH:02 () is suitable for solving optimal control problems in the form

(15) |

where the initial condition is fixed and , and are (at least) in and . Sufficient conditions on , and (cesari1983optimization (), lee1989optimalcontrol ()) guarantee the existence of optimal trajectories. In order to deal with state-input constraints, a strategy combining PRONTO with a barrier function approach is proposed in hauser2006barrier (). The key idea of PRONTO is that a properly designed projection operator , mapping state-control curves into system trajectories (curves satisfying the dynamics), is used to convert the dynamically constrained optimization problem (15) into an essentially unconstrained one. Let be a bounded curve and let be a trajectory of the nonlinear feedback system

(16) |

where the initial condition is given in (15) and the feedback gain is designed, e.g., by solving a suitable linear quadratic optimal control problem, in order to guarantee (local) exponential stability of the trajectory . The feedback system (16) defines the nonlinear projection operator

(17) |

mapping the curve to the trajectory . Using the projection operator to locally parameterize the trajectory manifold, problem (15) is equivalent to the one of minimizing the unconstrained functional , where . Then, using an (infinite dimensional) Newton descent method, a local minimizer is computed. The strength of this approach is that the local minimizer is obtained as the limit of a sequence of trajectories, i.e., curves satisfying the dynamics. The feedback system (16), defining the projection operator, allows us to generate trajectories in a numerically stable manner. In other words, the choice of the feedback in (16) is convenient from a numerical point of view. Furthermore, note that (projected) trajectories satisfy , according to the definition of the projection operator.

A pseudo-code of the Projection Operator Newton method is shown in the table (Algorithm 3). Let be an initial trajectory. Minimization of the cost functional is accomplished iteratively. Given the current trajectory iterate , the search direction is obtained by solving a linear quadratic optimal control problem with cost , where and are respectively the first and second Fréchet differentials of the functional at . Then, the curve , where is a step size obtained through a standard backtracking line search, is projected, by means of the projection operator, in order to get a new trajectory .

###### Remark.

The algorithm has the structure of a standard Newton method for the minimization of an unconstrained function. The key points are the design of defining the projection operator and the computation of the derivatives of to “search for descent direction”. It is worth noting that these two steps involve the solution of suitable (well known) linear quadratic optimal control problems, JH:02 ().

## Appendix B Biped walking model: swing phase

The mass matrix, the Coriolis vector, the gravity vector and the generalized torques vector of equation (10) are respectively

(18) |

(19) |

(20) |

(21) |

where

(22) |

and the model parameters are: the mass of the legs , the mass of the hip , the mass of the torso , the length of the legs , and the length of the torso .

##
Appendix C Biped walking model: relation between velocities

just after and before the impact

The relation between velocities just after and before the impact is represented by . The terms of the matrix are reported below.