Leveraging the Template and Anchor Framework for Safe, Online Robotic Gait Design

# Leveraging the Template and Anchor Framework for Safe, Online Robotic Gait Design

Jinsun Liu, Pengcheng Zhao, Zhenyu Gan, Matthew Johnson-Roberson, and Ram Vasudevan Jinsun Liu is with the Robotics Institute, University of Michigan, Ann Arbor, MI 48109 jinsunl@umich.eduPengcheng Zhao, Zhenyu Gan and Ram Vasudevan are with the Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109 {pczhao,ganzheny,ramv}@umich.edu Matthew Johnson-Roberson is with the Department of Naval Architecture and Marine Engineering, University of Michigan, Ann Arbor, MI 48109 mattjr@umich.eduThis work is supported by the Ford Motor Company via the Ford-UM Alliance under award N022977 and by the Office of Naval Research under Award Number N00014-18-1-2575.These two authors contributed equally to this work.
###### Abstract

Online control design using a high-fidelity, full-order model for a bipedal robot can be challenging due to the size of the state space of the model. A commonly adopted solution to overcome this challenge is to approximate the full-order model (anchor) with a simplified, reduced-order model (template), while performing control synthesis. Unfortunately it is challenging to make formal guarantees about the safety of an anchor model using a controller designed in an online fashion using a template model. To address this problem, this paper proposes a method to generate safety-preserving controllers for anchor models by performing reachability analysis on template models while bounding the modeling error. This paper describes how this reachable set can be incorporated into a Model Predictive Control framework to select controllers that result in safe walking on the anchor model in an online fashion. The method is illustrated on a 5-link RABBIT model, and is shown to allow the robot to walk safely while utilizing controllers designed in an online fashion.

Bipeds, underactuated system, safety guarantee.

## I Introduction

Legged robots are an ideal system to perform locomotion on unstructured terrains. Unfortunately designing controllers for legged systems to operate safely in such situations has proven challenging. To robustly traverse such environments, an ideal control synthesis technique for legged robotic systems should satisfy several requirements. First, since sensors perceive the world with a limited horizon, any algorithm for control synthesis should operate in real-time. Second, since modeling contact can be challenging, any control synthesis technique should be able to accommodate model uncertainty. Third, since the most appropriate controller may be a function of the environment and given task, a control synthesis algorithm should optimize over as rich a family of control inputs at run-time as possible. Finally, since falling can be costly both in time and expense, a control synthesis technique should be able to guarantee the satisfactory behavior of any constructed controller. As illustrated in Fig. 1, this paper presents an optimization-based algorithm to design gaits for legged robotic systems while satisfying each of these requirements.

We begin by summarizing related work with an emphasis on techniques that are able to make guarantees on the safety of the designed controller. For instance, the Zero-Moment Point approach [vukobratovic1972stability] characterizes the stability of a legged robot with planar feet by defining the notion of the Zero-Moment Point and requiring that it remains within the robot’s base of support. Though this requirement can be used to design a controller that can avoid falling at run-time, the gaits designed by the ZMP approach are static and energetically expensive [kuo2007choosing, westervelt2007feedback, Section 10.8].

In contrast, the Hybrid Zero Dynamics approach, which relies upon feedback linearization to drive the actuated degrees of freedom of a robot towards a lower dimensional manifold, is able to synthesize a controller which generates gaits that are more dynamic. Though this approach can generate safety preserving controllers for legged systems in real-time in the presence of model uncertainty [ames2014rapidly, hsu2015control, nguyen2015optimal, nguyen2016exponential, nguyen2016dynamic], it is only able to prove that the gait associated with a synthesized control is locally stable. As a result, it is non-trivial to switch between multiple constructed controllers while preserving any safety guarantee. Recent work has extended the ability of the hybrid zero dynamic approach beyond a single neighborhood of any synthesized gait [motahar2016composing, veer2018safe, ames2017first, smit2019walking]. These extensions either assume full-actuation [ames2017first] or ignore the behavior of the legged system off the lower dimensional manifold [motahar2016composing, veer2018safe, smit2019walking].

Rather than designing controllers for legged systems, other techniques have focused on characterizing the limits of safe performance by using Sums-of-Squares (SOS) optimization [parrilo2000structured]. These approaches use semi-definite programming to identify the limits of safety in the state space of a system as well as associated controllers for hybrid systems [prajna, shia2014convex]. These safe sets can take the form of reachable sets [koolen2016balance, shia2014convex] or invariant sets in state space [wieber2002stability, prajna, posa2017balancing]. However, the representation of each of these sets in state space restricts the size of the problem that can be tackled by these approaches and as a result, these SOS-based approaches have been primarily applied to reduced models of walking robots: ranging from spring mass models [zhao2017optimal], to inverted pendulum models [koolen2016balance, tang2017invariant] and to inverted pendulum models with an offset torso mass [posa2017balancing]. Unfortunately the differences between these simple models and real robots makes it challenging to extend the safety guarantees to more realistic real-world models.

This paper addresses the shortcomings of prior work by making the following four contributions. First, in Section III-A, we describe a set of outputs that are functions of the state of the robot, which can be used to determine whether a particular gait can be safely tracked by a legged system without falling. In particular, if a particular gait’s outputs satisfy a set of inequality constraints that we define, then we show that the gait can be safely tracked by the legged system without falling. To design gaits over -steps that do not fall over, one could begin by forward propagating these outputs via the robot’s dynamics for -steps. Unfortunately performing this computation can be intractable due to the high-dimensionality of the robot’s dynamics. To address this challenge, our second contribution, in Section III-B, leverages the anchor and template framework to construct a simple model (template) whose outputs are sufficient to predict the behavior of the full model’s (anchor’s) outputs [full1999templates]. Third, in Section IV-A, we develop an offline method to compute a gait parameterized forward reachable set that describes the evolution of the outputs of the simple model.

Similar to recently developed work on motion planning for ground and aerial vehicles [majumdar2017funnel, herbert2017fastrack, kousik2018bridging, kousik2019safe], one can then require that all possible outputs in the forward reachable set satisfy the set of inequality constraints that we define that guarantee that the robot does not fall over during the -steps. Unfortunately this type of set inclusion constraint can be challenging to enforce at run-time. Finally, in Sections IV-B and V, we describe how to incorporate this set-inclusion constraint as a set of inequality constraints in a Model Predictive Control (MPC) framework that are sufficient to ensure -step walking that does not fall over. Note, to simplify exposition, this paper focuses on an example implementation on a -dimensional model of the robot RABBIT that is described in Section II. The remainder of this paper is organized as follows. Section VI demonstrates the performance of the proposed approach on a walking example and Section VII concludes the paper.

## Ii Preliminaries

This section introduces the notation, the dynamic model of the RABBIT robot, and a Simplified Biped Model (SBM) that is used throughout the remainder of this paper. The following notation is adopted in this manuscript. All sets are denoted using calligraphic capital letters. Let denote the set of real numbers, and let denote the collection of all non-negative integers. Give a set for some , let denote the set of all differentiable continuous functions from to whose derivative is continuous and let denote the Lebesgue measure which is supported on .

### Ii-a RABBIT Model (Anchor)

This paper considers the walking motion of a planar 5-link model of RABBIT [chevallereau2003rabbit]. The walking motion of the RABBIT model consists of alternating phases of single stance (one leg in contact with the ground) and double stance (both legs in contact with the ground). While in single stance, the leg in contact with the ground is called the stance leg, and the non-stance leg is called the swing leg. The double stance phase is instantaneous. The configuration of the robot at time is , where are Cartesian position of the robot hip; is the torso angle relative to the upright direction; and are the hip angles relative to stance and swing leg, respectively; and and are the knee angles. The joints are actuated, and is an underactuated degree of freedom. Let denote the stance leg angle, and let denote the swing leg angle. We refer to the configuration when the robot hip is right above the stance foot, i.e. , as mid-stance. We refer to the motion between the -th and -st swing leg foot touch down with the ground as the -th step.

Using the method of Lagrange, we can obtain a continuous dynamic model of the robot during swing phase:

 ˙a(t)=f(a(t),u(t)) (1)

where denotes the tangent bundle of , , describes the permitted inputs to the system, and denotes time. We model the RABBIT as a hybrid system and describe the instantaneous change using the notation of a guard and a reset map. That is, suppose denotes the stance leg angle and the vertical position of the swing foot relative to the stance foot, respectively, given a configuration at time . The guard is . Notice the force of the ground contact imposes a holonomic constraint on stance foot position, which enables one to obtain a reset map: [westervelt2007feedback, Section 3.4.2]:

 ˙q+(t)=Δ(˙q−(t)), (2)

where describes the relationship between the pre-impact and post impact velocities. More details about the definition and derivation of this hybrid model can be found in [westervelt2007feedback, Section 3.4].

To simplify exposition, this paper at run-time optimizes over a family of reference gaits that are characterized by their average velocity and step length. These reference gaits are described by a vector of control parameters for all , where denotes the average horizontal velocity and denotes the step length between the -th and -st mid-stance position. Note is compact. These reference gaits are generated by solving a finite family of nonlinear optimization problems using FROST in which we incorporate , and periodicity as constraints, and minimize the average torque squared over the gait period [hereid2017frost]. Each of these problems yields a reference trajectory parameterized by and interpolation is applied over these generated gaits to generate a continuum of gaits. Given a control parameter, a control input into the RABBIT model is generated by tracking the corresponding reference trajectories using a classical PD controller.

Next, we define a solution to the hybrid model as a pair , where is a hybrid time set with being intervals in , and is a finite sequence of functions with each element satisfying the dynamics (1) over where [lygeros2012hybrid, Definitions 3.3, 3.4, 3.5]. Denote each for all . corresponds to the time of the transition between -th to -th step. We let correspond to the time just before the transition and and correspond to the time just after the transition. Since transitions are assumed to be instantaneous, if all values exist. When a transition never happens during the -th step, we denote . Note when , and .

### Ii-B Simplified Biped Model (Template)

As we show in Section VI, performing online optimization with the full RABBIT model is intractable due to the size of its state space. In contrast, performing online optimization with the Simplified Biped Model (SBM) adopted from [wight2008introduction] is tractable. This model consists of a point-mass and two mass-less legs each with a constant length . The configuration of the SBM at time is described by the stance leg angle, , and the swing leg angle, . The input into the model is the step length size and the guard is the set of configurations when . The swing leg swings immediately to a specified step length. During the swing phase, one can use the method of Lagrange to describe the evolution of the configuration as a function of the current configuration and the input. Subsequent to the instantaneous double stance phase, an impact with the ground happens with a coefficient of restitution of . We denote a hybrid execution of the SBM as a pair where is a hybrid time set with and is a finite sequence of solutions to the SBM’s equations of motion.

## Iii Outputs to Describe Successful Walking

During online optimization, we want to optimize over the space of parameterized inputs while introducing a constraint to guarantee that the robot does not fall over. This section first formalizes what it means for the RABBIT model to walk successfully without falling over. Unfortunately due to the high-dimensionality of the RABBIT model, implementing this definition directly as a constraint during online optimization is intractable. To address this problem, in Section III-A defines a set of outputs that are functions of the state of RABBIT and proves that the value of these outputs can determine whether RABBIT is able to walk successfully. Subsequently in Section III-B we define a corresponding set of outputs that are functions of the state of the SBM and illustrate how their values can be used to determine whether RABBIT is able to walk successfully.

To define successful walking on RABBIT, we begin by defining the time during step at which mid-stance occurs (i.e., the largest time at which during ) as

 tMSi:=⎧⎪⎨⎪⎩+∞,if θ(q(t))<π∀t∈Ii,−∞,if θ(q(t))>π∀t∈Ii,max{t∈Ii∣θ(q(t))=π},otherwise. (3)

Note if mid-stance is never reached during step , then the mid-stance time is defined as plus or minus infinity depending upon if the hip-angle remains less than or greater than during step , respectively. Using this definition, we formally define successful walking for the RABBIT model as:

###### Definition 1.

The RABBIT model walks successfully in step if

1. ,

2. for all , and

3. .

To understand this definition, note that the first requirement ensures that mid-stance is reached, the second requirement ensures that the hip remains above the ground, and the final requirement ensures that the swing leg actually makes contact with the ground. Though satisfying this definition ensures that RABBIT takes a step, enforcing this condition directly during optimization can be cumbersome due to the high dimensionality of the RABBIT dynamics.

### Iii-a Outputs to Describe Successful RABBIT Walking

This subsection defines a set of discrete outputs that are functions of the state of RABBIT model and illustrates how they can be used to predict failure. We begin by defining another time variable :

 t0i:=⎧⎪ ⎪⎨⎪ ⎪⎩τ+i,if ˙θ(q(t),˙q(t))<0∀t∈Ii,τ−i+1,if ˙θ(q(t),˙q(t))>0∀t∈Ii,max{t∈Ii∣˙θ(q(t),˙q(t))=0},otherwise. (4)

Note is defined to be the last time in when a sign change of occurs; when a sign change does not occur, is defined as an endpoint of associated with the sign of .

We first define an output, that can be used to ensure that :

 y1(i):=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩˙θ(q(tMSi),˙q(tMSi)),% if tMSi≠±∞,−√2g(lst(t0i)−qv(t0i))/lst(t0i),if tMSi=+∞,1if tMSi=−∞, (5)

where is gravity and is the stance leg length at time . Note that is the hip angular velocity when the mid-stance position is reached during the -th step. When the mid-stance position is not reached, represents the additional hip angular velocity needed to reach the mid-stance position. In particular, notice whenever .

Next, we define an output that can be used to ensure that :

 y2(i):={ϕ(q(τ−i+1)), if τ−i+1<+∞,2π,otherwise. (6)

Note, is the swing leg angle at touch-down at the end of the -th step; if touch-down does not occur, is defined as . Recall , so if , it then follows from (3) and (6) that and . Fig. 2 illustrates the behavior of and .

We now define our last two outputs that can be used to ensure that the hip stays above the ground:

 y3(i):={inf{θ(q(t))∣t∈[tMSi,tMSi+1]},if tMSi+1,tMSi∈R,−∞,otherwise. (7)
 y4(i):={sup{θ(q(t))∣t∈[tMSi,tMSi+1]},if tMSi+1,tMSi∈R,+∞,otherwise. (8)

Finally, we let . Using these definitions, we can prove the following theorem that constructs a sufficient condition to ensure successful walking by RABBIT.

###### Theorem 2.

Suppose that the -th step can be successfully completed (i.e. and are finite, , and )). Suppose , and for each , then the robot walks successfully at the -th step for each .

###### Proof.

Notice and for each . By induction we have is finite . implies that . By using the definitions of and , one has that the robot walks successfully in the -th step based on Definition 1. ∎

### Iii-B Approximating Outputs Using the SBM

Finding an analytical expression describing the evolution of each of the outputs can be challenging. Instead we define corresponding outputs for SBM. Importantly, the dynamics of each of these corresponding outputs can be succinctly described.

As we did for the RABBIT model, consider the following set of definitions for the SBM:

 ^tMSi:=⎧⎪ ⎪⎨⎪ ⎪⎩+∞,if ^θ(t)<π∀t∈^Ii,−∞,if ^θ(t)>π∀t∈^Ii,max{t∈^Ii∣^θ(t)=π},otherwise. (9)
 ^t0i:=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩^τ+i,if ˙^θ(t)<0∀t∈^Ii,^τ−i+1,if ˙^θ(t)>0∀t∈^Ii,max{t∈^Ii∣˙^θ(t)=0},otherwise. (10)
 ^y1(i):=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩˙^θ(^tMSi),if ^tMSi≠±∞−√2g(l(1+cos(^θ(^t0i))))/l,if ^tMSi=+∞1if ^tMSi=−∞, (11)
 ^y2(i):={^ϕ(^τ−i+1),if ^τ−i+1<+∞,2π,otherwise. (12)
 (13)
 (14)

The discrete-time dynamics of each of these outputs of SBM can be described by the following difference equations:

 ^y1(i+1) =f^y1(^y1(i),P(i)) (15) ^y2(i) =f^y2(P(i)) ^y3(i) =f^y3(^y1(i),P(i)) ^y4(i) =f^y4(^y1(i),P(i))

for each , , and . Such functions , , and can be generated using elementary mechanics 111A derivation can be found at: https://github.com/pczhao/TA_GaitDesign/blob/master/SBM_dynamics.pdf.

To describe the gap between the discrete signals and we make the following assumption:

###### Assumption 3.

For any sequence of control parameters, , and corresponding sequences of outputs, and , generated by the RABBIT dynamics and (15), respectively, there exists bounding functions , , , and , satisfying

 B––1(y1(i),P(i))≤y1(i+1)−^y1(i+1)≤¯¯¯¯B1(y1(i),P(i)) (16) y2(i)−^y2(i)≤¯¯¯¯B2(P(i−1),y1(i),P(i)) (17) y3(i)−^y3(i)≥B––3(y1(i),P(i)) (18) y4(i)−^y4(i)≤¯¯¯¯B4(y1(i),P(i)). (19)

In other words, if , then , , , , and bound the maximum possible difference between and . Though we do not describe how to construct these bounding functions in this paper due to space limitations, one could apply SOS optimization to generate them [smith2019]. To simplify further exposition, we define the following:

 (20)

for all . In particular, it follows from (16) that for any sequence of control parameters, , and corresponding sequences of outputs, generated by the RABBIT dynamics that for all .

## Iv Enforcing N-Step Safe Walking

This section proposes an online MPC framework to design a controller for the RABBIT model that can ensure successful walking for -steps. In fact, when one can directly apply Theorem 2 and Assumption 3 to generate the following inequality constraints over , and to guarantee walking successfully from the -th to the -th mid-stance:

 (21) f^y2(P(i))+¯¯¯¯B2(P(i−1),y1(i),P(i))≤π, (22) (23) (24)

Unfortunately, to construct a similar set of constraints when , one has to either compute for each , which can be computationally taxing, or one can apply (16) recursively to generate an outer approximation to for each and then apply the remainder of Assumption 3 to generate an outer approximation to and for each . In the latter instance, one would need the entire set of possible values for the outputs to satisfy the bounds described in (21), (22), (23), and (24) to ensure -step safe walking. This requires introducing a set inclusion constraint that can be cumbersome to enforce at run-time. To address these challenges, Section IV-A describes how to compute in an offline fashion, an -step Forward Reachable Set (FRS) that captures all possible outcomes for the outputs from a given initial state and set of control parameters for up to steps. Subsequently, Section IV-B illustrates how to impose the set inclusion constraints as inequality constraints.

### Iv-a Forward Reachable Set

Letting be compact, we define the -step FRS of the output:

###### Definition 4.

The -step FRS of the output beginning from for and for is defined as

 WN(y1(i),P(i)):=i+N⋃n=i+1{y1(n)∈Y1∣∃P(i+1),…, P(n−1)∈P such that ∀j∈{i,…,i+n−1}, y1(j+1) is generated by the RABBIT dynamics from y1(j) under P(j) } (25)

In other words, given a fixed output and the current control parameter , the FRS captures all the outputs that can be reached within steps, provided that all subsequent control parameters are contained in a set . The following result follows from the previous definition:

###### Lemma 5.
 WM(y1(i),P(i))⊆WN(y1(i),P(i))∀1≤M≤N (26)

To compute an outer approximation of the FRS, one can solve the following infinite-dimensional linear problem over the space of functions:

 infwN,v1,⋯,vN ∫Y1×P×Y1wN(x1,x2,x3)dλY1×P×Y1  (FRSopt) s.t. v1(x1,x2,x3)≥0, ∀x3∈B(x1,x2) ∀(x1,x2)∈Y1×P vζ+1(x1,x2,x4)≥vζ(x1,x2,x3), ∀ζ∈{1,2,⋯,N−1} ∀x4∈B(x3,x5) ∀(x1,x2,x5)∈Y1×P×P wN(x1,x2,x3)≥0, ∀(x1,x2,x3)∈Y1×P×Y1 wN(x1,x2,x3)≥vζ(x1,x2,x3)+1, ∀ζ=1,2,⋯,N ∀(x1,x2,x3)∈Y1×P×Y1

where the sets and are given, and the infimum is taken over an -tuple of continuous functions . Note that only the SBM’s dynamics appear in this program via .

Next, we prove that the FRS is contained in the 1-superlevel set of all feasible ’s in (FRSopt):

###### Lemma 6.

Let be feasible functions to (FRSopt), then for all

 WN(y1(i),P(i))⊆{x3∈Y1∣wN(y1(i),P(i),x3)≥1}. (27)
###### Proof.

Let be feasible functions to (FRSopt). Substitute an arbitrary and into and , respectively. Suppose , then there exists a natural number and a sequence of control parameters , such that for all and .

We prove the result by induction. Let . It then follows from the first constraint of (FRSopt) that . Now, suppose for some . In the second constraint of (FRSopt), let , , and , then . By induction, we know . Using the fourth constraint of (FRSopt), let , and we get . Therefore . ∎

Though we do not describe it here due to space restrictions, a feasible polynomial solution to (FRSopt) can be computed offline by making compact approximation of and applying Sums-of-Squares programming [zhao2017control, mohan2017synthesizing].

### Iv-B Set Inclusion

To ensure safe walking through -steps beginning at step , we require several set inclusions to be satisfied during online optimization. First, we require that , which ensures that (21) is satisfied. Since we cannot compute exactly we instead can require that the -superlevel set of is a subset of ; however, this set inclusion is difficult to enforce using MPC. Instead we utilize the following theorem which follows as a result of the -procedure technique described in Section 2.6.3 of [boyd1994linear] and Lemma 6:

###### Theorem 7.

Let be feasible functions to (FRSopt) and be as in Definition 4. Let be functions that are non-negative everywhere. Suppose satisfies the following inequality

 s1(x1,x2,x3)⋅x3−ℓ(x1,x2)+ − s2(x1,x2,x3)⋅(wN(x1,x2,x3)−1)≥0 (28)

for every . Then for any and , if , then .

Given a feasible solution to (FRSopt), one can construct polynomial functions and offline that satisfy Theorem 7 using Sums-of-Squares programming [zhao2017control, mohan2017synthesizing].

Similarly we can utilize the following theorem to construct polynomial functions offline that allow us to verify whether safe -step walking is feasible.

###### Theorem 8.

For each , suppose

1. are functions that are non-negative everywhere and there exist functions that satisfy the following inequality

 sy2ζ,1(x1,x2,x3)⋅(π−f^y2(x3)−¯¯¯¯B2(x1,x2,x3))+ − sy2ζ,2(x1,x2,x3)⋅x2−ℓy2ζ(x1,x3)≥0, (29)

for every . Then for each and , if and , then .

2. are functions that are non-negative everywhere, is a small positive number, and there exist functions that satisfy the following inequality

 sy3ζ,1(x1,x2)⋅(f^y3(x1,x2)+B––3(x1,x2)−π/2−ϵ)+ − sy3ζ,2(x1,x2)⋅x1−ℓy3ζ(x2)≥0 (30)

for every . Then for each and , if and , then .

3. are functions that are non-negative everywhere, is a small positive number, and there exists that satisfy the following inequality

 sy4ζ,1(x1,x2)⋅(3π/2−ϵ−f^y4(x1,x2)−¯¯¯¯B4(x1,x2))+ − sy4ζ,2(x1,x2)⋅x1−ℓy4ζ(x2)≥0 (31)

for every . Then for each and , if and , then .

One can construct polynomial functions and offline that satisfy Theorem 7 using Sums-of-Squares programming [zhao2017control, mohan2017synthesizing]. As we describe next, these functions allow us to represent the set inclusions conditions as inequality constraints that are amenable to online optimization.

## V Model Predictive Control Problem

We use a MPC framework to select a gait parameter for RABBIT by solving the following nonlinear program:

 minP(i)⋮P(i+N−1) r(y(i),P(i),P(i+1),⋯,P(i+N−1))(OL) s.t. ℓ(y1(i),P(i))≥0, f^y2(P(i))+¯¯¯¯B2(P(i−1),y1(i),P(i))≤π, f^y3(y1(i),P(i))+B––3(y1(i),P(i))>π/2, f^y4(y1(i),P(i))+¯¯¯¯B4(y1(i),P(i))<3π/2, ℓy2ζ(P(i+ζ−1),P(i+ζ))≥0,