Rolling in the Deep – Hybrid Locomotion for Wheeled-Legged
Robots using Online Trajectory Optimization
Wheeled-legged robots have the potential for highly agile and versatile locomotion. The combination of legs and wheels might be a solution for any real-world application requiring rapid, and long-distance mobility skills on challenging terrain. In this paper, we present an online trajectory optimization framework for wheeled quadrupedal robots capable of executing hybrid walking-driving locomotion strategies. By breaking down the optimization problem into a wheel and base trajectory planning, locomotion planning for high dimensional wheeled-legged robots becomes more tractable, can be solved in real-time on-board in a model predictive control fashion, and becomes robust against unpredicted disturbances. The reference motions are tracked by a hierarchical whole-body controller that sends torque commands to the robot. Our approach is verified on a quadrupedal robot that is fully torque-controlled, including the non-steerable wheels attached to its legs. The robot performs hybrid locomotion with different gait sequences on flat and rough terrain. In addition, we validated the robotic platform at the Defense Advanced Research Projects Agency (DARPA) Subterranean Challenge, where the robot rapidly maps, navigates, and explores dynamic underground environments.
Legged robots offer the possibility of negotiating challenging environments and thus, are versatile platforms for various types of terrains [bellicoso2018jfr]. In research and industry, there is an emphasis on replicating nature to improve the hardware design and algorithmic approach of robotic systems [eckert2015comparing, nyakatura2019reverse]. Even with extensive research, matching the locomotion skills of conventional legged robots to their natural counterparts remains elusive. In contrast, wheels offer a chance to extend some capabilities, particularly speed, of these legged robotic systems beyond those of their natural counterparts, which can be crucial for any task requiring rapid and long-distance mobility skills in challenging environments. With this motivation, the central contribution of this work involves locomotion planning on a wheeled-legged robot to perform dynamic hybrid walking-driving motions on various terrains, as shown in Fig. 1.
I-a Related Work
Finding control policies for performing walking motions in an articulated mobile robot is an involved task because of the system’s many degrees of freedom (DOF) and its nonlinear dynamics. This demands substantial computational power and introduces the challenge of overcoming local minima, making on-the-fly computations hard. Therefore, the online generation of optimal solutions for dynamic motions has been an active research area for conventional legged robots. Methods like trajectory optimization (TO) and model predictive control (MPC) are pervasive and recommended in the literature for aiding robots to be reactive against external disturbances and modeling errors.
In the literature concerning wheeled-legged robots, hybrid walking-driving motions are scarce. The focus is mostly on statically-stable driving motions where the legs are used for active suspension alone [reid2016actively, giordano2009kinematic, cordes2014active, giftthaler2017efficient, suzumura2014real, grand2010motion]. These applications do not show any instance of wheel lift-offs, and hence sophisticated motion planning for the wheels is unnecessary and hence usually skipped.
Agile motions over steps and stairs were demonstrated for the first time in our previous work [bjelonic2019keep] where the motion trajectories were tracked by a hierarchical whole-body controller (WBC) that included the rolling conditions associated with the wheels. The robot was able to execute walking and driving motions, but not simultaneously due to the missing wheel trajectory generation. The work in [viragh2019trajectory] extends the approach by computing base and wheel trajectories in a single optimization framework. This approach, however, decreased the update rate to 50 Hz, and no hybrid walking-driving motions were shown on the real robot.
CENTAURO, a wheeled-legged quadruped with a humanoid upper-body, performs a walking gait with automatic footstep placement using a linear MPC framework [laurenzi2018quadrupedal]. The authors, however, only perform walking maneuvers without making use of the wheels. Among the robots that employ hybrid walking-driving motions, Jet Propulsion Laboratory’s (JPL) Robosimian uses a TO framework [bellegarda2019trajectory], but for passive wheels and results are only shown in a simulation. Skaterbots [geilinger2018skaterbots] provide a generalized approach to motion planning by solving a nonlinear programming (NLP) problem. This approach, however, is impractical to update online in a receding horizon fashion, i.e., in a MPC fashion, due to excessive computational demand.
Given the state of the art, we notice a research gap in trajectory generation methods for hybrid walking-driving motions on legged robots with actuated wheels, which can be both reliable on various terrains and be used on-the-fly. Fortunately, research in traditional legged locomotion offers solutions to bridge this gap. The quadrupedal robot ANYmal (without wheels) performs highly dynamic motions using MPC [bellicoso2017dynamic, grandia2019feedback] and TO [winkler2018gait, carius2019trajectory] approaches. Impressive results are shown by MIT Cheetah, which performs blind locomotion over stairs [di2018dynamic] and jumps onto a desk with the height of 0.76 m [nguyen2019optimized]. The quadrupedal robot HyQ shows an online, dynamic foothold adaptation strategy based on visual feedback [magana2019fast]. Therefore, we conjecture that extending these approaches to wheeled-legged systems can aid in producing robust motions.
We present an online TO framework for wheeled-legged robots capable of running in a MPC fashion by breaking the problem down into separate wheel and base TOs. For dynamically-consistent motions, our wheel TO takes the rolling constraints of the wheels into account, while our base TO accounts for the robot’s balance during locomotion using the idea of the zero-moment point (ZMP) [vukobratovic2004zero]. A hierarchical WBC [bjelonic2019keep] tracks these motions by computing torque commands for all joints. Our hybrid locomotion framework extends the capabilities of wheeled-legged robots in the following ways:
Our framework is versatile over a wide variety of gaits, such as, pure driving, statically stable gaits, dynamically stable gaits, and gaits with full-flight phases.
We generate wheel and base trajectories for hybrid walking-driving motions in the order of milliseconds. Thanks to these fast update rates, the resulting motions are robust against unpredicted disturbances, making real-world deployment of the robot feasible. Likewise, we demonstrate the performance of our system at the DARPA Subterranean Challenge, where the robot autonomously maps, navigates, and searches dynamic underground environments.
Ii Motion Planning
The whole-body motion planner is based on a task synergy approach [farshidian2017planning] which decomposes the optimization problem into wheel and base TOs. By breaking down the problem into these two tasks, we hypothesize that the problem of locomotion planning for high-dimensional (wheeled-)legged robots becomes more tractable. The optimization can be solved in real-time in a MPC fashion. With high update rates, the locomotion is able to cope with unforeseen disturbances.
The main idea behind our approach is visualized in Fig. 2. Given a fixed gait pattern and the reference velocities 111The reference velocities are generated from an external source, e.g., an operator device, or a navigation planner as used in the DARPA Subterranean Challenge. with respect to (w.r.t.) the robot’s base frame as shown in Fig. 3, i.e., the linear velocity vector of its center of mass (COM) and the angular velocity vector , desired motion plans are generated in two steps, where the wheel TO is followed by a base TO which satisfies the ZMP [vukobratovic2004zero] stability criterion. The latter simplifies the system dynamics for motion planning of the COM to enable real-time computations on-board. Finally, a controller tracks these motion plans by generating torque commands which are sent to the robot’s motor drives. Due to this decomposition of the locomotion problem, the wheel TO, the base TO, and the tracking controller can run in parallel.
The following two sections discuss the main contribution of our work and show how the locomotion of the independent wheel and base TOs are synchronized to generate feasible motion plans.
Iii Wheel Trajectory Optimization
We formulate the task of finding the wheel trajectories, i.e., the , and trajectories w.r.t. a wheel coordinate frame as illustrated in Fig. 3, as a quadratic programming (QP) problem given by
where is the vector of optimization variables. The quadratic objective is minimized while respecting the linear equality and inequality constraints. In the following, the parameterization of the optimization variable is presented, and we introduce each of the objectives, equality constraints and inequality constraints which form the optimization problem.
Iii-a Parameterization of Optimization Variables
We describe the wheel trajectories as a sequence of connected splines. In our implementation, one spline is allocated for segments where the wheel is in contact with the ground, and two splines are used for describing the trajectory of the wheels in air. Therefore, the total number of splines for one gait sequence is (see Fig. 3). These two types of trajectory segments, i.e., corresponding to leg in air and in contact, are defined by different parameterizations as described next.
Iii-A1 Wheel segments in air
We parameterize each coordinate of the wheel trajectory in air as quintic splines. Thus, the position vector at spline segment is described by
where and contains the polynomial coefficients. Here, describes the time interval of spline with a duration of , where is the sum of all the previous splines’ durations (see example of the fourth spline in Fig. 3). We seek to optimize the polynomial coefficients for all coordinates of spline segment and hence contain them in the vector .
Iii-A2 Wheel segments in contact
As shown in our previous work [viragh2019trajectory], we employ a different parameterization for wheel segments in contact, such that they inherently capture the velocity constraints corresponding to no-lateral-slip of the wheel. For this purpose, we represent the wheel’s velocity in the coordinate, i.e., the rolling direction, as a quadratic polynomial, while the velocities of the remaining directions are set to zero. Thus, the velocity vector of the -th spline is
and the position vector is obtained by integrating w.r.t. and adding the initial position and of the trajectory as
where the rotation matrix describes the change in the wheel’s orientation caused by the reference yaw rate, i.e., the vector . By assuming a constant reference yaw rate over the optimization horizon, the integration is solved analytically, giving a linear expression w.r.t. the coefficients . Thus, the velocity and acceleration trajectories of spline are described by and , respectively.
Iii-B Formulation of Trajectory Optimization
To achieve robust locomotion, we deploy an online TO which is executed in a MPC fashion, i.e., the optimization is continuously re-evaluated providing a motion over a time horizon of seconds, where can be chosen as the stride duration of the locomotion gait.
The complete TO of the wheel trajectories is formulated as a QP problem as follows,
|if leg in contact:|
|if leg in air:|
where each element is described in more detail in the following sections.
The following sections describe the cost terms introduced in (5) in detail.
Iii-C1 Acceleration minimization
The acceleration of the entire wheel trajectory is minimized to generate smooth motions. The cost term for a wheel in air over the time duration of spline is given by
where is the hessian matrix, and is the corresponding weight matrix. Here, the linear term of (1) is null, i.e., . Similar, for a spline segment in contact, the hessian matrix, , is obtained by squaring and integrating the acceleration of the wheel trajectory over the time duration . Here, the time matrix , and hence, is dependent on the reference yaw rate as discussed in (4).
Iii-C2 Minimize deviations from previous solution
For a TO with high update rate, large deviations between successive solutions can produce quivering motions. To avoid this, we add a cost term that penalizes deviations of kinematic states between consecutive solutions. We penalize the position deviations between the optimization variables from the current solution and the previous solution as
where is the position vector of the wheel from the previous solution shifted by the elapsed time since computing the last solution, and is the corresponding weight matrix. This cost is penalized over the time horizon with sampling points, where is the time at time step and . Objectives for minimizing velocity and acceleration deviations are added in a similar formulation.
Iii-C3 Track reference velocity of wheels in contact
As shown in (3), the velocity along the rolling direction of the wheel trajectory is described by a quadratic polynomial which inherently satisfies the no-slip constraint. To track the reference velocity , we minimize the norm which gives
Iii-C4 Minimize deviations from default wheel positions
When a wheel is in contact, differences in heading velocities of the wheels and the base can lead to configurations where the corresponding leg can get extended in the forward or backward direction. To guide the optimizer towards solutions within a desired leg configuration, we minimize the distance of the wheel from a default position along the rolling direction as
where is the corresponding weight, and the sampling over the -th contact segment’s time duration is the same as shown in the paragraph below (7).
Iii-C5 Foothold projection
The placement of the wheel after a swing phase is crucial for hybrid locomotion (and for legged locomotion in general) because it contributes to maintaining balance and reacting to external disturbances. As shown in (5), the cost term to guide the foothold placement is given by , where is the weight matrix, and is the touchdown time of spline segment in air, i.e., at the end of the spline in air representing the second half of the swing phase (see Fig. 3). The subscript indicates that only footholds on the terrain plane are considered, i.e., the component is given by the height of the terrain estimation.
The position vector guides the locomotion depending on the reference velocity given by the linear velocity vector and the angular velocity vector as
where is a specified default wheel position similar to (9), and is the position vector from the robot’s COM to the projection of the measured wheel position onto the terrain plane.
Decoupling the locomotion problem into wheel and base TOs requires an additional heuristic to maintain balance. Balancing is achieved by adding a feedback term to the foothold obtained from reference velocities, through an inverted pendulum model [gehring2016practice, raibert1986legged] given by
where and are the reference and the measured velocity between the associated hip and base frame, respectively. Here, is the height of the hip above the ground, represents the gravitational acceleration, and is the gain for balancing.
Iii-C6 Swing height
with , and is the time at maximum swing height of spline segment in air, i.e., at the end of the spline in air representing the first half of the swing phase (see Fig. 3).
Similarly, we set the and coordinates of the swing trajectory at maximum swing height to match the midpoint of lift-off and touch-down position.
Iii-D Equality Constraints
The following sections describe the equality constraints introduced in (5) in more detail.
Iii-D1 Initial states
To achieve a reactive behaviour, every optimization is initialized with the current state of the robot. As discussed in (4), the initial position of the wheel segments in contact are set as equality constraints given by
where the initial values and are the measured positions of the wheel.
If the optimization problem begins with a wheel trajectory in air, we set the initial position, velocity, and acceleration to the measured state of the wheels, i.e., , , and .
Iii-D2 Spline continuity
We constrain the position, velocity and acceleration at the junction of the two wheel trajectory segments and in air as
Junction constraints between air and contact phases are only formulated on position and velocity level. Here, the acceleration is not constrained so that the optimizer accepts abrupt changes in accelerations, allowing lift-off and touch-down events.
Iii-E Inequality Constraints
The following section describes the inequality constraint introduced in (5) in more detail.
Iii-E1 Avoid kinematic limits
To avoid over-extension of the legs, we keep the wheel trajectories in a kinematic feasible space which is approximated by a rectangular cuboid centered around the default positions as in (9). As introduced in (5), the kinematic limits , , and are enforced over the full time horizon as , , , with a fixed sampling time similar to (7).
Iv Base Trajectory Optimization
The online TO of the base motion relies on a ZMP [vukobratovic2004zero]-based optimization which continuously updates reference trajectories for the free-floating base. Here, we extend the approach shown in our previous work [bjelonic2019keep] which originates from the motion planning problem of traditional legged robots [bellicoso2017dynamic]. Given the wheel TO in (5), we are now able to generalize the idea of the ZMP to wheeled-legged systems taking into account the moving contact points when in contact.
As shown in Figure 2, the motion planner of the free-floating base is described by a nonlinear optimization problem, which minimizes a nonlinear cost function subjected to nonlinear equality and inequality constraints . Here, the vector of optimization variables is composed of the position of the COM and the yaw-pitch-roll Euler angles of the base .
Iv-a Parameterization of Optimization Variables
The trajectories for each DOF of the free-floating base is represented as a sequence of quintic splines, which allows to set position, velocity and acceleration constraints. Thus, the parameterization is formulated similar to the definition of the wheel trajectories in air given in Section III-A1.
Iv-B Formulation of Trajectory Optimization
The online TO of the base has a similar setup as the TO described in (5). Cost terms are added to maintain smooth motions, and to track the reference velocity. The equality constraints initialize the variables with the current measured state of the base, and add junction constraints between consecutive splines. For balancing, we add a ZMP inequality constraint, which is described in more detail in the next section, since this is the only part of the base optimization problem which is affected by the computed wheel trajectories in Section III. A complete list of each objective and constraint can be obtained in [bjelonic2019keep].
Iv-C Generalization of ZMP Inequality Constraint
The ZMP position is constrained to lie inside the support polygon222A support polygon is defined by the convex hull of the expected wheels’ contact locations. throughout the optimization horizon to ensure dynamic stability of the robot [bellicoso2017dynamic]. This nonlinear inequality constraint is given by
where [sardain2004forces] and is the the terrain normal. The gravito-inertial wrench [caron2017zmp] is given by and , where is the mass of the robot, is the angular momentum of the COM, and is the gravity vector. In contrast to [bellicoso2017dynamic, bjelonic2019keep], the line coefficients that describe an edge of a support polygon depend on time , since the contact points of wheeled-legged robots continue to move even when a leg is in contact, unlike conventional legged robots. The ZMP inequality constraint is sampled over the time horizon with a fixed sampling time .
V Experimental Results and Discussion
To validate the performance of our hybrid locomotion framework, this section reports experiments and real-world applications conducted on ANYmal equipped with non-steerable, torque-controlled wheels (see Fig. 1). A video333Available at https://youtu.be/ukY0vyM-yfY showing the results accompanies this paper.
The wheel TO, base TO, tracking controller, and state estimator are running on a single PC (Intel i7-7500U, 2.7 GHz, dual-core 64-bit). All computation regarding the autonomy, i.e., perception, mapping, localization, path planning, path following, and object detection, is carried out by three different PCs. The robot is fully self-contained in terms of computation and perception.
As shown in our previous work [bjelonic2019keep], the computed trajectories in Section III and Section IV are tracked by a hierarchical WBC which generates torque commands for each actuator by accounting for the full rigid body dynamics and physical constraints, i.e., non-holonomic rolling constraint, friction cone, and torque limits. The WBC runs together with state estimation [bloesch2018two] in a 400 Hz loop. Similar to [bloesch2013stateconsistent], we fuse the inertial measurement unit (IMU) reading and the kinematic measurements from each actuator to acquire the robot’s state. Moreover, the frame in Fig. 3 requires an estimate of the terrain normal. In this work, the robot is locally modeling the terrain as a three-dimensional plane, which is estimated by fitting a plane through the most recent contact locations [bjelonic2019keep].
We model and compute the kinematics and dynamics of the robot based on the open-source Rigid Body Dynamics Library (RBDL) [rbdl] which uses the algorithms described in [featherstone2014rigid]. The nonlinear optimization problem in Section IV is solved with a custom sequential quadratic programming (SQP) algorithm, which solves the problem by iterating through a sequence of QP problems. Each QP problem including the optimization problem in Section III is solved using QuadProg++ [quadProg], which internally implements the Goldfarb-Idnani active-set method [goldfarb1983numerically]. To maintain a positive definite Hessian in (1) and to ensure the convexity of the resulting QP problem, a regularizer is added to its diagonal elements, e.g., as in [bellicoso2017dynamic].
V-B Solver Time of Different Contact Scheduler and Gait Switching
As shown in Table I, the wheel and base optimizations are solved in the order of milliseconds, and a great variety of gaits from driving, i.e., all legs in contact, up to gaits with full-flight phases are possible. Besides, the accompanying video shows manual gait switches between driving and hybrid walking-driving gaits, which can be useful for future works regarding automatic gait switches.
|Gait||/ (s)||Wheel TO / (ms)||Base TO / (ms)|
|Hybrid running trot||0.64||0.58||5.77|
V-C Rough Terrain Negotiation
The robot is capable of blind locomotion in a high variety of unstructured terrains, e.g., inclines, steps, gravel, mud, and puddles. Fig. 1 and the accompanied video shows the performance of the robot in these kinds of environments.
V-D High Speed and Cost of Transport
On flat terrain, the robot achieves a mechanical cost of transport (COT) [bjelonic2018skating] of 0.2 while hybrid trotting at the speed of 2 m/s and the mechanical power consumption is 156 W. The COT is by a factor of two higher than a pure driving gait at the same speed. A comparison to traditional walking and skating with passive wheels [bjelonic2018skating] shows that the COT is lower by 42 % w.r.t. the traditional trotting gait and by 9 % w.r.t. skating motions.
V-E DARPA Subterranean Challenge: Tunnel Circuit
The first DARPA Subterranean Challenge, the Tunnel Circuit, was held close to Pittsburgh in the NIOSH mine. The main objective was to autonomously search, detect, and provide spatially referenced locations of artifacts inside the underground mine. The wheeled version of ANYmal participated in two runs as part of the CERBERUS team [cerberus] alongside flying and other mobile platforms. Moreover, the wheeled quadrupedal robot was deployed next to the traditional version of ANYmal without wheels.
As depicted in the lower images of Fig. 1, the terrain consisted of hilly, bumpy, and muddy terrain and in some parts of the mine, the robot needed to cross puddles. Throughout both runs, the robot traversed the terrain with a hybrid trot. In the first run, the wheeled version of ANYmal managed to traverse 70 m without major issues, and the robot successfully reported the correct location of one artifact. In the end, however, one of the wheels started slipping on the muddy terrain before the fall. As can be seen in the accompanying video, the robot manages to balance after the first slip because of our implementation of the inverted pendulum model in (11). The mechanical design was improved after the first run by adding a chain around the wheels to increase the friction coefficient while traversing the mud (see the right middle image of Fig. 1). Fig. 4 and 5 show the measured and desired trajectories of the COM and wheels for a few meters of the whole run. Here, it can be seen that the robot executes a hybrid trotting gait since during ground contact, the wheel moves along its rolling direction. The robot traveled for more than 100 m before it fell due to a hardware issue on one of the motors.
Due to the time limitation of the challenge, the speed of mobile platforms becomes an important factor. Most of the wheeled platforms shown from the other competing teams were faster than our traditional legged robot by a factor of two or more. The upcoming Urban Circuit of the Subterranean Challenge includes stairs and other challenging obstacles. Therefore, we believe, only a wheeled-legged robot is capable of combining speed and versatility. At the Tunnel Circuit, the wheeled version of ANYmal traversed with an average speed of 0.5 m/s which was more than double the average speed of the traditional legged system. Our chosen speed was limited by the update frequency of our mapping approach or otherwise could have traversed the entire terrain with much higher speeds without any loss in agility. On the whole, the performance validation for real-world applications is satisfying, and a direct comparison with the traditional ANYmal reveals the advantages of wheeled-legged robots.
This work presents an online TO for generating hybrid walking-driving motions on a wheeled quadrupedal robot. The optimization problem is broken down into wheel and base trajectory generation. The independent wheel and base TOs are synchronized to generate feasible motions by time sampling the prior generated wheel trajectories which form the support polygons of the ZMP inequality constraint of the base TO. The presented algorithm makes the locomotion planning for high dimensional wheeled-legged robots more tractable, enables us to solve the problem in real-time on-board in a MPC fashion, and increases the robustness in the robot’s locomotion against unforeseen disturbances.
To the best of our knowledge, this is the first time that a hybrid walking-driving robot is deployed for real-world missions at one of the biggest robotics competition. To improve the reliability of our newly developed platform, the hardware maturity needs to be increased, e.g., improve the grip of the wheels. Additionally, an algorithmic approach with a higher-level intelligence to judge when to switch between a pure driving gait and a hybrid walking-driving gait can be beneficial. The robot uses, throughout the whole mission, a hybrid trotting gait and the switching to a pure driving gait could increase the speed and robustness of the locomotion.
The authors would like to thank the CERBERUS team consisting of University of Nevada, Reno, ETH Zurich, Sierra Nevada Corporation, University of California, Berkeley, and Flyability for helping to integrate the robot’s autonomy. Special thanks in this regards go to Eris Sako, Markus Stäuble, Giorgio Valsecchi, Marco Tranzatto, Fabian Tresoldi, Fabian Jenelten, Russell Buchanan, Takahiro Miki, Lorenz Wellhausen, Jan Carius, and Samuel Zimmermann from the Robotic Systems Lab of ETH Zürich.