# Describing Physics For Physical Reasoning: Force-based Sequential Manipulation Planning

## Abstract

Physical reasoning is a core aspect of intelligence in animals and humans. A central question is what model should be used as a basis for reasoning. Existing work considered models ranging from intuitive physics and physical simulators to contact dynamics models used in robotic manipulation and locomotion. In this work we propose path descriptions of physics which directly allow us to leverage optimization methods to solve planning problems, using multi-physics descriptions that enable the solver to mix various levels of abstraction and simplifications for different objects and phases of the solution. We demonstrate the approach on various robot manipulation planning problems, such as grasping a stick in order to push or lift another object to a target, shifting and grasping a book from a shelve, and throwing an object to bounce towards a target.

## I Introduction

Reasoning is an essential form of generalization in AI systems, implying decision making competences in situations that are not part of the training data. Understanding the structure of reasoning problems in the real world therefore yields important insights in what kind of structure we might want to impose on learning systems for strong in-built generalization.

In this work we aim to contribute towards general-purpose physical reasoning, by which we mean performing inference over unknowns or controls given a model of physics, and constraints or objectives on (future) configurations. We believe that a core question is how to model physics for the purpose of physical reasoning. In other words, perhaps the pressing challenge is not to develop algorithmic solvers for any kind of forward model of physics, or any kind of physical simulator. Instead, the challenge is to formulate specific models and abstractions of physics that are appropriate for physical inference. We particularly touch on the issues of multi-physics descriptions of physics and exposing a logic of physical interaction.

What are appropriate models of physics for enabling physical
reasoning? Physics is usually described as a differential equation
and simulation as numerical forward
integration. In such forward descriptions of physics, reasoning
becomes a problem of inverting physics, inferring decisions
that lead to desired future configurations.
^{1}

However, in the case of contacts, forward models often describe itself as a mathematical program, i.e. via a linear complementary problem [2, 8], or a Gaussian principle [27]. This is no issue for forward integration in simulators, and can also be made (piece-wise!) differentiable based on classical sensitivity analysis of NLP solutions [9, 18, 14]. But when translating this to path constraints this implies an equality constraint that itself contains a local mathematical program, which is known as bi-level optimization and makes long-term physical reasoning hard. Posa [23] thoroughly discussed the benefits of direct path optimization over bi-level optimization, which we fully follow.

Is there just one correct model of physics for reasoning? The sciences describe physics on many levels of simplification: quantum field dynamics, fluid dynamics, rigid body Newtonian physics, quasi-static physics, toy-like physics as in some games, and intuitive conceptions of physics we find in humans [25]. It would be too limiting for physical reasoning to make use of only one particular abstraction of physics, or one particular physical simulator. Complex reasoning requires the reasoning process to deliberately apply different levels of simplification for different aspects of inference. One approach to enable this is by describing physics itself as if it would switch laws, as if physics would decide itself that certain objects follow sometimes Newtonian laws, while others follow blocks-world pick-and-place laws, and yet others follow quasi-static equations. We use the term multi-physics to refer to this approach, a term only loosely borrowed from numerical simulation science and engineering. In our sense, a multi-physics description of physics enables physical reasoning that leverages different levels of simplification.

The contributions of this work are as follows:

1) We propose (multi-physics) models of physical interaction for general physical reasoning, which include and integrate general force based interactions, quasi-static dynamics, and pick-and-place type interaction modes.

2) We integrate these descriptions in a path optimization framework that takes a skeleton (a logic specification of the sequenced interactions) as input and tries to solve for a physically feasible and optimal path [31, 30, 29, 28]. Correct paths mix interaction modes of different abstraction levels for different object pairs in different time intervals, and we optimize full manipulation sequences across such switches in description.

3) We demonstrate the approach on sequential dynamic and quasi-static manipulation problems, such as pushing with a stick, lifting a ring with a stick, toppling over a box, and sliding a book from a shelf before grasping. The breadth of tasks highlights the generality of the formulation.

## Ii Related Work

A large body of work considers robotic control of a single manipulation phase, such as pushing an object to a desired pose [20, 4]. The focus of this existing work is controller synthesis, which is beyond the scope of this work. Our work extends such approaches to include sequential manipulation across several interactions and using multi-physics descriptions for different phases.

Path descriptions of physics for trajectory optimization through contact interactions have previously been considered for footstep planning, dynamic locomotion, and manipulation [6, 23], where only one fundamental type of interaction (complementary contacts) is modeled. Related to this, [21] demonstrate impressive sequential manipulation plans based on contact-invariant optimization. Note that the particular model used in [21] does not model sliding interactions and stable grasps. We review the approaches in Sec. III. Our work extends this to more general manipulation scenarios leveraging a multi-physics description and using a logical skeleton representation for categorical decisions. Note that [21] also discusses the idea of using contact activation schemes as a scripting language for a user, which is related to our skeleton representation but would not allow mixing multi-physics descriptions.

A core question in describing force based interactions in path optimization is how precisely decision variables are introduced to represent wrench exchange. Fazeli et al. [8] considered general transmission of wrenches through contact patches or multiple contacts. However, in a path optimization setting, the number of contact points depends on the current geometry and is variable throughout optimization. Xie et al. [34] introduced the concept of the equivalent contact point, which subsumes wrenches exchanged via a line or surface contacts into a single contact point. We largely adopt this idea and detail the approach below.

Most existing Task and Motion Planning (TAMP) approaches build on a discretization of the configuration space or action parameter space [19, 17, 16] and/or sample-based planners [26, 5, 1]. Our own previous work [28] proposed Logic-Geometric Programming (LGP), an optimization-based approach that can also plan tool-use and dynamic interactions using simplified Newtonian equations and impulse exchange. However, the particular physics descriptions used in this previous LGP formulation were not general enough to enable broader physical reasoning. In particular, our previous work was missing force-based contact models, proper Newton-Euler equations, and quasi-static variants. The present paper proposes exactly such extensions and thereby aims to consider more generally what are appropriate building blocks in a multi-physics approach to sequential manipulation planning. We focus on the modeling questions and our experiments solve manipulation problems for a given skeleton. Therefore, what is proposed here is only a component of a complete physical TAMP solver that also searches over skeletons, such as our Multi-Bound Tree Search [29].

One of our demonstration scenarios is inspired by [13], which leverages machine learning methods to enable real-time MPC control through planar manipulation interactions. Fig. 1 in [13] describes a scenario where a book is pulled from a shelf to enable a subsequent stable grasp, which exploits different contact modalities and motives our multi-physics description. We consider this scenario in section VI-B4.

Tool-use in animals and humans was described, e.g., in [15, 33, 22]. With our work we aim to provide computational methods to enable such reasoning. General physical reasoning in animals and humans is studied and discussed in [12, 32, 3, 24]. Particularly interesting is the discussion of simplistic and intuitive models of physics [25] that one might consider as being “incorrect”, but which are effective heuristics for real-world reasoning and decision making. This motivated our approach of enabling multi-physics descriptions within a coherent planning framework.

## Iii Background on Complementarity Formulations and Equivalent Contact Points

A very general formulation of contacts is based on complementarity. To have a concrete reference, we briefly review the formulation of [23] for 2D contacts, which is then generalized also to 3D. In the 2D case, a (potential) contact introduces decision variables that define the force , and a slack variable for the slip velocity. Further, we assume we can differentiably evaluate the distance for the configuration , and the geometric slip velocity for . The complementarity formulation imposes constraints

(no collisions (8)) | ||||

(positive forces (9)) | ||||

(force complementarity (13)) | ||||

(friction cone) | ||||

(slip complementarity (14)) | ||||

(true slip is within slack (11,12)) |

where equation numbers refer to the original paper [23]. The force complementarity states that forces may be non-zero only at contact. Posa et al. [23] apply the same principle also to model the friction cone, where may be non-zero only when the force is at the cone equality. Further equations tie the slip slack to the true slip depending on a left/right force. For their 3D extension, the parametrization is replaced by a polyhedral approximation of the friction cone, where the force must be a convex linear combination of each spanning vector of the polyhedral.

The above formulation assumes point contacts. To handle more general contact situations, [34] describe the equivalent contact point as a model for the effective wrench exchange of point, line, or surface contacts. With the forces and the signed distance between two shapes, they start with generic complementarity

(complementarity (8)) |

where equation numbers refer to the original paper [34], and means the either of at least one of the inequalities holds with equality. In their equations (9-13) this is then reformulated by introducing the effective contact point as two 3D positions and , one on each shape surface, and constraining them to be inside the convex shape polytopes with explicit inequalities for each shape face. Analytic solutions for a single simulation step are then provided for the sufrace and line contact cases. We will adopt the idea of effective contact points, but change the formulation to only introduce a single 3D decision variable for the point-of-attack, formulate constraints based on a generic signed distance function rather than an explicit convex polytope description of shapes, and embedding the formulation in path optimization rather than building on analytic solutions for one step simulation.

As a side note, Contact Invariant Optimization [21] equally has an implicit complementary mechanism, where a real-valued decision variable indicates contact, and a cost term

represents complementarity between contact () and distance and slip (), where is the pose difference between predefined contact frames. That is, if was minimized to zero, any contact has zero distance and slip. In that sense, this is a soft approximation of a complementarity constraint.

## Iv Trajectory Optimization Framework

This sections presents the path optimization framework as an extension of the LGP-formulation from [28] to multi-physics constraints. The main idea is a path optimization problem in which the path consists of phases or kinematic modes. A discrete variable defines the constraints and costs of this optimization problem in phase of the motion. We call a sequence of discrete variables a skeleton. The configuration space includes the -dimensional generalized coordinate of robot links () and the poses of rigid objects (), as in the original formulation [28]. However, in this work a configuration additionally includes wrench interactions for each of possible contact pairs () as well as a single scalar with the following semantics: The continuous path in the configuration space is indexed by , which corresponds to a virtual time, hence the duration of the path in virtual time is fixed to . To make this consistent with physics (see Sec. V-G), each component has a real time .

Given a skeleton , we solve the path problem

(1a) | ||||

(1b) | ||||

(1c) | ||||

(1d) |

Here, define differentiable path constraints for a given mode with , which depend on , and define differentiable switch constraints between modes and , which depend on , where is the velocity before, and is the velocity after a switch (e.g. impulse exchange) [28].

To formulate physics laws in terms of real-time, the real-time velocities and accelerations can trivially be determined by the chain-rule, e.g.

(2) |

Our path solver, KOMO [31], parameterizes each time slices with only the minimal set of degrees-of-freedom (dofs). For instance, actual optimization variables for force exchange are only introduced when the skeleton introduces the existance of a force interaction. Therefore, the path solver deals with varying dofs in each mode and at each switch, which depend on the skeleton.

The following section details the specific dofs, inequality and equality constraints that are introduced into this path optimization formulation to by force-based interaction modes.

## V Contact Models, Path Constraints, and Quasi-Static Motion

In this section we first formulate specific forced contact models, more specifically, an efficient parametrization of contact interactions that introduces the point-of-attack (POA) as an auxiliary 3D decision variable. We then describe the Newton-Euler constraint we employ to describe dynamic object motion and its quasi-static variant.

### V-a Contact interaction modes

In our approach the skeleton [28] decides between which objects and in which phase contact interactions are accounted for. When contact interaction between a pair of objects is created, this has several effects on the resulting path problem: (1) A 6D decision variable (wrench, or force and POA) for each time step is introduced, (2) constraints are added to the path problem that describe physically correct forces and POA in consistency with the geometric configuration, and (3), if one of the objects are either in quasi-static or in dynamic motion mode, the effective wrench of the contact enters its quasi-static or dynamic motion constraint.

We provide several options to impose contacts during the manipulation. Specifically, we allow for a forced contact (requiring zero-distance throughout the interval), an instantaneous contact (active at one time slice only, realizing elastic impulse exchange), and the standard complementarity (enforcing complementarity throughout the interval). Each of these three contact modes comes in two versions, one allowing for slip, the other enforcing stick. We describe the details in the following.

### V-B Wrench as Force & Point-of-Attack

For each force interaction we introduce a 6D decision variable in the path optimization problem to represent the total wrench exchange. However, instead of representing a wrench directly as linear force and torque , we equivalently represent it as , where is the 3D point-of-attack (POA) in world coordinates (or zero-momentum point), with . This formulation follows the idea of the equivalent contact point [34] and resolves several issues that arise when deciding on force interation during optimization.

The typical “causality” of force exchange, e.g. in a standard forward simulator, is that the current geometric configuration first determines contact points (or patches or multiple points), and then computes forces that attack at these given points. However, in this approach contact point(s) can be rather instable for flat-on-flat interactions, cause jittering or bouncing, and raise convergence issues for path optimization. By instead introducing the POA as a decision variable in our mathematical program, the optimizer directly decides on where the actual force exchange occurs. Additional constraints then need to ensure that this POA is also geometrically feasible. As detailed below we constrain the POA to lie on both object surfaces, allowing the optimizer to find the single point with keeps everything consistent. This elegantly handles transitions between contact configurations, e.g., from 3-to-3 initially to 3-to-2 or 2-to-1 when one object slides over the edge of another.

Note that by constraining the POA to be on the surfaces, and only allowing a linear force there, our current implementation excludes the possibility of a torque around the normal (e.g., from patch friction) [8] or wrenches via adhesion.

### V-C Forced and Complementary Contacts

We distinguish a forced contact and a complementary contact. When the skeleton imposes a forced contact, we add the constraints

(POA is on object 1) | (3) | ||||

(POA is on object 2) | (4) | ||||

(5) |

where is the (signed) distance or penetration of to the convex mesh of object 1; and is the signed distance or penetration between two convex meshes. Both are evaluated with either GilbertâJohnsonâKeerthi (GJK) for non-penetrating objects, and Minkowski Portal Refinement (MPR) for penetrating objects.

When the skeleton imposes a complementary contact, we only add the 7D constraint

(force complementarity) | (6) | ||||

(7) |

Note that complementarity is imposed with both(!) POA distances, not via the numerically less stable geometric distance . The POA is directly a decision variable, with trivial Jacobian, whereas is only piece-wise differentiable. But the combined constraints do eventually imply complementarity w.r.t. object touch.

### V-D Positivity, Slip and Elasticity

We always constraint the force to be positive,

(force is positive) | (8) |

where is the normal of the pair’s distance or penetration vector, which we retrieve differentiably from the witness simplices.

To model stick as well as elastic bounce we impose velocity constraints on the actual object surface points that relate to the POA. Let be the object-associated POA velocity, where are the linear and angular velocities of object 1, and its center. Let be the relative POA velocity between the interacting objects 1 and 2. For a non-slip contact we impose the equality constraint

(zero tangential surface velocities) | (9) |

and the inequality constraint

(quadratic friction cone) | (10) |

where is the coefficient of friction in Coulomb’s friction model. In contrast, for sliding contacts, the force must be on the edge of the friction cone and its tangential component needs to align with the negative relative POA velocity, which both is ensured by

(11) |

for . Note that in our experiments we only consider very large (stick) or low (slip) friction. For friction-less contact this implies a normal force

(12) |

Finally, let be the relative POA velocity one time step later – e.g., after an instantaneous bounce. For an instantaneous bounce with elasticity coefficient we add the constraint

(normal velocity reflection) | (13) |

In summary, using these equations we can impose forced, instantaneous, and complementary contacts, each with slip or stick.

### V-E Regularization costs

While we impose hard constraints to ensure physical correctness, we additionally can have soft penalties to favor smooth interactions. This can be interpreted as prior on which kinds of robot manipulations we favor – for instance, those where the POA does not exceedingly jump around. Adding such regularizations has a positive effect on the convergence behavior of the solver. Specifically we add cost terms

(POA acceleration penalization) | (14) | |||

(force acceleration penalization) | (15) | |||

(force penalization) | (16) | |||

(17) |

### V-F Dynamic and Quasi-Static Motion

For every object that is in dynamic or quasi-static motion mode we collect the total 6D wrench on its center that arises from all contacts with their forces and POAs. From the current path we compute the object’s linear and angular velocity and acceleration . Given the gravity vector , the inertia matrix and a potential friction , we have the Newton-Euler equation, neglecting the term due to off-diagonal inertias,

(18) |

For free flight objects we assume .

For the quasi-static sliding, we assume some friction such that the inertia forces can be ignored . In particular, when an object is pushed by a manipulator on a table, this quasi-static model constrains the object’s motion to be only along the surface of the table, . In the remaining degrees of freedom, the wrench applied by the manipulator exactly cancels out the friction, i.e., , and is related to the velocity as

(19) |

with a convex function . Note that, in this case, we need to consider the scaled wrench as optimization variables instead of [35, 11]. In the experiment, we assumed the pressure is distributed constantly, thus utilized a simple quadratic representation of : .

### V-G Time stepping optimization and impulse vs. force exchange

Finally, for the case of truly dynamic sequences, e.g. where a ball is bouncing on a table, it is physics that decides on the true time between two interactions. However, the skeleton imposes interactions at fixed steps. To resolve this conflict we introduce a scalar decision variable for every discretization step of the path which represents the real-time step in seconds. The optimizer can thereby find that scales a fixed path section to a correct physical time interval. We impose positive time evolution, , choose piece-wise constant time stepping within phases, and introduce a regularization to favor time steps close to the default. All velocities mentioned above are differentiably evaluated by finite differencing along the path and dividing by . The acceleration terms in the Newton-Euler equation are actually multiplied by , which semantically makes it an impulse exchange equation, and the decision variable associated with contacts actually represent impulses.

### V-H Integration with Existing Solver and Stable Interaction Models

The models described above were integrated in the existing solver described in [28]. This means that the above interaction modes can be mixed with modes for stable grasping and placement of objects. We also adopt the switch constraints, which enforce zero object accelerations at the switch, except for instantaneous contacts. As stated previously, in this work we focus on the path problem for a given skeleton, which is addressed using an Augmented Lagrangian method [31]. The skeleton is represented as a list of first order literals.

## Vi Experiments

Please see the accompanying video^{2}^{3}

### Vi-a Passive Tests (Pure Simulation as Path Optimization)

We start with first reporting on tests where there exists no robot or actuators, but path optimization is merely used to compute a physically feasible path. This tests whether our descriptions of physical interactions are appropriate to also perform ordinary physical simulation using path optimization.
^{4}

#### Ball bouncing

We drop a ball onto a table (Fig. 1(a)), letting it bounce 4 times with elasticity coefficient (where the outgoing velocity is constrained to be 90% of the incoming velocity). The accompanying video shows the simple behavior. The optimizer robustly converges within about to standard precision, and within about second to ultimate floating point precision in the constraints.

An insight we gain from this is that, in order to compute correct physical bounces, it is essential to include co-optimization of the time stepping . Fig. 1(d) shows for time steps , for . We see that the optimizer found different time steppings during each bounce interval. This is essential as the duration of the bounce is determined by physics and must be aligned with the imposed bounce schedule. This also means that those configurations at which bouncing contact is imposed are optimized to be at exactly the real times where the ball hits the table, allowing all constraints to be fulfilled to arbitrary precision even though we choose a very coarse time discretization. This addresses the typical issue in physical simulations of choosing efficient but imprecise fixed time steps versus adaptive stepsizes. Disabling stepping optimization makes our approach fail to find a correct solution for this simple bouncing problem.

#### Slide-falling and tumbling block

We present two passive examples that highlight the POA mechanism, a box sliding from a tilted table (solver time , Fig. 1(b)) and a box tumbling with sticky contact on a tilted table (solver time , Fig. 1(c)). In both cases we used general complementary contacts, and it was essential to allow the optimizer to find a suitable POA. Without the POA decision variable (when inserting the central witness point of the current configuration between the current shapes, computed with GJK or MPR, instead of in all constraints), the solver was unable to find solutions in both scenarios. Using the POA, the optimizer finds (mostly smooth) POA movements on the contact surfaces.

### Vi-B Physical Manipulations

In the remainder we consider sequential robot manipulation scenarios. In all cases the manipulator model is a Franka Emika Panda. However, we abstracted the gripper’s fingers as spheres. In the context of stable grasping, this is motivated by our experience that modeling the actual gripper closing with the actual finger geometries is hardly indicative of grasp success in real-world execution. Instead, ensuring a central opposing positioning to normal surfaces is simpler and transfers well. Abstracting the two fingers as spheres allows us to constrain that the nearest distance vector from the left finger-sphere to the object should exactly oppose the nearest distance vector from the right finger-sphere to the object. To this end we constrain the sum of both vectors to be zero, which describes a central opposition and has nice gradients to pull the gripper towards an opposing grasp.

#### Quasi-static pushing with a picked stick

In this scenario (Fig. 2(a)) the robot picks up a stick in order to push the box to the green target pose. The box motion is modeled as quasi-static table sliding. The pre-defined skeleton is

(oppose finger1 finger2 stick) (stable gripper stick) (quasiStaticOn table box) (contact_slide stick box) (poseEq box target)

where each line corresponds to one phase step, the predicates stable and quasiStaticOn describe our mode switches, contact_slide the creation of a forced sliding contact, and oppose and poseEq are geometric constraints.

The solver finds (in ) a rather involved pushing maneuver where the POA between the stick and the box is controlled to places that allow pushing the box into different directions. The video displays several additional pushing sequences, some with a free floating gripper, to show the variety of solutions found by the solvers. This scenario and the following two are cases where the solver benefits from mixing physics descriptions of varying abstraction: the stable grasp abstraction for the interaction with the stick, and force-based modeling for the interaction between box and stick, and quasi-static dynamics for the box. Our last experiment will investigate the gained efficiency of a stable grasp abstraction vs. a force-based grasp.

#### Dynamic ball throwing and bouncing to a target

This scenario (Fig. 2(b)) is an extension of the passive bouncing test discussed above. A robot picks up a ball to throw it onto the floor so that it bounces back against a wall, and then bounces to a given target. This highlights the ability to implicitly propagate back target constraints through force-based contacts to yield a correct throwing strategy. The pre-defined skeleton is

(oppose finger1 finger2 ball) (stable gripper ball) (dynamic ball) (bounce ball table) (bounce ball wall) (touch target ball)

which states that the first phase ends with gasping the ball, the ball becomes free and dynamic (Newton-Euler equations) after the second phase, the ball bounces with the table after the third, with the wall after the fourth, and touches (zero distance) the green target after the fifth.

The solver finds a solution (in ) where the robot, after picking up the ball, nicely accelerates and releases the ball to bounce to the target, as in 3D billiards. The velocity of the full sequence has to be rather fast as the free ball flight is governed by physics. Therefore, the control costs of this path are highly significant in this optimization problem. As seen in the video, the found solution varies drastically depending on the scaling of control costs.

#### Using a stick to lifting a weight at a ring to place it on a target

This scenario (Fig. 2(c)) aims to highlight the ability to create the needed contact points to achieve long term targets. A robot grasps a stick in order to insert it into a ring at the top of a weight. Thereby it can lift it and transport it onto a given target. The given skeleton is

(oppose finger1 finger2 stick) (stable gripper stick) (dynamic box) (contact_slide stick ring) (stableOn target box) (above target box)

where dynamic switches the box to free flying mode (Newton-Euler equations with contact forces as input), while stableOn then switches the box to reside stably on the target. above geometrically constraints the box center of mass to be within the target support.

The solver finds (in ) a solution, where the robot finds the right spot for the stick to touch the ring so as to lift it. The box swings slightly during the dynamic transport to the target. The transition from its initial resting on the table to the dynamic phase is not perfectly smooth, which would require more careful regularization of accelerations at mode switches.

#### First pushing then grasping a book from a shelf

This scenario (Fig. 2(d)) is inspired from Fig. 1 in [13] and considers a book on a shelf that is initially too close to a wall to be grasped. So it first has to be pushed forward in order to allow to grasp it. The pre-defined skeleton is

(contact finger1 book) (quasiStaticOn shelf book) (poseEq book subTarget) (oppose finger1 finger2 book) (stable gripper book) (poseEq book target)

Note that in this skeleton we predefined an intermediate target pose for the book, the first green pose seen in the video, as discussed in detail below.

Given this skeleton, the solver finds (in ) a solution where the robot places the finger nicely to the right of the book to push it to the sub-target, then in a minimal motion transitions to the opposing grasp to lift the book and carry it to the final target.

As a negative result, if we remove the sub-target (2nd line) from the skeleton, the solver fails to find a feasible solution and typically converges to an infeasible solution that cheats when picking up the object, squeezing the finger between wall and book in a penetrating and book-jumping manner (see video). We considered extensively how we could fix this deficit of our method. However, we concluded that without cheating by redesigning the scenario to become less symmetric and have an intrinsic bias towards the first book slide, there is no way for our approach to solve this problem without introducing the sub-goal or some similar bias. The path optimization process has no implicit gradient towards paths that have consistent book slides in one or another direction. A random initialization is too unsystematic to pit optimization towards such slides. Instead, due to symmetry, path optimization is most likely to converge to the local optimum that corresponds to the shown infeasible solution.

We believe this scenario is highly insightful. Local optimal are a fundamental issue for optimization and source of complexity for planning. The scenario shows that stronger biases would have to pre-exist, perhaps have been learned, to solve complex manipulation problems.

#### Force-based vs. stable grasping

In the previous experiments we imposed stable grasps. We can also solve for force-based grasping. In the last scenario (Fig. 2(e)) the robot only needs to lift the stick to a target pose. For the pre-defined skeleton with force-based contacts

(oppose fing1 fing2 stick) (contactStick fing1 stick) .. (contactStick finger2 stick) (dynamic stick) (poseEq stick target)

it takes the solver to find a lift. However, for the skeleton with stable grasp

(oppose finger1 finger2 stick) (stable gripper stick) (poseEq stick target)

the solver requires only .

## Vii Conclusions

In this work we propose concrete models for physical reasoning and robot manipulation which allow the solver to mix different abstractions for different objects and phases of the solution, and integrate this in a path optimization framework to solve sequential physical manipulation problems over a wide range of scenarios. We call this a multi-physics model for reasoning. Our solver is based on a path description of physics that directly allows us to leverage constrained optimization methods.

A limitation of the approach is the still significant computation time needed to solve complex sequential physical interaction scenarios ( in our examples). This makes the naive integration into the full symbolic search of LGP unattractive. A promising alternative is to use our solver to generate large-scale data to learn a heuristic that can drastically accelerate search over potential interaction skeletons [7]. Further, this work only considers the problem of reasoning about possible manipulation sequences, not controller synthesis for a robust execution of such plans. Translating the framework to stochastic optimal control is yet subject to research [10].

## Acknowledgement

We would like to thank Alberto Rodriguez for inspiring us to work on some of the problems. This work was funded by the Baden-Württemberg Stiftung in the scope of the NEUROROBOTICS project DeepControl. M.T. thanks for the Max Planck Fellowship at the MPI for Intelligent Systems.

### Footnotes

- Notably, an exception from forward descriptions of physics is the principle of least action, which states that every physical path to a given(!) end configuration minimizes the action integral, . Interestingly, this characterizes physics not as a local forward law, but by a global statement about what are correct physical paths (which implies the local differential law.) However, as the action principle requires knowledge of the end configuration, it remains unclear how to utilize it for physical reasoning. – In acknowledgment of a discussion with Scott Kuindersma.
- https://youtu.be/YxKuVit_23E
- The core is already available at https://github.com/MarcToussaint/rai.
- In fact, a physical simulator could be implemented using MPC based on our formulation, repeating path optimization of a short receding horizon. This would enable features such as co-optimizing the time stepping for the sake of simulation precision, or equality-constraining the simulation additionally on precise long-term energy conservation. We haven’t explored further in this direction.

### References

- (2010) Interleaving symbolic and geometric reasoning for a robotic assistant. In ICAPS Workshop on Combining Action and Motion Planning, Cited by: §II.
- (1997) Formulating dynamic multi-rigid-body contact problems with friction as solvable linear complementarity problems. Nonlinear Dynamics 14 (3), pp. 231–247. Cited by: §I.
- (2013) Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences 110 (45), pp. 18327–18332. Cited by: §II.
- (2018) A Data-Efficient Approach to Precise and Controlled Pushing. In Conference on Robot Learning, pp. 336–345. Cited by: §II.
- (2013) Towards combining HTN planning and geometric task planning. arXiv preprint arXiv:1307.1482. Cited by: §II.
- (2014) Footstep planning on uneven terrain with mixed-integer convex optimization. In Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference On, pp. 279–286. Cited by: §II.
- (2020) Deep visual heuristics: learning feasibility of mixed-integer programs for manipulation planning. In International Conference on Robotics and Automation (ICRA), Cited by: §VII.
- (2017-12) Parameter and contact force estimation of planar rigid-bodies undergoing frictional contact. The International Journal of Robotics Research 36 (13-14), pp. 1437–1454 (en). External Links: ISSN 0278-3649, Document Cited by: §I, §II, §V-B.
- (1985) Sensitivity analysis in nonlinear programming under second order assumptions. In Systems and Optimization, pp. 74–97. Cited by: §I.
- (2020) Probabilistic framework for constrained manipulations and task and motion planning under uncertainty. In International Conference on Robotics and Automation (ICRA), Cited by: §VII.
- (2018) A quasi-static model and simulation approach for pushing, grasping, and jamming. In Workshop on the Algorithmic Foundations of Robotics, Cited by: §V-F.
- (2012) Physical reasoning in complex scenes is sensitive to mass. PhD Thesis, Massachusetts Institute of Technology. Cited by: §II.
- (2018-05) Reactive Planar Manipulation with Convex Hybrid MPC. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 247–253. External Links: Document Cited by: §II, §VI-B4.
- (2010) Solution sensitivity for Karush–Kuhn–Tucker systems with non-unique Lagrange multipliers. Optimization 59 (5), pp. 747–775. Cited by: §I.
- (1917) Intelligenzprüfungen am menschenaffen. Springer, Berlin (3rd edition, 1973). Note: English version: Wolgang Köhler (1925): The Mentality of Apes. Harcourt & Brace, New York. Cited by: §II.
- (2014) Efficiently combining task and motion planning using geometric constraints. The International Journal of Robotics Research. External Links: Document Cited by: §II.
- (2012) Constraint propagation on interval bounds for dealing with geometric backtracking. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference On, pp. 957–964. Cited by: §II.
- (1995-09) Sensitivity of Solutions in Nonlinear Programming Problems with Nonunique Multipliers. In Recent Advances in Nonsmooth Optimization, pp. 215–223. External Links: ISBN 978-981-02-2265-9, Document Cited by: §I.
- (2014) A constraint-based method for solving sequential manipulation planning problems. In Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference On, pp. 3684–3691. Cited by: §II.
- (1996) Stable pushing: Mechanics, controllability, and planning. The International Journal of Robotics Research 15 (6), pp. 533–556. Cited by: §II.
- (2012) Contact-invariant optimization for hand manipulation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 137–144. Cited by: §II, §III.
- (2016) Tool use and affordance: Manipulation-based versus reasoning-based approaches.. Psychological review 123 (5), pp. 534. Cited by: §II.
- (2013) Direct trajectory optimization of rigid body dynamical systems through contact. In Algorithmic Foundations of Robotics X, pp. 527–542. Cited by: §I, §I, §II, §III.
- (2014) Space or physics? Children use physical reasoning to solve the trap problem from 2.5 years of age.. Developmental Psychology 50 (7), pp. 1951–1962 (en). External Links: ISSN 1939-0599, 0012-1649, Document Cited by: §II.
- (2018) Different physical intuitions exist between tasks, not domains. Computational Brain & Behavior 1 (2), pp. 101–118. Cited by: §I, §II.
- (2014) Combined task and motion planning through an extensible planner-independent interface layer. In Robotics and Automation (ICRA), 2014 IEEE International Conference On, pp. 639–646. Cited by: §II.
- (2012) MuJoCo: A physics engine for model-based control. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference On, pp. 5026–5033. Cited by: §I.
- (2018) Differentiable physics and stable modes for tool-use and manipulation planning. In Proc. of Robotics: Science and Systems (R:SS 2018), Note: Best Paper Award Cited by: §I, §II, §IV, §IV, §V-A, §V-H.
- (2017) Multi-bound tree search for logic-geometric programming in cooperative manipulation domains. In Proc. of the IEEE Int. Conf. on Robotics and Automation(ICRA 2017), Cited by: §I, §II.
- (2015) Logic-geometric programming: an optimization-based approach to combined task and motion planning. In Proc. of the Int. Joint Conf. on Artificial Intelligence(IJCAI 2015), Cited by: §I.
- (2017) A tutorial on Newton methods for constrained trajectory optimization and relations to SLAM, Gaussian Process smoothing, optimal control, and probabilistic inference. In Geometric and Numerical Foundations of Movements, J. Laumond (Ed.), Cited by: §I, §IV, §V-H.
- (2013-04) Social and Physical Reasoning in Human-reared Chimpanzees Preliminary Studies. Cited by: §II.
- (2009-08) Cognitive Processes Associated with Sequential Tool Use in New Caledonian Crows. PLOS ONE 4 (8), pp. e6471 (en). External Links: ISSN 1932-6203, Document Cited by: §II.
- (2016) Rigid body dynamic simulation with line and surface contact. In 2016 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR), pp. 9–15. Cited by: §I, §II, §III, §V-B.
- (2018) A convex polynomial model for planar sliding mechanics: theory, application, and experimental validation. The International Journal of Robotics Research 37 (2-3), pp. 249–265. Cited by: §V-F.