ChainQueen: A Real-Time Differentiable Physical Simulatorfor Soft Robotics

ChainQueen: A Real-Time Differentiable Physical Simulator
for Soft Robotics

Yuanming Hu, Jiancheng Liu, Andrew Spielberg,
Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu, Daniela Rus, Wojciech Matusik
Y. Hu, A. Spielberg, J. B. Tenenbaum, W. T. Freeman, J. Wu, D. Rus, and W. Matusik are with Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USAJ. Liu is with Institute for Interdisciplinary Information Science, Tsinghua University, Beijing, China Equally contributed.
Abstract

Physical simulators have been widely used in robot planning and control. Among them, differentiable simulators are particularly favored, as they can be incorporated into gradient-based optimization algorithms that are efficient in solving inverse problems such as optimal control and motion planning. Simulating deformable objects is, however, more challenging compared to rigid body dynamics. The underlying physical laws of deformable objects are more complex, and the resulting systems have orders of magnitude more degrees of freedom and therefore they are significantly more computationally expensive to simulate. Computing gradients with respect to physical design or controller parameters is typically even more computationally challenging. In this paper, we propose a real-time, differentiable hybrid Lagrangian-Eulerian physical simulator for deformable objects, ChainQueen, based on the Moving Least Squares Material Point Method (MLS-MPM). MLS-MPM can simulate deformable objects including contact and can be seamlessly incorporated into inference, control and co-design systems. We demonstrate that our simulator achieves high precision in both forward simulation and backward gradient computation. We have successfully employed it in a diverse set of control tasks for soft robots, including problems with nearly 3,000 decision variables.

I Introduction

Robot planning and control algorithms often rely on physical simulators for prediction and optimization [1, 2]. In particular, differentiable physical simulators enable the use of gradient-based optimizers, significantly improving control efficiency and precision. Motivated by this, there has been extensive research on differentiable rigid body simulators, using approximate [3, 4] and exact [5, 6, 7] methods.

Significant challenges remain for deformable objects. First, simulating the motion of deformable objects is slow, because they have much higher degrees of freedom (DoFs). Second, contact detection and resolution is challenging for deformable objects, due to their changing geometries and potential self-collisions. Third, closed-form and efficient computation of gradients is challenging in the presence of contact. As a consequence, current simulation methods for soft objects cannot be effectively used for solving inverse problems such as optimal control and motion planning.

In this paper, we introduce a real-time, differentiable physical simulator for deformable objects, building upon the Moving Least Squares Material Point Method (MLS-MPM) [8]. We name our simulator ChainQueen ***Or {CJK}UTF8gbsn 乾坤 , literally “everything between the sky and the earth.”. The Material Point Method (MPM) is a hybrid Lagrangian-Eulerian method that uses both particles and grid nodes for simulation [9]. MLS-MPM accelerates and simplifies traditional MPM using a moving least squares force discretization. In ChainQueen, we introduce the first fully differentiable MLS-MPM simulator with respect to both state and model parameters, with both forward simulation and back-propagation running efficiently on GPUs. We demonstrate the ability to efficiently calculate gradients with respect to the entire simulation. This enables many novel applications for soft robotics including optimization-based closed-loop controller design, trajectory optimization, and co-design of robot geometry, materials, and control.

As a particle-grid-based hybrid simulator, MPM simulates objects of various states, such as liquid (e.g., water), granular materials (e.g., sand), and elastoplastic materials (e.g., snow and human tissue). ChainQueen focuses on elastic materials for soft robotics. It is fully differentiable and faster than the current state-of-the-art. Numerical and experimental validation suggest that ChainQueen achieves high precision in both forward simulation and backward gradient computation.

ChainQueen’s differentiability allows it to support gradient-based optimization for control and system identification. By performing gradient descent on controller parameters, our simulator is capable of solving these inverse problems on a diverse set of complex tasks, such as optimizing a 3D soft walker controller given an objective. Similarly, gradient descent on physical design parameters, enables inference of physical properties (e.g. mass, density and Young’s modulus) of objects and optimizing design for a desired task.

In addition to benchmarking ChainQueen’s performance and demonstrating its capabilities on a diverse set of inverse problems, we have interfaced our simulator with high-level python scripts to make ChainQueen user-friendly. Users at all levels will be able to develop their own soft robotics systems using our simulator, without the need to understand its low-level details. We will open-source our code and data and we hope they can benefit the robotics community.

Ii Related Work

Ii-a Material Point Method

The material point method has been extensively developed from both a solid mechanics [9] and computer graphics [10] perspective. As a hybrid Eulerian-Langrangian method, MPM has demonstrated its versatility in simulating snow [11, 12], sand [13, 14], non-Newtonion fluids [15], cloth [16, 17], solid-fluid coupling [18, 19], rigid body coupling, and cutting [8]. [20] also proposed an adaptive MPM scheme to concentrate computation resources in the regions of interest.

There are many benefits of using MPM for soft robotics. First, MPM is a well-founded and physically-accurate discretization method and can be derived through the weak form of conservation laws. Such a physically-based approach makes it easier to match simulation with real-world experiments. Second, MPM is friendly to parallelization on modern hardware architectures. Closely related to our work is a high-performance GPU implementation [21] by Gao et al., from which we borrow many useful optimization practices. Though efficient when solving forward simulation, their simulator is not differentiable, making it inefficient for inverse problems in robotics and learning. Third, MPM naturally handles large deformation and (self-)collision, which are common in soft robotics, but often not modeled in, e.g., mesh-based approaches due to computational expense. Finally, the continuum dynamics (including soft object collision) are governed by the smooth (and differentiable) potential energy, making the whole system differentiable.

Our simulator, ChainQueen, is fully differentiable and the first simulator that applies MPM to soft robotics.

Ii-B Differentiable Simulation and Control

Recently, there has been an increasing interest in building differentiable simulators for planning and control. For rigid bodies, [22], [3] and [4] proposed to approximate object interaction with neural nets; later, [23] explored their usage in control. Approximate analytic differentiable rigid body simulators have also been proposed [5, 24]. Such systems have been deployed for manipulation and planning  [25].

Differentiable simulators for deformable objects have been less studied. Recently, [26] proposed SPNets for differentiable simulation of position-based fluids [27]. The particle interactions are coded as neural network operations and differentiability is achieved via automatic differentiation in PyTorch. A hierarchical particle-based object representation using neural networks is also proposed in [4]. Instead of approximating physics using neural networks, ChainQueen differentiates MLS-MPM, a well physically founded discretization scheme derived from continuum mechanics. In summary, our simulator can be used for a more diverse set of objects; it is more physically plausible, and runs faster.

Fig. 1: One time step of MLS-MPM. Top arrows are for forward simulation and bottom ones are for back propagation. A controller is embedded in the P2G process to generate actuation given particle configurations.
Symbol Type Affiliation Meaning
scalar time step size
scalar grid cell size
vector particle position
scalar particle initial volume
vector particle velocity
matrix particle affine velocity field  [28]
matrix particle PK1 stress ()
matrix particle actuation Cauchy stress
matrix particle actuation stress (material space)
matrix particle deformation gradient
vector node position
scalar node mass
vector node velocity
vector node momentum, i.e.
scalar quadratic B-spline function
TABLE I: List of notations for MLS-MPM.

Iii Forward simulation and back-propagation

We use the moving least squares material point method (MLS-MPM) [8] to discretize continuum mechanics, which is governed by the following two equations:

(1)
(2)

We briefly cover the basics of MLS-MPM and readers are referred to [10] and [8] for a comprehensive introduction of MPM and MLS-MPM, respectively. The material point method is a hybrid Eulerian-Lagrangian method, where both particles and grid nodes are used. Simulation state information is transferred back-and-forth between these two representations. We summarize the notations we use in this paper in Table  IV. Subscripts are used to denote particle () and grid nodes (), while superscripts (, ) are used to distinguish quantities in different time steps. The MLS-MPM simulation cycle has three steps:

  1. Particle-to-grid transfer (P2G). Particles transfer mass , momentum , and stress-contributed impulse to their neighbouring grid nodes, using the Affine Particle-in-Cell method (APIC) [28] and moving least squares force discretization [8], weighted by a compact B-spline kernel :

    (3)
    (4)
    (5)
  2. Grid operations. Grid momentum is normalized into grid velocity after division by grid mass:

    (6)

    Note that neighbouring particles interact with each other through their shared grid nodes, and collisions are handled automatically. Here we omit boundary conditions and gravity for simplicity.

  3. Grid-to-particle transfer (G2P). Particles gather updated velocity , local velocity field gradients and position . The constitutive model properties (e.g. deformation gradients ) are updated.

    (7)
    (8)
    (9)
    (10)

For soft robotics, we additionally introduce an actuation model. Inspired by actuators such as [29], we designed an actuation model that expands or stretches particle via an additional Cauchy stress , with – the stress in the material space. This framework supports the use of other differentiable actuation models including pneumatic, hydraulic, and cable-driven actuators. Fig. 1 illustrates forward simulation and back propagation.

MLS-MPM is naturally differentiable. Though the forward direction has been extensively used in computer graphics, the backward direction (differentiation or back-propagation) is largely unexplored.

Based on the gradients we have derived analytically, we have designed a high-performance implementation that resembles the traditional forward MPM cycle: backward P2G (scatter particle gradients to grid), grid operations, and backward G2P (gather grid gradients to particles). Please see the supplemental document for the gradient derivations. Gradients of state at the end of a time step with respect to states at the starting of the time step can be computed using the chain rule. With the single-step gradients computed, applying the chain rule at a higher level from the final state all-the-way to the initial state yields gradients of the final state with respect to the initial state, as well as to the controller parameters that are used in each state. We cache all the simulation states in memory, using a “memo” object. Though the underlying differentiation is complicated, we have designed a simple high-level TensorFlow interface on which end-users can build their applications (Fig. 2).

Our high-performance implementationBased the Taichi [30] open source computer graphics library. takes advantage of the computational power of modern GPUs through CUDA. We also implemented a reference implementation in TensorFlow. Note that programming physical simulation as a “computation graph” using high-level frameworks such as TensorFlow is less inefficient. In fact, when all the overheads are gone, our optimized CUDA solver is faster than the TensorFlow reference version. This is because TensorFlow is optimized towards deep learning applications where data granularity is much larger and memory access pattern is much more regular than physical simulation, and limited CPU-GPU bandwidth. In contrast, our CUDA implementation is tailored for MLS-MPM and explicitly optimized for parallelism and locality, thus delivering high-performance.

Fig. 2: Left: A “memo” object consists all information of a single simulation execution, including all the time step state information (position, velocity, deformation gradients etc.), and parameters for the initial state , policy parameter . Right: Code samples to get the symbolic differentiation (top) and memo, evaluate gradients out of the memo and symbolic differentiation, and finally use them for gradient descent (bottom).

Iv Evaluation

Approach Impl. # Particles Time per Frame
Flex (3D) CUDA 8,024 3.5 ms (286 FPS)
Ours (3D, F) CUDA 8,000 0.392 ms (2,551 FPS)
Ours (3D, B) CUDA 8,000 0.406 ms (2,463 FPS)
Flex (3D) CUDA 61,238 6 ms (167 FPS)
Ours (3D, F) CUDA 64,000 1.594 ms (628 FPS)
Ours (3D, B) CUDA 64,000 1.774 ms (563 FPS)
Ours (3D, F) CUDA 512,000 10.501 ms (92 FPS)
Ours (3D, B) CUDA 512,000 11.594 ms (86 FPS)
Ours (2D, F) TF 6,400 13.2 ms (76 FPS)
Ours (2D, B) TF 6,400 35.7 ms (28 FPS)
Ours (2D, F) CUDA 6,400 0.10 ms (10,000 FPS)
Ours (2D, B) CUDA 6,400 0.14 ms (7,162 FPS)

TABLE II: Performance comparisons on a NVIDIA GTX 1080 Ti GPU. F stands for forward simulation and B stands for backward differentiation. TF indicates the TensorFlow implementation. When benchmarking our simulator with CUDA we use the C++ instead of python interface to avoid the extra overhead due to the TensorFlow runtime library.

In this section, we conduct a comprehensive study of the efficiency and accuracy of our system, in both 2D and 3D.

Iv-a Efficiency

Instead of using complex geometries, a simple falling cube is used for performance benchmarking, to ensure easy analysis and reproducibility. We benchmark the performance of our CUDA simulator against NVIDIA Flex [31], a popular PBD physical simulator capable of simulating deformable objects. Note that both PBD and MLS-MPM needs substepping iterations to ensure high stiffness. To ensure fair comparison, we set a Young’s modulus, Poisson’s ration and density so that visually ChainQueen gives similar results to Flex. We used two steps per frame and four iterations per step in Flex. Note that setting exactly the same parameters is not possible since in PBD there is no explicitly defined physical quantity such as Young’s modulus.

We summarize the quantitative performance in Table II. Our CUDA simulator provides higher speed than Flex, when the number of particles are the same. It is also worth noting that the TensorFlow implementation is much slower, due to excessive runtime overheads.

Iv-B Accuracy

We design five test cases to evaluate the accuracy of both forward simulation and backward gradient evaluation:

  1. A1 (analytic, 3D, float32 precision): final position w.r.t. initial velocity (with collision). This case tests conservation of momentum, gradient accuracy and stability of back-propagation.

  2. A2 (analytic, 3D, float32 precision): same as A1 but with one collision to a friction-less wall.

  3. B (numeric, 2D, float64 precision): colliding billiards. This case tests gradient accuracy and stability in more complex cases where analytic solutions do not exist. We used float64 precision for accurate finite difference results.

  4. C (numeric, 2D, float64 precision): finger controller. This case tests gradient accuracy of controller parameters, which are used repeatedly in the whole simulation process.

  5. D1 (experimental, pneumatic actuator, actuation) In order to evaluate our simulator’s real-world accuracy, we compared the deformation of a physical actuator to a virtual one. The physical actuator has four pneumatic chambers which can be inflated with an external pump, arranged in a cross-shape. Inflating the individual chambers bends the actuator away from that chamber. The actuator was cast using Smooth-On Dragon Skin 30.

  6. D2 (experimental, pneumatic actuator, bouncing) In a second test, we dropped the same actuator from a 15 cm height, and compared its dynamic motion to a simulation.

Fig. 3: Experiments on the pneumatic leg. Row (A, B) Footage and simulator results of a bouncing experiment with the leg dropping at 15 cm. Row (C, D) Actuation test.

In 3D analytic test cases, where gradients w.r.t. initial velocity can be directly evaluated as in Table III. For the experimental comparisons, the results are shown in Fig. 3. In addition to our simulator’s high performance and accuracy, it is worth noting that that the gradients remain stable in the long term, within up to time steps.

Case 1 steps 10 steps 100 steps 1000 steps
A1
A2 - - -
B - -
C
TABLE III: Relative error in simulation and gradient precision. Empty values are because of too short time for collision to happen.

V Inference, Control and Co-design

The most attractive feature of our simulator is the existence of quickly computable gradients, which allows the use of much more efficient gradient-based optimization algorithms. In this section, we show the effectiveness of our differentiable simulator on gradient-based optimization tasks, including physical inference, control for soft robotics, and co-design of robotic arms.

V-a Physical Parameter Inference

ChainQueen can be used to infer physical system properties given its observed motion, e.g. perform gradient descent to infer the relative densities of two colliding elastic balls (see figure above, ball A moving to the right hitting ball B, and ball B arrives the destination C). Gradient-based optimization infers that relative density of ball A is , which constitutes to the correct momentum to push B to C. Such capability makes it useful for real-world robotic tasks such as system identification.

V-B Control

We can optimize regression-based controllers for soft robots and efficiently discover stable gaits. The controller takes as input the state vector , which includes target position, the center of mass position, and velocity of each composed soft component. In our examples, the actuation vector for up to actuators is generated by the controller in each time step. During optimization, we perform gradient descent on variables and , where is the actuation-generating controller.

We have designed a series of experiments including the 2D biped runner (Fig. 4) and robotic finger, and 3D quadrupedal runner (Fig. 6), crawler and robotic arm. Gradient-based optimizers successfully compute desired controllers within only tens or hundreds of iterations. Visual results are included in the supplemental video.

Fig. 4: A soft 2D walker with controller optimized using gradient descent, aiming to achieve a maximum distance after simulation steps. The walker has four actuators (left, marked by letter ‘A’s) with each capable of stretching or compressing in the vertical direction. The full walking animation (middle and right) is available in the video.
(a) Actuation config
(b) Resting pose
(c) Final pose I
(d) Final pose II
(e) Final pose III
Fig. 5: Final poses of the arm swing task. Lighter colors refer to stiffer regions. (c) Final pose of the fixed-stiffness 300% initial Young’s modulus arm. (d) Final pose of the fixed-stiffness 300% initial Young’s modulus arm. (e) Final pose of the co-optimized arm. Actuation cost is 95.5% that of the fixed 100% initial Young’s modulus arm and converges. Only the co-optimized arm is able to fully reach its target. The final optimized spatially varying stiffness of the arm has lower stiffness on the outside of the bend, and higher stiffness inside, promoting more bend to the left. Qualitatively, this is similar in effect to the pleating on soft robot fingers.
Fig. 6: A 3D quadrupedal runner. Please see the supplemental video for more results.

To emphasize the merits of gradient-based approaches, we compare our control method with proximal policy optimization (PPO) [32], a state-of-the-art reinforcement learning algorithm. PPO is an actor-critic method which relies on sampled gradients of a reward function in order to optimize a policy. This sampling-based approach is model-free; it relies on gradients of the rewards with respect to controller parameters, but not with respect to the physical model for updates. For our comparison, we use velocity projected onto the direction toward the goal as the reward. §§§Note that this is functionally extremely similar to a distance loss; the cumulative reward , where is the initial distance and and represent world coordinates of the robot at time and of the goal, respectively. As velocity toward the goal increases, final distance to the goal decreases. We use a simplified single link version (with only two adjacent actuators) of Fig. 5 and the 2D runner Fig. 4 as a benchmark. Quantitative results for the finger are shown in Fig. 7. We performed a similar comparison on the 2D walker, the controller optimized by ChainQueen for the 2D walker starts functioning well within minutes; by comparison the policy recovered by PPO still chose nearly-random actions after over hours of training; demonstrating that for certain soft locomotion tasks our gradient-based method can be more efficient than model-free approaches.

Fig. 7: Gradient-free optimization using PPO and gradient-descent based on ChainQueen, on the 2D finger task. Thanks to the accurate gradient information, even the most vanilla optimizer can beat state-of-the-art reinforcement learning algorithms by one order of magnitude regarding optimization speed. (Left) single, fixed target. (Middle) random targets. (Right) random targets, larger range. Curves are smoothed over 10, 100 and 100 iterations respectively. The -axis is simulation iterations and -axis the loss.

V-C Co-design

Our simulator is capable of not only providing gradients with respect to dynamics and controller parameters, but also with respect to structural design parameters, enabling co-design of soft robots. To demonstrate this, we designed a multi-link robot arm (two links, two joints each with two side-by-side actuators; all parts deformable). Similar to shooting method trajectory optimization, actuation for each time step is solved for, along with the time-invariant Young’s modulus of the system for each particle. In our task, we optimized the end-effector of the arm to reach a goal ball with final arm velocity, and minimized for actuation cost , where is the actuation vector at timestep , and is the total number of timesteps. This is a dynamic task and the target pose cannot be reached in a static equilibrium. NLOPT’s sequential least squares programming algorithm was used for optimization [33]. We compared our co-design solution to fixed designs. The designed stiffness distribution is shown in Fig. 5, along with controls. The convergence for the different tasks can be seen in Fig. 8. As can be seen, only the co-design arm fully converges to the target goal, and with lower actuation cost. Actuation for each chamber was clamped, and rnges of 30% to 400% of a dimensionless initial Young’s modulus were allowed and chosen large enough such as to require a swing instead of a simple bend.

Fig. 8: Convergence of the arm reaching task for co-design vs. fixed arm designs. The fixed designs can make progress but not complete the task, while with co-design, the task can be completed and the actuation cost is lower. Constraint violation is the norm of two constraints: distance of end-effector to goal and mean squared velocity of the particles.

Vi Discussion

We have presented ChainQueen, a differentiable simulator for soft robotics, and demonstrated how it can be deployed for inference, control, and co-design. ChainQueen has the potential to accelerate the development of soft robots. We have also developed a high-performance GPU implementation for ChainQueen, which we plan to open source.

One interesting future direction is to couple our soft object simulation with rigid body simulation, as done in [8]. As derived in [34], the limit for explicit time integration is , where is a constant close to one, is the density, and is the Young’s modulus. That means for very stiff materials (e.g., rigid bodies), only a very restrictive can be used. However, a rigid body simulator should probably be employed in the realm of nearly-rigid objects and coupled with our deformable body simulator. Combining our simulator with existing rigid-body simulators using Compatible Particle-in-Cell [8] can be an interesting direction.

Acknowledgments

We would like to thank Chenfanfu Jiang, Ming Gao and Kui Wu for the insightful discussions.

Supplemental Document

In this document, we discuss the detailed steps for backward gradient computation in ChainQueen, i.e. the differentiable Moving Least Squares Material Point Method (MLS-MPM) [8]. Again, we summarize the notations in Table IV. We assume fixed particle mass , volume , hyperelastic constitutive model (with potential energy or Young’s modulus and Poisson’s ratio ) for simplicity.

Symbol Type Affiliation Meaning
scalar time step size
scalar grid cell size
vector particle position
scalar particle initial volume
vector particle velocity
matrix particle affine velocity field  [28]
matrix particle PK1 stress ()
matrix particle actuation Cauchy stress
matrix particle actuation stress (material space)
matrix particle deformation gradient
vector node position
scalar node mass
vector node velocity
vector node momentum, i.e.
scalar quadratic B-spline function
TABLE IV: List of notations for MLS-MPM.

Vii Variable dependencies

The MLS-MPM time stepping is defined as follows:

(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)

The forward variable dependency is as follows:

(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)

During back-propagation, we have the following reversed variable dependency:

(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)

We reverse swap two sides of the equations for easier differentiation derivation:

(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)

In the following sections, we derive detailed gradient relationships, in the order of actual gradient computation. The frictional boundary condition gradients are postponed to the end since it is less central, though during computation it belongs to grid operations. Back-propagation in ChainQueen is essentially a reversed process of forward simulation. The computation has three steps, backward particle to grid (P2G), backward grid operations, and backward grid to particle (G2P).

Viii Backward Particle to Grid (P2G)

(A, P2G) For , we have

(53)
(54)
(55)

(B, P2G) For , we have

(56)
(57)
(58)

Note, the above two gradients should also include the contributions of and respectively, with being the next time step.

(C, P2G) For , we have

(59)
(60)
(61)
(62)

Ix Backward Grid Operations

(D, grid) For , we have

(63)
(64)
(65)

(E, grid) For , we have

(66)
(67)
(68)
(69)

X Backward Grid to Particle (G2P)

(F, G2P) For , we have

(70)
(71)
(72)

(G, G2P) For , we have

(73)
(74)
(75)

(H, G2P) For , we have

(76)
(77)
(78)
(79)
(81)

(I, G2P) For , we have

(82)
(83)
(84)

(J, G2P) For , we have

(85)
(86)
(87)
(88)
(89)
(90)
(91)
(92)
(97)

(K, G2P) For , we have

(99)
(100)
(101)

Xi Friction Projection Gradients

When there are boundary conditions:

(L, grid) For , we have

(102)
(103)
(104)
(105)
(106)
(107)
(108)
(109)
(110)
(111)