ChainQueen: A RealTime Differentiable Physical Simulator
for Soft Robotics
Abstract
Physical simulators have been widely used in robot planning and control. Among them, differentiable simulators are particularly favored, as they can be incorporated into gradientbased optimization algorithms that are efficient in solving inverse problems such as optimal control and motion planning. Simulating deformable objects is, however, more challenging compared to rigid body dynamics. The underlying physical laws of deformable objects are more complex, and the resulting systems have orders of magnitude more degrees of freedom and therefore they are significantly more computationally expensive to simulate. Computing gradients with respect to physical design or controller parameters is typically even more computationally challenging. In this paper, we propose a realtime, differentiable hybrid LagrangianEulerian physical simulator for deformable objects, ChainQueen, based on the Moving Least Squares Material Point Method (MLSMPM). MLSMPM can simulate deformable objects including contact and can be seamlessly incorporated into inference, control and codesign systems. We demonstrate that our simulator achieves high precision in both forward simulation and backward gradient computation. We have successfully employed it in a diverse set of control tasks for soft robots, including problems with nearly 3,000 decision variables.
I Introduction
Robot planning and control algorithms often rely on physical simulators for prediction and optimization [1, 2]. In particular, differentiable physical simulators enable the use of gradientbased optimizers, significantly improving control efficiency and precision. Motivated by this, there has been extensive research on differentiable rigid body simulators, using approximate [3, 4] and exact [5, 6, 7] methods.
Significant challenges remain for deformable objects. First, simulating the motion of deformable objects is slow, because they have much higher degrees of freedom (DoFs). Second, contact detection and resolution is challenging for deformable objects, due to their changing geometries and potential selfcollisions. Third, closedform and efficient computation of gradients is challenging in the presence of contact. As a consequence, current simulation methods for soft objects cannot be effectively used for solving inverse problems such as optimal control and motion planning.
In this paper, we introduce a realtime, differentiable physical simulator for deformable objects, building upon the Moving Least Squares Material Point Method (MLSMPM) [8]. We name our simulator ChainQueen ^{*}^{*}*Or {CJK}UTF8gbsn 乾坤 , literally “everything between the sky and the earth.”. The Material Point Method (MPM) is a hybrid LagrangianEulerian method that uses both particles and grid nodes for simulation [9]. MLSMPM accelerates and simplifies traditional MPM using a moving least squares force discretization. In ChainQueen, we introduce the first fully differentiable MLSMPM simulator with respect to both state and model parameters, with both forward simulation and backpropagation running efficiently on GPUs. We demonstrate the ability to efficiently calculate gradients with respect to the entire simulation. This enables many novel applications for soft robotics including optimizationbased closedloop controller design, trajectory optimization, and codesign of robot geometry, materials, and control.
As a particlegridbased hybrid simulator, MPM simulates objects of various states, such as liquid (e.g., water), granular materials (e.g., sand), and elastoplastic materials (e.g., snow and human tissue). ChainQueen focuses on elastic materials for soft robotics. It is fully differentiable and faster than the current stateoftheart. Numerical and experimental validation suggest that ChainQueen achieves high precision in both forward simulation and backward gradient computation.
ChainQueen’s differentiability allows it to support gradientbased optimization for control and system identification. By performing gradient descent on controller parameters, our simulator is capable of solving these inverse problems on a diverse set of complex tasks, such as optimizing a 3D soft walker controller given an objective. Similarly, gradient descent on physical design parameters, enables inference of physical properties (e.g. mass, density and Young’s modulus) of objects and optimizing design for a desired task.
In addition to benchmarking ChainQueen’s performance and demonstrating its capabilities on a diverse set of inverse problems, we have interfaced our simulator with highlevel python scripts to make ChainQueen userfriendly. Users at all levels will be able to develop their own soft robotics systems using our simulator, without the need to understand its lowlevel details. We will opensource our code and data and we hope they can benefit the robotics community.
Ii Related Work
Iia Material Point Method
The material point method has been extensively developed from both a solid mechanics [9] and computer graphics [10] perspective. As a hybrid EulerianLangrangian method, MPM has demonstrated its versatility in simulating snow [11, 12], sand [13, 14], nonNewtonion fluids [15], cloth [16, 17], solidfluid coupling [18, 19], rigid body coupling, and cutting [8]. [20] also proposed an adaptive MPM scheme to concentrate computation resources in the regions of interest.
There are many benefits of using MPM for soft robotics. First, MPM is a wellfounded and physicallyaccurate discretization method and can be derived through the weak form of conservation laws. Such a physicallybased approach makes it easier to match simulation with realworld experiments. Second, MPM is friendly to parallelization on modern hardware architectures. Closely related to our work is a highperformance GPU implementation [21] by Gao et al., from which we borrow many useful optimization practices. Though efficient when solving forward simulation, their simulator is not differentiable, making it inefficient for inverse problems in robotics and learning. Third, MPM naturally handles large deformation and (self)collision, which are common in soft robotics, but often not modeled in, e.g., meshbased approaches due to computational expense. Finally, the continuum dynamics (including soft object collision) are governed by the smooth (and differentiable) potential energy, making the whole system differentiable.
Our simulator, ChainQueen, is fully differentiable and the first simulator that applies MPM to soft robotics.
IiB Differentiable Simulation and Control
Recently, there has been an increasing interest in building differentiable simulators for planning and control. For rigid bodies, [22], [3] and [4] proposed to approximate object interaction with neural nets; later, [23] explored their usage in control. Approximate analytic differentiable rigid body simulators have also been proposed [5, 24]. Such systems have been deployed for manipulation and planning [25].
Differentiable simulators for deformable objects have been less studied. Recently, [26] proposed SPNets for differentiable simulation of positionbased fluids [27]. The particle interactions are coded as neural network operations and differentiability is achieved via automatic differentiation in PyTorch. A hierarchical particlebased object representation using neural networks is also proposed in [4]. Instead of approximating physics using neural networks, ChainQueen differentiates MLSMPM, a well physically founded discretization scheme derived from continuum mechanics. In summary, our simulator can be used for a more diverse set of objects; it is more physically plausible, and runs faster.
Symbol  Type  Affiliation  Meaning 
scalar  time step size  
scalar  grid cell size  
vector  particle  position  
scalar  particle  initial volume  
vector  particle  velocity  
matrix  particle  affine velocity field [28]  
matrix  particle  PK1 stress ()  
matrix  particle  actuation Cauchy stress  
matrix  particle  actuation stress (material space)  
matrix  particle  deformation gradient  
vector  node  position  
scalar  node  mass  
vector  node  velocity  
vector  node  momentum, i.e.  
scalar  quadratic Bspline function 
Iii Forward simulation and backpropagation
We use the moving least squares material point method (MLSMPM) [8] to discretize continuum mechanics, which is governed by the following two equations:
(1)  
(2) 
We briefly cover the basics of MLSMPM and readers are referred to [10] and [8] for a comprehensive introduction of MPM and MLSMPM, respectively. The material point method is a hybrid EulerianLagrangian method, where both particles and grid nodes are used. Simulation state information is transferred backandforth between these two representations. We summarize the notations we use in this paper in Table IV. Subscripts are used to denote particle () and grid nodes (), while superscripts (, ) are used to distinguish quantities in different time steps. The MLSMPM simulation cycle has three steps:

Grid operations. Grid momentum is normalized into grid velocity after division by grid mass:
(6) Note that neighbouring particles interact with each other through their shared grid nodes, and collisions are handled automatically. Here we omit boundary conditions and gravity for simplicity.

Gridtoparticle transfer (G2P). Particles gather updated velocity , local velocity field gradients and position . The constitutive model properties (e.g. deformation gradients ) are updated.
(7) (8) (9) (10)
For soft robotics, we additionally introduce an actuation model. Inspired by actuators such as [29], we designed an actuation model that expands or stretches particle via an additional Cauchy stress , with – the stress in the material space. This framework supports the use of other differentiable actuation models including pneumatic, hydraulic, and cabledriven actuators. Fig. 1 illustrates forward simulation and back propagation.
MLSMPM is naturally differentiable. Though the forward direction has been extensively used in computer graphics, the backward direction (differentiation or backpropagation) is largely unexplored.
Based on the gradients we have derived analytically, we have designed a highperformance implementation that resembles the traditional forward MPM cycle: backward P2G (scatter particle gradients to grid), grid operations, and backward G2P (gather grid gradients to particles). ^{†}^{†}†Please see the supplemental document for the gradient derivations. Gradients of state at the end of a time step with respect to states at the starting of the time step can be computed using the chain rule. With the singlestep gradients computed, applying the chain rule at a higher level from the final state alltheway to the initial state yields gradients of the final state with respect to the initial state, as well as to the controller parameters that are used in each state. We cache all the simulation states in memory, using a “memo” object. Though the underlying differentiation is complicated, we have designed a simple highlevel TensorFlow interface on which endusers can build their applications (Fig. 2).
Our highperformance implementation^{‡}^{‡}‡Based the Taichi [30] open source computer graphics library. takes advantage of the computational power of modern GPUs through CUDA. We also implemented a reference implementation in TensorFlow. Note that programming physical simulation as a “computation graph” using highlevel frameworks such as TensorFlow is less inefficient. In fact, when all the overheads are gone, our optimized CUDA solver is faster than the TensorFlow reference version. This is because TensorFlow is optimized towards deep learning applications where data granularity is much larger and memory access pattern is much more regular than physical simulation, and limited CPUGPU bandwidth. In contrast, our CUDA implementation is tailored for MLSMPM and explicitly optimized for parallelism and locality, thus delivering highperformance.
Iv Evaluation
Approach  Impl.  # Particles  Time per Frame 
Flex (3D)  CUDA  8,024  3.5 ms (286 FPS) 
Ours (3D, F)  CUDA  8,000  0.392 ms (2,551 FPS) 
Ours (3D, B)  CUDA  8,000  0.406 ms (2,463 FPS) 
Flex (3D)  CUDA  61,238  6 ms (167 FPS) 
Ours (3D, F)  CUDA  64,000  1.594 ms (628 FPS) 
Ours (3D, B)  CUDA  64,000  1.774 ms (563 FPS) 
Ours (3D, F)  CUDA  512,000  10.501 ms (92 FPS) 
Ours (3D, B)  CUDA  512,000  11.594 ms (86 FPS) 
Ours (2D, F)  TF  6,400  13.2 ms (76 FPS) 
Ours (2D, B)  TF  6,400  35.7 ms (28 FPS) 
Ours (2D, F)  CUDA  6,400  0.10 ms (10,000 FPS) 
Ours (2D, B)  CUDA  6,400  0.14 ms (7,162 FPS) 
In this section, we conduct a comprehensive study of the efficiency and accuracy of our system, in both 2D and 3D.
Iva Efficiency
Instead of using complex geometries, a simple falling cube is used for performance benchmarking, to ensure easy analysis and reproducibility. We benchmark the performance of our CUDA simulator against NVIDIA Flex [31], a popular PBD physical simulator capable of simulating deformable objects. Note that both PBD and MLSMPM needs substepping iterations to ensure high stiffness. To ensure fair comparison, we set a Young’s modulus, Poisson’s ration and density so that visually ChainQueen gives similar results to Flex. We used two steps per frame and four iterations per step in Flex. Note that setting exactly the same parameters is not possible since in PBD there is no explicitly defined physical quantity such as Young’s modulus.
We summarize the quantitative performance in Table II. Our CUDA simulator provides higher speed than Flex, when the number of particles are the same. It is also worth noting that the TensorFlow implementation is much slower, due to excessive runtime overheads.
IvB Accuracy
We design five test cases to evaluate the accuracy of both forward simulation and backward gradient evaluation:

A1 (analytic, 3D, float32 precision): final position w.r.t. initial velocity (with collision). This case tests conservation of momentum, gradient accuracy and stability of backpropagation.

A2 (analytic, 3D, float32 precision): same as A1 but with one collision to a frictionless wall.

B (numeric, 2D, float64 precision): colliding billiards. This case tests gradient accuracy and stability in more complex cases where analytic solutions do not exist. We used float64 precision for accurate finite difference results.

C (numeric, 2D, float64 precision): finger controller. This case tests gradient accuracy of controller parameters, which are used repeatedly in the whole simulation process.

D1 (experimental, pneumatic actuator, actuation) In order to evaluate our simulator’s realworld accuracy, we compared the deformation of a physical actuator to a virtual one. The physical actuator has four pneumatic chambers which can be inflated with an external pump, arranged in a crossshape. Inflating the individual chambers bends the actuator away from that chamber. The actuator was cast using SmoothOn Dragon Skin 30.

D2 (experimental, pneumatic actuator, bouncing) In a second test, we dropped the same actuator from a 15 cm height, and compared its dynamic motion to a simulation.
In 3D analytic test cases, where gradients w.r.t. initial velocity can be directly evaluated as in Table III. For the experimental comparisons, the results are shown in Fig. 3. In addition to our simulator’s high performance and accuracy, it is worth noting that that the gradients remain stable in the long term, within up to time steps.
Case  1 steps  10 steps  100 steps  1000 steps 
A1  
A2        
B      
C 
V Inference, Control and Codesign
The most attractive feature of our simulator is the existence of quickly computable gradients, which allows the use of much more efficient gradientbased optimization algorithms. In this section, we show the effectiveness of our differentiable simulator on gradientbased optimization tasks, including physical inference, control for soft robotics, and codesign of robotic arms.
Va Physical Parameter Inference
ChainQueen can be used to infer physical system properties given its observed motion, e.g. perform gradient descent to infer the relative densities of two colliding elastic balls (see figure above, ball A moving to the right hitting ball B, and ball B arrives the destination C). Gradientbased optimization infers that relative density of ball A is , which constitutes to the correct momentum to push B to C. Such capability makes it useful for realworld robotic tasks such as system identification.
VB Control
We can optimize regressionbased controllers for soft robots and efficiently discover stable gaits. The controller takes as input the state vector , which includes target position, the center of mass position, and velocity of each composed soft component. In our examples, the actuation vector for up to actuators is generated by the controller in each time step. During optimization, we perform gradient descent on variables and , where is the actuationgenerating controller.
We have designed a series of experiments including the 2D biped runner (Fig. 4) and robotic finger, and 3D quadrupedal runner (Fig. 6), crawler and robotic arm. Gradientbased optimizers successfully compute desired controllers within only tens or hundreds of iterations. Visual results are included in the supplemental video.
To emphasize the merits of gradientbased approaches, we compare our control method with proximal policy optimization (PPO) [32], a stateoftheart reinforcement learning algorithm. PPO is an actorcritic method which relies on sampled gradients of a reward function in order to optimize a policy. This samplingbased approach is modelfree; it relies on gradients of the rewards with respect to controller parameters, but not with respect to the physical model for updates. For our comparison, we use velocity projected onto the direction toward the goal as the reward. ^{§}^{§}§Note that this is functionally extremely similar to a distance loss; the cumulative reward , where is the initial distance and and represent world coordinates of the robot at time and of the goal, respectively. As velocity toward the goal increases, final distance to the goal decreases. We use a simplified single link version (with only two adjacent actuators) of Fig. 5 and the 2D runner Fig. 4 as a benchmark. Quantitative results for the finger are shown in Fig. 7. We performed a similar comparison on the 2D walker, the controller optimized by ChainQueen for the 2D walker starts functioning well within minutes; by comparison the policy recovered by PPO still chose nearlyrandom actions after over hours of training; demonstrating that for certain soft locomotion tasks our gradientbased method can be more efficient than modelfree approaches.
VC Codesign
Our simulator is capable of not only providing gradients with respect to dynamics and controller parameters, but also with respect to structural design parameters, enabling codesign of soft robots. To demonstrate this, we designed a multilink robot arm (two links, two joints each with two sidebyside actuators; all parts deformable). Similar to shooting method trajectory optimization, actuation for each time step is solved for, along with the timeinvariant Young’s modulus of the system for each particle. In our task, we optimized the endeffector of the arm to reach a goal ball with final arm velocity, and minimized for actuation cost , where is the actuation vector at timestep , and is the total number of timesteps. This is a dynamic task and the target pose cannot be reached in a static equilibrium. NLOPT’s sequential least squares programming algorithm was used for optimization [33]. We compared our codesign solution to fixed designs. The designed stiffness distribution is shown in Fig. 5, along with controls. The convergence for the different tasks can be seen in Fig. 8. As can be seen, only the codesign arm fully converges to the target goal, and with lower actuation cost. Actuation for each chamber was clamped, and rnges of 30% to 400% of a dimensionless initial Young’s modulus were allowed and chosen large enough such as to require a swing instead of a simple bend.
Vi Discussion
We have presented ChainQueen, a differentiable simulator for soft robotics, and demonstrated how it can be deployed for inference, control, and codesign. ChainQueen has the potential to accelerate the development of soft robots. We have also developed a highperformance GPU implementation for ChainQueen, which we plan to open source.
One interesting future direction is to couple our soft object simulation with rigid body simulation, as done in [8]. As derived in [34], the limit for explicit time integration is , where is a constant close to one, is the density, and is the Young’s modulus. That means for very stiff materials (e.g., rigid bodies), only a very restrictive can be used. However, a rigid body simulator should probably be employed in the realm of nearlyrigid objects and coupled with our deformable body simulator. Combining our simulator with existing rigidbody simulators using Compatible ParticleinCell [8] can be an interesting direction.
Acknowledgments
We would like to thank Chenfanfu Jiang, Ming Gao and Kui Wu for the insightful discussions.
Supplemental Document
In this document, we discuss the detailed steps for backward gradient computation in ChainQueen, i.e. the differentiable Moving Least Squares Material Point Method (MLSMPM) [8]. Again, we summarize the notations in Table IV. We assume fixed particle mass , volume , hyperelastic constitutive model (with potential energy or Young’s modulus and Poisson’s ratio ) for simplicity.
Symbol  Type  Affiliation  Meaning 
scalar  time step size  
scalar  grid cell size  
vector  particle  position  
scalar  particle  initial volume  
vector  particle  velocity  
matrix  particle  affine velocity field [28]  
matrix  particle  PK1 stress ()  
matrix  particle  actuation Cauchy stress  
matrix  particle  actuation stress (material space)  
matrix  particle  deformation gradient  
vector  node  position  
scalar  node  mass  
vector  node  velocity  
vector  node  momentum, i.e.  
scalar  quadratic Bspline function 
Vii Variable dependencies
The MLSMPM time stepping is defined as follows:
(11)  
(12)  
(13)  
(14)  
(15)  
(16)  
(17)  
(18) 
The forward variable dependency is as follows:
(20)  
(21)  
(22)  
(23)  
(24)  
(25)  
(26)  
(27) 
During backpropagation, we have the following reversed variable dependency:
(29)  
(30)  
(31)  
(32)  
(33)  
(34)  
(35)  
(36)  
(37)  
(38)  
(39) 
We reverse swap two sides of the equations for easier differentiation derivation:
(41)  
(42)  
(43)  
(44)  
(45)  
(46)  
(47)  
(48)  
(49)  
(50)  
(51) 
In the following sections, we derive detailed gradient relationships, in the order of actual gradient computation. The frictional boundary condition gradients are postponed to the end since it is less central, though during computation it belongs to grid operations. Backpropagation in ChainQueen is essentially a reversed process of forward simulation. The computation has three steps, backward particle to grid (P2G), backward grid operations, and backward grid to particle (G2P).
Viii Backward Particle to Grid (P2G)
(A, P2G) For , we have
(53)  
(54)  
(55) 
(B, P2G) For , we have
(56)  
(57)  
(58) 
Note, the above two gradients should also include the contributions of and respectively, with being the next time step.
(C, P2G) For , we have
(59)  
(60)  
(61)  
(62) 
Ix Backward Grid Operations
(D, grid) For , we have
(63)  
(64)  
(65) 
(E, grid) For , we have
(66)  
(67)  
(68)  
(69) 
X Backward Grid to Particle (G2P)
(F, G2P) For , we have
(70)  
(71)  
(72) 
(G, G2P) For , we have
(73)  
(74)  
(75) 
(H, G2P) For , we have
(76)  
(77)  
(78)  
(79)  
(81)  
(I, G2P) For , we have
(82)  
(83)  
(84) 
(J, G2P) For , we have
(85)  
(86)  
(87)  
(88)  
(89)  
(90)  
(91)  
(92)  
(97)  
(K, G2P) For , we have
(99)  
(100)  
(101) 
Xi Friction Projection Gradients
When there are boundary conditions:
(L, grid) For , we have
(102)  
(103)  
(104)  
(105)  
(106)  
(107)  
(108)  
(109)  
(110)  
(111)  