Fast Motion Planning for HighDOF Robot Systems Using Hierarchical System Identification
Abstract
We present an efficient algorithm for motion planning and control of a robot system with a high number of degreesoffreedom. These include highDOF soft robots or an articulated robot interacting with a deformable environment. Our approach takes into account dynamics constraints and present a novel technique to accelerate the forward dynamic computation using a datadriven method. We precompute the forward dynamic function of the robot system on a hierarchical adaptive grid. Furthermore, we exploit the properties of underactuated robot systems and perform these computations for a few DOFs. We provide error bounds for our approximate forward dynamics computation and use our approach for optimizationbased motion planning and reinforcementlearningbased feedback control. Our formulation is used for motion planning of two high DOF robot systems: a highDOF lineactuated elastic robot arm and an underwater swimming robot operating in water. As compared to prior techniques based on exact dynamic function computation, we observe one to two orders of magnitude improvement in performance.
I Introduction
HighDOF robot systems are increasingly used for different applications. These include soft robots with deformable joints [1, 2], which have a highdimensional configuration space. Other scenarios correspond to articulated robots interacting with highly deformable objects like cloth [3, 4] or deformable environments like fluids [5, 6]. In these case, the number of degreeoffreedom (DOF , , is typically more than . Modeling forward dynamics of these robot systems are computationally cost. For example, an elastically soft robot can be modeled using the finite element method (FEM) [7] by discretizing the robot into thousands of points, where each evaluation of the forward dynamic function involves inverting a large sparse matrix with a complexity of . As another example, an articulated robot swimming in water can be modeled using the boundary element method (BEM) [5] by discretizing the fluid potential using thousands of points on the robot surface, where each evaluation of the forward dynamic function involves inverting a large dense matrix with a complexity of .
The high computational cost of forward dynamics poses a major challenge for dynamicconstrained motion planning and feedback control algorithms. To find a feasible motion plan or optimize a feedback controller, these algorithms typically evaluate the forward dynamic function hundreds of times per iteration. For example, samplingbased planner [8] determines the feasibility of a sample using a forward dynamic simulator. Optimizationbased planner [9] requires the Jacobian of the forward dynamic simulator to improve the motion plan during each iteration. Finally, reinforcement learning algorithm [10] requires a large number forward dynamic evaluates to compute the policy gradient and improve a feedback controller.
Several methods have been proposed to reduce the number of forward dynamic evaluations required by the motion planning and control algorithms. For samplingbased planners, the number of samples can be reduced by learning a prior sampling distribution centered around highly successful regions [11]. For optimizationbased planners, the number of gradient evaluations can be reduced by using highorder convergent optimizers [12]. Finally, samplingefficient algorithms [13] have been proposed to optimize feedback controllers. However, the number of forward dynamic evaluations is still on the level of thousands [12] or even millions [13], which is impractical for highDOF robot systems.
Another method for improving the sampling efficiency is system identification [14, 15]. This method approximates the exact forward dynamic model with a surrogate model. A good surrogate model should well approximate the exact model while being computationally efficient [16]. These methods are mostly learningbased and require a training dataset. However, it is unclear whether the learned surrogate dynamic model is accurate enough for a given manipulation task. Indeed, [17] noticed that the dataset usually cannot cover the subset of configuration space required to accomplish the motion planning or control task.
Main Results: In this paper, we present a new method of system identification for highDOF robot systems. Our key observation is that, although the configuration space is highdimensional, these robot systems are highly underactuated with only a few controlled DOFs. The number of controlled DOFs typically corresponds to the number of actuators in the system and is minimized for lower cost [18, 19]. As a result, the state of the remaining uncontrollable DOFs can be formulated as a function of the few controlled DOFs, leading to a function , where is the space of the controlled DOFs. Since is lowdimensional, sampling in does not suffer from acurseofdimensionality. Therefore, our method accelerates the evaluations of by precomputing and storing in a hierarchical grid. The hierarchical grid is a highdimensional extension of the octree in 3D, where each parent node has children. This data structure has two desirable features. First, the error due to our approximate forward dynamic function can be bounded. Second, we construct the grid in an ondemand manner, where new sample points are inserted into the grid only when a motion planner requires the point to be sampled. As a result, the sampled dataset covers exactly the part of configuration space required by the given motion planning task and the construction of the hierarchical grid is efficient.
We evaluate the performance of our method on two benchmarks: a dimensional lineactuated elastic robot arm and a dimensional underwater swimming robot. Our formulation uses the approximate forward dynamic function but we still need evaluating the exact function for constructing the hierarchical grid. However, to compute a motion plan, our method reduces the number of exact function evaluations by one to two orders of magnitude. An optimizationbased motion plan can be computed within hours on a desktop machine for a DOF to DOF robot system.
Ii Related Work
We review related work in highDOF robot systems, dynamicconstrained motion planning and control algorithms, and system identification.
HighDOF Dynamic Robot systems have arisen for two reasons. The first reason is the use of soft robots [20]. A prominent method for numerically modeling these soft robots is the finite element method (FEM) [21]. FEM represents a soft robot using a general mesh with thousands of vertices or DOFs. The second reason is interactions between a lowDOF robots and highDOF passive objects, such as when a swimming robot interacts with fluid [5]. To numerically model robotfluid interaction, prior work [5] represents the state of the fluid using the boundary element method (BEM). BEM represents the fluid state using a surface mesh that has hundreds of DOFs in 2D workspaces and tens of thousands of DOFs in 3D workspaces. Another example is a robot arm manipulating a piece of cloth [3, 4, 22], where the state of cloth is also discretized using FEM in [22]; the cloth is also represented using a mesh with thousands of DOFs as well. Both FEM and BEM induce a forward dynamic function , the evaluation of which involves matrix factorization and inversion, where the matrix is of size . As a result, the complexity of evaluating is using FEM [23] and using BEM [24].
DynamicConstrained Motion Planning algorithms are divided into optimizationbased and samplingbased methods. Optimizationbased methods are used to find locally optimal motion plans [12, 9] by minimizing a set of statedependent or controldependent objective functions under the dynamic constraints. Such optimization is performed iteratively, where each iteration involves evaluating the forward dynamic function and its differentials. On the other hand, samplingbased methods, such as [8, 25], seek globally feasible or optimal motion plans. These methods repeatedly propose tentative motion plans and evaluate the proposed motion plans by calling the forward dynamic function . Feedback control algorithms also involve a large number of calls to . Differential dynamic programming [26] relies on to provide state and control differentials. These differentials are used to optimize a trajectory over a short horizon. Finally, reinforcement learning algorithms [10] optimize the feedback controller parameters by repeatedly computing the policy gradient, which requires a large number of samples or calls to the function . Therefore, our method can benefit all these methods by accelerating .
System Identification has been widely used to approximate the forward dynamic function when the evaluation of and its differentials is costly. Most system identification methods are datadriven and approximate the system dynamics using nonparametric models such as the Gaussian mixture model [27], Gaussian process [14, 28], neural networks [29], and nearestneighbor [30]. Our method based on the hierarchical grid is also nonparametric. Most prior methods are standalone, meaning that training data must be collected before using the identified system for motion planning. Recently, system identification has been combined with reinforcement learning [31, 32] for more efficient datasampling for lowDOF dynamic systems. However, these methods do not guarantee the accuracy of the identified system. In contrast, our method not only couples with the motion planning algorithm, but also scales to highDOF dynamic systems and provide guaranteed accuracy.
Iii Problem Formulation
In this section, we introduce the formulation of highDOF robot systems and their forward dynamic functions. Then, we formulate the problem of dynamicconstrained motion planning for highDOF robots.
Iiia HighDOF Robot System Dynamics
A highDOF robot can be formulated as a dynamic system, the configuration space of which is denoted as . Each uniquely determines the kinematic state of the robot and the highDOF environment it is interacting with. To determine the dynamic state of the robot, we need and its time derivative . Given the dynamic state of the robot, its time evolution is governed by the forward dynamic function:
where the subscript denotes the timestep index, is the kinematic state at time instance , and is the timestep size. Finally, we denote as the control input to the dynamic system (e.g., the joint torques for an articulated robot). In this work, we assume that the robot system is highly underactuated so that . This assumption holds because the number of actuators in a robot is kept small for lower costs. For example, [2] proposed a soft robot octopus where each limb is controlled by only two air pumps. The forward dynamic function is a result of discretizing the EulerLagrangian equation governing the dynamics of the robot. In this work, we consider two robot systems: an elastically soft robot arm and an articulated robot swimming in water.
IiiB Elastically Soft Robot
According to [21, 7, 33], the elastically soft robot is governed by the following PDE:
(1) 
where is the internal and external forces, is the mass matrix, and is the control force. This system is discretized by representing the soft robot as a tetrahedra mesh with representing the vertex positions, as illustrated in Figure 1. Then the governing PDE (Equation 1) is discretized using an implicitEuler time integrator as follows:
(2) 
This completes the definition of for our soft robot arm. This function is costly to evaluate because solving for involves inverting a large sparse matrix resulting from FEM discretization.
IiiC Underwater Swimming Robot System
Our second example, the articulated robot swimmer, has a lowdimensional configuration space by itself. The configuration consists of joint parameters. But the robot is interacting with a fluid so that the combined fluid/robot configuration space is highdimensional. According to [6, 5], the fluid’s state can be simplified as a potential flow represented by the potential defined on the robot surface. This is discretized by sampling it on each of the vertices of the robot’s surface mesh, as shown in Figure 2. The kinematic state of the coupled system is . However, can be computed from and using the BEM method, denoted as . The governing dynamic equation in this case is:
(3)  
where is the generalized mass matrix, is the centrifugal and Coriolis force, and is the Jacobian matrix. Finally, the term on the second line of Equation 3 is included to account for the fluid pressure forces, where the integral is over the surface of the robot and is the outward surface normal. Time discretization of Equation 3 is done using an explicitEuler integrator, as follows:
(4)  
This completes the definition of for our underwater swimming robot. This function is costly to evaluate because computing involves inverting a large dense matrix resulting from the BEM discretization.
IiiD DynamicConstrained Motion Planning and Control
In this paper, we focus on the specific problem of dynamicconstrained motion planning and feedback control. In the case of motion planning, we are given a reward function and our goal is to find a series of control commands that maximize the cumulative reward over a trajectory: , where is the planning horizon. This maximization is done under dynamic constraints, i.e. must hold for every timestep:
(5)  
In the case of feedback control, our goal is still to find the control commands but the commands are generated by a feedback controller , where is the optimizable parameters of :
(6)  
In both formulations, must be evaluated tens of thousands of times to find the motion plan or controller parameters. In the next section, we propose a method to accelerate the evaluation of .
Iv Hierarchical System Identification
Our method is based on the observation that highDOF robot systems are highly underactuated. As a result, we can identify a novel function that maps from the lowdimensional control input to the highdimensional kinematic state . This function encodes all the computationally costly steps in the evaluation of and is the function that will be approximated using system identification. We first show how to identify this function for our two examples. Then we show how to construction our hierarchical grid.
Iva Function for an Elastically Soft Robot
For an elastically soft robot, we use a novel method to define a function that encodes all the computationally costly substeps of the forward dynamic function. We first consider a quasistatic procedure, in which all the dynamic behaviors are discarded and only the kinematic behaviors are kept. In this case, Equation 2 becomes:
(7) 
Equation 7 defines our function implicitly. We can also compute explicitly using Newton’s method. This computation is costly due to the inversion of a large sparse matrix .
Given that defines the quasistatic function, we can also recover the dynamic function. We assume that function serves as a shape embedding function such that for each there exists a latent parameter and . Note that is not the control input, but a latent space parameter without physical meaning. This relationship can be plugged into Equation 1 to derive a projected dynamic system in the space of the control input as:
(8)  
where the left multiplication by is due to Galerkin projection (see [34] for more details). To time integrate Equation 8, we first compute from and then recover as . Computing is very efficient as Equation 7 is a lowdimensional dynamic system. In summary, the computational bottleneck of lies in the computation of , which is a mapping from the lowdimensional variables to the highdimensional variable .
IvB Function for an Underwater Swimming Robot
We define our novel function for the underwater swimming robot in this section. The kinematic state is lowdimensional and the fluid potential is high dimensional. We interpret this case as an underactuation because the highdimensional fluid state changes due to the lowdimensional state of the articulated robot. The fluid potential is determined by the boundary condition that fluids and articulated robots should have the same normal velocities at every boundary point:
(9) 
where is a linear operator that finds ’s directional derivative along the normal direction at the th surface sample (see Figure 2), which corresponds to the fluid’s normal velocity. The righthand side corresponds to the robot’s normal velocity. Finally, we compute as:
where we assemble all the equations on all the surface samples from Equation 9. Since there are a lot of surface sample points, is a large dense matrix and inverting it is computationally cost. We therefore define
(10) 
which encodes the computationally costly part of the forward dynamic function . Here we slightly abuse the notation in that the range of is not but . However, our method is still valid in this definition because we only need to be a mapping from a lowdimensional space to a highdimensional one.
IvC Constructing the Hierarchical Grid
At this point, the evaluation of forward dynamic function requires the timeconsuming evaluation of function . Moreover, certain motion planning algorithms require to solve Equation 5 or Equation 6. In this section, we build an approximation for function that can be efficiently evaluated.
We accelerate using a hierarchical gridbased structure, as shown in Figure 3 (a). Since the domain of is lowdimensional, this choice does not suffer from acurseofdimensionality. To evaluate using a grid with a grid size of , we first identify the grid that contains . This grid node has corner points, , with coordinates:
For every corner point , we precompute and . Then, we can approximate using a multivariate cubic spline interpolation [35]. A main benefit of a gridbased structure is that we can improve the approximation accuracy by refining the grid and halving the grid size to . After repeated refinement, a hierarchy of grids is constructed.
A major difficulty lies in the construction of the grid hierarchy. Evaluating on every grid point is infeasible, but we do not know which grid points will be required before solving Equation 5. We therefore choose to build the grid on demand. When the motion planner requires the evaluation of and , the evaluation of is also required. We then check each of the corner points, . When and have not been computed, we invoke a costly algorithm (Equation 10) and store the results in our database. After all the corner points have been evaluated, we perform the multivariate spline interpolation.
The ondemand scheme only constructs the grid at a fixed resolution or grid size. Our method allows the user to define a threshold and continually refines the grid for times until . Therefore, for each evaluation of and , we need to determine which resolution we would like to use. Almost all motion planning [12] and control [10] algorithms start from an initial motion plan or control parameter and updates it iteratively until convergence. We also want to use coarser grids when the algorithm is far from convergence and finer grids when it is close to convergence. However, measuring the convergence of an algorithm is difficult and we do not have a unified solution for different algorithms. As a result, we choose to interleave motion planning or control algorithms with grid refinement. Specifically, we execute the motion planning or control algorithms times. During the th execution of the algorithm, we use the result of the th execution as the initial guess and use a grid resolution of . Note that the only difference between the th execution and th execution is that the accuracy of approximation for is improved. Therefore, the th execution will only disturb the solution slightly. This property will confine the solution space covered by the th execution, and limit the number of new evaluations on the fine grid, as shown in Figure 3 (b).
V Experiments
We have evaluated our method on the 3D versions of the two examples covered in Section III. The computational cost of each substep of our algorithm is summarized in Table I.
The 3D soft robot arm is controlled by four lines attached to four corners of the arm so that the control signal is 4dimensional, , and each evaluation of requires grid corner points. To simulate its dynamic behavior, the soft arm is discretized using a tetrahedra mesh with vertices so that has dimensions. To set up the hierarchical grid, we use an initial grid size of and , so we will execute planning algorithm for times. In this example, we simulate a laser cutter attached to the top of the soft arm and the goal of our motion planning is to have the laser cut out a circle on the metal surface, as shown in Figure 4 (a). Our motion planner is optimizationbased [12], which solves Equation 5. We discretize a trajectory with timesteps so that if system identification is not used, i.e. evaluating exactly each time, then evaluations of and are needed. To measure the rate of acceleration achieved by our method, we plot the number of exact evaluations on grid corner points against the number of iterations of trajectory optimization with and without hierarchical system identification in Figure 6. Our method requires times fewer evaluations and the total computational time is times faster. The total number of evaluations of function with system identification is and that without system identification is .
For the 3D underwater robot swimmer, the robot has hinge joints, so is dimensional and grid corner points are required to evaluate . The fluid potential is discretized on the robot surface with vertices, so of the robot system has dimensions. To set up the hierarchical grid, we use an initial grid size of and , so we will execute planning algorithm for times. Our goal is to have the robot move forward like a fish, as shown in Figure 4 (b). We use two algorithms to plan the motions for this robot. The first algorithm is optimizationbased planner [12], which solves Equation 5. The resulting plot of the number of exact evaluations on grid corner points is shown in Figure 5 (a). Our method requires times fewer evaluations and the estimated total computational time is times faster. Our second algorithm is reinforcement learning [36], which solves Equation 6 and optimizes a feedback swimming controller. This algorithm is also iterative and, in each iteration, [36] calls times. The resulting plot of the number of exact function evaluations during reinforcement learning with and without hierarchical system identification is given in Figure 5 (b). Our method requires times fewer evaluations and the total computational time is times faster.
Example  Cost (s)  Cost (s)  Cost (s)  Iter Cost+HSI (s)  Iter CostHSI (s)  

1.5  1.51  0.01  5.5  305  

0.9  0.902  0.02  3.1  183  

0.9  0.902  0.02  42  16424 
Va Comparison
Several prior work solve similar problems to those in our work. To control an elastically soft robot arm, [37] evaluates and its differentials using finite difference in the space of control signals, . However, this method does not take dynamics into consideration and takes minutes to compute each motion plan in 2D workspaces. Other methods [38] only consider soft robots with a very coarse discretization and do not scale to highdimensional cases. To control an underwater swimming robot, [39] achieves realtime performance in evaluating the forward dynamic function, but they used a simplified fluid drag model, while we use the more accurate potential flow model [5] for the fluid. Finally, the key difference between our method and previous system identification methods such as [27, 14, 28, 29, 30] is that we do not identify the entire forward dynamic function . Instead, we choose to identify a novel function from that encodes the computationally costly part of and does not suffer from acurseofdimensionality.
Vi Conclusion and Limitations
We present a hierarchical, gridbased data structure for performing system identification for highDOF soft robots. Our key observation is that these robots are highly underactuated. Therefore, the most computationally costly step in their forward dynamic function, , can be encoded in another function, , the domain of which is lowdimensional and the range of which is highdimensional. As a result, we can precompute on a grid without suffering from acurseofdimensionality. We build the hierarchical grid until a userspecified accuracy threshold is satisfied. The construction is performed in an ondemand manner and the entire hierarchy construction is interleaved with the motion planning or control algorithms. These techniques effectively reduce the number of grid corner points to be evaluated and thus reduce the total running time.
A major limitation of the current method is that the function cannot always be identified and there is no general method for identifying it. Moreover, our method is only effective when is very lowdimensional to avoid acurseofdimensionality. Another major issue is that we cannot guarantee that function is a onetoone mapping. Indeed, a single control input can lead to multiple quasistatic poses for a soft robot arm. Therefore, a major direction of future research is to extend our gridbased structure to handle functions with special properties such as onetomany function mappings and discontinuous functions. Finally, to further reduce the number of grid corner points to be evaluated, we are interested in using a spatially varying grid resolution, in which higher grid resolutions are used in regions where function changes rapidly.
References
 [1] J. Fras, M. Macias, Y. Noh, and K. Althoefer, “Fluidical bending actuator designed for soft octopus robot tentacle,” in 2018 IEEE International Conference on Soft Robotics (RoboSoft). IEEE, 2018, pp. 253–257.
 [2] J. Fras, Y. Noh, M. Maciaś, H. Wurdemann, and K. Althoefer, “Bioinspired octopus robot based on novel soft fluidic actuator.” IEEE, 2018.
 [3] Z. M. Erickson, H. M. Clever, G. Turk, C. K. Liu, and C. C. Kemp, “Deep haptic model predictive control for robotassisted dressing,” CoRR, vol. abs/1709.09735, 2017.
 [4] A. Clegg, W. Yu, Z. M. Erickson, C. K. Liu, and G. Turk, “Learning to navigate cloth using haptics,” 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2799–2805, 2017.
 [5] E. Kanso, J. E. Marsden, C. W. Rowley, and J. B. MelliHuber, “Locomotion of articulated bodies in a perfect fluid,” Journal of Nonlinear Science, vol. 15, no. 4, pp. 255–289, Aug 2005. [Online]. Available: https://doi.org/10.1007/s0033200406509
 [6] A. Munnier and B. Pinçon, “Locomotion of articulated bodies in an ideal fluid: 2d model with buoyancy, circulation and collisions,” Mathematical Models and Methods in Applied Sciences, vol. 20, no. 10, pp. 1899–1940, 2010. [Online]. Available: https://hal.archivesouvertes.fr/hal00394744
 [7] C. Duriez, “Control of elastic soft robots based on realtime finite element method,” in 2013 IEEE International Conference on Robotics and Automation, May 2013, pp. 3982–3987.
 [8] S. M. LaValle, “Rapidlyexploring random trees: A new tool for path planning,” 1998.
 [9] J. T. Betts, “Survey of numerical methods for trajectory optimization,” Journal of guidance, control, and dynamics, vol. 21, no. 2, pp. 193–207, 1998.
 [10] R. J. Williams, “Simple statistical gradientfollowing algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3, pp. 229–256, May 1992. [Online]. Available: https://doi.org/10.1007/BF00992696
 [11] J. Pan and D. Manocha, “Fast probabilistic collision checking for samplingbased motion planning using localitysensitive hashing,” The International Journal of Robotics Research, vol. 35, no. 12, pp. 1477–1496, 2016. [Online]. Available: https://doi.org/10.1177/0278364916640908
 [12] J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Goldberg, and P. Abbeel, “Motion planning with sequential convex optimization and convex collision checking,” The International Journal of Robotics Research, vol. 33, no. 9, pp. 1251–1270, 2014. [Online]. Available: https://doi.org/10.1177/0278364914528132
 [13] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International conference on machine learning, 2016, pp. 1928–1937.
 [14] C. Williams, S. Klanke, S. Vijayakumar, and K. M. Chai, “Multitask gaussian process learning of robot inverse dynamics,” in Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, Eds. Curran Associates, Inc., 2009, pp. 265–272. [Online]. Available: http://papers.nips.cc/paper/3385multitaskgaussianprocesslearningofrobotinversedynamics.pdf
 [15] S. Genc, “Parametric system identification using deep convolutional neural networks,” in 2017 International Joint Conference on Neural Networks (IJCNN), May 2017, pp. 2112–2119.
 [16] K. Ã strÃ¶m and P. Eykhoff, “System identificationâa survey,” Automatica, vol. 7, no. 2, pp. 123 – 162, 1971. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0005109871900598
 [17] S. Ross and J. A. Bagnell, “Agnostic system identification for modelbased reinforcement learning,” in Proceedings of the 29th International Coference on International Conference on Machine Learning, ser. ICML’12. USA: Omnipress, 2012, pp. 1905–1912. [Online]. Available: http://dl.acm.org/citation.cfm?id=3042573.3042816
 [18] M. Skouras, B. Thomaszewski, S. Coros, B. Bickel, and M. Gross, “Computational design of actuated deformable characters,” ACM Trans. Graph., vol. 32, no. 4, pp. 82:1–82:10, July 2013. [Online]. Available: http://doi.acm.org/10.1145/2461912.2461979
 [19] X. Xiao, E. Cappo, W. Zhen, J. Dai, K. Sun, C. Gong, M. J. Travers, and H. Choset, “Locomotive reduction for snake robots,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 3735–3740.
 [20] D. Rus and M. T. & Tolley, “Design, fabrication and control of soft robots.” Nature, vol. 521, pp. 467–475, 2015.
 [21] Y.c. Fung, P. Tong, and X. Chen, Classical and computational solid mechanics. World Scientific Publishing Company, 2017, vol. 2.
 [22] B. Jia, Z. Hu, J. Pan, and D. Manocha, “Manipulating highly deformable materials using a visual feedback dictionary,” in ICRA, 2018.
 [23] A. George and E. Ng, “On the complexity of sparse $qr$ and $lu$ factorization of finiteelement matrices,” SIAM Journal on Scientific and Statistical Computing, vol. 9, no. 5, pp. 849–861, 1988.
 [24] P. A. P. and N. J. A. L., “A spectral multipole method for efficient solution of largescale boundary element models in elastostatics,” International Journal for Numerical Methods in Engineering, vol. 38, no. 23, pp. 4009–4034.
 [25] D. J. Webb and J. van den Berg, “Kinodynamic rrt*: Optimal motion planning for systems with linear differential constraints,” CoRR, vol. abs/1205.5088, 2012.
 [26] Y. Tassa, T. Erez, and E. Todorov, “Synthesis and stabilization of complex behaviors through online trajectory optimization,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2012, pp. 4906–4913.
 [27] G. Biagetti, P. Crippa, A. Curzi, and C. Turchetti, “Unsupervised identification of nonstationary dynamical systems using a gaussian mixture model based on em clustering of soms,” in Proceedings of 2010 IEEE International Symposium on Circuits and Systems, May 2010, pp. 3509–3512.
 [28] D. NguyenTuong, M. Seeger, and J. Peters, “Model learning with local gaussian process regression,” vol. 23, pp. 2015–2034, 10 2009.
 [29] S. R. Chu, R. Shoureshi, and M. Tenorio, “Neural networks for system identification,” IEEE Control Systems Magazine, vol. 10, no. 3, pp. 31–35, April 1990.
 [30] W. Greblicki and M. Pawlak, “Hammerstein system identification with the nearest neighbor algorithm,” IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 4746–4757, Aug 2017.
 [31] W. Yu, J. Tan, C. K. Liu, and G. Turk, “Preparing for the unknown: Learning a universal policy with online system identification,” arXiv preprint arXiv:1702.02453, 2017.
 [32] S. Levine and P. Abbeel, “Learning neural network policies with guided policy search under unknown dynamics,” in Advances in Neural Information Processing Systems, 2014, pp. 1071–1079.
 [33] F. Largilliere, V. Verona, E. Coevoet, M. SanzLopez, J. Dequidt, and C. Duriez, “Realtime Control of SoftRobots using Asynchronous Finite Element Modeling,” in ICRA 2015, SEATTLE, United States, May 2015, p. 6. [Online]. Available: https://hal.inria.fr/hal01163760
 [34] K. Carlberg, C. BouMosleh, and C. Farhat, “Efficient nonlinear model reduction via a leastsquares petrovâgalerkin projection and compressive tensor approximations,” International Journal for Numerical Methods in Engineering, vol. 86, no. 2, pp. 155–181. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/nme.3050
 [35] C. C. Lalescu, “Two hierarchies of spline interpolations. practical algorithms for multivariate higher order splines,” arXiv preprint arXiv:0905.3564, 2009.
 [36] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
 [37] G. Fang, C.D. Matte, T.H. Kwok, and C. C. Wang, “Geometrybased direct simulation for multimaterial soft robots,” in ICRA, 2018.
 [38] R. Gayle, P. Segars, M. C. Lin, and D. Manocha, “Path planning for deformable robots in complex environments,” in In Robotics: Systems and Science, 2005.
 [39] E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for modelbased control,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012, pp. 5026–5033.