A memory of motion for visual predictive control tasks

A memory of motion for visual predictive control tasks


This paper addresses the problem of efficiently achieving visual predictive control tasks. To this end, a memory of motion, containing a set of trajectories built off-line, is used for leveraging precomputation and dealing with difficult visual tasks. Standard regression techniques, such as k-nearest neighbors and Gaussian process regression, are used to query the memory and provide on-line a warm-start and a way point to the control optimization process. The proposed technique allows the control scheme to achieve high performance and, at the same time, keep the computational time limited. Simulation and experimental results, carried out with a 7-axis manipulator, show the effectiveness of the approach.

©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

I Introduction

Image-based visual servoing (VS) is a well established technique to control robots using visual information [6] [7]. Its classic formulation consists in the simple control law , where is the velocity of the camera, is the control gain and is the pseudo-inverse of the image Jacobian (or interaction matrix) ; the hat symbol denotes an approximation. This control law ensures an exponential convergence to zero of the visual error, i.e., the difference between the measured and desired visual features ( and , respectively). Although the VS control law is easy to implement and fast to execute, it has some limitations. For large values of the error, the behavior can be unstable, and for some configurations the Jacobian can become singular causing dangerous commands [8]. Being purely reactive, VS does not perform any sort of anticipatory behavior that would improve the tracking performance. Furthermore, it cannot easily include (visual or Cartesian) constraints, which are very useful in real-life robotic experiments.

Planning techniques [18] can be employed to compute trajectories that achieve the desired visual task while respecting constraints. Alternatively, VS can be formulated as an optimization process, allowing to easily include constraints. In [1], VS is written as a quadratic program (QP) so that it can account for the constrained whole-body motion of humanoid robots. Similarly, a virtual VS written as a QP is proposed in [21] to achieve manipulation tasks. Visual planning and control can be solved together using a model to predict the feature motion and the corresponding commands over a preview window [27]. Indeed, the model predictive control (MPC) technique can be applied to the VS case, by obtaining the so-called visual predictive control (VPC) framework [2] [3]. The main drawback of VPC is the computation time. The flatness property [23] [4] can be used to reduce the problem complexity, but it is not applicable to all kinds of dynamics.

In this work, we propose to use a dataset of pre-processed solutions to improve VPC performance (recalled in Sect. II). To this end, an initialization and a way point is inferred on-line from the dataset. Section III reports the literature on methods used to exploit stored data; the proposed approach is detailed in Sect. IV. Simulation and experiments, showing the effectiveness of the approach, are presented in Sect. V. Section VI concludes the paper and discusses future work.

Ii Background

The VPC paradigm [2] [3] aims at solving planning and control simultaneously. To this end, it computes a control sequence over a preview window by solving the optimization


where the cost function is defined as


and the optimization variable consists in the sequence of control actions to take along the preview window


In (2) and (3), is the number of iterations defining the size of the preview window, while is the control horizon defined such that from to the control is constant and equal to ; and are two matrices used to weight the error and penalize the control effort, respectively. In the preview window, i.e. for , the problem is subject to


with the difference between the measured and the first previewed feature, constant over the preview window. is the sampling time. Constraints on the optimization variable


account for actuation limits, the ones on the visual features


achieve visibility constraints: (6) forces the features to stay in an area, e.g., to prevent from leaving the image plane, (7) allows to avoid occlusions or spots on the lens. The ensemble of (5)-(7) compose the set of non-linear constraints in (1).

Following the MPC rationale, at each iteration , VPC measures the visual features , predicts the motion over the preview window using the model in (4), minimizes the cost function (2) and finally computes the commands . Only the first control of this sequence is applied to the real system which moves, providing a new set of features. Then, the loop starts again. To achieve a satisfactory behavior, the control is usually kept constant over the preview window (), while is tuned as a trade-off between a long (better tracking performance) and a short preview window (lower computational cost). More constraints (e.g., on camera position) can be added. In (4) a local model of the visual features is used, but a global model of the camera motion can also be considered. More details can be found in [2] [3].

Solving (1) with the constraints (5)-(7) is a non-convex optimization problem. As such, the solution depends on the solver initialization. If it is far from the global optimum, the convergence can be slow, or get stuck in local minima providing unsatisfactory results. Thus, it is important to provide the solver with a warm-start, i.e., an initial commands sequence already close to the optimal solution. To avoid the constraints, the warm-start can guide the motion away from the target momentarily. However, providing only warm-starts may not be sufficient. In fact, a solver with short time horizon might consider the warm-start to be sub-optimal and modify it to move towards the goal and, as a consequence, get stuck at the local optima at the constraints. One possible solution is to consider a long preview window and set the cost only at the end of the horizon, but this is computationally expensive. A better idea would be to adjust the cost function with a proper way point as sub-target to follow.

We propose to use a memory of motion, i.e., a dataset of precomputed trajectories, to infer both a warm-start and a way point during the on-line VPC execution. In this way, we leverage precomputation to shorten the VPC preview window while maintaining high performance.

Iii State-of-the-Art

Leveraging information stored in a memory to control or plan robotic motions has been the object of a lively research. In [30], a library of trajectories is queried by k-nearest neighbor (k-NN) to infer the control action to take during the experiment. A similar method [15] selects from the library a control which is then refined by differential dynamic programming. As an alternative to plan from scratch, the framework in [5] starts the planner from a trajectory learned from experiences. In [9] Gaussian process regression (GPR) is used to adapt the motion, stored as dynamic motion primitives, to the actual situation perceived by the robot. The line of works [28, 11] considers a robot motion database built from human demonstrations. This gives the controller a guess of the motion to make, possibly modified by the presence of obstacles. Demonstrations and optimization techniques are used in [29] to handle constraints in a visual planner.

To improve the convergence of planning or control frameworks written as optimization problems, the memory can be used to provide the solvers with a warm-start. In [16], a memory is iteratively built, expanding a probabilistic road map (PRM) using a local planner. A neural network (NN) is trained, in parallel, with the current trajectories stored in the PRM and used to give the local planner a warm-start to better connect the map. The final NN is then used to infer the warm-start for the on-line controller. In the context of a trajectory optimizer, the initialization is computed by applying k-NN and locally weighted regression to a set of pre-optimized trajectories [13]. In [17] a k-NN infers from a memory of motion the warm-starts for a planner. The same kind of problem is addressed in [26] with different techniques, i.e. k-NN, GPR and Bayesian Gaussian mixture regression, that allows to also cope with multi-modal solutions.

Other approaches consider the possibility to reshape the cost function to guide the solver towards an optimal solution. For example, the interior point method [24] solves an inequality constrained problem by introducing the logarithmic barrier function to the cost. In this way, the search for the solution starts from the inner region of the feasible space and then moves to the boundary region. In humanoid motion planning [19], heuristic sub-goals are introduced in the early stage of the optimization based on the zero-moment point stability criterion. In [31], to avoid discontinuity, the contact dynamics are smoothened such that virtual contact forces can exist at a distance. In reinforcement learning, it is common to modify the sparse reward function, that is difficult to achieve, by providing intermediate rewards as way points [22].

To build our framework and successfully achieve VPC tasks, we took inspiration from the different approaches existing in the literature. In particular, we decided to exploit the information contained in a memory of motion to infer: (i) warm-start to well initialize our optimization solver; and (ii) way point to be used in the cost in lieu of the final target.

Iv The Proposed Approach

As recalled in Sect. II, VPC computes a control sequence by solving a minimization problem. To efficiently find an optimal solution, the process has to converge fast and avoid local minima. Thus, it is important to initialize the solver with a warm-start , and reshape the cost function using a way point in place of the target . This section explains how to infer the warm-start and way point from a memory.

The memory of motion is a dataset , of samples. Each feature describes a particular visual configuration and is composed of a set of visual features, the area and the orientation of the visual pattern1


where , is the dimension of the visual feedback . We consider and along with in (8) to make the samples distinguishable, not only in terms of the visual appearance but also w.r.t. the corresponding camera poses. The output variable contains the proper control action to take and the way point to follow in function of . Since the control is constant in the preview window (, see Sect. II), it is enough to store the single command


where , with the actuated degrees of freedom of the camera. All the samples are collected in the matrices


The whole process computing warm-start and way point consists in off-line building and on-line querying the memory.

  Input: Output: ,
  while  do
     , success  False
     while   do
     end while
     if success is True then
         for  do
            Store and in and
         end for
     end if
  end while
Fig. 1: Algorithm generating the trajectories for the memory of motion.

Iv-a Building the memory of motion

The memory of motion is built by running VPC off-line for different sets of initial visual features. The aim is to compute successful trajectories able to achieve the visual task. To this end, the same solver of the on-line executions is used to build the memory. However, since the aim is to build ‘high-quality’ samples and there is no strict constraint on the execution time (the memory is built off-line), the solver is set up with low thresholds on the solution optimality, a high number of max iterations allowed, and a large VPC preview window.

The process building the memory of motion is presented in the algorithm of Fig. 1. For random initial conditions , if the VPC solver succeeds to find a feasible solution (no constraint is violated) and the task is achieved ( converge to in the given time), then all the visual features from to are saved ( is the length of the trajectory). Thus, , the following actions are executed:

  • the area and angle of the corresponding visual pattern are computed;

  • the way point is computed as the visual features at samples ahead (); if , ;

  • the corresponding solution is selected.

With this information, the vectors and are obtained and finally stored in and . The initial value of the visual features is generated randomly at the start of the memory building, while at the later stage it is biased toward the distributions corresponding to the set of unsuccessful initial conditions (estimated by Gaussian Mixture Model), so that the solver attempts to solve the difficult cases when the database has contained a sufficient number of samples. The algorithm uses the function ’Find Solution’ which tries to find an optimal solution, employing the strategies detailed in the algorithm of Fig. 2. It implements an iterative mechanism by which the memory building process benefits from the current status of the memory itself. Indeed, if there are enough trajectories in the memory, and the features are close to the constraints (in which case the function ’Is_Close’ returns True), the solver is provided with a warm-start and way point inferred by a k-NN algorithm (details in Sect. IV-B). Otherwise, the algorithm tries to solve the VPC using the previous solution as warm-start. If the solver does not manage to find a successful solution, two recovery strategies are executed: the solver is warm-started with one of: (i) 12 pre-defined; or (ii) 10 random camera velocity directions. In the presented algorithms, ’’, and ’’ denote the AND and NOT logic operator, respectively. Once the memory is built, it is ready to be queried on-line.

  Input: , , , , Output: , success
  if Is_Close() is True  then
  end if
  while success is False  do
  end while
  while success is False  do
  end while
Fig. 2: Algorithm trying to find a feasible VPC solution.

Iv-B Querying the memory of motion

The aim of querying the memory of motion is to infer from the dataset proper initial guess and way point for the on-line VPC solver, given the current visual features configuration. This means that we need to learn the map from so that an estimate can be computed for a novel feature . The map is learned using standard regression techniques, i.e. k-NN and GPR, as also proposed in [26]. In what follows, we describe the adaptation required for the VPC application.

The k-NN algorithm is a simple non-parametric method selecting the closest samples in the dataset , given a new feature . The distance between samples is computed as Euclidean norm. The corresponding closest outputs are thus averaged to provide the estimated output


In the case of GPR [25], the inference is computed by


where can be computed off-line, so that only a vector sum and a matrix multiplication, fast to compute, are left for the on-line estimation; is the identity matrix2. In (12), is the covariance matrix which is built from the kernel function. A popular choice, also used in this work, is the radial basis function . The hyperparameters , and are computed by minimizing the marginal log-likelihood. Finally, is the mean function acting as an offset in the estimation process. We consider the constant vector that suggests to compute zero velocity as warm-start and the final target as way point when is in an area not sampled by the memory. GPR is known to be effective with small data-set and is fast to compute. These characteristics fit very well our task, since the memory is built with trajectories lying on the image plane (which is a limited area) and has to be queried fast to be compatible with the on-line control requirements.

Finally, recalling that the control is constant in the preview window, the warm-start is built from the first entries of :


where ’’ is the Kronecker product. The way point, instead, is obtained from the remaining elements of :


Note that in the absence of constraints, the solution found at the previous iteration is already a good warm-start for the solver and there is no need to reshape the cost with a way point. Thus, the memory-based strategy is activated only when the visual features are “close” to one of the visibility constraints, i.e., when the distance between any feature and the border of the constraints is lower than a given threshold.

V Results

In this section we present the results carried out with the proposed framework to efficiently achieve VPC tasks.

As visual features , we considered four points (). The visual task consisted in making them match with four corresponding desired points . The image Jacobian in (4) has been approximated using the points depth at the target, known in advance. The approach has been implemented in Python. As optimization solver, we used the SLSQP method available in the open source library SciPy [14]. Actuation and visibility constraints were implemented as bounds and non-linear inequality constraints. The OR logic operation in (7), to be implemented, was converted into AND with a -norm formulation [12]. We choose for our k-NN, that is thus mainly used to select samples as they are in the memory; we considered the GPy library [10] as GPR implementation. As explained in Sect. II, VPC was set-up with and  s (since  Hz is the camera nominal framerate).

V-a Simulations

For the simulations, we considered a hand-held camera free to move in the Cartesian space (), with an image resolution of  pixels. As visibility constraints, we considered four convex and concave areas on the image (- plane) simulating occlusions and spots on the lens. As actuation constraints, we limited the linear and angular velocity components of the camera to  m/s and  rad/s. We set (decreasing to towards the convergence) and with .

Fig. 3: Simulations: visual features trajectories stored in the memory. For visualization purposes, the samples are plotted with different shades; each color corresponds to the motion of one single feature.

The memory of motion was generated following the procedure of Sect. IV-A. In particular, the solver was set-up with an optimality precision of and maximum iterations. VPC was set with . The choice of these parameters was driven by the need to store ‘high-quality’ samples, at the cost of a high computational time that we were willing to pay since the memory is built off-line. We generated trajectories, for a total of samples. Fig. 3 shows the visual features trajectories stored in the memory. The visibility constraints are depicted as shadowed areas, while the target are the red circles. For the on-line executions, we relaxed the solver parameters with as optimality precision and maximum iterations. This set-up, along with a smaller , allowed faster computations. However, thanks to the memory-based strategies presented in Sect. IV, performances are not invalidated, but even improved.

Strategy (%)  (s)
Previous-iteration () 80 0.085 49.3
Previous-iteration () 83 0.550 52.9
k-NN-based 92 0.074 19.6
GPR-based 93 0.080 16.4
TABLE I: Simulations: statistics comparing different VPC strategies.
(a) Prev.-it. strategy: visual feature path
(b) k-NN-based strategy: visual feature path
(c) GPR-based strategy: visual feature path
(d) Prev.-it. strategy: velocity command
(e) k-NN-based strategy: velocity command
(f) GPR-based strategy: velocity command
Fig. 4: Simulations: comparison between prev.-it., k-NN and GPR-based strategies in terms of features motion (top) and velocity command (bottom).

The approach was first evaluated with a statistical analysis, comparing: VPC warm-started with the previous-iteration solution (for brevity denoted “prev.-it.”) (i) using and (ii) using ; using warm-start and way point provided (iii) by k-NN and (iv) by GPR, both with . The memory-based strategies were activated at  pixels from the occlusions and we set . For GPR, data were sub-sampled by a factor . The comparison is performed w.r.t. the success rate , the average of the solver convergence time and the average of the cost divided by for all (successful and unsuccessful) trajectories. Each execution is considered successful if no constraint is violated (with a tolerance of  pixels) and the visual task is achieved ( converges to in the given time of  s). Each strategy was tested using the same random initial configuration. The results, run on a laptop with an i7- GHz 4-cores and  GiB RAM, are reported in Table I. The prev-it strategy with allowed to obtain % of success rate (note that among the test samples, many had an easy task execution). In order to improve , for the considered scenario, we had to increase the preview window to , but this also increased the computation time. The proposed memory-based strategies allowed us to keep the preview window short, so that both and have low values, and increase at the same time. This is due to the effect of warm-start and way point which help the execution of the task.

The main reason of the prev.-it. strategy failures is that the solution gets stuck at the visual occlusions. The memory-based strategies reduces the occurrence of these situations. As an example, in Fig. 4 we present the plots related to a single task execution, where the big blue dot is the initial value of the features, the smaller blue dots are the VPC solutions at each iteration, whereas the red circles are the target. The prev.-it. strategy stops at an occlusion border (see Fig. (a)a), as effect of conflicting gradients that produce zero velocity commands (Fig. (d)d). Instead, the memory-based approaches manage to overcome the occlusion, as shown in Figs (b)b and (c)c. In particular, the GPR solution, thanks to its interpolation capabilities, produces a smoother behavior w.r.t. our k-NN implementation (cf. Figs (b)b-(e)e with (c)c-(f)f ).

V-B Robot experiments

For the experiments, we used the 7 degrees-of-freedom robot arm Panda by Franka Emika, with an Intel RealSense RGB-D sensor mounted at the end-effector. The sensor, used as monocular camera, outputs images with a resolution of  pixels at a nominal framerate of  Hz. The image processing, used to detect the point features, was implemented using the open source library OpenCV [20]. A calibration procedure computed the intrinsic camera parameters and the camera–end-effector displacement. The camera velocity commands, computed by VPC, were transformed in the robot frame and sent to the robot Cartesian controller. As task, the robot had to place an object inside a box where we placed four known markers. Without knowing the box pose, VPC was used to drive the robot over the box and, after convergence, release the object. On the image we considered two constraints to take into account the occlusion of the object grasped by the robot, and emulate a spot in the center of the lens as a blurred area. VPC was set with , (decreasing it to approaching the convergence), while the commands bounds were set to  m/s and  rad/s.

The memory ( trajectories for a total of samples) was built with , solver optimality tolerance of and maximum iterations. The iterative building and the adaptive sampling were not used. To be conservative, the spot considered in the memory was bigger than the one in the experiments. Given the simulation results, we decided to use the GPR-based strategy. Data were subsampled by a factor , with . The trigger signal to query the memory was activated at  pixels from the occlusions.

For the on-line experiments, we set , the solver was given maximum iterations and as optimality tolerance. With this setting, and for some initial robot-box configuration, the previous iteration strategy was not capable to achieve the task, as shown in the snapshots of Fig. 5. While moving the visual features (blue dots, see Fig. (a)a) towards the target (red circle), the features met the blurred spot (Fig. (b)b) causing the loss of a feature and the consequent failure of the task (Fig. (c)c). The same experiment has been carried out with the GPR-based approach, see Fig. 6 where both robot and camera view are shown. Starting from the same initial condition (Fig. (a)a), at the proximity of the constraint (Fig. (b)b), the memory provides proper way point (depicted as red crosses on the image plane) and warm-start which allow to successfully achieve the desired task (Fig. (c)c). In Fig. 7 are shown the velocity commands sent to the robot during the execution. The experiments are shown in the accompanying video.

Fig. 5: Robot experiment: the prev.-it. strategy fails to avoid the occlusion.

Vi Conclusion and Future Work

In this paper, we addressed the problem of efficiently achieving visual predictive control tasks. Using a memory of motion, we could exploit previous solutions to better fulfill on-line tasks. Furthermore, leveraging pre-computation contained in the memory, we could set a short VPC preview window without invalidating the results. The algorithm performances rely on the pre-computed dataset; we plan to improve the quality of the memory using a global optimizer or a planner. Furthermore, more sophisticated paradigm of active learning can be employed to build a minimal memory, containing less but more informative samples. In the presented work, the memory is queried using k-NN and GPR. As shown with both simulations and experiments, these methods were able to outperform the standard VPC scheme. However, we believe that the performance could be even improved by considering other kinds of regressors that can cope with multimodality, as done in [26]. In the presented results we show that the use of a memory of motion helps also to keep the computation time limited. However, more effort will be done in order to ensure full real-time performances. Finally, further developments will be devoted to include the proposed scheme within the optimization framework of more complex systems such as humanoids.

Fig. 6: Robot experiment using the memory of motion: VPC is able to avoid the occlusion and achieve the desired task.
Fig. 7: Robot experiment using the memory of motion: velocity commands.


  1. For example, if point features are used, the visual pattern is the polygon having the visual features as vertexes.
  2. Hereafter, , and refer to the identity, all-ones and null matrix. When not explicitly marked, the dimensions are inferred from the context.


  1. D. J. Agravante, G. Claudio, F. Spindler and F. Chaumette (2017) Visual servoing in an optimization framework for the whole-body control of humanoid robots. IEEE Robot. and Autom. Lett. 2 (2), pp. 608–615. Cited by: §I.
  2. G. Allibert, E. Courtial and F. Chaumette (2010) Predictive control for constrained image-based visual servoing. IEEE Trans. Robot. 26 (5), pp. 933–939. Cited by: §I, §II, §II.
  3. G. Allibert, E. Courtial and F. Chaumette (2010) Visual servoing via nonlinear predictive control. In Visual Servoing via Advanced Numerical Methods, G. Chesi and K. Hashimoto (Eds.), pp. 375–393. Cited by: §I, §II, §II.
  4. G. Allibert, E. Courtial and Y. Touré ([2008) Real-time visual predictive controller for image-based trajectory tracking of a mobile robot. IFAC Proceedings Volumes 41 (2), pp. 11244–11249. Cited by: §I.
  5. D. Berenson, P. Abbeel and K. Goldberg (2012) A robot path planning framework that learns from experience. In IEEE Int. Conf. on Robotics and Automation, pp. 3671–3678. Cited by: §III.
  6. F. Chaumette and S. Hutchinson (2006) Visual servo control, Part I: basic approaches. IEEE Robot. Autom. Mag. 3 (4), pp. 82–90. Cited by: §I.
  7. F. Chaumette and S. Hutchinson (2007) Visual servo control, Part II: advanced approaches. IEEE Robot. Autom. Mag. 14 (1), pp. 109–118. Cited by: §I.
  8. F. Chaumette (1998) Potential problems of stability and convergence in image-based and position-based visual servoing. In The Confluence of Vision and Control, pp. 66–78. Cited by: §I.
  9. D. Forte, A. Gams, J. Morimoto and A. Ude (2012) On-line motion synthesis and adaptation using a trajectory database. Robotics and Autonomous Systems 60 (10), pp. 1327–1339. External Links: ISSN 0921-8890 Cited by: §III.
  10. GPy: a gaussian process framework in python. External Links: Link Cited by: §V.
  11. L. Huber, A. Billard and J. Slotine (2019) Avoidance of convex and concave obstacles with convergence ensured through contraction. IEEE Robot. and Autom. Lett. 4 (2), pp. 1462–1469. Cited by: §III.
  12. N. P. Hyun, P. A. Vela and E. I. Verriest (2017) A new framework for optimal path planning of rectangular robots using a weighted norm. IEEE Robot. and Autom. Lett. 2 (3), pp. 1460–1465. Cited by: §V.
  13. N. Jetchev and M. Toussaint (2009) Trajectory prediction: learning to map situations to robot trajectories. In Int. Conf. on Machine Learning, pp. 449–456. Cited by: §III.
  14. E. Jones, T. Oliphant and P. Peterson SciPy: open source scientific tools for Python. External Links: Link Cited by: §V.
  15. C. Liu and C. G. Atkeson (2009) Standing balance control using a trajectory library. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 3031–3036. Cited by: §III.
  16. N. Mansard, A. Del Prete, M. Geisert, S. Tonneau and O. Stasse (2018) Using a memory of motion to efficiently warm-start a nonlinear predictive controller. In IEEE Int. Conf. on Robotics and Automation, pp. 2986–2993. Cited by: §III.
  17. W. Merkt, V. Ivan and S. Vijayakumar (2018) Leveraging precomputation with problem encoding for warm-starting trajectory optimization in complex environments. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 5877–5884. Cited by: §III.
  18. Y. Mezouar and F. Chaumette (2002) Path planning for robust image-based control. IEEE Trans. Robot. Autom. 18 (4), pp. 534–549. Cited by: §I.
  19. I. Mordatch, E. Todorov and Z. Popović (2012) Discovery of complex behaviors through contact-invariant optimization. ACM Trans. Graph. 31 (4), pp. 1–8. Cited by: §III.
  20. Open source computer vision library. External Links: Link Cited by: §V-B.
  21. A. Paolillo, K. Chappellet, A. Bolotnikova and A. Kheddar (2018) Interlinked visual tracking and robotic manipulation of articulated objects. IEEE Robot. and Autom. Lett. 3 (4), pp. 2746–2753. Cited by: §I.
  22. X. B. Peng, P. Abbeel, S. Levine and M. van de Panne (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans. Graph. 37 (4), pp. 1–14. Cited by: §III.
  23. B. Penin, P. R. Giordano and F. Chaumette (2018) Vision-based reactive planning for aggressive target tracking while avoiding collisions and occlusions. IEEE Robot. and Autom. Lett. 3 (4), pp. 3725–3732. Cited by: §I.
  24. I. Pólik and T. Terlaky (2010) Interior point methods for nonlinear optimization. In Nonlinear optimization, pp. 215–276. Cited by: §III.
  25. CE. Rasmussen and CKI. Williams (2006) Gaussian processes for machine learning. MIT Press, Cambridge, MA, USA. Cited by: §IV-B.
  26. T. Santoso Lembono, A. Paolillo, E. Pignat and S. Calinon (2020) Memory of motion for warm-starting trajectory optimization. IEEE Robot. and Autom. Lett. 5 (2), pp. 2594–2601. Cited by: §III, §IV-B, §VI.
  27. M. Sauvée, P. Poignet, E. Dombre and E. Courtial (2006) Image based visual servoing through nonlinear model predictive control. In IEEE Conf. on Decision and Control, pp. 1776–1781. Cited by: §I.
  28. M. Saveriano and D. Lee (2014) Distance based dynamical system modulation for reactive avoidance of moving obstacles. In IEEE Int. Conf. on Robotics and Automation, pp. 5618–5623. Cited by: §III.
  29. T. Shen, S. Radmard, A. Chan, E. A. Croft and G. Chesi (2018) Optimized vision-based robot motion planning from multiple demonstrations. Autonomous Robots 42 (6), pp. 1117–1132. Cited by: §III.
  30. M. Stolle and C. G. Atkeson (2006) Policies based on trajectory libraries. In IEEE Int. Conf. on Robotics and Automation, pp. 3344–3349. Cited by: §III.
  31. E. Todorov (2011) A convex, smooth and invertible contact model for trajectory optimization. In IEEE Int. Conf. on Robotics and Automation, pp. 1071–1076. Cited by: §III.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description