Existence of local minima of a minimal 2D pose-graph SLAM problem

Existence of local minima of a minimal 2D pose-graph SLAM problem

Felix H. Kong, Jiaheng Zhao, Liang Zhao, and Shoudong Huang The authors are with the Centre for Autonomous Systems, Faculty of Engineering and Information Technology, University of Technology Sydney (UTS), Sydney, Australia. Email: {felix.kong, jiaheng.zhao, liang.zhao,shoudong.huang}@uts.edu.au
Abstract

In this paper, we show that for a minimal pose-graph problem, even in the ideal case of perfect measurements and spherical covariance, using the so-called “wrap function” when comparing angles results in multiple suboptimal local minima. We numerically estimate regions of attraction to these local minima for some numerical examples, and give evidence to show that they are of nonzero measure. In contrast, under the same assumptions, we show that the chordal distance representation of angle error has a unique minimum up to periodicity. For chordal cost, we also search for initial conditions that fail to converge to the global minimum, and find that this occurs with far fewer points than with geodesic cost.

Keywords — Pose-graph SLAM, convergence analysis

\SetNewAudience

ral \SetNewAudiencearxiv

I Introduction

Simultaneous Localization and Mapping (SLAM) is concerned with simultaneously estimating the pose of a robot (localization) and building a map of its surroundings (mapping). This capability has been useful in many areas, such as unmanned aerial vehicles [1, 2], autonomous ground vehicles [3], [4], and a plethora of other applications [5].

Currently, the “modern” approach to SLAM is to represent the robot’s trajectory as a graph: that is, to represent its poses as nodes, and measurements from those poses as edges. Then, given this graph, typically a weighted least-squares optimization problem is solved to estimate the most likely robot poses given the robot’s measurements [5]. However, even in 2D SLAM, the optimization problem is usually nonlinear and nonconvex, resulting in the possibility for iterative solvers to converge to a local instead of global minimum. Although modern solvers appear to achieve a global minimum much of the time, it is as of yet unclear under what conditions local minima exist, and how many there are, even for very small problems.

It is known that the cost on error in orientations of poses is a major contributor to the nonlinearity of the problem [6, 7], so choosing a particular representation for orientation error can affect the existence, number, and nature of minima for the pose-graph optimization problem. In 2D SLAM, one common method to evaluate orientation error is to directly subtract the two (scalar) angles, then “wrap” this difference to be on the interval , resulting in the “geodesic distance” between two orientations. Open-source SLAM software implementations such as Google’s Cartographer [8], and also other popular software such as MATLAB’s Robotics Toolbox implementation use geodesic distance via the “wrap” function in their cost functions. Use of geodesic distance is intuitive and widely known; however, it has been linked empirically to convergence to local minima [9].

Another method to evaluate orientation error is to use the “chordal distance” (e.g. [10, 11]), which is calculated using the Frobenius norm of the difference in rotation matrices. The use of the chordal distance in the cost function of a pose-graph optimization problem has been investigated previously. By using chordal cost and reformulating the problem as an equality-constrained optimization problem, global optimality can be certified for a pose-graph problem by solving a semidefinite program [11, 12]. Built upon this work, methods to speed up the computation of the global optimality certificate are proposed in [13]. Solvers have also been developed that yield certifiably globally optimal solutions [14, 15, 16].

(a) Regions , marked by red lines.
(b) Poses at local minimum.
Fig. 3: (a) This heatmap motivates why there may be local minima when using geodesic distance. This is a plot of one period of , the angular part of the cost function using geodesic distance. Clearly, local minima exist in and , which suggest the that there may be local minima in the full cost function. See Section III for details. (b) Poses corresponding to a local minimum of a minimal 2D pose-graph problem using geodesic cost. Note the large orientation error of pose 2. See Section V for more detail about this example problem.

While practical SLAM problems are much larger, analyzing a small problem allows clear conclusions to be made, which can inform insights into larger problems. Several papers have investigated “minimal” SLAM problems in an attempt to show the fundamental structure and limitations of different formulations of SLAM. In the formulation in [7], the authors concluded that for noise in some bounded interval, there is a unique global minimum, and no local minima. However, the authors assumed that the angle differences are always within . This allowed the angular terms to be treated as linear, which facilitates analysis, but cannot be assumed in general. In another paper [9], the authors compare the use of geodesic and chordal distance in a feature-based SLAM problem with two robot poses and a single landmark, which is considered to have no orientation. For that problem, using chordal cost, the authors concluded that a unique global minimum exists, regardless of noise.

This paper compares the influence of using geodesic or chordal distance on the convergence properties of a minimal pose-graph problem. In this paper, we study a planar pose-graph problem with three poses and three measurements. Our contributions are:

  • We prove that even in the case of perfect measurements with spherical covariance, if geodesic distance is used, multiple non-global local minima exist (see Figure (b)b for an example of a local minmum). This is due to local minima introduced by geodesic distance (see Figure (a)a). This clarifies the work in [7]; in particular, answers the question of what happens when angle differences outside of are considered.

  • We numerically estimate the regions of attraction to the local minima for some examples with varying noise magnitude, and show that they are of nonzero measure. Conservative regions of attraction to global minimum have been investigated for higher-dimensional problems using Gauss-Newton [17]; in this paper, due to the small size of the problem, we can explicitly compute the size of the region of attraction to the global (and local) minima, with little conservativism.

  • We build upon [9], asking the question: “Does a unique global minimum for chordal cost with any noise magnitude exist for the 3-pose case too?” By adding orientation information to the landmark, that problem becomes the 3-pose problem considered in this paper. We prove that for the noise-free case, a unique global minimum exists, but provide a counterexample to show that uniqueness of the global minimum does not hold for arbitrary noise in the 3-pose case.

  • Finally, we search for points that failed to converge to the global minimum in the case of chordal cost; across all three example problems, only four singleton points are found. This is a significant reduction in area compared to the regions of attraction to local minima in the geodesic cost case.

The paper is structured as follows: we first define the minimal SLAM problem using geodesic and chordal cost in Section II, and rewrite them in more convenient formulations. Then, we analyze the number and nature of minima for geodesic and chordal cost in Sections III and IV, respectively. In Section V, we analyze a few examples and compute their regions of attraction to local minima when using geodesic cost. Finally, we conclude with Section VI.

Ii Two formulations of a 3-pose planar pose-graph SLAM problem

Ii-a Notation and conventions

In this paper, we use the semicolon to mean vertical vector concatenation. For an angle , let be its corresponding rotation matrix.

Ii-B The 3-pose problem

In this paper we consider a 2D pose-graph problem with three poses and three measurements. Let each of the poses have a position , and an orientation for ; for short we write . Let the vector and the the vector . In pose-graph SLAM, one pose needs to be fixed for the problem to be observable; we will treat and as fixed, so we exclude them from and .

Suppose at each pose the robot has taken some measurements from pose to pose : a relative position , and a relative rotation ; as before, we use the shorthand . We assume the most ideal case, that each measurement in and has variance ; hence let , where is the (square) identity matrix whose size is determined by context. Hence is a spherical covariance matrix [18]. For simplicity, we have assumed that and have the same variance. However, the analysis in this paper holds if the position and orientations have different variances.

In this paper’s formulation of the 3-pose problem, we assume there are three measurements, resulting in three relative positions and three relative rotations . Throughout the paper, we assume that none of the measurements are zero, and that none of the poses are equal to another.

With these definitions, a pose-graph optimization problem can be set up to find robot poses that best satisfy these measurements, according to some cost function. One formulation of the cost function is what we will call the “geodesic cost” :

(1)

where is the Mahalanobis distance with respect to covariance , is the set , and returns the angle equivalent to on the interval .

The other cost function we consider in this paper will be called “chordal cost” :

(2)

where is the Frobenius norm of a matrix. The factor of is introduced so and have the same linearization at the origin, c.f. [12, Remark 1]. The two cost functions and are different ways of quantifying the same qualitative idea: they evaluate how well given poses “match” the measurements. Then, by minimizing either

(3)
(4)

can be found that explain the measurements well.

Notice that and share the same , and differ only in and . For our particular problem, simplifies to:

(5)

since and does not depend on .

Ii-C Dimensionality reduction via Schur complement

The decision space for the optimization problems minimizing and is . In this subsection we reduce it to two dimensions by noticing that the problem of minimizing can be solved in closed-form given any . The following lemma will aid us in this [18]:

Lemma 1

The linear least-squares problem of minimizing has a unique solution if is full column rank, and :

(6)
(7)

where .

Now, we rewrite for use with Lemma 1:

(8)
(9)

where , is the matrix that is , and

(10)

The matrix is . Hence by Lemma 1,

(11)
(12)
(13)

evaluates to [18, Theorem 1]:

(14)

where the constants , , and are determined by the measurement and covariance data only. Notice that , since the measurements are assumed to be nonzero.

Then, if we let

(15)
(16)

then instead of solving (3) and (4), we can instead solve the two-dimensional problems:

(17)
(18)

We will use and interchangeably, and similarly with and . The following lemma tells us what minima in and imply about minima in and , which will be used in the proofs of Theorem 1 and 2.

Lemma 2

Consider the problem of minimizing a function of two variables with . Suppose also there exists a function that for fixed ,

(19)

Then, if

(20)

then

(21)

If additionally is known to have a unique minimum, then is its unique minimum.

{proof}

For any , by (20) and (19),

(22)

and hence (21). Uniqueness of the minimum on yields uniqueness of .

Iii Analyzing local minima of “geodesic cost”

In this section, we consider the pose-graph optimization problem (17). We further reduce it to a set of one-dimensional optimization problems, and use these 1D optimization problems to analyze the local and global minima of (17) and the original problem (3).

Iii-a Representing (17) as three 1D optimization problems

Let be the square with and . Because wrap() is -periodic in and , it suffices to consider only when analyzing . Figure (a)a shows a surface plot of .

(a) Surface plot of on for .
(b) Surface plot of on .
(c) 1D problem costs .
Fig. 7: Plots from the noise-free 3-pose example problem in Section V using geodesic cost. (a) Notice that on each square , there are three minima, one on each region . (b) Suboptimal local minima are marked by pink x’s; we show their existence in Theorem 1. The global minimum in is marked by a black x. (c) 1D optimal costs plotted on their domain of definition in . Notice minima exist for with cost approximately equal to 20, which are marked with black ‘x’s.

On the region , the function is:

(23)

To keep notation compact, we use the shorthand

(24)

Then, on , can be replaced by:

(25)

Hence we can rewrite this as for on some appropriate regions of . This results in a natural subdivision of into three regions (see Figure (a)a): for , , which corresponds to the (open) lower right triangle, for , , which corresponds to the (open) upper left triangle, and for , the middle region (also open). Notice we have included in any points on the non-differentiable boundary where .

Then, for each , for , can be rewritten:

(26)

For each , it can be seen that (26) is a least-squares cost function in . That is, for any given and , we can find the optimal that minimizes by again using Lemma 1. This reduces the 2D optimization problem of minimizing to minimizing a 1D problem. We rewrite as:

(27)

where the column vectors , and . The matrix is equal to . Hence the angular component of the cost function can be written:

(28)

where , and

(29)

Hence can also be re-written as a set of one-dimensional cost functions:

(30)

For each , this is obviously smooth, and has derivatives:

(31)
(32)

Hence we have reduced the dimension of the optimization problem from 2D in to a set of three 1D optimization problems in . Figure (c)c shows the one-dimensional for each region for the example problem in Section V.

We also define the 2- and 6-dimensional cost functions for each . In place of , we consider three corresponding 2D problems:

(33)

and in place of ,

(34)
Remark 1

Note however that even if is a global minimum of , this does not necessarily imply it is a global minimum of . This is because only on ; may well be less than the global minimum of outside of .

Iii-B Main result: Existence of multiple local minima of

Now that we have represented as a triplet of 1D problems , we use them to analyze and .

In this section, we assume that the measurements are “perfect”; that is,

(35)

When the measurements are perfect, , the global minimum on , should match measurements exactly: . This can be seen by checking that . We are more interested in proving the existence of suboptimal local minima.

We claim that even in the ideal case of spherical covariance and perfect measurements, there are multiple suboptimal local minima of and . The proofs contain only elementary linear algebra and vector calculus, and have been relegated to the appendix.

Lemma 3

Assume (35) holds. Then, there are no global minima of in .

Theorem 1

Assume that the measurements are perfect, i.e. (35). Then, has at least two suboptimal local minima on , one in , and the other in . Each of these correspond to (suboptimal) local minima of .

However, in practice, (35) does not hold, and there is usually some inconsistency in the measurements:

(36)

It is easy to show that the boundaries of the regions vary with ; see Figure (a)a for some examples. In the event that is “large”, the number of minima on may change.

Remark 2

For “large enough” measurement mismatch , there may not exist minima on the open set . In the proof of Theorem 1, suppose and fixed and , i.e. we are in case “b” in the proof. Then, Theorem 1 relies on finding a minimum in the interval , where . However, with , the interval of on which is defined shrinks. If it shrinks enough so that the minimum of found through Theorem 1 is not actually in , there will not be a minimum in . The same logic applies to .

In conclusion, the use of geodesic distance in the cost function results in a nonsmooth cost function that has multiple suboptimal local minima, even in the case of perfect measurements.

Iv Analyzing minima of chordal cost

In this section we investigate the minima of optimization problem (18) and (4). In contrast to the previous section, we show that if measurements are perfect, the use of chordal distance yields zero local minima, and a unique global minimum.

Expanding and simplifying from (2):

(37)

Figure (a)a shows for the example problem considered in Figure (b)b. Hence (18) can be rewritten:

(38)

Figure (b)b shows for our example. We will also make use of the , the Jacobian of :

(39)

and the Hessian

(40)

Iv-a Main result: Unique existence of global minimum of

The main claim of this section is the following theorems. Again, the proofs are elementary and have been relegated to the appendix.

Theorem 2

Assume that the measurements are perfect, i.e. (35) holds. Then, is the unique minimum on .

However, for the case of imperfect measurements, this is no longer true. In the worst case, for , we have two distinct global minima on :

Theorem 3

If , multiple distinct global minima of exist on .

Hence it cannot be true that a unique global minimum exists for arbitrary noise. This is a significant difference of the 3-pose problem compared to the “one-step” problem in [9], which had a unique minimum for any noise magnitude.

(a) Cost for chordal distance .
(b) Chordal cost on .
Fig. 10: Plots from the noise-free 3-pose example problem in Section IV using chordal cost. (a) Compared to in Figure (a)a, is much more well-behaved. (b) Compared to Figure (b)b, is smooth and has a unique minimum exists on . Maxima are marked with pink ‘x’s, the unique global minimum is marked with a black ‘x’.
(a)
(b)
Fig. 13: Example poses corresponding to local minima of for noisy, imperfect measurements: (left), (right).
(a) Grid points converging to local minima of .
(b) Grid points converging to local minima of .
Fig. 16: (a) This figure shows a numerical estimate of the regions of attraction to the local minima in . The local minima are marked with red ’x’,’o’, and ’+’ for different noise levels . For each , the colored areas show initial conditions (IC’s) which failed to converge to the global minimum; all of them converged instead to the corresponding local minima. The diagonal lines represent the boundaries of for each . (b) The same test for shows many fewer points that failed to converge to the global minimum. These points encountered numerical issues, with fminunc terminating with large gradients and Hessians with large condition numbers.

V Examples and Discussion

In this section we consider several numerical examples, firstly to illustrate the results of Theorem 1 in the case of perfect measurements, and secondly to give more intuition about the noisy case, which is far more common in practice.

Theorem 1 applies to any planar 3-pose problem with perfect measurements, and not just a single, contrived example. To emphasize this, we consider three problems with three different ground truths. To investigate the effect of noise, we have applied three levels of noise, one to each example problem: . Noise of was added to each orientation measurement; no noise was added to position measurements. Figure 13 shows poses corresponding to the local minima of the two noisy 3-pose problems.

Problem Ground truth (rad) (rad)
% initial conditions on
converging to local min
1 0 0.2%
2 0.1 0.4%
3 19.8%
TABLE I: Percentage of sampled points converging to a local minimum

Figure 16 shows plots of for all three example problems superimposed on one another. Even though the three problems have different ground truth poses, and different ; we are still able to compare the effect of noise through them. Since the positions can be obtained in closed form for any given (c.f. Section II-C), we need only concern ourselves with . Then, the center of each is ; we move the center of each to the origin. Hence we are considering the problem of deviation from the measurements (e.g. the x-axis is , not itself). This allows the effect of to be considered in isolation, even in example problems with different ground truth poses.

Figure (a)a shows an approximation of the region of attraction for each problem. For each problem, a uniform grid of was constructed. Each grid point was used as the initial condition for MATLAB’s fminunc solver. If the global minimum was reached, the grid point was omitted from the plot; otherwise, it was plotted.

Figure (a)a yields several interesting conclusions. For and , the grid points in always converged to the global minimum. This is consistent with the experience of many users that although wrap() is used in the cost function, good results are obtained. A common method for initializing for consecutive poses is to use odometry; if odometry measurements are reasonably accurate, the initial conditions are likely to be in , i.e. the linear region of wrap(). This is also consistent with the conclusions in [7], namely that in the noise-free case, a unique minimum that is globally optimal exists if wrap() is assumed to be the identity.

Additionally, as increases, there seem to be less points in the top left triangular region that converge to local minima. According to Remark 2, for , the local minimum in disappears. However, while the number of local minima decreased, the region of attraction of the remaining local minimum grows, as shown by the shaded area in the bottom right. As becomes unrealistically large at , almost 20% of the sampled points on converged to a local minimum, even including some points from (see the green area above the lower dot-dashed line). In contrast, the case of perfect measurements () has 0.2% of points converging to a local minimum, and 0.4% for . While we have only shown that a finite number of sampled points ROA converge to a local minimum, it seems reasonable due to smoothness that all points on some continuous area “in between” these sampled points will also converge to the local minimum (obviously, this depends on the choice of solver).

Figure (b)b shows the same investigation applied to chordal cost . Clearly, many fewer points fail to converge to the global minimum. Even though Theorem 2 guarantees a unique global minimum on , there were several singleton initial conditions across the that failed to converge to the global minimum (Figure (b)b). The result of fminunc for each of these initial conditions had a large gradient (around 20), and had Hessians with condition number on the order of , suggesting numerical issues. While some issues pertaining to numerical solvers persist, it is clear from Figure 16 that has considerable advantages over when it comes to convergence.

Vi Conclusion

In this paper, we have shown that for a minimal pose-graph problem, even in the case of ideal measurements, use of geodesic distance in the cost function results in multiple suboptimal local minima. For several numerical examples, we give evidence that the regions of attraction to these local minima are of nonzero measure, and show that these regions of attraction increase in size as noise increases.

In contrast, we prove that for perfect measurements, use of the chordal cost instead of geodesic cost yields a unique global minimum, up to periodicity. For unrealistically large noise, however, we show that multiple minima exist, even for chordal cost. For the case of perfect and for noisy measurements, the region of attraction is shown to the global minimum is the whole domain , except for one or two isolated points due to numerical issues.

While we cannot claim our results apply directly to larger problems, the existence of these regions of attraction due to geodesic cost for this ideal, minimal problem suggests that similar regions may exist for larger problems. A useful future direction of research is in finding out how this scales. Extending to the 3D case, or the -pose 2D case would be a valuable addition to our understanding of the fundamental nature of pose-graph SLAM problems.

Appendix

{shownto}

ral In this appendix we give sketches of proofs to save space. The full proofs are available at: inlineinlinetodo: inlineGIVE ARXIV LINK {proof}[Lemma 3] We first consider the case , aiming to show that for every , . Consider the difference in the 1D optimal costs on and :

(41)

since measurements are perfect. For , this is negative; therefore and hence no global minima exist in . The proof for is very similar and therefore has been omitted.

The following propositions will help us in Theorem 1.

Proposition 1

.

{proof}{shownto}

ral This can be shown through somewhat arduous but simple plane geometry, and is omitted to save space. {shownto}arxiv We know that . We aim to show that also equals this expression, so that on .

Now, can be directly evaluated:

(42)

where is the identity matrix. Recall also that (c.f. (9)). Then, let and be position measurements in the - and -directions between poses and , in the frame of pose . Finally, letting and , we can directly evaluate and :

(43)
(44)

Now we turn our attention to , and show that it can also be written . Figure 17 shows a diagram of the geometry used to write in this way. From Figure 17, it is evident that

by (43) and (44), and the fact that multiplying both of them by does not change the value of the two-argument arctangent.

Fig. 17: Drawing of geometry for the proof of Proposition 1.
Proposition 2

On , .

{proof}

On , via Proposition 1, . Also, by direct calculation,

(45)

Hence for , . {proof} [Theorem 1] Consider first , which corresponds to the top left triangle, where . Then, by (35),

(46)
(47)

via (45). We aim to show that on an interval where . Now, , if and only if

(48)

We consider two cases: case “a”, where , and case “b”, where . Since we are not considering the non-differentiable boundaries of , we ignore the case where .

In case “a”, (48) is always true, so for all , and any critical point will be a minimum. We cannot solve (46) directly, but we can show a solution exists. By substitution, it can be shown that , and that . Since is continuous, by the intermediate value theorem, for some .

For case “b”, on . As in case “a”, we still use the intermediate value theorem, but instead evaluate , which equals:

(49)

which is strictly negative if . Hence there exists a root of on , where , which implies a minimum of on .

It is trivial to check that for , satisfies the inequality constraints at the boundaries of ; hence the minimum is in . Since minimizes , by Lemma 2, is a global minimum of . However, while is a global minimum of , by Remark 1 and Lemma 3, it is merely a local minimum of . Again by Lemma 2, the minimum of and the existence of implies that is a global minimum of , and by the same logic as above, a local minimum of .

The proof for uses the same steps and is omitted.

{proof}

[Theorem 2] The critical points of are such that . This implies , the second element , is zero, which yields:

(50)

which makes for all . Then, substituting this into , and letting ,

Solving yields: