MAGSAC: marginalizing sample consensus

MAGSAC: marginalizing sample consensus

Abstract

A method called -consensus is proposed to eliminate the need for a user-defined inlier-outlier threshold in RANSAC. Instead of estimating , it is marginalized over a range of noise scales using a Bayesian estimator, i.e. the optimized model is obtained as the weighted average using the posterior probabilities as weights. Applying -consensus, two methods are proposed: (i) a post-processing step which always improved the model quality on a wide range of vision problems without noticeable deterioration in processing time, i.e. at most 1-2 milliseconds; and (ii) a locally optimized RANSAC, called LO-MAGSAC, which includes -consensus to the local optimization of LO-RANSAC. The method is superior to the state-of-the-art in terms of geometric accuracy on publicly available real world datasets for epipolar geometry (F and E), homography and affine transformation estimation.

Keywords:
RANSAC, robust estimation, Bayesian estimation
\captionsetup

compatibility=false

1 Introduction

The RANSAC (RANdom SAmple Consensus) algorithm proposed by Fischler and Bolles [1] in 1981 has become the most widely used robust estimator in computer vision. RANSAC and its variants have been successfully applied to a wide range of vision tasks, e.g. motion segmentation [2], short baseline stereo [2, 3], wide baseline stereo matching [4, 5, 6], detection of geometric primitives [7], image mosaicing [8], and to perform [9] or initialize multi-model fitting [10, 11]. In short, the RANSAC approach repeatedly selects random subsets of the input point set to which it fits a model, e.g. a plane to three 3D points or a homography to four 2D point correspondences. Next, the quality of the estimated model is measured, for instance by the size of its support, i.e. the number of inliers. Finally, the model with the highest quality, polished e.g. by a least squares fit on its inliers, is returned.

Since the publication of RANSAC, a number of modifications have been proposed. NAPSAC [12], PROSAC [13] and EVSAC [14] modify the sampling strategy to increase the probability of selecting an all-inlier sample early. NAPSAC assumes the inliers to be spatially coherent, PROSAC exploits an a priori predicted inlier probability of the points and EVSAC estimates a confidence in each point. MLESAC [15] estimates the model quality by a maximum likelihood process, albeit under certain assumptions about inlier and outlier distributions, with all its beneficial properties. In practice, MLESAC results are often superior to the inlier counting of plain RANSAC and they are less sensitive to the user-defined inlier-outlier threshold. In MSAC [16], the robust estimation is formulated as a process that estimates both the parameters of the data distribution and the quality of the model in terms of maximum a posteriori.

One of the highly attractive properties of RANSAC is its small number of control parameters. The termination is controlled by a manually set confidence value and the sampling stops as soon as the probability of finding a model with higher support falls below .1 The setting of is not problematic, the typical values are 0.95 or 0.99 depending on the required confidence in the solution. The second, and most critical, parameter is the inlier noise scale that determines the inlier-outlier threshold which strongly influences the outcome of the procedure. In standard RANSAC and its variants, must be provided by the user which limits its fully automatic out-of-the-box use and requires the user to acquire knowledge about the problem at hand. To reduce the dependency on this threshold, MINPRAN [17] assumes the outliers to be uniformly distributed and finds the model with inliers least likely to have occurred randomly. Moisan et al. [18] proposed a contrario RANSAC to optimize each model by selecting the most likely noise scale. Neither of the methods are commonly used.

As the major contribution of this paper, we propose an approach, -consensus, that eliminates the need for , the noise scale parameter and also keeps the processing time acceptable. Instead of , only a range, within lies, is required. The range can be fairly wide, say one order of magnitude. The final outcome is obtained by marginalizing over , using Bayesian estimation and data likelihood given the model and . Besides finessing the need for a precise scale parameter, the novel method, called MAGSAC, is more precise than previously published RANSACs.

As a second contribution, we replace maximization in RANSAC with Bayesian estimation which takes the form of robust model averaging weighted by the data likelihood for each so-far-the-best model. We build on the observation of Rais et al. [19] who showed that the accuracy of RANSAC, which depends on a single sample where the model attained the maximum score, can be improved by averaging other all-inlier models generated during the RANSAC procedure. Their Random Sample Aggregated Consensus (RANSAAC) algorithm [19] takes the weighted mean (or median) of all the generated models selecting a power of the inlier counts as weights. We replace this ad-hoc weighting by the a posteriori likelihood, justified under Bayesian estimation, and propose an approach for averaging over .

They however identified a problem, that significant number of models ( for homography fitting) should be generated and averaged in case of low inlier ratio to achieve stable results – which is prohibitive in practice. Dealing with this issue, they applied locally optimized RANSAC proposed by Chum et al. [20] who observed that RANSAC requires more samples in practice than theory predicts since not all all-inlier samples are “good”, i.e. lead to a model accurate enough for distinguishing all inliers. In [19], LO-RANSAC is used to select models worth to be involved into the averaging: the locally optimized ones.

As a third contribution, we propose a new local optimization, called LO-MAGSAC, which polishes each so-far-the-best model through a range of noise s. Since the number of LO runs is close to the logarithm of the number of verifications, the procedure is often faster than plain RANSAC due to the early termination. More importantly, the obtained results were superior to that of the state-of-the-art RANSAC variants on a wide range of vision problems and datasets. We however acknowledge that being often faster than plain RANSAC does not mean that the time demand has not been increased by the optimization of . Thus we also propose a post-processing step applying -consensus to the so-far-the-model without noticeable deterioration in processing time, i.e. maximum 1-2 milliseconds. In our experiments, the method always improved the input model (coming from RANSAC, MSAC or LO-RANSAC) on a wide range of problems. Thus we see no reason for not applying it after the robust estimation finished.

2 Bayesian optimization

2.1 Notation

In this paper, the input point set is denoted as , where is the dimension of the points, e.g.  for 2D points. The inlier set is . The model to fit is represented by its parameter vector , where is dimension of the model, e.g.  for 2D lines (angle and offset), and is the manifold, e.g. all possible 2D lines. Fitting function calculates the model parameters from data points, where and is the minimum point number for fitting a model, e.g.  for 2D lines. Note that is a combined function applying different estimators on the basis of the input set, for instance, a minimal method if and least-squares fitting otherwise. Function is the point-to-model residual function. Function selects the inlier set given model and threshold . For example, if the original RANSAC approach is considered, , whilst for truncated quadratic distance, . The quality function is where higher quality is interpreted as better model. For RANSAC, and for MSAC, it is , where is the th inlier.

Notation
- Set of data points - Noise standard deviation
- Model parameters - Residual function
- Inlier selector function - Model quality function
- Fitting function - Minimal sample size

2.2 -consensus

Suppose that we are given a model estimated from a minimal sample and a point set . The objective is to estimate the noise , given model and , minimizing the Bayesian quadratic loss. The problem is formulated as follows:

(1)

where is the model implied by the inlier set selected using around the input model . It can be solved using the joint probability rule:

(2)

where is the likelihood of the model. Substituting Eq. 2 into Eq. 1 leads to

Since is a non-negative constant, it is enough to minimize the integral as follows:

(3)

The solution is obtained by simply differentiating Eq. 3 as follows:

After rearranging the equation,

Note that .

To discretize the problem, let us notice that due to having a finite point set (), the set of s leading to models with different parameters cannot be infinite. The set of possible models given and is as follows:

Due to , it holds that . Therefore the integral can be replaced by a summation as follows:

(4)

Consequently, the model minimizing the loss is the weighted mean of the models which a finite set of s imply. The weights are the posterior probabilities of those s.

2.3 Posterior probabilities

To calculate the Bayesian posterior probability of model implied by given

(5)

required for the averaging, we need to model the input data as a mixture of probabilistic distributions. Note that is the likelihood of the points given and encapsulates the prior knowledge about , and thus about as well. It can be easily seen that can be omitted from Eq. 4 due to the division and the fact that it does not depend on . Let us start from the original RANSAC scheme where a fitness score is assigned to every data point interpreting it as a member of the outlier () or inlier () classes. In that approach, both classes are supposed to have uniform distributions and thus, the input point set is described by a mixture of those. The error distribution given the ground truth model is as follows:

where is the standard deviation of the inlier residuals () and is a boundary for the outliers, for instance, the image diagonal size for 2D points. The implied likelihood of given is as follows:

where is an indicator variable – determined by function – which is equal to if is inlier, and otherwise.

{subfigure}

[b]0.40 {subfigure}[b]0.40

Figure 1: Likelihood (from Eq. 7; mean of runs; H fitting) of s given the MAGSAC output on outlier ratios , and . The ground truth is shown by a black line.

Leaving the RANSAC approach and assuming that the inliers have normal distribution (MSAC or MLESAC scheme), similarly as in [16], the residuals are described by the following mixed distribution:

Therefore is written as follows:

(6)

The full derivation to get Eq. 6 is in [16]. Due to the indicator variable, Eq. 6 is as

(7)

Note that the prior probability , without any prior knowledge about the desired model or the actual noise scale, can be chosen to a flat one or can either be updated step-by-step inside the RANSAC procedure. Also note that for avoiding extremely small weights, to which the product of many small numbers leads, it is beneficial to use the logarithm of as weights. It is as follows:

Example distributions estimated by the proposed method for are shown in Fig. 1 for synthetic homography fitting discussed later in the experimental section.

3 Algorithms using -consensus

In this section, we propose two algorithms, LO-MAGSAC and a post-processing approach which uses -consensus. LO-MAGSAC is a locally optimized RANSAC aiming to minimize the quadratic loss when each so-far-the-best model is found. Even though it is often faster than plain RANSAC, there are RANSAC variants promising faster procedure. Thus the reason of the post-processing step is to improve without noticeable increasing in the processing time by applying -consensus only once: to the final so-far-the-best-model, i.e. the output of the robust estimation. In the proposed algorithms, we define the quality function as follows:

Function , i.e. the one returning the inlier set, is .

3.1 Post-processing by -consensus

In order to use the previously described optimization without high computational overhead, we propose to apply it as a post-processing step refining only the RANSAC output. The proposed algorithm is summarized in Alg. 1.

Assuming that the given s are ordered, , , it is a straightforward solution for speeding up the process to get the inliers implied by the highest (; line 4) first. Then the next steps just need to update the inlier set by removing the points farther than the current (line 6) instead of checking all elements in . For each , a model is fitted to the implied inlier set (line 7) and its quality is computed (line 8). To achieve the maximum accuracy, the estimated model is also compared to the current so-far-the-best model and stored if has higher quality. The averaged model is then updated (line 12) by adding the new model with its quality as a weight. We use running average to be able to keep the results of intermediate steps as well if they lead to better quality (line 13) than the previous best.

It can be easily seen that the time complexity of the algorithm is not significant for small . Only the first iteration requires to check all the points, the remaining ones works with the shrinking inlier set, . The time complexity of function , considering that it is an SVD decomposition, is , where is the dimension of unknowns, e.g. for line fitting, it is . The worst case is when all points are at zero distance from the model, . has to run on all data points times, thus leading to complexity. The complexity of computing the score times is and that of the running average is . Therefore the overall worst case complexity is , which is linear in the number of points. However, the solution can straightforwardly be parallelized by computing the score with computational demand. Then the complexity becomes .

1:Input: – data points; – set of s; – model parameters; – quality
2:Output: – optimal model parameters, – optimal quality
3:
4:; ; ; Inlier set, aggregated model, sum of weights
5:for i =  do
6:      if  then
7:             Get the inliers.
8:      else
9:             Remove too far inliers       
10:       Fit model to the inliers
11:       Compute the quality of the model
12:      if  then
13:             ;       
14:       Running average
15:      
16:       Compute the score of the aggregated model
17:      if  then
18:            ;       
Algorithm 1 -consensus.

3.2 LO-MAGSAC: Locally optimized marginalizing sample consensus

We propose a locally optimized RANSAC applying the proposed -consensus inside the LO procedure. The algorithm is summarized in Alg. 2. First, it selects the inliers of the model to be optimized (line 1). Next, a RANSAC-like procedure is applied to the inlier set selecting -sized samples in each iteration (as proposed in [21]), where is the size of a minimal sample. Using guided sampling like PROSAC [13] instead of the uniform one is a straightforward choice. For PROSAC, the points have to be ordered by the feature scores. Thus it requires no additional computation to use PROSAC in the embedded RANSAC since the inlier selector function does not change the point ordering. Note that it could also be a justifiable approach to use the inlier probabilities of the points w.r.t. the given model. However, this would introduce bias into the estimation. A model is then estimated from the selected sample (line 4), its score is computed (line 5) and -consensus is performed (line 8).

The time complexity of in this case, considering that it is still an SVD decomposition, is , where is the number of unknowns. The score computation has complexity in general, but for parallel computing it has . As we discussed in the previous section, the time demand of the optimization step in the worst case is . Thus the overall complexity of the local optimization step is . That of the parallel approach is . Since , and are usually small numbers and do not depend on , the time demand is linear in the number of points.

1:Input: – data points; – threshold; – set of s; – model parameters; – quality; – iteration number; – size of a minimal sample
2:Output: – optimal model parameters, – optimal quality
3:
4:; Inlier set, optimal score
5:for i =  do
6:       Select -sized random sample from . was proposed in [21]
7:       Fit model to the selected inliers
8:       Compute the score of the temporal model
9:      if  then
10:             ;       
11:       Alg. 1
12:      if  then
13:            ;       
Algorithm 2 LO-MAGSAC. Local optimization with -consensus.

4 Experimental Results

In order to evaluate the effect of the proposed post-processing step, we tested several approaches with and without this step. The compared algorithms are: RANSAC, MSAC, LO-RANSAC, and LO-MSAC. Being the first method which averages the estimated models, we also included the results of LO-RANSAAC [19]. PROSAC sampling, the same random seed and algorithmic components were used for all methods and they performed a final least-squares on the obtained inlier set. Thus the difference between RANSAC – MSAC and LO-RANSAC – LO-MSAC is solely the scoring (i.e. quality) function. Moreover, the methods with LO prefix, except LO-MAGSAC, run the original local optimization step proposed by Chum et al. [20] with an inner RANSAC applied to the inliers. The parameters used are as follows: was the inlier-outlier threshold used for the RANSAC loop (this value was proposed in [21] and also suited for us). The number of inner RANSAC iterations was . The required confidence was . There was a minimum number of iterations required (set to ) before the first LO step applied and also before termination. The reported error values are the root mean square (RMS) errors. We used set for the optimization since it led to accurate results with negligible time demand.

4.1 Synthesized Tests

For testing the proposed method in a fully controlled environment, two cameras were generated by their projection matrices and . The first camera was located in the origin and its image plane was parallel to the XY plane. The position of the second camera was at a random point inside a unit-sphere around the first one, thus . Its orientation was determined by three random rotations affecting around the principal directions as follows:

where , and are random angles in-between values and . Both cameras had a common intrinsic camera matrix with focal length and principal points . A 3D plane was generated with random tangent directions and origin . It was sampled at locations, thus generating three-dimensional points at most one unit far from the plane origin. These points were projected into the cameras. All of the random parameters were selected using uniform distribution. Zero-mean Gaussian-noise with standard deviation was added to the projected point coordinates. Finally, outliers, i.e. uniformly distributed random point correspondences, were added. In total, points were generated, therefore .

The mean results of runs are reported in Fig. 11. The competitor algorithms are: RANSAC (RSC), MSAC (MSC), LO-RANSAC (LO-RSC), LO-MSAC (LO-MSC) and LO-MAGSAC. Suffix ”” means that -consensus was applied to the obtained model. The first row (a–c) reports the geometric accuracy (in pixels) as a function of the noise obtained using different outlier ratios (a – , b – , c – ). For instance, outlier ratio means that and . A fixed, and same, number of samples were used for all methods in these tests calculated from the ground truth inlier ratio requiring confidence . By looking at the differences between methods with and without the proposed post-processing step (””), it can be seen that it always improved the results on these tests. E.g. the geometric error of RSC is higher than that of RSC + for every noise . LO-MAGSAC led to the estimates with the lowest errors for outlier ratios and . It was the second best for low outlier ratio (), whilst the first one was LO-MSAC + in that case. For the second row (plots d–f), not a fixed number of samples but an iteratively updated one was used requiring confidence. The geometric errors are plotted as a function of the noise . Even though the differences become slightly smaller, the same trend can be observed as for the previous row. This change in the differences can easily be understood by noticing that the proposed local optimization more often led to earlier termination than the others – see plots (g) and (h). For (g–h), the required sample number is shown as the function of the noise scale (g) or the outlier ratio (h). Since methods with suffix ”+ ” use a post-processing step, the number of generated samples does not change. Thus they are not shown in these plots. It can be seen, that LO-MAGSAC needed the least number of samples for the termination in most of the cases. The last plot (i) shows the sensitivity of -consensus to the used range of s. The results were averaged over outlier ratios , , …, . For this test, LO-MAGSAC was ran optimizing with different sets. For instance, the case means that the used set was . Thus the tested s were from interval . It can be seen that the results are getting better as , i.e. the number of s in interval , is increasing. After a certain point, , the results have fairly similar quality.

Fig. 1 shows the normalized likelihood of each given , i.e. the probabilities of the most and least likely (Eq. 7) ones are and , respectively. The results are the mean of runs for each outlier ratio (0.2, 0.5 and 0.8). The ground truth s are shown by black vertical lines. It can be seen that by using the proposed approach the peaks are close to the ground truth even for high outlier ratio.

{subfigure}

[t]0.325 {subfigure}[t]0.325 {subfigure}[t]0.325
{subfigure}[t]0.325 {subfigure}[t]0.325 {subfigure}[t]0.325
{subfigure}[t]0.325 {subfigure}[t]0.325 {subfigure}[t]0.325

Figure 2: outl., fixed iter.
Figure 3: outl., fixed iter.
Figure 4: outl., fixed iter.
Figure 5: outl., conf.
Figure 6: outl., conf.
Figure 7: outl., conf.
Figure 8: Sample number as the function of the noise .
Figure 9: Sample number as the function of the outlier .
Figure 10: Sensitivity to the choice of set .
Figure 11: Synthetic homography fitting. The competitor algorithms are: RANSAC (RSC), MSAC (MSC), LO-RANSAC (LO-RSC), LO-MSAC (LO-MSC) and LO-MAGSAC. Suffix ”” means that the proposed post-processing step was applied to the returned model. The first row (a–c) reports the geometric accuracy (in pixels) as a function of the noise obtained with different outlier ratios and a fixed, and same, iteration number for all methods calculated from the ground truth inlier ratio with confidence set to . The second row (d–f) shows the errors with iteratively updated iteration number requiring confidence. Plots (g–h) report the required sample number as a function of the noise and outlier ratio. Since ”” does not change the iteration number these methods were removed. Plot (i) shows the accuracy implied by different sets as the function of the ground truth noise averaged over outlier ratios , , …, . Value denotes the number of used s in-between and , e.g.  means .

4.2 Real World Tests

{subfigure}

[t]0.49 {subfigure}[t]0.49 {subfigure}[t]0.49 {subfigure}[t]0.49 {subfigure}[t]0.49 {subfigure}[t]0.49

Figure 12: Homography; homogr dataset
Figure 13: Homography; EVD dataset
Figure 14: Fundamental matrix; kusvod2 dataset
Figure 15: Fundamental matrix; AdelaideRMF dataset
Figure 16: Essential matrix; Strecha dataset
Figure 17: Affine transformation; SZTAKI dataset
Figure 18: Example results of LO-MAGSAC on each dataset and problem.

Estimation of Fundamental Matrix. To evaluate the performance on fundamental matrix estimation we used kusvod22 (24 pairs), Multi-H3 (5 pairs), and AdelaideRMF4 (19 pairs) datasets. Kusvod2 consists of 24 image pairs of different sizes with point correspondences and fundamental matrices estimated using manually selected inliers. AdelaideRMF and Multi-H consist a total of 24 image pairs with point correspondences, each assigned manually to a homography or the outlier class. For them, all points which are assigned to a homography were considered as inliers and others as outliers. In total, the proposed technique was tested on image pairs from three publicly available datasets for F estimation. All methods applied the seven-point method [22] as a minimal method for estimating F. Thus they drew minimal sets of size seven in each RANSAC iteration. For the final least squares fitting, the normalized eight-point algorithm [23] was ran on the obtained inlier set. Note that all fundamental matrices were discarded for which the oriented epipolar constraint [24] did not hold.

The first three blocks of Table 1, each consisting of four rows, report the quality of the estimation on each dataset as the average of 1000 runs on every image pair. The first two columns show the name of the tests and the investigated properties: (1-2) values and are the mean and median RMS geometric errors in pixels of the obtained model w.r.t. the manually annotated inliers. For fundamental matrices and homographies, it is defined as the average Sampson distance and re-projection error, respectively. For essential matrices, it is the mean Sampson distance of the implied F and the correspondences. (3) Value is the mean processing time in milliseconds. (4) Value is the mean number of samples, i.e. RANSAC iterations, which had to be drawn till termination. Note that, as it is expected, the methods applied with or without the proposed post-processing step do not differ in that aspect.

It can be clearly seen that for F estimation the proposed post-processing step improved the results in nearly all of the tests with negligible increasing in the processing time. The increase was milliseconds on average. The errors were reduced by approximately compared with the methods without -consensus. Applying LO-MAGSAC led to results superior to that of the competitor algorithms in all test cases in terms of geometric accuracy. Comparing its processing time to that of the fastest one (LO-MSAC) LO-MAGSAC was times slower. However, its time was times that of plain RANSAC due to the early termination.

Estimation of Homography. In order to test homography estimation we downloaded homogr5 (16 pairs) and EVD6 (15 pairs) datasets. Each consists of image pairs of different sizes from up to with point correspondences and inliers (correctly matched point pairs) selected manually. Homogr dataset consists of mostly short baseline stereo images, whilst the pairs of EVD undergo an extreme view change, i.e. wide baseline or extreme zoom. All algorithms applied the normalized four-point algorithm [22] for homography estimation both in the model generation and local optimization steps. Therefore, each minimal sample consists of four correspondences. The th and th blocks of Fig. 1 show the mean results computed using all the image pairs of each dataset. Similarly as for F estimation, the proposed post-processing step always improved (by pixels on average). For homogr dataset, the most accurate results were obtained by LO-MAGSAC. For EVD, LO-RSC + led to the best solutions.

Estimation of Essential Matrix. To estimate essential matrices, we used the strecha dataset [25] consisting of image sequences of buildings. All images are of size . The ground truth projection matrices are provided. The methods were applied to all possible image pairs in each sequence. The SIFT detector [26] was used to obtain correspondences. For each image pair, a reference point set with ground truth inliers was obtained by calculating the fundamental matrix from the projection matrices [22]. Correspondences were considered as inliers if the symmetric epipolar distance was smaller than pixel. All image pairs with less than inliers found were discarded. In total, image pairs were used in the evaluation. The results are reported in the th block of Table 1. The trend is similar to the previous cases. The only difference is that the post-processing step did not improve plain RANSAC and MSAC. Its average improvement was pixels. The most accurate essential matrices were obtained by LO-MAGSAC.

Estimation of Affine Transformation. The SZTAKI Earth Observation dataset7 [27] ( image pairs of size ) was used to test the estimation of affine transformations. It contains images of busy road scenes taken from a balloon. Due to the altitude of the balloon, the image pair relation is well approximated by an affine transformation. Point correspondences were detected by the SIFT detector. For ground truth, inliers were selected manually for every image pair. Point pairs with the distance from the ground truth affine transformation lower than pixel were defined as inliers. The results are shown in the th block of Table 1. The reported geometric error is , where A is the estimated affine transformation and is the point in the th image (). It can be seen that the methods obtained fairly similar results.

RSC + MSC + LO-RSC + LO-MSC + LO-RSAAC LO-MAGSAC

kusvod2

,

0.73 0.63 0.76 0.67 0.60 0.57 0.62 0.60 1.01 0.57
0.69 0.53 0.71 0.57 0.45 0.37 0.44 0.39 0.59 0.30
38 39 16 17 22 22 14 14 14 57
697 697 304 304 335 335 177 177 177 181

Adelaide

,

0.53 0.52 0.52 0.49 0.32 0.32 0.32 0.32 0.33 0.28
0.37 0.37 0.32 0.27 0.24 0.24 0.24 0.22 0.26 0.22
418 418 371 372 236 237 215 216 216 332
3 327 3 327 2 752 2 752 1 635 1 635 1 471 1 471 1 471 1 651

Multi-H

,

0.70 0.67 0.74 0.59 0.57 0.56 0.59 0.55 0.58 0.54
0.50 0.47 0.56 0.40 0.38 0.36 0.37 0.33 0.30 0.32
230 232 92 94 102 104 62 65 64 315
1 987 1 987 908 908 580 580 327 327 327 585

homogr

,

3.71 3.18 2.87 2.67 3.58 3.15 2.78 2.67 2.95 2.49
3.43 2.89 2.63 2.55 3.17 2.85 2.48 2.47 2.50 1.83
34 36 28 30 44 45 37 38 42 169
931 931 763 763 831 831 644 644 644 456

EVD

,

4.60 4.25 5.42 5.09 4.31 4.01 4.82 4.70 4.55 4.20
2.96 2.89 2.58 2.43 2.79 2.76 2.44 2.34 2.51 2.32
660 664 658 666 694 694 701 706 717 772
8 415 8 415 6 307 6 307 5 610 5 610 4 451 4 451 4 451 4 798

strecha

,

5.94 5.94 5.76 5.76 9.04 6.70 8.69 6.68 6.62 5.63
0.86 0.86 0.95 0.95 2.51 0.87 2.53 0.91 1.26 0.76
3 216 3 216 2 753 2 760 1 559 1 560 1 457 1 461 1 460 2 398
4 012 4 012 3 359 3 359 1 826 1 826 1 691 1 691 1 691 2 183

SZTAKI

,

0.48 0.47 0.42 0.41 0.48 0.47 0.42 0.41 0.42 0.41
0.36 0.36 0.36 0.36 0.36 0.36 0.36 0.36 0.36 0.36
1 1 1 1 1 2 1 2 1 9
23 23 23 23 23 22 22 22 22 22

all

2.43 2.24 2.36 2.24 2.70 2.25 2.61 2.28 2.35 2.02
1.31 1.20 1.16 1.08 1.41 1.12 1.27 1.00 1.11 0.87
Table 1: Accuracy of robust estimators on two-view geometric estimation. Fundamental matrix estimation () on kusvod2 (24 pairs), AdelaideRMF (19 pairs) and Multi-H (4 pairs) datasets, homography estimation () on homogr (16 pairs) and EVD (15 pairs) datasets, essential matrix estimation (E) on the strecha dataset (467 pairs), and affine transformation estimation (A) on the SZTAKI Earth Observation benchmark (52 pairs). In total, the testing included image pairs. The datasets, the problem, the number of the image pairs () and the reported properties are shown in the first three columns. The other columns show the average results ( runs on each image pair) of the competitor methods at confidence. Columns with ”” show the results when the proposed -consensus was applied to the output of the method on its left. The mean and median geometric error (, ; in pixels) of the estimated model w.r.t. the manually selected inliers are written in each 1st and 2nd rows; the mean processing time (, in milliseconds) and the required number of samples () are written in every rd and th rows. The geometric error is the RMS Sampson distance for and , and the RMS re-projection error for and calculated using the ground truth inlier set.

5 Conclusion

A robust approach, called -consensus, was proposed for eliminating the need of a user-defined threshold by marginalizing over a range of noise scales using a Bayesian estimator. Also, it is shown that having a finite set of data points, the scale range can be replaced by a finite set of s without loss of generality. The optimized model, not depending on a threshold parameter, is obtained as the weighted average using the posterior probabilities as weights. Applying -consensus, we proposed two methods: first, a locally optimized RANSAC, called LO-MAGSAC, which includes -consensus to the local optimization of LO-RANSAC. The method is superior to the state-of-the-art in terms of geometric accuracy on publicly available real world datasets for epipolar geometry (both F and E), homography and affine transformation estimation. The method is often faster than plain RANSAC due to the early termination, but sometimes significantly slower. We therefore also proposed a post-processing step applying -consensus only once: to the final so-far-the-best model. The method always improved the model quality on a wide range of vision problems without noticeable deterioration in processing time, i.e. at most 1-2 milliseconds. We see no reason for not applying it after the robust estimation finished.

Footnotes

  1. Note that the probabilistic interpretation of holds only for the standard cost function.
  2. http://cmp.felk.cvut.cz/data/geometry2view/
  3. http://web.eee.sztaki.hu/~dbarath/
  4. cs.adelaide.edu.au/~hwong/doku.php?id=data
  5. http://cmp.felk.cvut.cz/data/geometry2view/
  6. http://cmp.felk.cvut.cz/wbs/
  7. http://mplab.sztaki.hu/remotesensing

References

  1. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM (1981)
  2. Torr, P.H.S., Murray, D.W.: Outlier detection and motion segmentation. In: Optical Tools for Manufacturing and Advanced Automation, International Society for Optics and Photonics (1993)
  3. Torr, P.H.S., Zisserman, A., Maybank, S.J.: Robust detection of degenerate configurations while estimating the fundamental matrix. Computer Vision and Image Understanding (1998)
  4. Pritchett, P., Zisserman, A.: Wide baseline stereo matching. In: International Conference on Computer Vision, IEEE (1998)
  5. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing (2004)
  6. Mishkin, D., Matas, J., Perdoch, M.: MODS: Fast and robust method for two-view matching. Computer Vision and Image Understanding (2015)
  7. Sminchisescu, C., Metaxas, D., Dickinson, S.: Incremental model-based estimation using geometric constraints. Pattern Analysis and Machine Intelligence (2005)
  8. Ghosh, D., Kaabouch, N.: A survey on image mosaicking techniques. Journal of Visual Communication and Image Representation (2016)
  9. Zuliani, M., Kenney, C.S., Manjunath, B.S.: The multiransac algorithm and its application to detect planar homographies. In: International Conference on Image Processing, IEEE (2005)
  10. Isack, H., Boykov, Y.: Energy-based geometric multi-model fitting. International Journal of Computer Vision (2012)
  11. Pham, T.T., Chin, T.J., Schindler, K., Suter, D.: Interacting geometric priors for robust multimodel fitting. Transactions on Image Processing (2014)
  12. Nasuto, D., Craddock, J.M.B.R.: NAPSAC: High noise, high dimensional robust estimation - it’s in the bag. (2002)
  13. Chum, O., Matas, J.: Matching with PROSAC-progressive sample consensus. In: Computer Vision and Pattern Recognition, IEEE (2005)
  14. Fragoso, V., Sen, P., Rodriguez, S., Turk, M.: EVSAC: accelerating hypotheses generation by modeling matching scores with extreme value theory. In: International Conference on Computer Vision. (2013)
  15. Torr, P.H.S., Zisserman, A.: MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding (2000)
  16. Torr, P.H.S.: Bayesian model estimation and selection for epipolar geometry and generic manifold fitting. International Journal of Computer Vision 50(1) (2002) 35–61
  17. Stewart, C.V.: Minpran: A new robust estimator for computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(10) (1995) 925–938
  18. Moisan, L., Moulon, P., Monasse, P.: Automatic homographic registration of a pair of images, with a contrario elimination of outliers. Image Processing On Line 2 (2012) 56–73
  19. Rais, M., Facciolo, G., Meinhardt-Llopis, E., Morel, J., Buades, A., Coll, B.: Accurate motion estimation through random sample aggregated consensus. CoRR abs/1701.05268 (2017)
  20. Chum, O., Matas, J., Kittler, J.: Locally optimized RANSAC. In: Joint Pattern Recognition Symposium, Springer (2003)
  21. Lebeda, K., Matas, J., Chum, O.: Fixing the locally optimized RANSAC. In: British Machine Vision Conference, Citeseer (2012)
  22. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge university press (2003)
  23. Hartley, R.I.: In defense of the eight-point algorithm. Transactions on Pattern Analysis and Machine Intelligence (1997)
  24. Chum, O., Werner, T., Matas, J.: Epipolar geometry estimation via RANSAC benefits from the oriented epipolar constraint. In: International Conference on Pattern Recognition. (2004)
  25. Strecha, C., Fransens, R., Van Gool, L.: Wide-baseline stereo from multiple views: a probabilistic account. In: Conference on Computer Vision and Pattern Recognition, IEEE (2004)
  26. Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer vision, IEEE (1999)
  27. Benedek, C., Szirányi, T.: Change detection in optical aerial images by a multilayer conditional mixed markov model. Transactions on Geoscience and Remote Sensing (2009)
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
130626
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description