An Efficient Algebraic Solution to the PerspectiveThreePoint Problem
Abstract
In this work, we present an algebraic solution to the classical perspective3point (P3P) problem for determining the position and attitude of a camera from observations of three known reference points. In contrast to previous approaches, we first directly determine the camera’s attitude by employing the corresponding geometric constraints to formulate a system of trigonometric equations. This is then efficiently solved, following an algebraic approach, to determine the unknown rotation matrix and subsequently the camera’s position. As compared to recent alternatives, our method avoids computing unnecessary (and potentially numerically unstable) intermediate results, and thus achieves higher numerical accuracy and robustness at a lower computational cost. These benefits are validated through extensive MonteCarlo simulations for both nominal and closetosingular geometric configurations.
An Efficient Algebraic Solution to the PerspectiveThreePoint Problem
Tong Ke 
University of Minnesota 
Minneapolis, MN 55455 
kexxx069@cs.umn.edu 
Stergios Roumeliotis 
University of Minnesota 
Minneapolis, MN 55455 
stergios@cs.umn.edu 
The PerspectivenPoint (PnP) is the problem of determining the 3D position and orientation (pose) of a camera from observations of known point features. The PnP is typically formulated and solved linearly by employing lifting (e.g., [?]), or as a nonlinear leastsquares problem minimized iteratively (e.g., [?]) or directly (e.g., [?]). The minimal case of the PnP (for n=3) is often used in practice, in conjunction with RANSAC, for removing outliers [?].
The first solution to the P3P problem was given by Grunert [?] in 1841. Since then, several methods have been introduced, some of which [?, ?, ?, ?, ?, ?] were reviewed and compared, in terms of numerical accuracy, by Haralick et al. [?]. Common to these algorithms is that they employ the law of cosines to formulate a system of three quadratic equations in the features’ distances from the camera. They differ, however, in the elimination process followed for arriving at a univariate polynomial. Later on, Quan and Lan [?] and more recently Gao et al. [?] employed the same formulation but instead used the Sylvester resultant [?] and WuRitz’s zerodecomposition method [?], respectively, to solve the resulting system of equations, and, in the case of [?], to determine the number of real solutions. Regardless of the approach followed, once the feature’s distances have been computed, finding the camera’s orientation, expressed as a unit quaternion [?] or a rotation matrix [?], often requires computing the eigenvectors of a matrix (e.g., [?]) or performing singular value decomposition (SVD) of a matrix (e.g., [?]), respectively, both of which are timeconsuming. Furthermore, numerical error propagation from the computed distances to the rotation matrix significantly reduces the accuracy of the computed pose estimates.
To the best of our knowledge, the first method^{1}^{1}1Nister and Stewenius [?] also follow a geometric approach for solving the generalized P3P resulting into an octic univariate polynomial whose odd monomials vanish for the case of the central P3P. that does not employ the law of cosines in its P3P problem formulation is that of Kneip et al. [?], and later on that of Masselli and Zell [?]. Specifically, [?] and [?] follow a geometric approach for avoiding computing the features’ distances and instead directly solve for the camera’s pose. In both cases, however, several intermediate terms (e.g., tangents and cotangents of certain angles) need to be computed, which negatively affect the speed and numerical precision of the resulting algorithms.
Similar to [?] and [?], our proposed approach does not require first computing the features’ distances. Differently though, in our derivation, we first eliminate the camera’s position and the features’ distances to result into a system of three equations involving only the camera’s orientation. Then, we follow an algebraic process for successively eliminating two of the unknown 3dof and arriving into a quartic polynomial. Our algorithm (summarized in Alg. 1) requires fewer operations and involves simpler and numerically more stable expressions, as compared to either [?] or [?], and thus performs better in terms of efficiency, accuracy, and robustness. Specifically, the main advantages of our approach are:

Our algorithm’s implementation takes about 40% of the time required by the current state of the art [?]. ^{2}^{2}2Although Masselli and Zell [?] claim that their algorithm runs faster than Kneip et al.’s [?], our results (see Section An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) show the opposite to be true (by a small margin). The reason we arrive at a different conclusion is that our simulation randomly generates a new geometric configuration for each run, while Masselli employs only one configuration during their entire simulation, in which they save time due to caching.

Our method achieves better accuracy than [?, ?] under nominal conditions. Moreover, we are able to further improve the numerical precision by applying root polishing to the solutions of the quartic polynomial while remaining faster than [?, ?].

Our algorithm is more robust than [?, ?] when considering closetosingular configurations (the three points are almost collinear or very close to each other).
The remaining of this paper is structured as follows. Section An Efficient Algebraic Solution to the PerspectiveThreePoint Problem presents the definition of the P3P problem, as well as our derivations for estimating first the orientation and then the position of the camera. In Section An Efficient Algebraic Solution to the PerspectiveThreePoint Problem, we assess the performance of our approach against [?] and [?] in simulation for both nominal and singular configurations. Finally, we conclude our work in Section An Efficient Algebraic Solution to the PerspectiveThreePoint Problem.
Given the positions, , of three known features , with respect to a reference frame , and the corresponding unitvector, bearing measurements, , , our objective is to estimate the position, , and orientation, i.e., the rotation matrix , of the camera .
From the geometry of the problem (see Fig. An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), we have (for ):
(1) 
where is the distance between the camera and the feature .
In order to eliminate the unknown camera position, , and feature distance, , we subtract pairwise the three equations corresponding to (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) for and , and project them on the vector to yield the following system of 3 equations in the unknown rotation :
(2)  
(3)  
(4) 
Next, and in order to compute one of the 3 unknown degrees of rotational freedom, we introduce the following factorization of :
(5) 
where^{3}^{3}3 denotes the rotation matrix describing the rotation about the unit vector, , by an angle . Note that in the ensuing derivations, all rotation angles are defined using the lefthand rule.
(6) 
Substituting (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) in (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), yields a scalar equation in the unknown :
(7) 
which we solve by employing Rodrigues’ rotation formula [?]:^{4}^{4}4 denotes the skewsymmetric matrix corresponding to such that , . Note also that if is a unit vector, then , while for two vectors , , . Lastly, it is easy to show that
(8) 
to get
(9) 
Note that we only need to consider one of these two solutions [in our case, we select ; see Fig. An Efficient Algebraic Solution to the PerspectiveThreePoint Problem], since the other one will result in the same (see Appendix An Efficient Algebraic Solution to the PerspectiveThreePoint Problem for a formal proof).
In what follows, we describe the process for eliminating from (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) and (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), and eventually arriving into a quartic polynomial involving a trigonometric function of .
To do so, we once again substitute in (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) and (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) the factorization of defined in to get (for ):
(10) 
where
(11) 
and employ the following property of rotation matrices
to rewrite (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) in a simpler form as
(12) 
where
(13) 
The last equality in (13) is geometrically depicted in Fig. An Efficient Algebraic Solution to the PerspectiveThreePoint Problem and algebraically derived in Appendix An Efficient Algebraic Solution to the PerspectiveThreePoint Problem. Analogously, it is straightforward to show that
Next, by employing Rodrigues’ rotation formula [see (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem)], for expressing the product of a rotation matrix and a vector as a linear function of the unknown , i.e.,
(14) 
in (12) yields (for ):
(15) 
Expanding (15) and rearranging terms, yields (for )
(16) 
Notice that the term appears three times in (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), and
(17) 
This motivates to rewrite (12) as (for ):
(18) 
where
(19) 
To simplify the equation analogous to (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) that will result from (18) [instead of (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem)], we seek to find a (not necessarily unique) such that , and hence, [see (17)], i.e.,
(20)  
(21) 
where
(22) 
and thus [from (19) using (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem)]
(23) 
Now, we can expand (18) using (14) to get an equation analogous to (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem):
(24) 
Substituting [see (17)] in (24) and renaming terms, yields (for ):
(25) 
where^{5}^{5}5The simplified expressions for the following terms, shown after the second equality, require lengthy algebraic derivations which we omit due to space limitations.
For , (25) results into the following system:
(26) 
Note that since , we can further simplify (26) by introducing , where
(27) 
Replacing by in (26), we have
(28) 
where
(29)  
(30)  
(31)  
(32)  
(33)  
(34)  
(35)  
(36) 
From (28), we have
(37) 
Computing the norm of both sides of (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), results in
which is a 4thorder polynomial in that can be compactly written as:
(38) 
with
(39)  
(40)  
(41)  
(42)  
(43)  
(44)  
(45)  
(46)  
(47)  
(48)  
(49)  
(50) 
We compute the roots of (38) in closed form to find . Similarly to [?] and [?], we employ Ferrari’s method [?] to attain the resolvent cubic of (38), which is subsequently solved by Cardano’s formula [?]. Once the (up to) four real solutions of (38) have been determined, an optional step is to apply root polishing following Newton’s method, which improves accuracy for minimal increase in the processing cost (see Section \thefigure). Regardless, for each solution of , we will have two possible solutions for , i.e.,
(51) 
which, in general, will result in two different solutions for . Note though that only one of them is valid if we use the fact that (see Appendix An Efficient Algebraic Solution to the PerspectiveThreePoint Problem).
Next, for each pair of , we compute and from (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), which can be written as
(52) 
Lastly, instead of first computing from (19) and from (27) to find using (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), we hereafter describe a faster method for recovering . Specifically, from (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), (12) and (18), we have
(53) 
Since is perpendicular to , we can construct a rotation matrix such that
and hence
(54) 
where
Substituting (54) in (53), we have
(55) 
where
The advantages of (55) are: (i) The matrix product can be computed analytically; (ii) are invariant to the (up to) four possible solutions and thus, we only need to construct them once.
Substituting in (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) the expression for from (62) and rearranging terms, yields
(56) 
Note that we only use (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) for to compute from . Alternatively, we could find the position using a leastsquares approach based on (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) for (see Appendix An Efficient Algebraic Solution to the PerspectiveThreePoint Problem), if we care more for accuracy than speed. Lastly, the proposed P3P solution is summarized in Alg. 1.
Our algorithm is implemented^{6}^{6}6Our code is submitted along with the paper as supplemental material. in C++ using the same linear algebra library, TooN [?], as [?]. We employ simulation data to test our code and compare it to the solutions of [?] and [?]. For each single P3P problem, we randomly generate three 3D landmarks, which are uniformly distributed in a cuboid centered around the origin. The position of the camera is , and its orientation is .
We generate simulation data without adding any noise or rounding error to the bearing measurements, and run all three algorithms on 50,000 randomlygenerated configurations to assess their numerical accuracy. Note that the position error is computed as the norm of the difference between the estimate and the ground truth of . As for the orientation error, we compute the rotation matrix that transforms the estimated to the true one, convert it to the equivalent axisangle representation, and use the absolute value of the angle as the error. Since there are multiple solutions to a P3P problem, we compute the errors for all of them and pick the smallest one (i.e., the root closest to the true solution).
The distributions and the means of the position and orientation errors are depicted in Fig.s An Efficient Algebraic Solution to the PerspectiveThreePoint Problem  An Efficient Algebraic Solution to the PerspectiveThreePoint Problem and Table An Efficient Algebraic Solution to the PerspectiveThreePoint Problem. As evident, we get similar results to those presented in [?] for Kneip et al.’s [?] and Masselli and Zell’s methods [?], while our approach outperforms both of them by two orders of magnitude in terms of accuracy. This can be attributed to the fact that our algorithm requires fewer operations and thus exhibits lower numericalerror propagation.
Furthermore, and as shown in the results of Table An Efficient Algebraic Solution to the PerspectiveThreePoint Problem, we can further improve the numerical precision by applying root polishing. Typically, two iterations of Newton’s method [?] lead to significantly better results, especially for the orientation, while taking only 0.01 s per iteration, or about 4% of the total processing time.
position  orientation  

Kneip’s method  1.18E05  1.02E05 
Masselli’s method  1.84E08  4.89E10 
Proposed method  1.66E10  5.30E12 
Proposed method+Root polishing  5.07E11  1.53E13 
We use a test program that solves 100,000 randomly generated P3P problems and calculates the total execution time to evaluate the computational cost of the three algorithms considered. We run it on a 2.0 GHz4 Core laptop and the results show that our code takes 0.54 s on average (0.52 s without root polishing) while [?] and [?] take 1.3 s and 1.5 s, respectively. This corresponds to a 2.5 speed up (or 40% of the time of [?]). Note also, in contrast to what is reported in [?], Masselli’s method is actually slower than Kneip’s. As mentioned earlier, Masselli’s results in [?] are based on 1,000 runs of the same features’ configuration, and take advantage of data caching to outperform Kneip.
There are two typical singular cases that lead to infinite solutions in the P3P problem:

Singular case 1: The 3 landmarks are collinear.

Singular case 2: Any two of the 3 bearing measurements coincide.
In practice, it is almost impossible for these conditions to hold exactly, but we may still have numerical issues when the geometric configuration is close to these cases. To test the robustness of the three algorithms considered, we generate simulation data corresponding to small perturbations (uniformly distributed within ) of the features’ positions when in singular configurations. The errors are defined as in Section An Efficient Algebraic Solution to the PerspectiveThreePoint Problem, while we compute the medians of them to assess the robustness of the three methods. For fairness, we do not apply root polishing to our code here. According to the results shown in Fig.s An Efficient Algebraic Solution to the PerspectiveThreePoint Problem  An Efficient Algebraic Solution to the PerspectiveThreePoint Problem and Tables An Efficient Algebraic Solution to the PerspectiveThreePoint Problem  An Efficient Algebraic Solution to the PerspectiveThreePoint Problem, our method achieves the best accuracy in these two closetosingular cases. The reason is that we do not compute any quantities that may suffer from numerical issues, such as cotangent and tangent in [?] and [?], respectively.
position  orientation  

Kneip’s method  1.42E14  1.34E14 
Masselli’s method  7.13E15  6.15E15 
Proposed method  5.16E15  3.73E15 
position  orientation  

Kneip’s method  8.10E14  8.85E14 
Masselli’s method  7.24E14  6.07E14 
Proposed method  6.73E14  1.75E14 
In this paper, we have introduced an algebraic approach for computing the solutions of the P3P problem in closed form. Similarly to [?] and [?], our algorithm does not solve for the distances first, and hence reduces numericalerror propagation. Differently though, it does not involve numericallyunstable functions (e.g., tangent, or cotangent) and has simpler expressions than the two recent alternative methods [?, ?], and thus it outperforms them in terms of speed, accuracy, and robustness to closetosingular cases.
As part of our ongoing work, we are currently extending our approach to also address the case of the generalized (noncentral camera) P3P [?].
First, note that is a unit vector since is perpendicular to . Also, from (13) and (An Efficient Algebraic Solution to the PerspectiveThreePoint Problem) we have
(57) 
Then, we can prove by showing that their inner product is equal to 1: