Complete Closed-Form and Accurate Solution to Pose Estimation From 3D Correspondences

Computing the pose from 3D data acquired in two different frames is of high importance for several robotic tasks like odometry, SLAM and place recognition. The pose is generally obtained by solving a least-squares problem given points-to-points, points-to-planes or points to lines correspondences. The non-linear least-squares problem can be solved by iterative optimization or, more efficiently, in closed-form by using solvers of polynomial systems. In this letter, a complete and accurate closed-form solution for a weighted least-squares problem is proposed. Adding weights for each correspondence allow to increase robustness to outliers. Contrary to existing methods, the proposed approach is complete since it is able to solve the problem in any non-degenerate case and it is accurate since it is guaranteed to find the global optimal estimate of the weighted least-squares problem. Simulations and experiments on real data demonstrate the superior accuracy and robustness of the proposed algorithm compared to previous approaches.


I. INTRODUCTION
T HE use of 3D sensors (lidars [1], [2], stereo cameras [3], [4], RGB-D cameras [5], [6],...) has become mainstream in autonomous robotics applications for their ability to accurately localize the robot and measure the 3D structure of the robot's environment. These sensors are able to produce a set of 3D points at each acquisition. Therefore, the registration of two sets of 3D points is of high importance for several tasks like odometry [7], SLAM [8], place recognition [9]. The problem is generally set as the solution of a non-linear optimization problem (generally a least-squares) after matching points to points [10], [11], [12], points to planes [13], [14] or points to lines [15], [16].
A unified solution for the points-to-points, points-to-line and points-to-plane registration problem was firstly proposed in [17]. The authors proposed to use an iterative branch and bound optimization solver [18] that is able to find global optimum of non-convex problems. Such an algorithm is quite slow and usually reach a suboptimal solution i.e. a solution that is close to the global minimum within a given precision. To overcome this problem and speed up the computation time [19] and [20] use a Lagrangian dual relaxation solver. These methods can find the global minimum but are iterative [21] and therefore time consuming. Much faster algorithm can be obtained by using solvers   [22], [23], [24] that solve the problem in "closed-form" since they make use of simple linear algebra operations only. Most solvers eliminate the translation from the equations since it appears linearly (quadratically in the least-squares problem), obtaining a simpler optimization problem depending from the rotation only (3 d.o.f.). The non-linearities comes from the rotation matrix since it must be an orthonormal matrix. Therefore, the parametrisation of the rotation matrix is crucial to transform the least-square optimisation into a polynomial problem. The most common rotation parametrisation is obtained by using unit quaternions [10], [23]. This representation is complete but not minimal: all possible rotations can be represented using 4 variables. Another possible representation is the Cayley-Gibbs-Rodrigues (C-G-R) parameters [22], [24]. This representation is minimal but not complete: only 3 variables are necessary but it is impossible to represent rotations with an angle of π around an arbitrary axis. Even if minimal, the C-G-R parametrisation leads to a rational representation of the rotation. Therefore, authors in [24] were finally forced to introduce 4 intermediate variables to transform the rational problem into a polynomial problem. One of the intermediate variables depends on the three others but it was considered as a free variable (therefore ignoring the constraint). Experimental results in [24] shows better accuracy than in [23] but using an additional Newton refinement of the solver solution.
In order to understand which is the best combination between solvers and rotation parametrisations in order to have the best accuracy a different approach is proposed in this paper. A quaternion parametrisation is used like in [10], [23] but, contrarily to [23] the unit quaternion constraint is properly imposed through Lagrangian optimization. Moreover, instead of the Gröebner-basis solver, a solver based on the u-resultant MacAulay method [25] is proposed. The method is improved in this paper in order to avoid degenerate configurations that may occur with the u-resultant. The u-resultant MacAulay method was successfully used by [22] to solve the perspective-point pose determination problem (Pnp) but jointly with a C-G-R parametrisation of the rotation. Therefore, the key contribution of this paper is the combination of a robust u-resultant solver with the quaternion parametrisation of the rotation, that was not considered in previous works as illustrated in Table I. II. THEORETICAL BACKGROUND Let m c be a 3D point in a current frame F c . The point belongs to a line in the same frame with origin m (any point on the line) and unitary direction vector d if and only if: And it belongs to a plane in the same frame with origin m (any point on the plane) and with unitary normal vector n if and only if: We consider now the same point but in a different reference frame F r . Let c T r ∈ SE(3) the homogeneous transformation matrix between the two frames: where R ∈ SO(3) is the 3 × 3 rotation matrix and t ∈ R 3 is the 3 × 1 translation vector. The current point m c is obtained transforming the reference point m r as follows: Plugging (3) into (1) we get: And plugging (3) into (2) we get: Stacking the entries r ij of the rotation matrix R, rows by rows, into a 9 × 1 vector: where m is the following 3 × 9 matrix: In order to get a similar notation for all cases (points, lines and planes), (3), (4) and (5) can be rewritten as follows: For point-to-point correspondences the point m is set to the current points m = m c in (7). For point-to-line correspondences m is an arbitrary point on the current line, and from (4) we get: For point-to-plane correspondences m is an arbitrary point on the current plane, and from (5) we get: Our objective is to estimate the pose (rotation vector r and translation vector t) given several measured points and corresponding planes. Note that even if the equations are linear in the vector r, the problem is non-linear.

A. Rotation Parametrisation
The rotation matrix R ∈ SO(3) is a 3 × 3 matrix but it has 3 degrees of freedom only since it is orthonormal, R R = I, that means 6 constraints on the entries of the rotation: Instead to represent the rotation by 9 entries subject to 6 constraints we can use more compact representations.
1) Cayley-Gibbs-Rodrigues Parametrisation: We can use the the Cayley-Gibbs vector parametrisation for R : where g = u tan θ 2 . This parametrisation leads to a rational form and is not complete since it cannot represent rotations when θ = π (indeed tan θ 2 → ∞ when |θ| → π). 2) Quaternion Parametrisation: Another rotation parametrisation is the quaternion q = [q r , q i ] = [w; x; y; z] for which we have: We can stack all the rows of the rotation matrix in a (9 × 1) vector r(q) = [R] ∨ that is quadratic in the 4 unknowns x, y, z, w. This non minimal parametrisation has a 2-fold symmetry since r(q) = r(−q).

B. The Weighted Least Squares Problem
Considering n m point to point correspondences, n l point to line correspondences and n p point to plane correspondences the weighted least squares optimisation problem can be written as follows: where w m , w l and w p are weights. The squared distance point to point is given by: the squared distance point to line is given by: and the squared distance point to plane is given by: Introducing a (3 × 3) symmetric weighting matrix W we can define all possible squared distances with an unique equation: where the choice of the entries of the semi-definite positive weighting matrix W determines a specific distance : Therefore, the optimisation problem mixing points, lines and plane can be generally written as follows: This optimisation problem can be solved iteratively starting from an initial guess of the global optimum, or in a closed form as proposed in the next section. The weights can be used to implement robust M-estimators techniques.

III. LEAST SQUARE CLOSED-FORM SOLUTION
The first step to solve the problem (17) in closed-form is to eliminate the translation vector that appears quadratically in the cost function and then solve a new equivalent non-linear problem that depends on the rotation only [17], [23], [24]. Again, even if the new equivalent problem is non-linear it can be solved by finding the real roots of polynomial equations.

A. Separating Rotation and Translation
The least squares problem in (17) can be written as a quadratic function of t: Therefore, the optimal solution for the translation can be easily computed: That is: where t is a linear function in r. The translation can be obtained as a linear function of the rotation vector r: where the matrix A t and vector b t are: Plugging back the optimal translation into the cost function we have now to solve the following optimisation problem: where: we can set the problem in the following canonical form: where A r is the following 9 × 9 matrix: and c r is the following scalar:

B. Special Case of Point-to-Point Correspondences
As noticed by [23] when point-to-point correspondences are used exclusively their in-homogeneous solver GAPS had difficulties. Indeed, for a point-to-point correspondence the matrix A r has a very special block-diagonal structure: Therefore the total degree of the system of equations point-topoint correspondence drops from 4 to 2.
This explains the difficulties of using the same solver when point-to-point correspondences are used exclusively. In this case, the problem can be rewritten as follows: The cost function is quadratic in the quaternion q, B being a symmetric (4 × 4) matrix. Therefore the solution of the problem is straightforward and it is given to the unitary eigenvalue of the symmetric matrix B corresponding to the smallest eigenvalue [26]. This optimal result can also be obtained by the method proposed in the following section.

C. Closed-Form Solutions for the Rotation
Using the quaternion parametrisation, the vector r can be written as a quadratic function of 4 unknowns q = [w; x; y; z]. We can impose the constraint q q = 1 using the Lagrange multiplier method [21]: The stationary points of the Lagrangian are found by solving the following polynomial equations: where g (q) = [g w (q) g x (q) g y (q) g z (q)] is the following (1 × 4) vector: that is the multiplication of the following (1 × 9) vector: and the following (9 × 4) Jacobian matrix: Note that one could compute λ as follows: therefore we could back-substitute λ in the optimization problem: This would lead to a system of 5th degree in 4 unknowns. Another way to solve the problem is to consider (29): and eliminate λ with the wedge product (or exterior product): which is a 4 × 4 skew symmetric matrix: therefore obtaining the following 6 equations (of degree 4 in the general case and of degree 2 when considering only point-topoint correspondences), jointly with the quaternion constraint of degree 2: The Bezout theorem states that the system has at most 128 solutions in the general case and at most 16 solutions when considering only point-to-point correspondences. We use the u-resultant method proposed by [25]. This method has already been used to solve a different problem (the perspective-n-point camera pose estimation) by [22]. The u-resultant approach consist in adding an auxiliary linear equation in the unknown q: where the coefficients u = [u 0 , u 1 , u 2 , u 3 , u 4 ] can be any. Note that, in general, f 0 (q) will not be zero at the roots of the system of polynomial equations (30)(31)(32)(33)(34)(35)(36). The second step is to select n = 4 independent equations of degree d i from (30)(31)(32)(33)(34)(35)(36) and to define the total degree of the system of equations, including the auxiliary polynomial as d = 1 + n i=1 d i − n (in our case d = 11). We multiply the equations f j will all possible monomial m γ = w γ 1 x γ 2 y γ 3 z γ 4 with degree γ = In order to reduce the computation time and increase the accuracy three main improvements over the original algorithm [25] are proposed in this paper. First of all, to solve the possible problem of rank deficiency of the D matrix [27] when building the Macaulay matrix all possible equations (30-36) are considered when selecting polynomials to form the resultant. Secondly, notice that it is possible to reduce the number of monomials by using the equation with the lowest degree. In our case, we substitute w 2 = 1 − x 2 − y 2 − z 2 in the equations. Therefore, only monomial linear in w will be present in the equations. The size of matrix M is then reduced to 650 × 650, and size(D) = (522 × 522). The third simplification is obtained by noticing that M matrix is generally very sparse (in our case 95% of the matrix entries are equal to 0) and that the matrix may be reordered in such a way that it can be written as a block diagonal matrix: where in our case size(D 1 ) = (222 × 222) and size(D 2 ) = (300 × 300). Finally, the solutions are obtained computing the eigenvectors of the following 128 × 128 multiplication matrix that is much faster to compute: Obviously we select only the real solutions and discard complexconjugate solutions. Note also that the possible solutions can be reduced to 64 due to the two-fold symmetry of quaternions [28], [29], [30].

IV. EXPERIMENTS
In this section, we compare our algorithm with the state-ofthe-art iterative algorithms (Olsson's algorithm using Branch and Bound [18] and Briales's algorithm [20] using Lagrangian duality) and with the state-of-the-art closed-form algorithms (Wientapper's algorithm [23] and Zhou's algorithm [24]). In the experiments with closed-form solvers, only the closed-form solution of the least square problem is considered. Therefore, in order to have a fair comparison between the algorithms, the Newton-Raphson iterative refinement, starting from the best pose found by the closed-form optimization, is not performed as in [24]. Another solution would have been to perform this iterative refinement step after any of the algorithm but this may have hidden the true performance of the solvers. We present experiments both on simulated and real data.

A. Experiments With Simulated Data
The simulated data are generated similarly to [20] and [24]. Specifically, 3D points are determined by randomly sampling a sphere of radius 10 m. Unit directions for lines and normal for planes are generated randomly. Random rotations are generated by uniformly sampling the Euler angles ϕ, θ and ψ (ϕ, ψ ∈ [0 • ; 360 • ] and θ ∈ [0 • ; 180 • ]). The translation displacements are uniformly distributed within [10 m; 10m]. For n p point-to-plane, n l point-to-line, and n m point-to-point correspondences, the number of correspondences is calculated as N = n p + 2n l + 3n m . Given an N, a combination of point, line and point is randomly generated whose effective number is N. The estimated rotation R and translation t are compared to the ground truth rotation R and translation t. The rotation error is the absolute value of the rotation angle computed as δr = ||logm( R R )|| F , where logm is the logarithm of a rotation matrix [31] that gives the skew-symmetric matrix [uθ] × such that R R = expm([uθ] × ) and || • || F is the Frobenius norm that gives the magnitude of the rotation angle. The translation error is δt = || t − t||. For all experiments the average rotation error μ(δr) and the average translation error μ(δt) are computed for 1000 trials.

1) Fixed Number of Correspondences, Increasing Noise:
In the first set of simulations the number of correspondences is fixed while the noise standard deviation on the generated 3D data increases from 0 to 0.2 m. Fig. 1 shows the results of a simulation where the rotation angle is fixed to θ = 180 • and the rotation axis and the translation are random. The number of correspondences is fixed to the minimum N = 6. For more clarity, we separate the comparison of the proposed method with closed-form algorithm from the comparison with iterative algorithms that have much more computational complexity. The iterative algorithms show poor results when considering the minimum number of correspondences where multiple global minima may exist. As expected, the method proposed in [24] is highly unstable since the C-G-R parametrisation of the rotation cannot handle the case when the rotation axis is fixed to 180 • . The proposed method provide a better accuracy than [23] when the noise level increases.    2 shows the results of a simulation where the rotation and the translation are random. The number of correspondences is again fixed to N = 6. For iterative algorithms we obtain similar results to the previous simulation. Note that when the number of correspondences is minimal there may be several possible global minima. The closed-form methods provide all the possible solutions while iterative methods may fail to find them all. Since the rotation angle is random the method proposed by Zhou performs better than in the previous simulation but it clearly has problems to handle the minimal case. Wientapper's algorithm handles correctly this minimal case, the proposed method even better. Fig. 3 shows the results of a simulation where the rotation angle is fixed to θ = 180 • and the rotation axis and the translation are random. The number of correspondences is fixed to N = 7. Contrarily to the simulation with the minimal number of correspondences, iterative methods work well and are almost equivalent. The proposed method provides better results while having a much lower computational complexity. Surprisingly, the method proposed by Zhou gives better results even if the C-G-R parametrisation of the rotation may not handle this case. Noise and numerical errors make that the estimated error is never exactly 180 • that would cause a division by zero in the algorithm. Fig. 4 shows the results of a simulation where the rotation and the translation are random. The number of correspondences is fixed to N = 7. The proposed approach, provides more accurate results than iterative and closed-form methods for all noise levels.
2) Fixed Noise, Increasing Number of Correspondences: In the second set of simulations the noise standard deviation is fixed to 0.2 m while the number of correspondences increases from 6 to 14. Fig. 5 shows the results when the rotation angle is fixed to θ = 180 • around a random axis.
In this case the C-G-R rotation parametrisation provide the worst results, especially when 6 to 8 correspondences are considered. Note that the 180 • degeneracy can be avoided by randomly pre-rotating the points [23]. However, the requirement to run the algorithm multiple times increases the computation time and the number of solutions. Iterative algorithms also provide poor results for few correspondences confirming previous experiments. The proposed algorithm provide the best results over all the algorithms even when only few correspondences are used. Fig. 6 shows the results when the rotation is random. The results are very similar to the previous experiment. In particular,  there is no improvement for Zhou's algorithm suggesting again that high level of noise in the data prevents the C-G-R rotation parametrisation from failing. It must be noticed that the difference from the results presented in [24] is certainly due to the fact that the Newton-Raphson refinement is not performed here.

3) Special Case With Points-to-Points Correspondences:
Simulations with points-to-points correspondences only confirmed that Wientapper's algorithm is not able to handle this special case. In Fig. 7 the results corresponding to this method  are not displayed since the Matlab function provided by the authors does not give any results. Indeed, the polynomials equations goes from degree 4 to degree 2 (see section III-B). Note that Zhou's algorithm does not suffer from this problem and provide good results. On the contrary, Olsson's algorithm provided poor results. The proposed approach applied to this case provided the best results similarly to Briales's algorithm.

4) Comparison of Computation Time:
In this experiment, we compare the computational time of closed-form algorithms. The number of correspondences vary from 100 to 3000. Fig. 8 illustrates the results averaging the running time over 100 trials. In the current Matlab implementation the proposed approach is slower than the others closed-form methods below 1000 correspondences and faster above (e.g. around 12 milliseconds for 1000 correspondences).

B. Experiments With Real Data
Similarly to previous work [24], the KITTI dataset [32] is used for experimental results. The current 3D points set (lidar scan k+1) is segmented to extract lines and planes and matched with the closest 3D points of the reference set (lidar scan k). There are more than 50,000 correspondences in each frame, with the majority being the point-to-plane correspondences. For simplicity, only points-to-planes correspondences are considered since they represent more than 99% of the found correspondences (i.e. points-to-points and points-to-lines correspondences are only 1% of the found correspondences). The planes are extracted from the point cloud by performing least squares fitting on k-neighbors (k = 8) around each point. If the least square error is below a threshold the points are considered to be on the same plane.
To find the point-to-plane correspondences, a kd-tree search is performed to find the current point closest (minimal 3D distance) to a given reference point. If the current point belong to a plane, the reference point and the the current plane are set as corresponding.
After finding the closest points-to-planes correspondences, the optimal pose between scans k+1 and k is estimated and used to register the reference 3D points into the current frame. A robust Tukey M-estimator [33] is used to compute the weights of the weighted least-square problem in equation (13). This ICP process is repeated at most for 10 iterations or stopped before if the computed incremental translation displacement is less than 1 mm and the computed incremental rotation displacement is less than 0.1 deg. Finally, the optimal pose is compared to the ground truth and a translation error δt (in meters) and a rotation error δR (in degrees) are computed for each frame. Table II show the mean and standard deviation of translation and rotation errors obtained on sequences 03, 04, and 07 of the KITTI dataset like in [24]. Iterative algorithms are not considered here since they have a much higher computational cost as pointed out in [24]. Finally, the proposed approach is more accurate and robust than state-of-the-art closed-form approaches. This letter presents a complete and accurate solution for pose estimation from points-to-points, points-to-line and points-to plane correspondences. The proposed approach is also able to provide a solution to the weighted least squares problem even when considering a minimum number of correspondences and for any rotation. Simulated and experimental results show that the proposed algorithm outperforms state-of-the-art algorithms at the price of a higher computational complexity in the current Matlab implementation. Future research directions would be (i) to study if the superior accuracy comes from the choice of solver (Macaulay u-resultant instead of Gröebner basis) or from the correct handling of the quaternion constraint with the Lagrange multiplier, and (ii) to study how to reduce the computational complexity of the solver.

ACKNOWLEDGMENT
The author thanks Dr. Lipu Zhou and Dr. Folker Wientapper for providing their Matlab source code for the experimental comparison with their previous works.