Gauge Freedom and Coarse Initial Alignment

3. Constrained Bundle Adjustment for Non-Rigid Registration 37

3.4. Gauge Freedom and Coarse Initial Alignment

Figure 3.2.: Optimization on a manifold. The global parameters,a₍i), at iterationiand the feasible set (manifold), M^a, induced by the constraints are shown in blue. The local parameter increments∆αααare computed as elements of the tangent space,M^aT^a(red), represented by a subspace basis,V= [v1,v2]. The constrained parameter values for the next iteration,a(i+1), are updated on geodesics of the manifold along the direction of∆ααα.

In the practical use-cases (as later described in Chaps. 4 and5) the imposed constraints on the parameters can also involve a combination of the four types given above. For example, the knowledge that a marker which must lie on a planar surface corresponding to the virtual model may simultaneously affect the translation (plane constraint) as well as the rotation, i.e. the rotation axis is always parallel to the plane normal.

We have implemented this BA variant and evaluated in the use-cases of the chapters that follow. Due to the fact, that Lie algebra and has gained a high level of popularity for parameterizing rotations and other groups, several existing BA libraries - which have appeared in parallel to ours - also provide mechanisms to represent such an OOM approach. We will formulate general requirements and analyze to what extend they are fulfilled in these implementations.

3.4. Gauge Freedom and Coarse Initial Alignment

Iterative minimization approaches usually require an initial estimate of the parameters, which should be close enough to the global minimum. This is especially important for large minimization problems such as the bundle adjustment problems considered here, since the large amount of nonlinear parameters also results in a large number of local minima.

In Chapters4and5we will describe how an initial reconstruction is obtained for marker-based and feature-based SLAM, respectively. In both cases, the initial parameter estimates are obtained without assuming any prior information on the scene. In particular, no assumption on the availability of user knowledge for the registration is made at the beginning, either because the user provides his knowledge only later at the end of a preparative stage (feature-based) or because it is necessary to wait until observations for a minimal number of parameters with associated constraints are available (marker-based).

This lack of pre-knowledge leaves some arbitrariness on the choice of a coordinate system. But as the nu-merical estimation of points or other parameters in space requires to use coordinates for their representation, one has to deliberately decide for one specific coordinate system to work with. For instance, in feature-based SLAM usually the location of the first camera is chosen as the origin, but any other choice would be equally valid for the unconstrained minimization problem. This arbitrariness in reconstruction problems is long known, both by researchers from photogrammetry [Der94] and computer vision [TMHF00], and is usually termedgauge freedom ordatum problem.

A consequence of the gauge freedom is that the explicit constraints given by the user cannot be included into the minimization algorithm directly. They are described in the coordinate system of the virtual model which may be considerably different to the one temporarily used by the SLAM algorithm for initial reconstruction. Even if the parameters are initially estimated with high accuracy in the unconstrained case, their final optimal values of the constrained problem may be completely different due to the fact that the coordinate systems do not match.

Thus, a direct inclusion may result in convergence to a wrong local minimum or in a complete failure of the minimization algorithm.

It is therefore necessary to reconcile both coordinate systems, first. To this end, a similarity (or Euclidean) transformation,SW→V, is computed that connects both coordinate systems. With this similarity transformation all other parameters are also transformed into the coordinate system of the virtual model. For the estimation we use correspondences established from a subset of parameters for which constraints exist. Specifically, we use information on the linear components in the structure parametersbn, i.e. scene pointsxn(features) or translation vectorstn(markers). We assume that the constraints for these parameter subsets may be specified completely as target or anchor points or partially in the form of affine subspaces such as lines or planes in Euclidean space.

Then, for the estimation of the similarity transform the closed-form algorithms of Chap. 2can be used. Their application is use-case dependent and will be presented in more detail in Chapters4and5.

For the moment we will assume that a similarity transform,SW→V, has been computed, which connects both coordinate systems,W andV. In order to correctly apply this transformation to the optimization variables, am,bn, it is important to recognize that some parameters represent points inW(such as the scene pointsxW

in SfM-BA), whereas others represent transformation parameters betweenW and some other local coordinate systems (C^m or Lⁿ). The subscript notation introduced at beginning of this thesis in Section Mathematical Notationbecomes a useful tool at this point, as it helps to identify all involved coordinate systems, and it reveals the relations between them. The transformation of the motion and structure parameters of Table 3.1can be summarized as follows.

• Scene points of SfM-BA,xn, (structure parameters). These represent the simplest case. In order to trans-form them into the coordinate system of the virtual modelV, the similarity transformation can be directly applied onto the points:

xn,(V)=SW→V·xn,(W)

=sW→V·RW→V·x_n,₍_W₎+tW→V. (3.10)

• Marker positions and orientations in marker BA,Sn orEn, (structure parameters). As we will discuss later, the geometry of a marker is usually represented by four 3D points corresponding to the outer corners of its binary pattern. However, these corner points are not optimized directly or individually. Rather, the marker is modeled as a rigid body, where the coordinates of the pointsx_ln,₍_L_n₎are fixed and defined in a local coordinate systemLⁿ, with each of the markers having its own origin e.g. at the center of the pattern.

The location and orientation of the markers is represented by similarity transformationsS_n,₍_L_n_→W₎ (al-ternatively Euclidean), which connect each local coordinate system withW. The transformation of these

3.4. Gauge Freedom and Coarse Initial Alignment

parameters to the coordinate system of the virtual modelVamounts to a concatenation withSW→V, i.e.:

Sn,(L_n→V)=SW→V·Sn,(L_n→W).

⇔







Rn,(L_n→V) =RW→V·Rn,(L_n→W).

t_n,₍_L_n_→V₎ =sW→V·RW→V·t_n,₍_L_n_→W₎+tW→V. sn,(L_n→V =sW→V·sn,(L_n→W).

(3.11)

• Camera extrinsics,Em, (motion parameters). The extrinsic parameters for all framesmare given by Eu-clidean transformations,EW→C_m, from the real world model coordinate systemWto the local coordinate systemC^mof each camera. After transformation withSW→V the extrinsics should relate points inV to points inC^m, which means that the transformation of the extrinsic parameters must be as follows:

Sm,(V→C_m)=Em,(W→C_m)·SV→W

=E_m,₍_W→C_m₎·S_W→V⁻¹ .

⇔







R_m,₍_V→C_m₎ =R_m,₍_W→C_m₎·R^T_W→V.

tm,(V→C_m) =−s⁻_W→V¹ ·Rm,(W→C_m)·R^T_W→V·tW→V+tm,(W→C_m). s_m,₍_V→C_m₎ =s⁻_W→V¹ .

(3.12)

Since we are working with pinhole models for the cameras, images points remain invariant to global scale changes of their associated 3D points xin camera coordinates Cⁿ. This is because of the perspective division,P(x) = [x1/x3, x2/x3]^T, as part of the image generation process, which has the property that P(sx) =P(x)for anys∈R\0. Therefore it is convenient to re-express the resulting camera extrin-sics again as Euclidean transformations with unit scale, which can be achieved by multiplication of the parameters withsW→V:

E_m,₍_V→C_m₎:

(R_m,₍_V→C_m₎ =R_m,₍_W→C_m₎·R^T_W→V.

tm,(V→C_m) =−Rm,(W→C_m)·R^T_W→V·tW→V+sW→V·tm,(W→C_m). (3.13) Based on these relations it is easy recognize why thegauge freedomis always prevalent in unconstrained bundle adjustment problems. If both, the motion and structure parameters, are transformed simultaneously as in Eqs.

3.10to3.13usinganysimilarity transformation, the error of the minimization problem will remain invariant to this coordinate system change. This can be seen by writing the coordinates of a scene point in camera coordinates after the transformation:

x_n,₍_C_m₎=E_m,₍_V→C_m₎·x_n,₍_V₎

=sW→V·E_m,₍_W→C_m₎·S_W→V⁻¹ · SW→V·x_n,₍_W₎

=sW→V·Em,(W→C_m)·xn,(W).

(3.14)

As can be seen, the values of any point in camera coordinates remain the same after the transformation, apart from the scaling factorsW→V. Since the scaling factor has no influence on the final image points after perspective division, the effect of the transformation on the structure and on the motion parameters is completely canceled out.

At this point we would like to point out another property of the gauge freedom and the explicit constraints:

their impact on possible choices for the actual minimization algorithm. As we showed above, the actual choice of

the coordinate system does not affect the direction at which scene points are located with respect to the cameras, and thus, the final re-projection error of the unconstrained minimization problem is invariant to the reference coordinate system. This means that the minimum of the objective function is not a single point, but rather a seven-dimensional manifold⁴that evolves through the parameter space. As a consequence, this results in a degeneracy of the Jacobian matrix,[Ja,Jb], of line 6 in Listing3.3.1and also of the approximated Hessian. The rank of these matrices is always by seven smaller than the number of minimization parameters and the number of columns of[Ja,Jb], and therefore the Hessian or its Gauss-Newton approximation,[Ja,Jb]^T[Ja,Jb], is not directly invertible. This property is completely independent of the size of the bundle adjustment problem and the number of measurements. This circumstance is rarely addressed in literature, but it is the actual reason why a damped minimization algorithm such as Levenberg-Marquardt must be used in the unconstrained case. The damping term,µI, has the role of regulizer and ensures that these matrices become invertible again.

If external constraints are added the situation changes. If a similarity transformation can be unambiguously computed from the given constraints, it also means that the coordinate system (gauge) is in fact fixed. This again results in invertible matrices and the damping term mechanism is not needed anymore. As a result, undamped algorithms such as Gauss-Newton can be used, having in general faster convergence rates than damped variants.

Im Dokument Optimal Spatial Registration of SLAM for Augmented Reality (Seite 69-72)