Ambiguities in structure from motion - Shape Ambiguities in Structure from Motion

Because structure from motion attempts to recover both the structure of the world and the camera motion without any external (prior) knowledge, it is subject to certain ambiguities. The most fun-damental (but most innocuous) of these is the coordinate frame (also known as pose, or Euclidean) ambiguity, i.e., we can move the origin of the coordinate system to an arbitrary place and pose and still obtain an equally valid solution.

The next most common ambiguity is the scale ambiguity (for a perspective camera) or the depth ambiguity (for an orthographic camera). This ambiguity can be removed with a small amount of additional knowledge, e.g., the absolute distance between camera positions.

A third ambiguity, and the one we focus on in this technical report, is the bas-relief ambiguity.

In its pure form, this ambiguity occurs for a two frame problem with an orthographic camera, and is a confusion between the relative depth of the object and the amount of object rotation. In this technical report, we focus on the weak form of this ambiguity, i.e., the very large bas-relief uncer-tainty which occurs with imperfect measurements even when we use more than two frames and/or perspective cameras. A central result of this technical report is that the bas-relief ambiguity cap-tures the largest uncertainties arising in structure from motion. However, when examined in detail, it appears that a larger class of deformations (i.e., projective) more fully characterizes the errors which occur in structure from motion.

To characterize these ambiguities, we will use eigenvector analysis of the information matrix, as explained in Section 3.2. Absolute ambiguities will show up as zero eigenvalues (unless we add additional constraints or knowledge to remove them), whereas weak ambiguities will show up as small eigenvalues.

4 A two parameter example

To develop an intuitive understanding of the basic bas-relief ambiguity, we start with a simple two-parameter example. Assume that we have an orthographic scanline camera which measures the^x component of 2-D points^(x^z). Furthermore, assume that we already know the shape up to a scale

4 A two parameter example 9

(a) Noise-free (b) Noise = 0.01 (c) Contour plot

Figure 2: Constraint lines and energy surface for simple two-parameter example. The^x-axis is the angleand the^y-axis is the scale factor^a.

factor in depth,

p

ⁱ ⁼^(xⁱ^azⁱ⁾

and that the rotation angles are uniform,

=j :

The projection equation is then

What happens when we try to estimate the scale factor^aand the anglefrom a set of noisy measurements ^fu^ij^g? First, let’s examine the very simplest case, which is a single point, say at

(xz)=(11). Each new image gives us a constraint of the form to noise. This can also be seen in the contour plot of the energy surface (Figure 2c) which can be computed by summing the constraints in (15).

10 5 Orthography: single scanline To characterize the shape of the error surface near its minimum, we compute the information matrix

A

. The Jacobian for^(a⁾is straightforward,

H

^ij ⁼ (information) matrix is then

A

^J2^Z

i (assuming that^Pj

j =0). Assuming that2

a2, we can compute (Appendix A) the approximate eigenvalues of

A

^as

min4

J4^X=a2 and max^J2^Za2

: (18)

The eigenvalues of the information matrix describe an “elliptic” approximation to the error sur-face (and hence posterior probability distribution), which matches the true “banana shaped” sursur-face near the optimal solution but not far away from it. To determine if the additional nonlinearities in the reconstruction process result lower or higher overall uncertainties than those predicted by the information matrix, we would have to resort to numerical simulations. In practice, we expect these secondary effect to be much smaller than the large variations in eigenvalues which explain most of the uncertainties (ambiguities) associated with structure from motion.

5 Orthography: single scanline

Let us now turn to a true structure from motion problem where both the structure and motion are unknown. For simplicity, we analyze the orthographic scanline camera first, where the unknowns are the 2-D point positions

p

ⁱ ⁼^(xⁱ^zⁱ⁾and the rotation angles^j.3 The imaging equations are

3We do not estimate the horizontal translation since it can be determined from the motion of the centroid of the image points [TK92b].

5.1 Two frames: the bas-relief ambiguity 11 The Jacobian for the 1-D orthographic camera is

H

^ij ⁼^h ^@uij^@xi ^@uij^@zi ^@uij^@j ⁱ^T ⁼^h ^c^j ^;s^j ^;(c^j^zⁱ⁺^s^j^xⁱ⁾ ⁱ^T ⁽²⁰⁾

and the entries in the information matrix are

A p

ⁱ ⁼

Before analyzing the complete information matrix, let us look at the two subblocks

A p

^and

A m

If we know the motion, the structure uncertainty is determined by

A p

ⁱ and is simply the triangula-tion error, i.e.,2

;1 (note that for small rotations,2

is generally much smaller than 2

z). If we know the structure, the motion accuracy is determined by

A m

^j and is inversely proportional to the variance in depth along the viewing direction^(s^j^c^j⁾.

What about ambiguities in the solution? Under orthography, the traditional scale ambiguity does not exist. However, translations along the optical axis cannot be estimated, and an overall pose (coordinate frame) ambiguity still exists. Unless we add some additional constraints, we can always rotate the coordinate system by aand add the same amount to the^fj

g. This manifests itself as the null (zero eigenvalue) eigenvector

e

0 ⁼

5.1 Two frames: the bas-relief ambiguity

Let us say we only have two frames, and we have fixed0 ⁼⁰^c0 ⁼¹^s0⁼⁰1 ⁼ ^c1 ⁼^c^s1 ⁼

s(Figure 3). Then

A p

ⁱ ⁼

12 5 Orthography: single scanline

x z

δz

δθ θ

x z

δz

δθ δx

δθ

θ θ

(a) stationary first frame (b) antisymmetric cameras Figure 3: Orthographic projection, two frames.

The solid lines indicate the viewing rays, while the thin lines indicate the optical axes and image planes. The diagonal dashed lines are the displaced viewing rays, while the ellipses indicate the positional uncertainty in the reconstruction due to uncertainty in motion (indicated as).

The bas-relief ambiguity manifests itself as a null eigenvector

e

0⁼

0 cz0⁺^sx0 ⁰ ^cz^N⁺^sx^N ^;s

as can be verified by inspection. This is as we expected, i.e., the primary uncertainty in the structure is entirely in the depth (^z) direction, and is a scale uncertainty (proportional to^z). Note however that this uncertainty is proportional to^cz⁺^sxrather than^z, as can be seen by inspecting Figure 3a.

An alternative parameterization of the two-frame problem is to set 0 ⁼ ^;1 (Figure 3b), in which case we have

A p

ⁱ ⁼

2c2 0 0 2s2

5 (27)

A p

ⁱ

m

⁼

;2csx

2csz

i 3

5 (28)

A m

⁼ ^h ^2c²^Z⁺^2s²^X ⁱ^: ⁽²⁹⁾

5.2 More than two frames, equi-angular motion constraint 13

Im Dokument Shape Ambiguities in Structure from Motion (Seite 16-21)