• Keine Ergebnisse gefunden

7. Tracking in Unknown Scenes 89

7.5. Experimental Evaluation

7. Tracking in Unknown Scenes

If µt and µ0 are the mean intensity values of the current and the initial patch, and if σt and σ0 are their standard deviations respectively, then the contrast λ and brightness δ can be predicted by

λ=σt

σ0 (7.28)

δ =µt− σt

σ0µ0. (7.29)

The predicted illumination parameters λ and δ describe the illumination correction of a patch, which is extracted out of the current image, thatT(x) =λI(g(x;p)) +δ holds.

7.5. Experimental Evaluation

(a) (b)

(c) (d)

Figure 7.1.: Illustration of the feature tracking and reconstruction process. All images represent the states of the features at the same camera frame. In (a) the results of the 2D feature tracking step are pointed out. In (b) the reference object and the 3D covariances of the reconstructed feature points are shown.

Figure (c) illustrates a projection of the 3D covariances in the image plane.

Features with a high absolute value of the covariance are colored red, features with a small uncertainty are colored green. In (d) the original image is aug-mented with the line model of the reference object and a virtual character standing on the table.

7. Tracking in Unknown Scenes

Figure 7.2.: Tracking results showing the ability of handling occlusion and even the total removal of the object from the scene.

Image (c) of Figure 7.1 illustrates the same uncertainty regions of the feature points as a projection in the image plane. The absolute value of the uncertainty is color coded in that way that the color value is shifted with increasing precision from red to green.

Finally in (d) the original image is overlaid with the line model with which the tracking was initialized and an additional virtual character standing on the table. The purpose of this frame is to evaluate, if the camera pose is estimated properly and that virtual objects are always placed correctly in the scene.

Figure 7.2 shows the tracking results of another sequence of the same scenario with an image size of 320x240 pixels. This time the initialization object is occluded and removed from the scene. Since enough other features have been triangulated and rened success-fully, it is still possible to keep tracking. The line model, which is used for augmentation, sticks at the same position in the real world.

In another sequence an industrial control unit is used as a reference object. In Figure 7.3 some frames of this sequence can be seen. Again the tracking is initialized with a line-model, which is generated out of a given VRML-model. After the initialization feature points are extracted in the whole image, but only those points which are located on the known geometry can be used instantly for the camera tracking. The other features are triangulated and rened when the camera is moved through the scene. Again it can be observed that due to the renement process the uncertainties of the reconstructed feature points shrink during the tracking. When a person moves into the scene, some feature are occluded and the 2D tracking step for these features fails. In the rst column of Figure 7.3

7.5. Experimental Evaluation

(a) (b) (c)

Figure 7.3.: Tracking results demonstrating the robustness against occlusion. In (a) the KLT features are shown, in (b) the covariances of the reconstructed features, and in (c) the feature with their 2D unceratinties can be seen.

all those occluded features are colored red. Since enough valid 2D features are available in every frame, the pose can be estimated successfully despite the occlusion.

7.5.2. Runtime Analysis

We analyzed the processing time needed for tracking and reconstruction on a Pentium IV with 3 GHz. The results are shown in Table 7.1. The time for processing one frame strongly depends on the number of features in the current eld of view. In this sequence on average 28.5features were used for the tracking in every frame and 38.81milliseconds

2D registration + pose estimation reconstruction

Avg. 28.84ms 9.97ms

Min. 10.55ms 4.15ms

Max. 67.63ms 32.86ms

Table 7.1.: Average, minimum and maximum time in milliseconds, which is needed for processing one frame of the truck sequence.

7. Tracking in Unknown Scenes

(a) (b)

Figure 7.4.: In (a) a frame of the image sequence is shown together with the tracked template patches. In (b) the reconstructed 3D patched and their surface normal direction are drawn. The blue rectangle illustrates the corresponding plane of the feature.

were measured for the total time needed for 2D feature registration, pose estimation and reconstruction. The feature map contained38.4features on average.

For the acquisition of the images and the rendering of some virtual objects some additional processing time is needed. Altogether the system can handle frame rates up to 20Hz, when a reasonable number of features is used for tracking.

7.5.3. Surface Normal Reconstruction

For the evaluation of the reconstruction of surface normals a sequence of a desktop scene is used. The camera tracking is initialized with a reference image and a randomized trees keypoint classication. The tracking is tested with both updating the template and keeping the template as it is when it was captured at its rst appearance. With both methods the camera can be traced throughout the whole scene.

To analyze the quality of the surface normal reconstruction the 3D patches together with their normals are rendered. In gure 7.4 a frame of the image sequence and the reconstructed planar patches together with their surface normal directions are shown.

At the beginning when a feature is observed rst, the normal vector points towards the camera position, but when the camera is moved around the scene, the estimate of the normal vectors converges towards the true orientation of the surface normal direction of a feature. At some points where only a poor alignment of a template patch is possible, i.e. on edges or object borders, the reconstructed normal cannot be estimated correctly.