Summary - Automated visual inspection of assemblies from monocular images

4 Assembly Inspection

4.4 Summary

of conventional kernel particle filtering. The first extension was motivated by the fin-ding that density plateaus and narrow peaks can impact on the efficiency of mean shift based gradient ascent. The proposed extension manipulates weighting functions in a way that induces a coarse-to-fine behavior to the mean shift iterations. The second extension aims at selecting bandwidth parameters such that the bias and variance of kernel den-sity estimates are minimized for a given number of particles. In order to achieve this minimization, the theoretically most promising adaptive kernel method was employed, together with the variable bandwidth mean shift for particle sets. The third extension aims at improving the scalability of kernel particle filtering. It was shown that the ex-isting approach doesn’t scale to the state space of multi-part assemblies, due to the fact that the underlying kernel density estimation process suffers from the curse of dimensio-nality. This problem was solved by contributing a state space decomposition scheme that divides the state space into subspaces of computationally tractable dimensionality. The theoretical limits and advantages of this heuristic were covered in detail.

The three extensions were finally combined within the extended kernel particle filtering (EKPF) algorithm. The algorithm was explained in detail and shown to exhibit moderate memory consumption. Furthermore, it was found to exhibit a worst-case time complex-ity that is quadratic in the number of particles, and linear in the number of mean shift iterations, assembly model features, and oriented bounding boxes. To our knowledge, this algorithm is the first particle filter that facilitates an accurate and precise localization of multi-part assemblies from monocular images. It has been reported in [SS06], which was published and presented at the DAGM 2006 Conference in Berlin². Its performance aspects are evaluated in the following chapter.

Finally, the third section of this chapter discussed the classification of part completeness and pose integrity, while providing illustrative examples for both of the problems. As a conceptual solution to the problem of classifying part completeness, a Bayes classifier was proposed. It was shown how such an approach would allow to reuse the image cues from the pose localization module. Furthermore, NN classifiers and decision trees were identified as eligible techniques for the task of pose integrity classification.

2The paper presents central parts of the whole inspection system. Due to the space limitations, it only covers the variable bandwidth extension of the EKPF algorithm. Furthermore, the paper discusses the processing of single images, only.

4 Assembly Inspection

5 Evaluation

Documenting the measurement accuracy and precision of a pose localization system such as the EKPF is difficult because many parameters influence the pose estimation process.

For example, the imaging process might capture objects from different distances or with varying zoom settings. The resulting images are of different scales. With small image scales, each pixel represents a small area of the object space and objects appear large within the image. The larger the image scale grows, the smaller the respective objects will appear under projection to the image plane. Clearly, the EKPF will perform better for objects that appear large within an image than for apparently small objects. Further influences to the localization accuracy and precision arise e.g. from the perspective under which an object is perceived, lighting, clutter, the inspected objects, and the employed models.

In order to illustrate how well the EKPF can localize assemblies, four different experi-mental investigations were conducted that document the system performance under vary-ing conditions. Table 5.1 presents an overview which shortly describes the key issues.

Each experimental investigation involves the pose estimation of an individual object, with recovered DOF ranging from 5 up to 29. The localized objects are chosen from two application domains, namely a real industrial inspection scenario for experimental investigation 2, and assemblies built from the wooden building blocks provided by the baufix^rconstruction set for experimental investigations 1, 3, and 4. The former do-main allows to compare the achieved localization performance to an existing inspection system. The parts of thebaufix^rdomain have the advantage of being widely available and standardized. Concerning the pose estimation task, they are very challenging because they are uniformly colored and thus provide no texture that could be exploited as image cue. Furthermore, the colored surfaces yield strong specular reflections and the edges

Table 5.1: Overview of the experimental investigations

Exp. No. Assembly Recovered DOF Key issues

1 Screw-Cube 6 Varying perspective and image scale

2 Oil Cap 5 Industrial application, EKPF extensions

3 Toy Airplane 28 Multi-part assembly, no clutter

4 Toy Axle 29 Multi-part assembly, clutter, model optimization

5 Evaluation

y x

(a) (b)

Figure 5.1: A screw-cube assembly under two different image scales and perspectives. a) The assembly is perceived under a60^◦elevation from the yz-plane of the depicted coordi-nate system that is attached to the cube. The image scale is 0.1mm per pixel. b) The same assembly, perceived under an elevation of0^◦ and an image scale of 0.3mm per pixel

of all parts are rounded, which both impacts negatively on the quality of the resulting contour edges. With respect to the models, the true shape dimensions of the employed real wooden parts deviate up to 3% w.r.t. the largest model extent.

5.1 Experimental Investigation 1

The first experimental investigation is aimed at investigating the effects of the first two influencing factors stated above, namely image scale and perspective. In order to keep the effects of other influencing factors at the lowest possible level, a setting of low complexity is used. It is described and discussed in the following.

5.1.1 Methology and Data Sets

The inspected assembly is illustrated in Fig. 5.1. It simply consists of a wooden screw that is screwed into a wooden cube. The picture also shows that the cube is held fixed.

Accordingly, the pose localization only needs to recover pose parameters of the screw.

This is done relative to the cube. Among the recovered pose parameters, only the z-axis rotation and translation are considered in the following because only these two para-meters could be reliably recorded as ground truth. The latter was carried out manually,

5.1 Experimental Investigation 1

by means of a goniometer and vernier calipers. The accuracy of such measurements is expected to be better than1^◦ and1mm.

With this assembly, a total of 1000 image measurements was recorded in the following way. First, the cube was positioned at an elevation of 0^◦ and a distance of 60cm from a statically placed camera. The camera’s zoom lens was then adjusted to yield images with a scale of 0.1mm per pixel. Afterwards, the camera was calibrated and 125 images were captured that show the assembly with five different screw positions. The true screw translations and rotations w.r.t. the z-axis were recorded manually for each individual screw position. This procedure was repeated for an elevation of30^◦, 60^◦, and90^◦. Fur-thermore, the assembly was recorded under the same four elevation angles, but with an image scale of 0.3mm per pixel and a recalibrated camera. The scene was illuminated with two 110W cold light lamps that were statically placed to the left and right of the camera, in addition to the neon head lights of the lab.

In order to evaluate the data, the retrieved pose information was separated into 8 sets.

These corresponded to the image measurements that were recorded under the two diffe-rent image scales and four camera elevation angles. Each set consisted of 125 retrieved screw poses. For each set, the deviations of the recovered pose parameters from the manually measured ground truth were determined. Based on these deviations from the ground truth, the mean pose estimation errors and standard deviations were calculated.

In the following, the mean error w.r.t. pose parameter deviations from the ground truth is used to document the absolute system accuracy, while the standard deviation is used to characterize the absolute precision.

5.1.2 Results

Figure 5.2 illustrates the results of the first experimental investigation, concerning the measurements with a small image scale of 0.1mm per pixel. It can be seen from Fig. 5.2(a) that the screw rotation is measured most accurately and precisely under a camera elevation angle of0^◦. In this case, the hexagonal screw head is perceived straight from above, such as in Fig. 5.1(b). Given this perspective, the mean error of the screw rotation is smaller than 0.1^◦, with a standard deviation of1.9^◦. For increasing camera elevation angles, the localization performance quickly decays, so that at an elevation of 90^◦ no meaningful determination of the rotation parameter is feasible. The reason for this finding is that the screw shape exhibits strong rotational symmetries. The higher the elevation angle is, the smaller are the shape changes that result from a screw rotation around the z-axis.

Figure 5.2(b) shows that with regard to the screw translation, the measurement accuracy and precision develops quite differently. Here, a camera perspective associated with a 0^◦ elevation yields the highest mean error of4.3mm and a standard deviation of4.7mm,

5 Evaluation

(a) (b)

Figure 5.2: The mean pose error and standard deviation at an image scale of 0.1mm per pixel. a) Recovering the screw rotation around the z-axis, under four different camera elevation angles. b) Recovering the screw translation along the z-axis

(a) (b)

Figure 5.3: The mean pose error and standard deviation at an image scale of 0.3mm per pixel. a) Recovering the screw rotation around the z-axis, under four different camera elevation angles. b) Recovering the screw translation along the z-axis

while the smallest values are achieved from a side-look position at90^◦elevation (0.2mm mean error and standard deviation). The reason for this finding is that, from a side-look, a small translation of the screw along the z-axis yields a large change of the screw position within the image plane. In contrast to this, the only changes that a z-axis translation induces to an image perceived from 0^◦ elevation result from depth changes which are comparatively small under the given setup.

Im Dokument Automated visual inspection of assemblies from monocular images (Seite 108-115)