Results - The Calculation of the Homology

7.4 The Calculation of the Homology

7.4.4 Results

Each of the 16 algorithms for the calculation of the axis and 3 algorithms for the calculation of the vertex were run on a total of 49 images² of 6 SORs (see Fig. 7.6) which have previously appeared in publications about the recognition of SORs [3–5].

This resulted in 48 different values for the harmonic homology in each image and 48×49 = 2352 different harmonic homologies overall. For each homology I also calculated the residual as described in Sec. 7.4.3 and used this to determine the relative goodness of fit for each approach.

2The relevant contours and bitangents were selected by hand, so as not to confound the com-parison with additional issues.

180 The Calculation of the Homology

Figure 7.6: 45 of the 49 images of SORs used.

7.4.4 Results 181

0 10 20 30 40 50 120

0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15

Error (min/median/max)

Axis−Model

intra inter intra + crossp. inter + crossp.

impl. model, homog. expl. model, homog.

impl. model, Eucl. expl. model, Eucl.

V=0 V=2 V=14

0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15

max 18. 20. 25. 24. 41. 37. 15. 32. 13. 13. 15. 15. 123 20.5.0 5.0 med 2.8 2.6 2.4 2.4 3.2 2.7 1.4 1.4 2.2 2.2 2.0 1.8 3.9 2.11.1 1.2 min .40 .46 .41 .41 .40 .44 .32 .36 .49 .39 .32 .32 .41 .41.34 .35 Figure 7.7: Range of residuals. For each algorithm the minimum, median and maximum residual are plotted — note the discontinuity along the ordinate. The table shows the numerical values for the second best vertex model.

When comparing these different algorithms it is important to remember that any algorithm could perform best for one particular set of features due to statistical fluc-tuations (and we will see in Sec. 8.1 that the particular shapes of some of the objects do indeed skew the outcome). To alleviate these effects of random fluctuations, I use different measures of fitness to assess the quality of the algorithms. These mea-sures are either based on the actual residual calculated (in Sec. 7.4.4.1), or on an algorithm’s relative performance compared to all other algorithms, its ranking (in Sec. 7.4.4.2).

7.4.4.1 Absolute Performance

Figure 7.7 shows the range (minimum, median, and maximum) of residuals encoun-tered for each of the 48 combinations of axis- and vertex models, ordered by axis model firstly and features used secondly. In the following I will mostly be inter-ested in the maximum residual calculated, as some algorithms could clearly result in unacceptably wrong results, which of course need to be avoided if the number of

182 The Calculation of the Homology

Axis model 3 Vertex model 2

r = 122.763

Axis model 9 Vertex model 2

r= 14.5296

Axis model 15 Vertex model 14

r= 5.03427

Figure 7.8: Cases of maximum residual for three axis models (using the best vertex-model). The graphs show the left contour mapped onto the right one false negatives is to be kept small; only then will I consider the median residual, which gives information about the algorithms’ average performance. The minimum residual is of little interest to us as, given enough trials, it will always be in the order of ε ≈√

2σp_i.

I already mentioned before that the actual algorithm used for the calculation of the vertex is of little importance for our comparison (assuming SORs), and this is born out by Fig. 7.7, where results look similar for all three vertex models. I will therefore discuss algorithms by axis model in the following.

Maybe the most interesting result, when studying Fig. 7.7, and at first glance con-trary to this thesis’ line of argument, is that a more complicated error-model will not necessarily improve results; using more features, on the other hand, can in fact considerably decrease performance. The former can be easily seen in the case where only intra-pair features are used (first and third block in Figure 7.7). These features are reasonably reliable, so that good results can be obtained even without an explicit error-model, while the error model used obviously isn’t completely accurate at least for some cases — we see from the table to Fig. 7.7 that an explicit error-model did in all cases reduce the median (and in most cases also the minimum) error, just not the maximum error. However, even the maximum error decreases once we are also using the less reliable interpair features together with an explicit error-model (ig-noring models 9 and 13 which, according to theory, should have performed similarly, but surprisingly didn’t).

The latter, that more features can give worse results, can be seen when we compare the algorithms using intra-pair bitangent-intersections, 0, 4, 8, and 12 (the first block), with the ones using interpair intersections, 1, 5, 9, and 13 (the second block)

— the maximum and median error actually increase for the algorithms which do not use an explicit error-model, although many more features are used (2N(N−1) versus N features). The most striking example is provided by Algorithm 3, which uses all

7.4.4 Results 183 available feature points and simple orthogonal regression in the image plane. This is the most commonly used model for the calculation of a line through several points, and the one which performed so well when fitting a line to edgels in Section 4.3

— but clearly the solutions found by this approach cannot be relied on at all for applications where the underlying assumption of iiid measurements is not valid.

Even the median residual for this method is higher than that from any other model, and the maximum error can be absolutely intolerable. For the algorithms with an explicit error model, on the other hand, the median error decreases drastically by about 70 %. Figure 7.8 illustrates how the maximum errors from Algorithms 3, 9 and 15 will affect the result, to give an idea how good or bad the relative errors are.

The third thing to be learned from Fig. 7.7 is that the additional use of crosspoints will, as a rule, improve the results calculated. This is particularly true for algorithms which use an explicit error-model.

To sum up: as expected a high number of features is indeed preferable, but only if used together with an explicit error-model; without such a model the emphasis should be put on accurate rather than numerous features — this is in direct conflict with the assumption underlying many algorithms that more features are always better. And although even the algorithm with the lowest maximum residual (axis model 15, vertex-model 14 — the most refined model using the most features) will produce noticeable errors for some input-constellations, we can see from Fig. 7.8, right, that the results are even in the worst case much more usable than for some of the other algorithms, Fig. 7.8 left and middle. It should also be noted that this particular object is actually not quite symmetric, although in this case 8 out of the 48 algorithms tested performed better. Ranking the relative performance of all algorithms is indeed another possibility to determine fitness, and will be done in the next section.

7.4.4.2 Relative Performance

Although any algorithm might return the smallest residual for one particular outline, we would nonetheless expect that the better an algorithm is suited for the task, the more often should it show up among the best N algorithms; conversely the more often it is placed among the worstN algorithms, the more unsuitable would we deem this algorithm. Table 7.1 lists, for each algorithm, how often it was observed among the best 3 algorithms. From this table it seems as if performance is mostly a matter of features used — all algorithms from the first block (intra-pair intersections only) perform considerably worse than any of the algorithms from the last block, using the maximum number of features, and algorithms using an intermediate number of features perform somewhere in between. The image becomes somewhat clearer if we also consider the 3 worst algorithms, shown in Table 7.2. The Algorithms 0–7, which use no explicit error model, account for 84 % of the worst 3 algorithms.

The usefulness of an explicit error-model becomes even more apparent if we look

184 The Calculation of the Homology

Table 7.1: How often out of 49 runs each algorithm was among the best 3

Axis Model Sum

Table 7.2: How often out of 49 runs each algorithm was among the worst 3

Axis Model Sum

Figure 7.9: Histograms of rank for axis models 6, 7, 11, and 1

Discussion 185

Figure 7.10: The accuracy with which bitangent points can be located depends on the contour’s curvature in that region. This has little influence on bitangent-intersections, but can considerably influence the position of crosspoints and in-terpair intersections

at a histogram of the ranks achieved with the Algorithms 6, 7, 11, and 15 (which, according to Tables 7.1 and 7.2, all performed similarly, while in theory 11 and 15 should exhibit superior performance). Figure 7.9 shows a clear difference between the algorithms which do not use an explicit error model (6 and 7, top row) on the one hand and the ones which do (11 and 15, bottom row) on the other. The former (as do most other models) show a nearly uniform distribution, which means that they are similarly likely to be among the N best as well as the N worst algorithms, while the latter’s distribution looks somewhat like a Poisson distribution, with good ranks much more likely than bad ones. This shows that the overall likelihood of an acceptable result is much higher for axis models which use as many features as possible together with an explicit error-model.

I believe this to be strong evidence that the use of an explicit error-model, at least when used together with many features of varying quality, can considerably improve an algorithm’s performance.

Im Dokument Error Propagation (Seite 179-185)