Crossratios - Comparing Stochastic Entities

4.6 Comparing Stochastic Entities

4.6.4 Crossratios

The comparison of cross-ratios is straightforward; one can simply use (4.56)– (4.59) with n = 1 so that the covariance matrices become simple variances. We can then calculate the most likely true crossratio as

cr = cr1σ_cr²₂ + cr2σ²_cr₁

σ_cr²₁ +σ_cr²₂ (4.67)

which it was possible to calculate with accuracy (variance) σ² = σ²_cr₁σ_cr²₂

σ_cr²₁ +σ_cr²₂. (4.68) The hypothesis that the two measurements cr1 and cr2 are observations of the same entity cr can then be tested using:

R= (cr1−cr2)² σ_cr²₁ +σ²_cr₂

< χ! ²_p,1. (4.69)

Chapter 5 Detecting Repeated Parallel Structure

. . . Stripe for Stripe.

The Bible: Hebrew Exodus 21:23

108 Introduction

Figure 5.1: Examples of repeated parallel structure in images.

5.1 Introduction

This chapter, as well as Chapters 6 and 7, demonstrates the application of the theories discussed in Chapter 4 to a real-world example.

The algorithm described here deals with the detection of repetitive structures con-sisting of parallel line segments with a given crossratio. Some of the more obvi-ous examples for this are railway-sleepers, fences, and windows (particularly in big office-buildings). Figure 5.1 shows some examples. The structure used throughout most of this chapter is that of a pedestrian or zebra crossing . This was originally implemented as part of the project MOVIS¹ — Mobile Optoelectronic Visual Inter-pretative System for the Blind and Visually Impaired — which took place from 1995 to 1997 [1, 6]. Within this project, a first prototype of a portable device for blind and visually impaired persons was created which was able to recognise a small number of useful objects and signs customarily found in street scenes. This prototype consisted of a spectacle-like device connected to an (at the time stationary) computer doing the image processing. Figure 5.17 on Page 130 shows images of the actual device used. The theory is, however, independent of this particular application and equally applicable to any other repeated structure of parallel line segments.

Detecting zebra crossings may sound like an easy task. After all, they’re big, and they’re designed to be fairly obvious. However, it isn’t. Reasons include:

The general amount of occlusion connected with street scenes, namely fel-low pedestrians who get in the way, lamp and sign posts, cars, and basically everything that moves. Moreover, zebra crossings are particularly prone to occlusion, since they are designed for people to walk on and for cars to drive across.

Zebra crossings are often in bad repair, patches are missing, they have spots or holes.

1MOVISwas funded by the BMBF, the German Ministry for Education and Research.

Introduction 109

Figure 5.2: Different views of a zebra crossing as seen from a car (left) and a pedestrian (middle and right).

Due to varying viewing geometries the width of stripes in an image may vary from dozens of pixels to only 2 to 4 pixels — even within one stripe.

The recognition of repeated parallel structures under perspectivity has traditionally been dealt with in the context of texture analyses. In [84] an algorithm for the recognition of arbitrary repeated structures is presented. However, this approach requires a minimum amount of texture within each element of the structure and is therefore unsuitable or at best problematic for structures with little or no texture as they are presented here. Only after the work described here was first published [6]

did a small number of papers appear based on this work [134, 135, 137].

The work specific to zebra crossings on the other hand has nearly exclusively assumed an autonomous vehicle’s (car’s) point of view [85, 108]. This way, the camera’s orientation relative to the ground can be assumed known. Also, the street’s left and right boundaries are generally well known, and these can be used to identify the road’s (virtual) vanishing point [90], through which all lines bounding a zebra crossing have to pass [108]. Finally, from the viewpoint of a car a zebra crossing is always encountered head on, which means that all stripes will have approximately the same width on any row of the image, and that the zebra crossing will at most be occluded by objects directly on the road. Figure 5.2 (left) shows an example of a zebra crossing as seen from a car.

None of the constraints mentioned above apply when dealing with a camera carried by a pedestrian, as within MOVIS. Here, the camera’s orientation relative to the ground is at best only approximately known (e. g. from motion sensors affixed to the camera), and no other constraints exist. Also, the zebra crossing will often be heavily occluded, fracturing the individual stripes into several “stubs”. Figure 5.2 (middle, right) shows two examples of a zebra crossing as seen from a pedestrian’s point of view. This means that it is generally necessary to group several separate patches into one zebra crossing. It is my experience that this is best achieved using a line-based approach, which will be described in Section 5.3. Error propagation plays a particularly important role here since a zebra-crossing’s size and quality in the image can vary considerably from image to image — so much in fact that a first prototype based on static thresholds never worked on more than at most two images at the same time, while the approach presented here has proven it’s stability on literally

110 Model thousands of images. Work on the recognition of pedestrian crossings from the viewpoint of a pedestrian only appeared after the original publication of this work in [6], and building on it [135].

The remainder of this chapter is organised as follows: Section 5.2 describes the underlying model used to group and recognise repeated parallel structures. The ex-ample of a zebra crossing used is easily modified for other structures, possibly simply replacing “horizontal” with “vertical” where appropriate. Section 5.3 describes the actual process of grouping and recognition based on the theory and principles dis-cussed in Chapter 4. This makes use of my new formulation for the calculation of the cross-ratio described in Section 4.5.2. In addition I present a new method for the transformation of lines into an only partly specified canonical frame, i. e. one where only some structural information is given, in Section 5.3.3. To my knowledge this was also the first application where the horizon was calculated from image struc-ture alone (now a staple of projective geometry). A heuristic, but in my experience rather efficient method for merging hypotheses in the presence of unquantified errors in the object’s geometry is given in Section 5.3.4. Although all sections discuss the relative merits of different camera models, I have found that it is the assumption of a quasi-calibrated (“sensible”) camera which allowed me to implement an algorithm that is both fast and robust. Based on this camera model, Section 5.4 describes a simple but at the time of implementation new approach used for verification, which stands in the tradition of [97] and could well be seen as the forerunner of algorithms such as [33, 34, 87]. Finally, Section 5.5 presents some examples of successfully recognised zebra crossings and discusses the results.

Im Dokument Error Propagation (Seite 106-110)