• Keine Ergebnisse gefunden

-5 -4 -3 -2 -1 0

-1 0 1 2 3 4 5

hue value of the structure

-1 0 1 2 3

-4 -3 -2 -1 0 1

saturation value of the structure Figure 3.11: Distribution of the three domains along the eigenvectors with the two largest eigenvalues. The eigenvectors are computed in the combined data set based on different image features. Green stands for artexplosion photos, red marks myMondrian images and blue represents pictures of the shark webcam.

Sequential Data Organisation and Categorisation by 1dSOMs

A ordinary approach to find a desired picture in an unstructured collection is to eye the images one by one. Humans are familar with this procedure. Nevertheless this is a time-consuming and often boring way to get a required image. Thus an appropriate sequential ordering can be helpful to support the search in image collections. An automated approach to perform this is preferable. In general, the sequential ordering of images is a desirable process step for image retrieval tasks.

The pictures of an arbitrary mixed set usually have no specific order. However, it can be assumed that similar images or consecutive pictures of a sequence are neighbours in the data space. Using artificial neural networks (ANNs) such an ordering might lie on or near a low-dimensional manifold. Self-Organising Maps(SOMs) [Kohonen, 1997], also known as Kohonen-Maps, are neural networks promising to perform an automated ordering of images. They preserve topological relationships to represent the similarity of consecutive images. Furthermore they adapt unsupervised. Thus the trainings step does not require a labelled data set. A similarity graph of the input data can be computed.

In this chapter a SOM based approach to perform a one dimensional alignment of images is presented. The theoretical background is introduced and several experiments are carried out in order to evaluate the approach.

4.1 Self-Organising Maps

Classical SOMs are two dimensional, spatial interacting networks, often used for adapta-tion and regression. The goal usually is a low dimensional data representaadapta-tion maintaining similarity relationships corresponding to the topological relations. In contrast to PCA as a linear approach, SOM can be considered as a non-linear extension. Based on the map-ping of the statistical relationships between high-dimensional data on a low-dimensional display by preserving topological and metric relationships visualisation and abstraction of high-dimensional data are performed. Each node (often called neuron)aof a two dimen-sional grid Ais associated with a reference vector wa∈IRD, whereDis the dimension of the input data. In the training process the reference vectors waare adapted to the input data. The SOM update rule for the weight vector of the unitais:

wa(t+ 1) = wa(t) + haa(t)(x(t)−wa(t)) (4.1) where t denotes time. The x(t) ∈ IRD is the input vector randomly drawn from the data set at time t and haa(t) the neighbourhood function around the winner neuron a

47

whereρ(a) is the location of unitaon the map grid. Usually the neighbourhood radius is bigger at first and is decreased, e.g. linearly, to one during the training. ǫ(t) is a decreasing learning rate.

For visualisation purpose, the input x will be assigned to that neuron a which has the most similar reference vector to the input vector. The best match node a is given according to:

a = arg min

a∈A d(x,wa), a= 1, ..., Nn (4.3) where d(.) is a distance function, usually the Euclidean distance. Using the learning algorithm outlined above the global ordering of the reference vectors will be reached in a finite number of steps. Based on the mapping of the input data to the SOM nodes the data is ordered, as well.

Since the first works on Self-Organising Maps in the 1980s [Kohonen, 1982] [Ritter and Schulten, 1986] [Ritter, 1991] SOM has become a wide spread field of research as the bibliographies [Kaski et al., 1998] and [Oja et al., 2002] show. Applications are investigated as well as improvements of the algorithm and modifications of the approach:

To overcome the discrete character of common SOMs a Parametric SOM (PSOM) is presented [Walter and Ritter, 1996]. The discrete grid positions of the SOM nodes are generalised to a continuous manifold. It is enhanced for noisy and incomplete data [Klanke and Ritter, 2005]. In [Saalbach et al., 2002] PSOM is used for classification and pose estimation of the objects in a COIL (Columbia Object Image Library [Nene et al., 1996]) image set.

A similar approach is the Continuous SOM (C-SOM) [Aupetit et al., 1999] and [Cam-pos and Carpenter, 2000]. Basically, an interpolation step is added after the training of a common SOM grid. C-SOMs are used to perform continuous function approximations.

Usually the grids of SOMs are based on the Euclidean space. Indeed the embedding of complex highdimensional and hierarchical structures in this space is limited by the restricted size of the neighbourhood of a point. Since the hyperbolic space is better qualified to represent highdimensional neighbourhoods, Hyperbolic Self-Organising Maps (HSOM) are introduced in [Ritter, 1999]. It is used for browsing in text-databases in [Ontrup and Ritter, 2001a] and [Ontrup and Ritter, 2001b].

Apart from visualisation purposes, SOM based approaches are used for feature detec-tion. In [Kohonen et al., 1997] as well as in [de Ridder et al., 2000] and [de Ridder et al., 2001] unsupervised procedures to compute adaptive subspaces are presented and utilised for feature detection.

As presented in section 2.2.1 Self-Organising Maps can be arranged hierarchically. The resulting approach is called Tree-structured SOM (TR-SOM) and has been successfully used in text [Kohonen et al., 2000] and image retrieval [Laaksonen et al., 2000].

Based on the mapping to the SOM grid, classification tasks can be performed. For example in [K¨ampfe et al., 2001] SOM is used to classify image patches.

able. As shown in [Kohonen, 1997], for an appropriate number of learning stepst(t→ ∞) in a one-dimensional SOM the weights wa become ordered ascendingly or descendingly.

Since each weight is associated with a data vector or image, such a 1dSOM is interesting regarding picture alignment tasks.

1dSOM

1dSOM is a chain A of Nn consecutive nodes. The associated reference vectors wa, a= 1, ..., Nn approximate the image space. The learning algorithm is the same as in the standard two-dimensional SOM above. For each imagex∈X the the best match nodea determines the position ρ(x) along the sequence. The direction of the changes based on the sequential structure cannot be detected by the 1dSOM. Regarding pictures the reverse order is as good as the forward movement.

The usage of 1dSOM to align data in a sequence resembles the travelling salesman problem. Therefore different approaches are investigated. Self-Organising Maps and elas-tic map models are identified as suitable to solve such combinatorial optimisation tasks [Smith, 1999]. For example in [Bacao et al., 2005] this is used for path finding in marine patrol situations. Here, the optimal route to inspect critical or interesting points in the sea are desired. This resembles the approximation of a data distribution by a trajectory, since a variety of connections between interesting points are possible in a lot of situations.

Another important aspect of SOMs is the assignment of a number of data samples to one node. This can be problematic for image alignment since the pictures matching the same node still set up an unordered set. An elementary solution to this problem is an oversized 1dSOM with at least as many nodes in SOM as pictures in the sequence, Nn ≥ N. Thus, the desired mapping of single images to each node becomes possible.

Concerning the task of ordering one image sequence by one 1dSOM this is justifiable.