• Keine Ergebnisse gefunden

Scale space and difference of Gaussians

3.2 Detection of particles with SIFT

3.2.1 Scale space and difference of Gaussians

The main idea of SIFT is to extend the images by another dimension, thescale. We will see that extrema in scale space belong to interesting features in the image and the ability to detect them is independent of their size. The method can be applied in the same way to 2D and 3D images. Following Lowe [64]

the first step is to create the scale space by convolving the raw image𝐼(π‘₯, 𝑦, 𝑧)with a series of Gaussian

raw image

Figure 3.4:Overview of the here implemented particle detection algorithm using SIFT

3.2 Detection of particles with SIFT

Figure 3.5: An image is convolved with the Gaussians as described in Equation3.6until one octave is fully computed (here𝑛= 2, 𝑠= βˆ’1,0,1,2,3). Then the image is down-sampled by a factor of2in each dimension for the computation of the next octave. DoGs on the right stack are computed by subtracting adjacent Gaussian filtered images from the left stack. Figure is copied from [64]

filters𝐺of increasing widthπ‘˜πœŽ(in Fourier space these are low pass filters with increasing cut off values):

𝐺𝑠(π‘₯, 𝑦, 𝑧) = 𝐼(π‘₯, 𝑦, 𝑧)⋆ 𝑔(π‘₯, 𝑦, 𝑧, π‘˜πœŽ), 𝜎 >0, π‘˜= 2π‘ βˆ•π‘›, 𝑠= βˆ’1,0,1,2,…

𝑔(π‘₯, 𝑦, 𝑧, π‘˜πœŽ) = 1

2πœ‹(π‘˜πœŽ)2π‘’βˆ’[π‘₯2+𝑦2+𝑧2]βˆ•[2(π‘˜πœŽ)2] (3.6) The widthπ‘˜πœŽof the Gaussian filter increases exponentially with increasing𝑠. An octave is subdivided into𝑛intervals. It is doubled fromπ‘˜πœŽ=𝜎at𝑠= 0toπ‘˜πœŽ= 2𝜎for𝑠=𝑛(similar to the audio frequency, which is doubled when playing a note one octave higher).

The choice of𝑛allows one to adjust the sensitivity in the detection and the accuracy in the size deter-mination of the features, however, a division into too many intervals will lead to more wrong detections that have to be excluded later. Lowe [64] examined the dependency on𝑛by testing the method an real images, they checked if the same features are detected after a random rotation and a random rescaling of an image. A choice of𝑛= 3was the most reliable choice, with90% of the features being repeatedly detected in the images. For𝑛= 4and𝑛= 6they found a surplus of about10and20% in the number of features and the detection repeatability was still at85%. Note that they used photographs (landscapes, human faces, industrial images..). For particle images of high quality the repeatability is close to100 %. Leocmach and Tanaka [65] followed Lowe [64] and chose𝑛 = 3scales per octave. In this work it this was also found to be a good choice for 3D data. For 2D snapshots the performance with𝑛= 4seemed to be better with about5 %more detected particles in a 2D slice of a 3D sample.

The other parameter𝜎determines the width of the Gaussian filter used for𝑠 = 0. A reasonable value has to be between1.0and2.0, smaller values will not suppress noise, while bigger values will smear out all the small particles. [64,65] chose𝜎 = 1.6. In this work a value of𝜎 = 1.4was chosen, because it allowed for a more reliable detection of very small particles.

The second step for building up the scale space is to compute the difference of these consecutive filtered images, the so-called difference of Gaussians (DoG):

𝐷𝑠 =πΊπ‘ βˆ’πΊπ‘ βˆ’1, 𝑠= 0,1,2,… (3.7)

Figure 3.6: Applying difference of Gaussians (DoG) filters to a simulated noisy image using Equa-tions3.7and3.6with𝑛 = 4. Filter widths are denoted by the octave numberπ‘œand the filter number𝑠. Minima of particles with different sizes occur for the appropriate set ofπ‘œ,𝑠and have similar depths.

Now we have a series of band pass filtered images and in principle each one is similar to the single image used in the method of Crocker and Grier [35] (see3.3). Lowe [64] argues that the DoG is an ideal filter for feature detection: Since it is a close approximation to the scale-normalized Laplac a an of a Gaussian 𝜎2βˆ‡2𝐺, local extrema of𝐷𝑠(π‘₯, 𝑦, 𝑧)are independent of the feature size, i.e. small features and big features lead to the same absolute extremal values in DoG space (see Figure3.6). Furthermore, highly optimized codes are available for Gaussian filtering, which allow for an efficient and fast computation.

After completing the calculation of all𝑛+ 3Gaussian filtered images (𝑠 = βˆ’1,0,1,…, 𝑛+ 1) and all 𝑛+ 2DoGs (𝑠 = 0,1,…, 𝑛+ 1) for one octave, the image is resampled by computing the mean of two adjacent pixels/voxels in every direction (see Figure3.5). In 2D four pixels are merged to one pixel, in 3D eight voxels are combined to give one voxel. This decreases the number of values for one image by a ratio of 4 or8, respectively and leads to a great reduction of RAM and CPU usage. While the smallest features will be found in the first octave of the DoG space, larger features will be found in the higher octaves. The accuracy of the location determination in the higher octaves is not affected by this coarsening procedure (at least relative to the feature’s size). One has to keep in mind that mean values are used in the down-sampling. Therefore, all voxels contribute to the calculation of the DoGs with their full intensity resolution, also in the higher octaves.

Figure3.6shows examples for DoG filtered particle images, starting at a small filter size (π‘œ = 1,𝑠= 1) in the first octave up to the biggest filter size (π‘œ = 2, 𝑠 = 5) in the second octave. It is evident that down-sampling does not change the image: Values forπ‘œ= 1,𝑠= 7in octave 1 are the same as forπ‘œ= 2, 𝑠 = 3in octave 2. While small particles have have their minimum at lowπ‘œ and𝑠values, big particles show their deepest minimum at higher values, always according to the width of the Gaussians used in the filter. It is remarkable that minimal values obtained for different particle sizes are always at the same level. This confirms the theory that the DoG space is indeed scale invariant, which is also the reason for the name scale invariant feature transform (SIFT).

3.2 Detection of particles with SIFT

Figure 3.7:In 2D a local minimum is detected by comparing a pixel (black X) to all the 26 neighbours in DoG space (turquoise circles): 8 in the same image, 9 in the scale above, 9 in the scale below. Figure copied from [64].