Multimedia Databases
Wolf-Tilo Balke Younès Ghammad
Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
• Texture-Based Image Retrieval – Low Level Features
•Tamura Measure, Random Field Model
– High-Level Features
•Fourier-Transform, Wavelets
Multimedia Databases– Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2
Previous Lecture
4 Multiresolution Analysis and Shape-based Features
4.1 Multiresolution Analysis 4.2 Shape-based Features
- Thresholding - Edge detection - Morphological
Operators
Multimedia Databases– Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 3
4 Shape-based Features
• In the case of images with many pixels (high resolution) wavelet transforms provide high- dimensional equation systems
• Calculation of long feature vectors by solving linear equations?
– Far too expensive!
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4
4.1 Multiresolution Analysis
• The wavelet transform of an image can be computed using fast wavelet transform algorithms in linear time
– It can be calculated by the repetition of two steps:
•Converting the image into a representation with reduced resolution (pixel count)
•Storing the image information lost by this transformation (which provides the wavelet coefficients)
• The underlying technology is called Multiresolution Analysis
4.1 Multiresolution Analysis
• Idea:
– Consider the image in different resolutions – The image signal is composed of “raster" parts and
detail parts
– Therefore: representation of the image by blocks of detailed information from which the image can be restored in stages
– Example:
4.1 Multiresolution Analysis
More detail
More
detail …
• Forming the average in different resolutions
• Summarize blocks of pixels by using their average as one pixel
• „Averaging and Downsampling“
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7
4.1 Image Resolution
• Basic idea
– V
k: Pixel raster of the original image – V
k–1: Raster of a lower resolution,
therefore it has less pixels than V
k– The process will continue down to V
0, which consists of only one pixel
– It still has to be defined for each V
ihow the intensities of pixels are obtained from the
intensities of pixels belonging to V
i-1’s coarser raster
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8
4.1 Multiresolution Analysis
. . .
• Usually
– The intensity of a pixel in the grid V
i-1is the mean of a set of corresponding pixels in the grid V
i– V
0has then as intensity the average intensity of the
output image V
k– V
i-1is calculated from V
iby halving the number of pixels in width or height
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 9
4.1 Multiresolution Analysis
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10
4.1 Example
…
300 200 150 200
150 100 75 100
75 50
• For each pixel (x, y), in the original image there is in each raster V
ia pixel p
i(x, y) derived from (x, y) through repeated averaging
– Let f
i(x, y) be the intensity of the pixel p
i(x, y) in raster V
i– For each pixel (x, y) of the original image and each i we have:
f
i(x, y) = f
i-1(x, y) + d
i-1(x, y)
– By using the detail information d
i(x, y) we can reconstruct the intensity of the pixel (x, y) in the original image f (x, y):
4.1 Multiresolution Analysis
• Details are often described by the differences of averages (“Averaging and Differencing”):
4.1 Multiresolution Analysis
Differences:
Advantage:In images, usually neighbor pixels are
• These two steps correspond to the application of filters in the signal processing:
– High pass filter: only receives signal components with high frequency (= baby wavelets of higher order) – Low pass filter: only receiving signal components
with low frequency (= baby wavelets of low order)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 13
4.1 Example
X(ω)
ω
XLP(ω)
ω XHP(ω)
ω
• High-pass filter extract the image details, low-pass filter, the averages
• Four possible applications of both filters, to reduce the image size, both vertically and horizontally by half:
– HH, HL, LH, LL (sub-band)
• Save the results of the high pass filter for the subsequent reconstruction of the image
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 14
4.1 Multiresolution Analysis
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 15
4.1 Multiresolution Analysis
Filtering and Downsampling
• Various resolutions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16
4.1 Multiresolution Analysis
LL LH
HL HH
The total number of pixels in each step is the same, i.e. no loss of information!
• Feature array
– Save the expected value and standard deviation of Wavelet coefficients at each resolution
– E.g., three-stage resolution is a 20-dimensional feature array
4.1 Multiresolution Analysis
• Shape-based retrieval
– Occurring shapes contribute significantly to the similarity of images
•In contrast to the purely visual impression made by colors or textures, shapes often carry deeper semantic information
•A displayed item, is often independent of color, however usually identical items have an identical shape
4.2 Shape-based Features
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19
4.2 Example: Chair
• Combination of simple shape-features (round, elliptical, triangular, square, trapezoid, ...) with other features (color, texture, etc.) brings better retrieval
– "Round object in a red-orange image" may be a search for a sunset ...
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 20
4.2 Basic Idea
• Even more complicated: "Find all coats of arms containing crosses"
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 21
4.2 Basic Idea
• Fundamental problems – How to recognize the shape
of things in images?
– Is a semantic mapping always possible?
– How do we describe shapes with features?
– Which shapes are similar and how do you compare different shapes?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 22
4.2 Shape-based Retrieval
• Shape segmentation is a fundamental problem – Which shapes are displayed in the image?
– All of them?
– All important?
– Only foreground motive?
4.2 Segmentation
• What represents shape and what does not?
• Is the shape homogeneous in colors or textures?
4.2 Segmentation
• Are all parts of the shape visible?
• Is the sun round?
• Segmentation?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25
4.2 Segmentation
• Can the segmentation be done automatically?
– At least semi-automatically?
• Not in early versions of multimedia retrieval!
– E.g.: IBM's QBIC Image Classifier
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 26
4.2 Automatic Segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27
4.2 IBMs QBIC Prototype
• Manually or semi-automatic with Flood Fill ("seeded region growing")
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 28
4.2 Segmentation in QBIC
input image masked marked shape auto-unmask
• Problem
– Only segmentation of monochrome surfaces – 'End' forms
4.2 Segmentation in QBIC
input image masked marked shape auto-unmask
• Many research projects in multimedia retrieval have been working on the topic (e.g., Blobworld, Photobook)
– There are solutions, however mostly for special cases
4.2 Automatic Segmentation
• Due to segmentation problems, shape features were removed from all commercial databases
– IBM's QBIC
→ DB2 Image Extender (set) – Virage Retrieval Engine
→ Oracle Multimedia → Oracle Intermedia – Excalibur Technologies
→ Informix Image Foundation DataBlade
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31
4.2 Automatic Segmentation
• In principle, a form is defined through the outer perimeter
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 32
4.2 Automatic Segmentation
• How do you get this outline?
– Segmentation of areas with the same brightness, color and/or texture
– Edge detection (differences in brightness, gradient, watersheds, etc.)
– Filling in the spaces with morphological operators (dilation and erosion)
– Segmentation of the outline as closed curve (polygon, splines, ...)
– And a large number of other procedures ...
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33
4.2 Automatic Segmentation
• Usually applied for gray value images
• Idea:
– Important objects can be
differentiated from the background because of their different brightness range
– A certain threshold separates the regions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 34
4.2 Thresholding
• Supposition:
– Thematically related areas have similar gray values – Can be clearly separated from the background
4.2 Thresholding
• Fixed threshold
– A fixed threshold is applied to each image – Enough for example in the case of binary images
• Flexible Threshold
– Depending on the gray value histogram – New threshold for each image – Often, the histogram is first smoothed
but without moving the peaks
4.2 Thresholding
• ISODATA algorithm (Ridler and Calvard, 1978) – Divide the gray value histogram into two parts – Calculate the expectation values of the gray values in
the left and right part
– Compute a new threshold as the average of the two expected values
– Iteratively compute the new expected values and a new threshold (until the threshold no longer changes significantly)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 37
4.2 Thresholding
• Triangle algorithm (Zack and others, 1977) – Connect the
highest peak in the histogram with the highest brightness value – Maximize the distance
to the connecting line – Threshold is minimum,
shifted by some constant value
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38
4.2 Thresholding
• Medical segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39
4.2 Application example
• There are also area-based algorithms, which evaluate thresholds of individual image areas to segment an image
– Applicability depends strongly on each image collection
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40
4.2 Thresholding
• Area based algorithms, especially for color images
4.2 Thresholding
Segmentation with Edge Flow (Ma and Manjunath, 1997)
• Advantage
– Very simple procedure
• Disadvantage
– Determination of the "right" Thresholds – Supposition: strong color or gray value change
between foreground object and background – Problem: decomposition of complex objects
4.2 Thresholding
• Strong color change between the foreground object and background?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43
4.2 Thresholding
• Not the area of the foreground objects will be detected, but borders of such areas
• The goal is a closed curve around an image object
• Usually, maxima of the first and second derivative of the brightness function are considered
• Gradient and Laplace operator
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 44
4.2 Edge Detection
• How to determine the gradient at (x, y)?
• Problem: gradients require differentiable (continuous) functions; we only have discrete supporting points
• Two common solutions:
–(1)
Estimate a differentiable function from the available supporting points and use these
(e.g., via Fourier transformation)
–(2) Estimate the course of this function for each pixel
from its immediate neighborhood (e.g., Sobel filter);
often much faster than (1)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45
4.2 Edge Detection
• Gradient-based method
– Calculate the magnitude of the gradient at each point (e.g., Sobel filter)
– Edges denote high gradient
– Then use an threshold algorithm to separate the edges from regions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46
4.2 Edge Detection
• Advantage – More simple filter
• Disadvantage
– Very susceptible to noise (one possibility would be performing noise reduction before applying the Sobel filter)
– Blurred or merging contours
4.2 Edge Detection
• Zero crossing of second derivative (“Laplacian Zero-Crossing”)
• Is particularly used in "noisy" images with blurred edges
• The behavior of the gradient is studied starting from an ideal edge
4.2 Edge Detection
• Idea: zero passage of the second derivative shows the maximum of the gradient
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 49
4.2 Edge Detection
• Unlike the gradient procedure, it is not expected that every point with a sufficiently high gradient value to be assigned to the edge, but only the points on the zero-crossing
• Applying a smoothing filter (normally Gaussian filter) before calculating the derivative prevents the susceptibility to noise
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 50
4.2 Edge Detection
• Important: Only real zero crossings, not zero points
• Mark all pixels with zero crossings and multiply them by the "strength" of the edge (e.g., magnitude of the gradient)
• Again, we can bring thresholding in performing segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 51
4.2 Edge Detection
• Example:
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 52
4.2 Edge Detection
• Comparison between gradient procedure and zero crossing technique:
4.2 Edge Detection
original gradient zero crossing
4.2 Edge Detection
• Sobel and Zero-crossing filters in Matlab – Transform image to gray scale values – sobel = edge(img, ‘sobel’)
– zeroc = edge(img, ‘zerocross’)
• Watershed transformation
• Supposition: surfaces are defined by minimal gray values and their zone of influence
• Idea: “Flooding" a surface judging by the minimum gray value, so that different surfaces do not connect
• Gray values can be seen as topographical surfaces or "Mountains"
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55
4.2 Watersheds
• Example: Flood regions based on the minimum gray values
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56
4.2 Watersheds
• For image segmentation:
Watershed transformation of the gradient :
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57
4.2 Watersheds
original gradient water
separation
segmentation
• Advantage
– Enclosed and correct bordering
• Disadvantage
– Difficult to implement efficiently – Over –
segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58
4.2 Watersheds
• Supposition
– Regions are bordered by a predominantly closed curve ("salient boundary")
• Method
– Based on a curve ("snake") iterate towards the best possible separation
– Minimize the energy of the snake curve
•Internal energy: curvature and continuity
•External energy: image energy (gradient)
4.2 Active Contour 4.2 Example
• Advantage
– Fits also "fuzzy" edges
• Disadvantage
– Complexity of the curve increases with the accuracy of contour
– Where does the initial snake curve come from?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 61
4.2 Active Contour
• Problem:
– Noise can make shape recognition difficult
• Goal:
– Make the contours of surfaces easily recognizable and easy to describe
• Solution:
– Apply morphological operators as a preprocessing step
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62
4.2 Morphological Operators
• Morphological operators are binary
neighborhood operations for changing the surfaces
– Pixels are removed or added to the object edges by such operations
– These operations are controlled by an operator mask (the "structure element")
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63
4.2 Morphological Operators
• Basic operators
– Dilation – “inflating”, adding pixels to the area – Erosion – “shrinking”, removing pixels from the area
• Typical structural elements are symmetric areas
of a pixel
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64
4.2 Morphological Operators
• Dilation
– The structural element will be applied on all pixels of the source image
– The structural element defines a neighborhood around each pixel – In the dilated image the black pixels,
are exactly the pixels which had a black pixel anywhere in their neighborhood in the original image – Effects:
→ enlarging areas
→ connecting objects with small distance 4.2 Morphological Operators
• Example of a dilation with various structural elements
4.2 Morphological Operators
Original pixel New pixel after dilation
• Erosion
–
The structural element is again applied to every pixel of the source image
–
The structural element again defines neighborhoods
–
In the resulting image the white pixels,
are exactly the pixels which had a white pixel in their neighborhood
–
Effects:
→
small spots disappear
→
breaking up areas with small connections
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 67
4.2 Morphological Operators
• Example
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 68
4.2 Morphological Operators
Original
Dilation
Erosion
• Opening - erosion followed by dilation – Elimination of thin and small objects – Breaking up thinly connected areas – Smoothing of edges
• Closing - dilation followed by erosion – Small holes are filled
– Joining close objects – Smoothing of edges
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 69
4.2 Morphological Operators
• Example
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 70
4.2 Morphological Operators
Original
Opening
Closing
• Advantages
– Using morphological operators for image processing makes it easier to obtain good shapes
• Disadvantages
– Gray values of the areas must be uniform – Precise control is relatively difficult
4.2 Morphological Operators
• Multiresolution Analysis
• Shape-based Features - Thresholding - Edge detection
- Morphological Operators
This Lecture
• Query by Visual Examples
• Shape-based Features – Chain Codes – Fourier Descriptors – Moment Invariants
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 73