Multimedia Databases
Wolf-Tilo Balke
Institut für Informationssysteme
Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
• Texture-Based Image Retrieval
– Low Level Features
• Tamura Measure, Random Field Model
– High-Level Features
• Fourier-Transform, Wavelets
Multimedia Databases– Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2
Previous Lecture
4 Multiresolution Analysis and Shape-based Features
4.1 Multiresolution Analysis 4.2 Shape-based Features
- Thresholding - Edge detection - Morphological
Operators
Multimedia Databases– Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 3
4 Shape-based Features
• In the case of images with many pixels (high
resolution) wavelet transforms provide high- dimensional equation systems
• Calculation of long feature vectors by solving linear equations?
– Far too expensive!
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4
4.1 Multiresolution Analysis
• The wavelet transform of an image can be computed using fast wavelet transform algorithms in linear time
– It can be calculated by the repetition of two steps:
• Converting the image into a representation with reduced resolution (pixel count)
• Storing the image information lost by this transformation (which provides the wavelet coefficients)
• The underlying technology is called Multiresolution Analysis
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 5
4.1 Multiresolution Analysis
• Idea:
– Consider the image in different resolutions
– The image signal is composed of “raster" parts and detail parts
– Therefore: representation of the image by
blocks of detailed information from which the image can be restored in stages
– Example:
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 6
4.1 Multiresolution Analysis
More detail
More
detail …
• Forming the average in different resolutions
• Summarize blocks of pixels by using their average as one pixel
• „Averaging and Downsampling“
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7
4.1 Image Resolution
• Basic idea
– Vk: Pixel raster of the original image – Vk–1: Raster of a lower resolution,
therefore it has less pixels than Vk
– The process will continue down to V0, which consists of only one pixel
– It still has to be defined for each Vi how the intensities of pixels are obtained from the
intensities of pixels belonging to Vi-1’s coarser raster
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8
4.1 Multiresolution Analysis
. . .
• Usually
– The intensity of a pixel in the grid Vi-1 is the mean of a set of corresponding pixels in the grid Vi
– V0 has then as intensity the average intensity of the output image Vk
– Vi-1 is calculated from Vi by halving the number of pixels in width or height
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 9
4.1 Multiresolution Analysis
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10
4.1 Example
…
300 200 150 200
150 100 75 100
75 50
• For each pixel (x, y), in the original image there is in each raster V
ia pixel p
i(x, y) derived from (x, y) through repeated averaging
– Let fi(x, y) be the intensity of the pixel pi(x, y) in raster Vi
– For each pixel (x, y) of the original image and each i we have:
fi(x, y) = fi-1(x, y) + di-1(x, y)
– By using the detail information di(x, y) we can reconstruct the intensity of the pixel (x, y) in the original image fi(x, y):
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 11
4.1 Multiresolution Analysis
• Details are often described by the differences of averages (“Averaging and Differencing”):
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 12
4.1 Multiresolution Analysis
Differences:
Advantage: In images, usually neighbor pixels are similar, thus the differences are often 0. Only strong intensity differences are contained in the compressed image.
• These two steps correspond to the application of filters in the signal processing:
– High pass filter: only receives signal components with high frequency (= baby wavelets of higher order) – Low pass filter: only receiving signal components
with low frequency (= baby wavelets of low order)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 13
4.1 Example
X(ω)
ω
XLP(ω)
ω XHP(ω)
ω
• High-pass filter extract the image details, low-pass filter, the averages
• Four possible applications of both filters, to reduce the image size, both vertically and
horizontally by half:
– HH, HL, LH, LL (sub-band)
• Save the results of the high pass filter for the subsequent reconstruction of the image
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 14
4.1 Multiresolution Analysis
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 15
4.1 Multiresolution Analysis
Filtering and Downsampling
• Various resolutions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16
4.1 Multiresolution Analysis
LL LH
HL HH
The total number of pixels in each step is the same, i.e. no loss of information!
• Feature array
– Save the expected value and standard deviation of Wavelet coefficients at each resolution
– E.g., three-stage resolution is a 20-dimensional feature array
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 17
4.1 Multiresolution Analysis
• Shape-based retrieval
– Occurring shapes contribute significantly to the similarity of images
• In contrast to the purely visual impression made by colors or textures, shapes often carry deeper semantic information
• A displayed item, is often independent of color, however usually identical items have an identical shape
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 18
4.2 Shape-based Features
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19
4.2 Example: Chair
• Combination of simple shape-features (round, elliptical, triangular, square, trapezoid, ...) with
other features (color, texture, etc.) brings better retrieval
– "Round object in a red-orange image" may be a search for a sunset ...
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 20
4.2 Basic Idea
• Even more complicated: "Find all coats of arms containing crosses"
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 21
4.2 Basic Idea
• Fundamental problems
– How to recognize the shape of things in images?
– Is a semantic mapping always possible?
– How do we describe shapes with features?
– Which shapes are similar and how do you compare different shapes?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 22
4.2 Shape-based Retrieval
• Shape segmentation is a fundamental problem
– Which shapes are displayed in the image?
– All of them?
– All important?
– Only foreground motive?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 23
4.2 Segmentation
• What represents shape and what does not?
• Is the shape homogeneous in colors or textures?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 24
4.2 Segmentation
• Are all parts of the shape visible?
• Is the sun round?
• Segmentation?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25
4.2 Segmentation
• Can the segmentation be done automatically?
– At least semi-automatically?
• Not in early versions of multimedia retrieval!
– E.g.: IBM's QBIC Image Classifier
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 26
4.2 Automatic Segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27
4.2 IBMs QBIC Prototype
• Manually or semi-automatic with Flood Fill ("seeded region growing")
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 28
4.2 Segmentation in QBIC
input image masked marked shape auto-unmask
• Problem
– Only segmentation of monochrome surfaces – 'End' forms
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 29
4.2 Segmentation in QBIC
input image masked marked shape auto-unmask
• Many research projects in multimedia retrieval have been working on the topic (e.g., Blobworld, Photobook)
– There are solutions, however mostly for special cases
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 30
4.2 Automatic Segmentation
• Due to segmentation problems, shape features were removed from all commercial databases
– IBM's QBIC
→ DB2 Image Extender (set) – Virage Retrieval Engine
→ Oracle Multimedia → Oracle Intermedia – Excalibur Technologies
→ Informix Image Foundation DataBlade
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31
4.2 Automatic Segmentation
• In principle, a form is defined through the outer perimeter
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 32
4.2 Automatic Segmentation
• How do you get this outline?
– Segmentation of areas with the same brightness, color and/or texture
– Edge detection (differences in brightness, gradient, watersheds, etc.)
– Filling in the spaces with morphological operators (dilation and erosion)
– Segmentation of the outline as closed curve (polygon, splines, ...)
– And a large number of other procedures ...
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33
4.2 Automatic Segmentation
• Usually applied for gray value images
• Idea:
– Important objects can be
differentiated from the background because of their different brightness range
– A certain threshold separates the regions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 34
4.2 Thresholding
• Supposition:
– Thematically related areas have similar gray values – Can be clearly separated from the background
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 35
4.2 Thresholding
• Fixed threshold
– A fixed threshold is applied to each image
– Enough for example in the case of binary images
• Flexible Threshold
– Depending on the gray value histogram – New threshold for each image
– Often, the histogram is first smoothed but without moving the peaks
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 36
4.2 Thresholding
• ISODATA algorithm (Ridler and Calvard, 1978)
– Divide the gray value histogram into two parts
– Calculate the expectation values of the gray values in the left and right part
– Compute a new threshold as the average of the two expected values
– Iteratively compute the new expected values and a new threshold (until the threshold no longer changes significantly)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 37
4.2 Thresholding
• Triangle algorithm (Zack and others, 1977)
– Connect the
highest peak in the histogram with the
highest brightness value – Maximize the distance
to the connecting line – Threshold is minimum,
shifted by some constant value
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38
4.2 Thresholding
• Medical segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39
4.2 Application example
• There are also area-based algorithms, which evaluate thresholds of individual image areas to segment an image
– Applicability depends strongly on each image collection
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40
4.2 Thresholding
• Area based algorithms, especially for color images
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 41
4.2 Thresholding
Segmentation with Edge Flow (Ma and Manjunath, 1997)
• Advantage
– Very simple procedure
• Disadvantage
– Determination of the "right" Thresholds
– Supposition: strong color or gray value change between foreground object and background – Problem: decomposition of complex objects
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 42
4.2 Thresholding
• Strong color change between the foreground object and background?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43
4.2 Thresholding
• Not the area of the foreground objects will be detected, but borders of such areas
• The goal is a closed curve around an image object
• Usually, maxima of the first and second derivative of the brightness function are considered
• Gradient and Laplace operator
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 44
4.2 Edge Detection
• How to determine the gradient at (x, y)?
• Problem: gradients require differentiable
(continuous) functions; we only have discrete supporting points
• Two common solutions:
– (1) Estimate a differentiable function from the available supporting points and use these
(e.g., via Fourier transformation)
– (2) Estimate the course of this function for each pixel from its immediate neighborhood (e.g., Sobel filter);
often much faster than (1)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45
4.2 Edge Detection
• Gradient-based method
– Calculate the magnitude of the gradient at each point (e.g., Sobel filter)
– Edges denote high gradient
– Then use an threshold algorithm to separate the edges from regions
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46
4.2 Edge Detection
• Advantage
– More simple filter
• Disadvantage
– Very susceptible to noise (one possibility would be performing noise reduction before applying the Sobel filter)
– Blurred or merging contours
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 47
4.2 Edge Detection
• Zero crossing of second derivative (“Laplacian Zero-Crossing”)
• Is particularly used in "noisy" images with blurred edges
• The behavior of the gradient is studied starting from an ideal edge
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 48
4.2 Edge Detection
• Idea: zero passage of the second derivative shows the maximum of the gradient
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 49
4.2 Edge Detection
• Unlike the gradient procedure, it is not expected that every point with a sufficiently high gradient value to be assigned to the edge, but only the points on the zero-crossing
• Applying a smoothing filter (normally Gaussian filter) before calculating the derivative prevents the susceptibility to noise
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 50
4.2 Edge Detection
• Important: Only real zero crossings, not zero points
• Mark all pixels with zero crossings and multiply them by the "strength" of the edge (e.g.,
magnitude of the gradient)
• Again, we can bring thresholding in performing segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 51
4.2 Edge Detection
• Example:
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 52
4.2 Edge Detection
• Comparison between gradient procedure and zero crossing technique:
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 53
4.2 Edge Detection
original gradient zero crossing
• Sobel and Zero-crossing filters in Matlab
– Transform image to gray scale values – sobel = edge(img, ‘sobel’)
– zeroc = edge(img, ‘zerocross’)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 54
4.2 Edge Detection
• Watershed transformation
• Supposition: surfaces are defined by minimal gray values and their zone of influence
• Idea: “Flooding" a surface judging by the
minimum gray value, so that different surfaces do not connect
• Gray values can be seen as topographical surfaces or "Mountains"
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55
4.2 Watersheds
• Example: Flood regions based on the minimum gray values
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56
4.2 Watersheds
• For image segmentation:
Watershed transformation of the gradient :
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57
4.2 Watersheds
original gradient water
separation
segmentation
• Advantage
– Enclosed and correct bordering
• Disadvantage
– Difficult to implement efficiently – Over –
segmentation
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58
4.2 Watersheds
• Supposition
– Regions are bordered by a predominantly closed curve ("salient boundary")
• Method
– Based on a curve ("snake") iterate towards the best possible separation
– Minimize the energy of the snake curve
• Internal energy: curvature and continuity
• External energy: image energy (gradient)
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 59
4.2 Active Contour
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 60
4.2 Example
• Advantage
– Fits also "fuzzy" edges
• Disadvantage
– Complexity of the curve increases with the accuracy of contour
– Where does the initial snake curve come from?
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 61
4.2 Active Contour
• Problem:
– Noise can make shape recognition difficult
• Goal:
– Make the contours of surfaces easily recognizable and easy to describe
• Solution:
– Apply morphological operators as a preprocessing step
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62
4.2 Morphological Operators
• Morphological operators are binary
neighborhood operations for changing the surfaces
– Pixels are removed or added to the object edges by such operations
– These operations are controlled by an operator mask (the "structure element")
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63
4.2 Morphological Operators
• Basic operators
– Dilation – “inflating”, adding pixels to the area
– Erosion – “shrinking”, removing pixels from the area
• Typical structural elements are symmetric areas
of a pixel
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64
4.2 Morphological Operators
• Dilation
– The structural element will be applied on all pixels of the source image
– The structural element defines a neighborhood around each pixel – In the dilated image the black pixels,
are exactly the pixels which had a black pixel
anywhere in their neighborhood in the original image – Effects:
→ enlarging areas
→ connecting objects with small distance
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 65
4.2 Morphological Operators
• Example of a dilation with various structural elements
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 66
4.2 Morphological Operators
Original pixel New pixel after dilation
• Erosion
– The structural element is again applied to every pixel of the source image
– The structural element again defines neighborhoods
– In the resulting image the white pixels,
are exactly the pixels which had a white pixel in their neighborhood
– Effects:
→ small spots disappear
→ breaking up areas with small connections
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 67
4.2 Morphological Operators
• Example
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 68
4.2 Morphological Operators
Original
Dilation
Erosion
• Opening - erosion followed by dilation
– Elimination of thin and small objects – Breaking up thinly connected areas – Smoothing of edges
• Closing - dilation followed by erosion
– Small holes are filled – Joining close objects – Smoothing of edges
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 69
4.2 Morphological Operators
• Example
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 70
4.2 Morphological Operators
Original
Opening
Closing
• Advantages
– Using morphological operators for image processing makes it easier to obtain good shapes
• Disadvantages
– Gray values of the areas must be uniform – Precise control is relatively difficult
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 71
4.2 Morphological Operators
• Multiresolution Analysis
• Shape-based Features
- Thresholding - Edge detection
- Morphological Operators
Multimedia Databases– Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 72
This Lecture
• Query by Visual Examples
• Shape-based Features
– Chain Codes
– Fourier Descriptors – Moment Invariants
Multimedia Databases – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 73