Multimedia Databases

(1)

Multimedia Databases

Wolf-Tilo Balke Janus Wawrzinek

Institut für Informationssysteme

Technische Universität Braunschweig

(2)

• Color spaces

–

RGB, CYMK, HSV

• Extracting color features

–

Average color, color histogram, quantization

• Matching

–

Comparison of histograms, Minkowski distance, Quadratic distance, Mahalanobis distance

–

Color Layout

• Today: Textures

Previous Lecture

…

(3)

3 Texture-Based Image Retrieval

3.1 Textures, basics

3.2 Low-Level Features

- Tamura Measure

- Probabilistic: Random Field Model

3.3 High-Level Features

- Fourier-Transform

3 Texture-Based Image Retrieval

(4)

• Textures describe the nature of typical, recurrent patterns in pictures

• Important for the description of images

–

Type of representation (raster image, etc.)

–

Image objects

• Natural things: grass, gravel, etc.

• Artificial things: stone walls, wallpaper, etc.

3.1 Texture Analysis

(5)

• Various ordered and random textures

3.1 Example

(6)

• Texture segmentation

–

Find areas of the image (decomposition) with homogeneous textures

• Texture classification

–

Describe and denote homogeneous textures in image regions

• (Texture synthesis)

–

Create textures for increased realism in images (texture mapping, etc.)

3.1 Texture Research

(7)

• Find image regions with a certain texture

• Here: wine grapes

• Scene

Decomposition

3.1 Texture Segmentation

(8)

• Color and texture are related

• Texture decomposition often provides no meaningful (semantically related) areas

3.1 Texture Segmentation

(9)

• Denote the (segmented) regions with a predominant texture

–

Medical images: tomography, etc.

–

Satellite images: ice, water, etc.

–

...

• Describe the corresponding texture with

appropriate features, suitable for comparisons in similarity search queries

3.1 Texture Classification

(10)

• Classification can be semantic

–

Textures represent objects in the real world

–

Strongly dependent on the application

• Or based on purely descriptive characteristics

–

Usually it has no direct significance for people

–

Ensures comparability between different image

collections

–

Query-by-Example

3.1 Texture Classification

(11)

• Example: Satellite image (semantically)

3.1 Texture Classification

Sand

Water

(12)

• How to describe textures for similarity measures?

• Low-level features

–

Basic building blocks (e.g., Julesz‘ Textons), Tamura- measure, etc.

• High-level features

–

Gabor-Filters, Fourier-Transformation, etc.

3.1 Texture Features

(13)

• How do people distinguish textures?

3.1 Texture Features

(14)

• (Rao / Lohse, 1993) give three main criteria:

–

Repetition

–

Orientation

–

Complexity

• Is this measurable?

3.1 Texture Features

(15)

• 60s and 70s: Grey-level analysis

–

Grey value histograms provide information on pixel intensity

–

Allows comparison using expected value, standard deviation, etc.

–

Similar patterns produce similar distributions of grey values

3.2 Low-Level Texture Features

(16)

• Moments of the first order, do not consider the position of the pixel

• Periodicity poorly detectable

3.2 Low-Level Texture Features

(17)

• Solution: Grey-level co-occurrence

–

Pixel at position s has intensity q: I(s) = q

–

(Julesz, 1961): Calculate the empirical

probability distribution for the intensity change of the value m at pixel shift, with d pixels to the right:

–

(Julesz, 1975): Generalization to

shifts in any direction. As two-dimensional distribution function (for every d) use the

Grey-level co-occurrence matrix

3.2 Low-Level Texture Features

(18)

• Grey-level co-occurrence matrix:

–

Consider all pixel pairs (x

₁

, y

₁

), (x

₂

, y

₂

) with Euclidean distance d

–

Point (x

₁

, y

₁

) has grey value i, Point (x

₂

, y

₂

) has grey value j,

i, j

∈

{1, …, N}

–

We can now define C

_d = [c_d(i, j)] as grey-level

co-occurrence matrix, where c

_d(i, j) is the number of

pixel pairs, which have distance d and intensity

i respectively j

3.2 Low-Level Texture Features

(19)

• Complicated to calculate,

(N x N) – a matrix for each distance d

• Many measures were derived from the grey-level co-occurrence matrices

• Thesis of Julesz (Julesz and others, 1973):

People can not distinguish textures, if they have identical grey-level co-occurrence matrices

• Perception psychology: Unfortunately wrong But useful as a rule of thumb! (Julesz, 1981)

3.2 Low-Level Texture Features

(20)

• The Tamura-measure (Tamura and others, 1978)

• Image textures are evaluated along six different dimensions

– Granularity (coarseness): gravel vs. sand – Contrast: clear-cut shapes, shadows

– Directionality: predominant directions – Line-Likeness

– Regularity – Roughness

• The last three properties are rarely seen and appear to be correlated to the others

3.2 The Tamura Measure

(21)

• Granularity (coarseness)

–

Image resolution: e.g., aerial photographs from different heights

3.2 Granularity

(22)

• Examine the neighborhood of each pixel for brightness changes

–

Lay over each pixel, a window of size

2

ⁱ

x 2

ⁱ

(e.g., 1 x 1 to 32 x 32 in IBM's QBIC)

–

Determine for each i and each pixel, the average gray level in the corresponding window

3.2 Granularity Extraction

(23)

• Compute δ

_i

= max(δ

_i^h

, δ

_i^v

) for each pixel

– δ_i^h

is the difference of means of gray levels belonging to the left and right horizontally adjacent windows (of size 2

ⁱ

x 2

ⁱ

)

– δ_i^v

analogous, between the vertically adjacent windows

• Determine for each pixel, the maximum window size 2

^j

x 2

^j

, whose δ

_j

has the maximum difference (or which lies within a certain tolerance from the maximum of δ

_i

)

3.2 Granularity Extraction

(24)

• The granularity of the entire image, is the mean of the maximum window sizes of all pixels

• A histogram which maps the number of pixels corresponding to each window can be used instead of the mean

• This allows for better comparison between images with different granularities

3.2 Granularity Extraction

(25)

• Problem: image selections, whose granularity needs to be determined,

may be too small to calculate meaningful averages in large operator windows: small image

sections would therefore always have small granularity

• Estimation of maximum δ

_i

from the smaller values (Equitz / Niblack, 1994)

3.2 Granularity Extraction

(26)

• Contrast evaluates the clarity of an image

• Sharpness of the color transitions

• Exposure, shadows

3.2 Contrast

Low Contrast High Contrast

(27)

• Extraction of the contrast values

–

Consider higher moments of the distribution of gray-level histogram

–

The contrast is

–

Where σ is the standard deviation of the image collection and α

₄

is the kurtosis

and μ

₄

as the fourth central moment

–

Uni- and bi-modal distributions can be differentiated through the use of the kurtosis

3.2 Contrast

(28)

• Directionality

• Senses predominant directions of elements in the image

3.2 Directionality

Highly directional Weak directional

(29)

• Directionality

–

Determines the strength (magnitude) and direction (angle) of the gradient in each pixel e.g., Sobel edge detector (IBM's QBIC 16 directions)

3.2 Directionality Extraction

(30)

• Create histograms, which assign for each angle the number of

pixels with gradients above a certain threshold

• A dominant direction in the image is represented by a peak in the histogram

• If the measure is rotation invariant, then do not use the angle, but the number and amplitude of such peaks for the calculation of the average

directionality

3.2 Directionality Extraction

(31)

• The first three Tamura features aren’t correlated so we can implement similarity between two

textures x and y as their distance in a 3D space:

3.2 Matching using the Tamura-measure

G G

G

(32)

• Pattern (fur) on ‘coats of arms’ images

3.2 Example

(33)

• Random-Field Models

– Observation: Textures are repeated periodically –

Generating textures (synthesis) requires stochastic

models

• Ability to predict the brightness of a pixel in a image sample

• Probability that a pixel has a certain brightness value

–

A good model creates different, but still very similar textures

3.2 Stochastic models

(34)

• The same trick can be used for the texture description and matching

• Which model (parameter) generates the presented textures the best?

• Assuming a model has created all the textures in the collection, the parameters of the model

serve as comparability features for each image

3.2 Random-Field Models

(35)

• Model x generated textures:

–

Given a pixel and its surroundings:

What is the expected intensity value?

–

Obviously different, therefore model x has different parameters for the following images

3.2 Example

striated irregular

(36)

• An image is described by a matrix F

(values in the matrix correspond to pixel intensities)

• Model:

– Matrix F is a random variable

– The distribution of class F is known, but the parameters of the distribution are not

• Question:

– We have an image and we assume that it is an implementation of the model. What are the parameters for the corresponding distribution of F? (Maximum likelihood estimation)

• Idea: describe the observed image by the estimated parameters

• F is called a random field

3.2 Approach

(37)

• What is the expected intensity of a pixel dependent from?

• For "sufficiently regular" textures the following locality statement is valid:

3.2 Locality Property of Textures

„If the neighbors to the left and right are white and the up and down neighbors are black, then the pixel covered by the red square is with a high probability also white”

(38)

• We can usually assume that the value of a pixel s, does not depend on the value of all pixels in the image, but on the pixels in the neighborhood of s

– This is called the Markov property

• Simplification: the neighborhood N

_s

is defined by a set N of shifts:

– N_s = {s + t | t ∈ N }

• Generalization:

N = {(0, 1), (1, 0), (0, –1), (–1, 0), (1, 1), (1, –1), (–1, –1), (–1, 1)}

3.2 Markov Property

(39)

• A model must now be defined, which best reproduces the observed distribution

–

There are many classes of texture models

–

We must commit to a common model for each

collection and then calculate the best parameters for each image of the collection

3.2 Texture Model

(40)

• A popular class of models for texture description is the Simultaneous AutoRegressive model (SAR):

• F(s) is the value of pixel s

• W(s) is a random variable (white noise with mean 0 and variance 1)

• θ(t) and β are characteristic parameters and are used as features for matching

3.2 Texture Model

(41)

• Problem:

–

Best size of the neighborhood of a pixel is different for different periodicities of textures

• Unfortunately, not trivial

• Multi-resolution simultaneous autoregressive model (Mao and Jain, 1992)

3.2 Texture Model

(42)

• Random-field models provide a good low- dimensional description of textures

• Suppositions:

–

The Markov condition is valid, therefore the state of a pixel is described with sufficient accuracy by its

neighborhood

–

The size of the neighborhood has been well chosen for the collection

3.2 Overview

(43)

• Transform domain features

–

In the case of low-level features, one chooses descriptors for certain aspects such as the

coarseness or contrast and embeds them in a vector space

–

High-level features describe the complete picture in a different domain (no loss of information)

–

This way, the image is interpreted as a signal and transformed mathematically

3.3 Transform Domain Features

(44)

• A transform is the conversion in a different representation

• Transforms are reversible and information preserving

• E.g.: a straight line can be described by ...

–

Two points

–

A point and the gradient

3.3 Transforms

(45)

• For images: Fourier transform

• Idea: gain information by transforming to another representation - “see other things"

• Goal: each new data item should contain information regarding the entire image

3.3 Transforms

(46)

• Algebra: for any set of n points

(x

₀

, y

₀

), …, (x

_n–1

, y

_n–1

) in ℜ

²

there is a polynomial of degree n-1, which passes through all of these points

• For example a set of points can be represented as ...

– coordinates (x₀, y₀), …,(x_{n – 1}, y_{n – 1}) or

– as a polynomial with x₀, x₁, …, x_{n – 1}, a₀, a₁, …,a_{n – 1} and

3.3 Example: Polynomial interpolation

(47)

• An image is a discrete function which assigns each pixel ( x , y ) with an intensity I(x, y)

– In the case of color images, an intensity is assigned to each color channel (RGB, HSV, ...)

– Therefore, each row of an image can be interpreted as a sequence of real numbers

– As seen, these rows can be transformed into the polynomial coefficients presentation form

– This presentation form is unfortunately not suitable for texture description

• Although textures exhibit periodic grey value variation, polynomials are not periodic

3.3 Image as a Signal

(48)

• Jean Fourier (1768-1830):

any periodic function or periodic signal can be decomposed into the sum of a

(possibly infinite) simple oscillating functions, namely sines and cosines.

• Great for periodic patterns like textures are

3.3 Discrete Fourier Transform

(49)

• One dimensional signal in time domain

–

Signal as a series of real numbers y

₁…y_n-1

–

According to Fourier this signal can be decomposed into a series of sine functions

3.3 DFT

Time

Value

Y_n-1 y_i

y₀

(50)

• One dimensional signal

–

Start with a sinusoidal function with the lowest frequency

–

Add more sines with higher frequency

3.3 DFT

…

(51)

• Sum of Fourier series:

• The lower frequencies contain most of the information

3.3 DFT

Value

Time

Amplitude

Frequency

f_i c_j

Frequency Domain Time Domain

(52)

• More formal, Fourier says that:

„Every sequence y

_0,

y

_1,

..., y

_n-1

real numbers can be transformed into a sequence of coefficients

a

₀

, a

₁

, …, a

_⌊_n/2_⌋

, b

₀

, b

₁

, …, b

_⌊_n/2_⌋

with:

for k=0, …, n-1”

3.3 DFT

(53)

• a

₀

, …, a

_⌊_n/2_⌋

, b

₀

,…, b

_⌊_n/2_⌋,

are the amplitudes of sine and cosine waves

–

Calculated by projecting the signal onto the corresponding sine or cosine wave

– 𝑎_𝑘 = _𝑗=0^𝑛/2 𝑓 𝑗 ⋅ 𝑐𝑜𝑠 𝑘 ⋅ 𝑗

3.3 DFT

(54)

• The discrete Fourier transformation can also be generalized to two-dimensional data like images

• Real space: pixel grey values as y

_0,

y

_1,

..., y

_n-1

• Frequency space: representation as a

₀

, a

₁

, …, a

_⌊_n/2_⌋

, b

₀

, b

₁

, …, b

_⌊_n/2_⌋

3.3 DFT

(55)

• For each i=0, ..., w-1 and each j=0, ..., h-1 there is an oscillation of the form shown below.

Parameters i and j indicate the direction and wavelength of the oscillation

• w: width of the image

• h: height of the image

• f(x, y): intensity of pixel (x, y)

3.3 2D DFT (formal)

(56)

• Coefficients A(i, j) and B(i, j) indicate the

amplitude of the corresponding cosine and sine waves of the (i, j) parameter pair

• They can be calculated as follows:

3.3 2D DFT (formal)

(57)

• Comparing images in frequency domain

–

A picture is used instead of the two matrices A and B

• This will be determined as follows:

–

The value at position (i, j) represents the length of vector (A(i, j), B(i, j))

–

This length defines how much from which frequency is contained in the signal

3.3 2D DFT (formal)

(58)

• Properties

–

Centered on the fundamental frequency

–

Symmetrically towards the origin

–

Harmonics

–

Main oscillation

–

Amplitude (strength)

–

Size of the period

3.3 Frequency Space

(59)

• Example: transformation from real space to frequency space

3.3 Frequency Space

(60)

• Horizontal frequencies are plotted horizontally in the frequency image

• Vertical frequencies are plotted vertically

3.3 Frequency Space

(61)

• Computation for is usually done with Fast Fourier Transform (FFT) algorithms

–

Efficient algorithm class for computing DFT

• …DFT complexity is O(N²)

–

FFT implementations

3.3 FFT

(62)

• Cooley-Tukey algorithm

–

Based on divide-and-conquer paradigm

–

Reduces the complexity

of calculating DFT to

O(N*log₂N)

• FFT in Matlab

3.3 FFT

(63)

• FFT in Matlab is easy to compute

3.3 FFT

(64)

• Typical features used in the texture analysis are:

–

Discrete Fourier Transform (DFT)

–

Discrete Cosine Transform (DCT)

–

Wavelet Transform (WT)

3.3 Transform Domain Features

(65)

• Analog to DFT, only cosine functions

• E.g., used in the encoding of JPEG images for compression purposes

• In the case of DFT and the DCT, the power spectrum, thus the coefficients are used for performing the comparison

3.3 Discrete Cosine Transform

(66)

• Approximation of the intensity function through a different class of functions

–

Fourier transform and the produced spectrum show the contained frequencies, but not when (or where in the image)

–

Wavelet transforms try to establish the function from a local base function (mother wavelet) in different

resolutions and shifts

–

Wavelets are thus local by frequency (through scaling) and time (through shifts)

3.3 Wavelet Transform

(67)

• The function classes are locally integrable functions with integral = 0

3.3 Wavelet Transform

(68)

• Having Ψ( x), a wavelet, we can generate a base B through appropriate shifting and scaling

• Usually one should consider special values for the wavelet basis: a = 2

^–j

and b = k· 2

^-j

for integers j

and k

3.3 Wavelet Transform

(69)

• The most simple example: Haar wavelet

– 1 on [0, ½) and –1 on [½, 1)

–

The corresponding functions result in an orthogonal basis

in L

²(ℜ)

(square integrable functions)

–

Orthonormal by a factor of 2

^j/2

3.3 Wavelet Transform

Ψ_{0, 0}(x)

(mother wavelet)

(70)

• Baby-Wavelets: Ψ

_{j, k}

(x)

3.3 Wavelet Transform

Ψ

_{1, 0}

( x ) Ψ

_{1, 1}

( x )

Ψ

_{2, 0}

( x ) Ψ

_{2, 1}

( x ) Ψ

_{2, 2}

( x ) Ψ

_{2, 3}

( x )

Scaled by a factor of 2^j Shifted by k·2^–^j

The smaller the scale the more shifts (exponential)

(71)

• The base can also be represented through the help of a scaling function φ

_0,0

• For Haar wavelets, the scaling function is the characteristic function of the interval [0, 1)

• Each data set of cardinality 2

ⁿ

, y = { y

₀

, …, 𝑦

₂^𝑛−1

} can then be represented on [0, 1) by a piecewise continuous function:

3.3 Wavelet Transform

(72)

• Since step functions are finite, they can be

expressed through the scaling function and Haar wavelet

3.3 Wavelet Transform

(73)

• Example: Describe the step function given by y = (1, 0, -3, 2, 1, 0, 1, 2)

3.3 Wavelet Transform

• Resolution

e.g., j = 0, 1, 2

• Basis with

ortho-normalization factor 2^0,5·j = {1, 2^1/2, 2} to

{d_0,k, d_1,k, d_2,k}

(74)

• The solution of the system delivers the coefficients for each wavelet

3.3 Wavelet Transform

Scaling function

Mother Wavelet

Baby-Wavelets (j= 1)

Baby-Wavelets (j= 2)

(75)

• Solution

• Obtained function

• Test

3.3 Wavelet Transform

(76)

• Texture-Based Image Retrieval

–

Low Level Features

• Tamura Measure, Random Field Model

–

High-Level Features

• Fourier-Transform, Wavelets

3. Summary

(77)

• Texture Analysis

–