Multimedia Databases Multimedia Databases

(1)

Multimedia Databases Multimedia Databases

Wolf-Tilo Balke Silviu Homoceanu

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

(2)

3 Using Texture for Image Retrieval

3.1 Textures, basics

3.2 Low-Level Features 3.3 High-Level Features

3 Using Textures for Image Retrieval

(3)

• Textures describe the nature of typical, recurrent patterns in pictures

• Important for the description of images

3.1 Texture Analysis

• Important for the description of images

– Type of representation (raster image, etc.) – Image objects

• Natural things: grass, gravel, etc.

• Artificial things: stone walls, wallpaper, etc.

(4)

• Various ordered and random textures

3.1 Example

(5)

• Texture segmentation

– Find areas of the image (decomposition) with homogeneous textures

• Texture classification

3.1 Texture Research

– Describe and denote homogeneous textures in image regions

• Texture synthesis

– Create textures for increased realism in images

(texture mapping, etc.)

(6)

• Find image regions with a certain texture

– Here: wine grapes

– Used for scene decomposition

3.1 Texture Segmentation

(7)

• Colors and texture are often related

3.1 Texture Segmentation

– Texture decomposition alone often provides no

meaningful (semantically related) areas

(8)

• Extract the (segmented) regions with a certain predominant texture

– Medical images: tomography, etc.

– Satellite images: ice, water, etc.

3.1 Texture Classification

– ...

• Describe the corresponding texture with

appropriate features, suitable for comparisons in

similarity search queries

(9)

• Classification can be semantic

– Textures represent objects in the real world – Strongly dependent on the application

• Or based on purely descriptive characteristics

3.1 Texture Classification

• Or based on purely descriptive characteristics

– Usually it has no direct significance for people – Ensures comparability between different image

collections

– Query-by-Example

(10)

• Example: Satellite image (semantically)

3.1 Texture Classification

Sand

Water

(11)

• How to describe textures for similarity measures?

– Low-level features use basic building blocks (e.g., Julesz‘ Textons), the Tamura-measure, etc.

– High-level features use Gabor-Filters, Fourier- Transformations, etc.

3.1 Texture Features

Transformations, etc.

(12)

• How do people distinguish textures?

3.1 Texture Features

(13)

• (Rao / Lohse, 1993) give three main criteria:

– Repetition – Orientation – Complexity

3.1 Low Level Texture Features

Complexity

• Is this measurable?

– Grey level features

– The Tamura measure

– Random field models

(14)

• 60ies and 70ies: investigating texture meant mainly grey-level analysis

– Grey value histograms provide information on pixel intensity

– Allows comparison using statistical

3.2 Grey Level Features

– Allows comparison using statistical measures like expected value,

standard deviation, etc.

– Idea: Similar patterns produce

similar distributions of grey values

(15)

• Moments of the first order do not consider the position of the pixels

– Periodicity poorly detectable

3.2 Grey Level Features

(16)

• Solution: Grey-level co-occurrence

– Pixel at position s has intensity q: I(s) = q

– (Julesz, 1961): Calculate the empirical probability distribution for the intensity change of the value m at pixel shift with d pixels to the right:

3.2 Grey Level Features

at pixel shift with d pixels to the right:

• (Julesz, 1975): Generalization to shifts in any direction

– As two-dimensional distribution function (for every d)

use the Grey-level co-occurrence matrix

(17)

• Grey-level co-occurrence matrix

– Consider all pixel pairs (x ₁ , y ₁ ), (x ₂ , y ₂ ) with Euclidean distance d and assume point (x ₁ , y ₁ ) has grey value i, and point (x ₂ , y ₂ ) has grey value j, for i, j ∈ {1, …, N}

– Define C = [c (i, j)] as grey-level co-occurrence

3.2 Grey Level Features

∈

– Define C _d = [c _d (i, j)] as grey-level co-occurrence

matrix, where c _d (i, j) is the number of pixel pairs, which have distance d and intensity i, respectively j – Problem: rather complicated to calculate all (N x N) -

matrices for different distances d

(18)

• Many measures for texture recognition were derived from these grey-level co-occurrence

matrices

– Thesis of Julesz (Julesz and others, 1973):

People can not distinguish textures,

3.2 Grey Level Features

People can not distinguish textures, if they have identical grey-level

co-occurrence matrices

– Perception psychology: nope…

But still useful as a rule of thumb!

(Julesz, 1981)

Bela Julesz

(19)

• In the Tamura Measure (Tamura and others, 1978) image textures are evaluated along six different

dimensions

– Granularity (coarseness): gravel vs. sand – Contrast: clear-cut shapes, shadows

3.2 The Tamura Measure

– Contrast: clear-cut shapes, shadows – Directionality: predominant directions – Line-Likeness

– Regularity – Roughness

• The last three properties are rarely used and

appear to be correlated to the others

(20)

• Granularity (coarseness)

– Image resolution: e.g., aerial photographs from different heights

3.2 Granularity

(21)

• Examine the neighborhood of each pixel for brightness changes

– Lay over each pixel, a window of size

2 ⁱ x 2 ⁱ (e.g., 1 x 1 to 32 x 32 in IBM's QBIC)

– Determine for each i and each pixel, the average gray

3.2 Granularity Extraction

– Determine for each i and each pixel, the average gray

level in the corresponding window

(22)

• Compute δ δ δ δ _iiii = = max = = max max max((((δ δ δ δ _iiii ^h ^h ^h ^h , , , , δ δ δ δ _iiii ^v ^v ^v ^v ) ) ) ) for each pixel

– δ _i ^h is the difference of means of gray levels belonging to the left and right horizontally adjacent windows (of size 2 ⁱ x 2 ⁱ )

– δ _i ^v analogous, between the vertically adjacent windows

3.2 Granularity Extraction

– δ _i ^v analogous, between the vertically adjacent windows

• Determine for each pixel, the maximum window

size 2 ^j x 2 ^j , whose δ _j has the maximum difference

(or which lies within a certain tolerance from the

maximum of δ _i )

(23)

• The granularity of the entire image, is the mean of the maximum window sizes of all pixels

• Alternatively a histogram which maps the

number of pixels corresponding to each window

3.2 Granularity Extraction

number of pixels corresponding to each window can be used

– This allows for better comparison between images

containing different granularities

(24)

• Problem: image selections, whose granularity needs to be determined, may be too small to calculate meaningful averages in large operator windows

3.2 Granularity Extraction

– Small image sections would therefore always have small granularity

– Solution: estimation of a maximum δ _i from the smaller

values (Equitz / Niblack, 1994)

(25)

• Contrast evaluates the clarity of an image

– Sharpness of the color transitions – Exposure, shadows

3.2 Contrast

Low Contrast High Contrast

(26)

• Extraction of the contrast values

– Consider higher moments of the distribution of gray-level histogram – The contrast is

3.2 Contrast Extraction

– Where σ is the standard deviation of the image collection and α ₄ is the kurtosis

and µ ₄ as the fourth central moment

– Uni- and bi-modal distributions can be differentiated

through the use of the kurtosis

(27)

• Directionality

– Senses predominant directions of elements in the image

3.2 Directionality

Highly directional Weak directional

(28)

• As a measure determine the magnitude and direction (angle) of the gradient in each pixel, e.g., with a Sobel edge detector

– IBM's QBIC uses 16 different directions

3.2 Directionality Extraction

(29)

• Create histograms, where each angle is assigned the number of pixels with gradients above a certain threshold

3.2 Directionality Extraction

– A dominant direction in the image is represented by a peak in the histogram

– If the measure has to be rotation invariant, then do

not use the location (angle), but the number and

amplitude of such peaks for the calculation of the

average directionality D

(30)

• No correlations were found between the first three Tamura features

– Similarity can be implemented as distance measurement in a three-dimensional space:

3.2 Tamura-Measure Matching

G G

G

(31)

• Pattern (fur) on heraldic images

3.2 Example

(32)

• Random-Field Models

– Observation: textures are periodically repeated – Generating textures requires stochastic models

• Provides ability to predict the brightness of a pixel in some image sample

3.2 Using Stochastic Models

image sample

• Includes probability that a pixel has a certain brightness value

– A fixed model creates different, but still very similar

textures

(33)

• The same trick (reversed) can be used for the texture description and matching, too

– Which model (parameter) generates the presented textures the best?

– Assuming a fixed model has created all the textures in

3.2 Random-Field Models

– Assuming a fixed model has created all the textures in

the collection, the parameters of the model serve as

descriptive features for each image

(34)

• Assume the same model X has generated all textures in images of a collection

• Given some pixel and its surroundings: What is the expected intensity value?

3.2 Example

the expected intensity value?

– Obviously different for different images, therefore model X has different parameters for each image

shaded irregular

(35)

• Each image can be seen as an observation

– Described by a matrix F, where values in the matrix correspond to pixel intensities

• Probabilistic model

– Matrix F is a random variable (called a random field) – The basic distribution of class F is known, but the specific

3.2 Random Field Approach

F (

– The basic distribution of class F is known, but the specific parameters of the distribution are not

– Question: we have an image and assume that it is an implementation of F. What are the parameters for the corresponding distribution of F?

• Idea: Perform a maximum likelihood estimation and

describe each image by the estimated parameters

(36)

• What does the expected intensity of a pixel depend on?

– For "sufficiently regular" textures the following locality statement is valid:

3.2 Exploiting Locality

„If the neighbors to the left

and right are white and the

up and down neighbors are

black, then the pixel under

the red square is with a high

probability also white”

(37)

• We can usually assume that the value of a pixel s, does not depend on the value of all pixels in the image, but on the pixels in the neighborhood of s

– This is called the Markov property

• Formalization

F(r) r N _s

s

3.2 Markov Property

s

• Formalization

– Let F(r) be the (random) value of the pixel r and N _s the set of pixels in the neighborhood of pixel s

– For each pixel s (and color values k, k _{(0, 0)} , k _{(0, 1)} , ...):

P[F(s) = k | for all r ≠ s: pixel r has value k _r ]

=: P[F(s) = k | for all r ≠ s, r ∈ N , r ∈ N , r ∈ N , r ∈ N _ssss : pixel r has value k _r ]

– Thus, the probabilities of all values of pixel s depend only

(38)

• Now, a model must be defined, which best reproduces the observed distribution

– There are many classes of texture models

• We must fix a common model for each collection

3.2 Choosing Texture Models

• We must fix a common model for each collection and then calculate the best parameters for

each image of the collection

– Simplification: the neighborhood N _s is defined by a set N of shifts: N _s = {s + t | t ∈ N }

– Generalization: N := {(0, 1), (1, 0), (0, –1), (–1, 0),

(1, 1), (1, –1),(–1, –1), (–1, 1)}

(39)

• A popular class of models for texture description is the Simultaneous AutoRegressive model (SAR):

F(s) s

W(s)

3.2 Choosing Texture Models

– F(s) is the intensity value of pixel s

– W(s) is a special random variable reflecting white noise with mean 0 and variance 1

• θ(t) and β are characteristic parameters

and are used as features for later matching

(40)

• Still, there is a problem: the best size of the neighborhood of a pixel is different for different periodicities of textures

– Unfortunately, the solution is not trivial

1992

3.2 Choosing Texture Models

– On possibility are multi-resolution simultaneous

autoregressive models (Mao and Jain, 1992)

(41)

• Random-field models provide a good low- dimensional description of textures

• Assumptions

– The Markov condition is valid, i.e. the intensity of

3.2 Random Field Models

– The Markov condition is valid, i.e. the intensity of each pixel is described with sufficient accuracy by its neighborhood

– The size of the neighborhood has been well chosen

for the collection

(42)

• Transform domain features

– In the case of low-level features, descriptors are

chosen for certain aspects such as the coarseness or contrast

– High-level features describe the complete picture

3.3 Transform Domain Features

– High-level features describe the complete picture in a different domain (no loss of information)

– Basically, the image is interpreted as a signal and

transformed mathematically

(43)

• Well known for images: Fourier transform

3.3 Transform Domain Features

• Idea: by transforming to another representation gain information - “see other things“

local space frequency space

(44)

• Typical features used in the texture analysis are

– Discrete Fourier Transform (DFT) – Discrete Cosine Transform (DCT) – Wavelet Transform (WT)

3.3 Transform Domain Features

(45)

• A transform is the conversion of a mathematical object into a different representation

– Transforms are reversible and information preserving

3.3 Transforms

• E.g.: a straight line can be described by...

– Two arbitrary points

– A point and the gradient

• Both give the same information

(46)

• More general result from algebra:

– For any set of n points (x ₀ , y ₀ ), …, (x _n–1 , y _n–1 ) in ℜ ² there is exactly one polynom of degree n-1, which passes through all of these points

• This polynom can thus be represented as ...

(x , y ), …,(x , y )

3.3 Example: Polynomial interpolation

n-1

• This polynom can thus be represented as ...

– The set of points (x ₀ , y ₀ ), …,(x _{n – 1} , y _{n – 1} ) – The equation of the polynom

for all k = 0, ..., n-1 with suitable a a a a ₀ ₀ ₀ ₀ , a , a , a , a ₁ ₁ ₁ ₁ , …,a , …,a , …,a , …,a _n _n _n _{n –} _– _– _{– 1} ₁ ₁ ₁

(47)

• Special case: x ₀ := 0, x ₁ := 1, …, x _n–1 := n–1

– That means an equidistant sampling over the x-axis – Exactly the case when reading intensity values of

some image row by row

y ₀ , y ₁ ,... , y _n-1 3.3 Example: Polynomial interpolation

• Then, any sequence of real numbers y ₀ , y ₁ ,... , y _n-1 can be transformed into a sequence of

coefficients a ₀ , a ₁ , ..., a _n-1 where

is valid for all k := 0, …, n – 1

(48)

• Idea: an image is a discrete function which assigns each pixel ( x , y ) with an intensity I(x, y)

– In the case of color images, an intensity is assigned to each color channel (RGB, HSV, ...)

– Therefore, each row of an image can be interpreted as a

3.3 Images as Signals

– Therefore, each row of an image can be interpreted as a sequence of real numbers

• As seen, these rows can be transformed into the polynomial coefficients presentation form

– This representation is not suitable for texture description because although textures exhibit periodic grey value

variation polynoms, polynoms are not periodic

(49)

• Solution by Jean Fourier (1768-1830):

„Every sequence y _0, y _1, ..., y _n-1 of real numbers can be transformed into a sequence of

coefficients a ₀ , a ₁ , …, a _⌊ _n/2 _⌋ , b ₀ , b ₁ , …, b _⌊ _n/2 _⌋ with

for k=0, …, n-1

3.3 Discrete Fourier Transform

a ₀ , a ₁ , …, a _⌊ _n/2 _⌋ , b ₀ , b ₁ , …, b _⌊ _n/2 _⌋ with

for k=0, …, n-1”

– This sequence can also be described, by the overlap of harmonic oscillations

– The coefficients are typical for periodic patterns

(50)

• Real space: representation as y _0, y _1, ..., y _n-1

• Frequency space: representation as a ₀ , a ₁ , …, a _⌊ _n/2 _⌋ , b ₀ , b ₁ , …, b _⌊ _n/2 _⌋

• The discrete Fourier transformation can also be

3.3 DFT

a _⌊ _n/2 _⌋ , b ₀ , b ₁ , …, b _⌊ _n/2 _⌋

• The discrete Fourier transformation can also be generalized to two-dimensional data in real space (e.g., pixel coordinates)

– We then have sine and cosine waves, in the frequency

domain, each with a direction

(51)

• For each i=0, ..., w-1 and each j=0, ..., h-1 there is an oscillation of the form shown below.

Parameters i and j indicate the direction and wavelength of the oscillation

3.3 Two-dimensional DFT (formal)

i j

• w: width of the image

• h: height of the image

• f(x, y): intensity of pixel (x, y)

(52)

• Coefficients A(i, j) and B(i, j) indicate the

amplitude of the corresponding cosine and sine waves of the (i, j) parameter pair

3.3 Two-dimensional DFT (formal)

waves of the (i, j) parameter pair

• They can be calculated as follows:

(53)

• For graphic illustration of the amplitudes

instead of the two matrices A and B, a picture can be used

– The frequency space picture is derived as follows:

the value at position (i, j) represents the length of (A(i, j), B(i, j))

3.3 Two-dimensional DFT

the value at position (i, j) represents the length of vector (A(i, j), B(i, j))

• This length defines the Fourier spectrum

– i.e. how much of which frequency is contained in

the signal

(54)

• Horizontal frequencies are plotted horizontally in the frequency image

3.3 Frequency Space

– Vertical frequencies

are plotted vertically

(55)

• High/low periodicity in the real world is also reflected in feature space

3.3 Frequency Space

(56)

• Properties

– Symmetrically towards the origin – Harmonics

– Main oscillation

3.3 Frequency Space

Main oscillation – Brightness reflects

amplitude (strength)

– Size of periodicity

(57)

• How to compute DFTs?

– Fast Fourier Transform is an efficient algorithm class for computing DFT

– …DFT complexity is O(N ² )

3.3 FFT

O(N ² )

– FFT implementations

• Cooley-Tukey algorithm, Prime Factor FFT, Bruun’s FFT, …

(58)

• Cooley-Tukey algorithm

– Based on divide-and-conquer paradigm – Recursively expresses N=N ₁ *N ₂

– Reduces the complexity O(N*log N)

3.3 FFT

N=N ₁ *N ₂ Reduces the complexity

of calculating DFT to O(N*log ₂ N)

• FFT in Matlab

(59)

• FFT in Matlab is easy to compute

3.3 FFT

(60)

3.3 Examples

Remove noise by masking pixels Lossy image

compression

(61)

• The Discrete Cosine Transform (DCT) works analogously to DFT, only using cosine functions

– E.g., used in the encoding of JPEG images for compression purposes

3.3 Discrete Cosine Transform

• In the case of DFT and DCT the power spectrum (i.e. the coefficients) are used for comparisons

• Little problem: the spectrum produced by Fourier

transform shows all contained frequencies, but not

when (or where in the image) they occur

(62)

• Wavelet transforms approximate the intensity function through a different class of functions

– Approximation of the intensity function using a local base function (mother wavelet) in different resolutions and shifts

3.3 Wavelet Transform

and shifts

– Wavelets are thus local by frequency (through scaling) and time (through shifts)

– This solves the locality problem of DFT/DCT

(63)

• The function classes are locally integrable functions with integral = 0

3.3 Wavelet Transform

(64)

• Having some wavelet Ψ(x) we can generate a base B through appropriate shifting and scaling

3.3 Wavelet Transform

– Usually special values for the wavelet basis are

considered: a = 2 ^–j and b = k· 2 ^-j for integers j and k

– These values are called “critical sampling”

(65)

• The most simple example: the Haar wavelet

– Definition: 1 on [0, ½) and –1 on [½, 1)

– The corresponding functions

3.3 Wavelet Transform

– The corresponding functions form an orthogonal basis

in L ² (ℜ) (all quadratically integrable functions)

– Can be made orthonormal

by a factor of 2 ^j/2 ^{graph of} (mother wavelet) ^Ψ

^{0, 0}

⁽ ^x ⁾

(66)

• Baby-Wavelets: Ψ _{j, k} (x)

3.3 Wavelet Transform

Ψ _{1, 0} ( x ) Ψ _{1, 1} ( x )

Scaled by a factor of 2 ^j Shifted by k ·2

^–

^j

Ψ _{1, 0} ( x ) Ψ _{1, 1} ( x )

Ψ _{2, 0} ( x ) Ψ _{2, 1} ( x ) Ψ _{2, 2} ( x ) Ψ _{2, 3} ( x )

The smaller the scale the more shifts

(exponentially)

(67)

• The base can also be represented using a scaling function φ _0,0

• For Haar wavelets, the scaling function is the

2 , y = { y , …, y }

3.3 Wavelet Transform

• For Haar wavelets, the scaling function is the characteristic function of the interval [0, 1)

– Each data set of cardinality 2 ⁿ , y = { y ₀ , …, y ₂ ⁿ _– ₁ }

can then be represented on [0, 1) by a piecewise

continuous function:

(68)

• Our intensity values for image rows are basically discrete step functions

– Since step functions are finite, they can be expressed through the scaling function and Haar wavelets