- 1 - Digital Signal Processing and Pattern Recognition
Probability Theory
- 2 - Digital Signal Processing and Pattern Recognition
Objective
Fewer classification errors
Better models which do not only depend on mean values of reference vectors but also on their variances.
- 3 - Digital Signal Processing and Pattern Recognition
Example
Classification of animals as fish or bird
Collect some reference birds and reference fishes.
Compute average weight in both classes.
Model of a class consists of a single number.
Only one feature: Weight.
Create a model for each class:
- 4 - Digital Signal Processing and Pattern Recognition
Example
weight
fish
m
fish birdm
bird xx is classified as bird although it is a fish.
plankton whale
hummingbird albatros
x
What is the reason for this error?
- 5 - Digital Signal Processing and Pattern Recognition
Example
weight
fish bird
m
fishm
bird xIn order to classify something as bird, it has to have a much smaller distance to the mean of bird than to the mean of fish.
plankton whale
hummingbird albatros
Weight has a small variance in class bird but a high variance in class fish.
x
A large deviaton from the mean weight is much more likely in class fish than in class bird.
We need a better distance measure which takes this into account!
- 6 - Digital Signal Processing and Pattern Recognition
Example
signal energy vowel
consonant
m
vowelm
cons xx is classified as consonant although it is a vowel.
Difference to mean value is not a sufficient distance measure for classification!
Classification of vowels and consonants by their signal energy
Classes do not overlap, perfect classification should be possible.
Reason: High variance of energy in class vowel, low variance in class consonant.
- 7 - Digital Signal Processing and Pattern Recognition
Example
Class A
Class B Two dimensional feature vectors
Feature 2
Feature 1
- 8 - Digital Signal Processing and Pattern Recognition
A very mean example
Class A
Class B
Variances of both features are equal in each class.
But feature 1 has higher variance than feature 2!
Reason for misclassification
Distance in feature component 2 should be weighted higher than distance in component 1!
- 9 - Digital Signal Processing and Pattern Recognition
Example
Class A
Class B
Two dimensional feature vectors – components correlated
Grade in Physics
Grade in Math
- 10 - Digital Signal Processing and Pattern Recognition
Example
Universalists
Specialists
Two dimensional feature vectors – class wise different correlation
Grade in Language
Grade in Science
- 11 - Digital Signal Processing and Pattern Recognition
Contents
Mahalanobis Distance
Improved Distance Measure based on Normal Distribution Assumption Random Variables, Probability Density
Random Vectors, Common Density Function
Application of Probability Theory to Training and Classification
- 12 - Digital Signal Processing and Pattern Recognition
Reference pattern Class A:
Reference pattern Class B:
Test pattern
Simple special case:
Length of vector sequence 1
Vector dimension 1 Distance measure for numbers
Example
- 13 - Digital Signal Processing and Pattern Recognition
Reference patterns Class A:
Reference patterns Class B:
Test pattern
Example: More than one reference pattern per class
- 14 - Digital Signal Processing and Pattern Recognition
Class A Class B
Class A
?
Example: More than one reference pattern per class
Reference patterns Class A:
Reference patterns Class B:
Test pattern
Distance measure: Absolute or quadratic distance to the class mean
- 15 - Digital Signal Processing and Pattern Recognition
Sample Mean
Average over all samples
Random sample of reference patterns of a class:
Sample Variance
Measure for the scattering: Average squared deviation from the sample mean
Empirical mean value, empirical variance
(sample mean, sample variance)
- 16 - Digital Signal Processing and Pattern Recognition
Class A: Class B:
Sample variance in class B is much higher than in class A!
- 17 - Digital Signal Processing and Pattern Recognition
„Improved“ distance measure, which takes the variance into account Mahalanobis distance:
Motivation: „normalized“ distance measure
Distance ofx to the mean relative to the average distance to the mean in the class
Distance measure so far: Squared distance to the mean
Mahalanobis Distance
Class
Class
- 18 - Digital Signal Processing and Pattern Recognition
Class A: Class B:
Classification with Mahalanobis distance
Distance to class A Distance to class B
Class B
Class B ???
Class B
„Misclassification“ with Mahalanobis distance!
Classes wih a very high variance are preferred too much!
- 19 - Digital Signal Processing and Pattern Recognition
Probability Density and
Random Variables
- 20 - Digital Signal Processing and Pattern Recognition
Class A
Class B Class B
62 should be classified as B, although 62 is closer to the mean of class A!
35 should be classified as A.
Probability density
- 21 - Digital Signal Processing and Pattern Recognition
Example: Density function of the normal distribution
Density function depends on only two paramenters mean and variance which can be estimated empirically from some samples
- 22 - Digital Signal Processing and Pattern Recognition
Classification result with normal distribution assumption
Class A: Class B:
Class B Class A Class B Class
- 23 - Digital Signal Processing and Pattern Recognition
Contour lines of f(x,y) if X and Y are independent
- 24 - Digital Signal Processing and Pattern Recognition
Contour lines of f(x,y) if X and Y are dependent
- 25 - Digital Signal Processing and Pattern Recognition
and
- 26 - Digital Signal Processing and Pattern Recognition
Sample of reference vectors from a class:
Empirical mean, empirical variance for each component
Empirical mean vector, empirical variance vector
Example
- 27 - Digital Signal Processing and Pattern Recognition
Objective: Improved model for the classification of feature vector sequences
Example
Reference patterns of some class
(given)
Modell
for the class (wanted)
length 6 length 7
length 3
Application for Viterbi Training
and Classification
- 28 - Digital Signal Processing and Pattern Recognition
Linear Segmentation (nothing new)
Model states
Model states
- 29 - Digital Signal Processing and Pattern Recognition
Initial estimation of the model states
- 30 - Digital Signal Processing and Pattern Recognition
Matching of the reference sequences with the model using Viterbi Algorithm (use normal distribution based distance measure!)
Model states
Model states
- 31 - Digital Signal Processing and Pattern Recognition
Reestimation of the model states
Matching with new model (Viterbi Algorithm) Reestimate model using new segmentation Iterate:
- 32 - Digital Signal Processing and Pattern Recognition