- 1 - Digital Signal Processing and Pattern Recognition
Speech recognition yesterday
… and tomorrow
- 2 - Digital Signal Processing and Pattern Recognition
Classification of signals, pattern recognition
e.g. speech, gestures, hand written text, ECG, traffic situations, …
Objective
Preprocessing:
Computation of feature vectors
…
Word 2 Word n Word 1
Classification:
Comparison with reference patterns Sensor Signal
- 3 - Digital Signal Processing and Pattern Recognition
Distance Measure for Vector Sequences with Different Lengths
Dynamic programming, dynamic time warping
Contents
Computation of a Reference Pattern from Several Examples
Mean value, variance, Mahalanobis distance
Random Variables
Modeling of uncertainty, normal distribution
Hidden Markov Models, Learning from Examples
Viterbi Training, Expectation Maximization (EM), k-means Algorithm, mixture distributions
Statistical Dependency, Covariance
Reduction of classification errors, n-dimensional normal distribution
- 4 - Digital Signal Processing and Pattern Recognition
Maximum Likelihood Estimation
Theoretical foundation of statistical learning algorithms
Contents
(optional)Decorrelation, Principal Component Analsysis
Simplification of the classification problem through preprocessing
- 5 - Digital Signal Processing and Pattern Recognition
Ilias or
http://mitarbeiter.hs-heilbronn.de/~vstahl
Course Material
Slides used in the lecture Exercises
Data for Exercises
Literature hints
- 6 - Digital Signal Processing and Pattern Recognition
Programming project synchronous to the lectures
Development of a reliable classifier Successive improvement, homework
Programming language C, Java, Python or Matlab Teamwork (max. 3 students)
Deadline: First lecture after Christmas
Grading
- 7 - Digital Signal Processing and Pattern Recognition
Elementare Einführung in die Wahrscheinlichkeitsrechnung
Karl Bosch (E-Book, 519.Bosch)
Literature
Elementare Einführung in die angewandte Statistik
Karl Bosch (519.Bosch)
Mustererkennung mit Markov-Modellen – Theorie, Praxis, Anwendungsgebiete
Gernot A. Fink (621.841 Fin)
Pattern Classification and Scene Analysis
Richard O. Duda, Peter E. Hart (519.Dud)
Mathematical Methods and Algorithms for Signal Processing
Todd K. Moon, Wynn C. Stirling
The Elements of Statistical Learning
Trevor Hastie, Robert Tibshirani, Jerome Friedman (519.Has)
- 8 - Digital Signal Processing and Pattern Recognition
1957 „Sputnik shock“
Foundation of ARPA 1958 (Advanced Research Projects Agency)
under the administration of the US defence ministry
DARPA (1972-1993), ARPA (1993-1996), DARPA (1996-…) current budget 3,5 Mrd. $
- 9 - Digital Signal Processing and Pattern Recognition
1969 Arpanet
Further projects: stealth technology, GPS, …
Grand Challenge 2004, Mohave desert Nevada, 241km.
Grand Challenge 2005 Urban Challenge 2007
- 10 - Digital Signal Processing and Pattern Recognition
- 11 - Digital Signal Processing and Pattern Recognition
- 12 - Digital Signal Processing and Pattern Recognition
- 13 - Digital Signal Processing and Pattern Recognition
September 2014: Internationale Automobilausstellung Frankfurt Daimler presents autonomously driving truck.
SOP planned for 2025.
- 14 - Digital Signal Processing and Pattern Recognition
“Google’s Next Phase in Driverless Cars: No Steering Wheel or Brake Pedals”
www.nytimes.com May 27, 2014
- 15 - Digital Signal Processing and Pattern Recognition
„Keine Folgen für Tesla nach tödlichem Unfall“
FAZ 20.1.2017
- 16 - Digital Signal Processing and Pattern Recognition
Daimler AG 2017 Prototyp
- 17 - Digital Signal Processing and Pattern Recognition
Example
Character recognition
Pattern Recognition
A
- 18 - Digital Signal Processing and Pattern Recognition
Recognition means Classification
Given:
Signal (or features derived from it): Pattern Finite number of classes
Wanted:
Class, to which the pattern „belongs“.
Example:
Character recognition
Visual quality control (ok, reject)
Speech recognition(phonemes, words) Medical diagnostics (X-ray images, ECG, …)
…
- 19 - Digital Signal Processing and Pattern Recognition
Classification means Comparision with Reference Patterns
Given:
Pattern to be classified (test pattern) A reference pattern for each class Wanted:
The class whose reference pattern is most „similar“
to the given pattern
Problem: „Suitable“ similarity/distance measure between patterns?
- 20 - Digital Signal Processing and Pattern Recognition
A
Reference pattern class A Reference
pattern class B:
B
Example character recognition
Test pattern
A
Similarity / distance measure?
- 21 - Digital Signal Processing and Pattern Recognition
Preprocessing
Computation of distinctive features of a pattern: Feature vector.
Sequence of feature vectors
Preprocessing is application dependent!
Different features for speech (formant frequencies), gestures, handwritten text, images, ...
Time signal
(speech, gestures, ECG…)
- 22 - Digital Signal Processing and Pattern Recognition
Distance measure for sequences of feature vectors
Reference pattern class A
Reference pattern class B
Test pattern
- 23 - Digital Signal Processing and Pattern Recognition
Euclidean Distance
- 24 - Digital Signal Processing and Pattern Recognition
Reference pattern class A
Reference pattern class B
Test pattern
Distance measure for sequences with different lengths?
- 25 - Digital Signal Processing and Pattern Recognition
Reference
Test
Matching
- 26 - Digital Signal Processing and Pattern Recognition
Summary
Classification of a signal means comparison with reference signals Comparison is not done on the level of pixels or samples
but on features (or feature vector sequences) derived from the signals Feature extraction is application dependent.
Different features e.g. for speech / image / letter recognition
Features for speech recognition are usually derived from Fourier coefficients of short signal sections (e.g. 32ms)
Meaningful distance measure for feature vector sequences with different lengths.
Task: Match test sequence to reference sequence!
- 27 - Digital Signal Processing and Pattern Recognition
Interpretation as a path in a search grid
Reference
Test
Test Reference
- 28 - Digital Signal Processing and Pattern Recognition
Restrictions for matching sequences
Each test vector has to be mapped to exactly one reference vector (mapping is a function!)
Forbidden!
Temporal order has to be maintained!
(function is monotonously increasing) A later test vector cannot be mapped to an earlier reference vector
Forbidden!
three hundred two two hundred three
- 29 - Digital Signal Processing and Pattern Recognition
First and last vectors of the sequences
have to be mapped to each other Forbidden!
At most one reference vector
may be skipped at a time Forbidden!
…
…
hundred
three two
three hundred two
three hundred two
… …
Restrictions for matching sequences
- 30 - Digital Signal Processing and Pattern Recognition
Wanted: functional path through the grid with minimal cost
from bottom left to top right monotonously increasing
maximim slope 2 (skip one reference vector)
Reference
Test
- 31 - Digital Signal Processing and Pattern Recognition
Conclusion: Only 3 kinds of transitions in the grid allowed
Next transition
Loop transition
Skip transition
- 32 - Digital Signal Processing and Pattern Recognition
Computation of the optimal path
Viterbi Algorithm:
- 33 - Digital Signal Processing and Pattern Recognition
- 34 - Digital Signal Processing and Pattern Recognition
- 35 - Digital Signal Processing and Pattern Recognition
- 36 - Digital Signal Processing and Pattern Recognition
- 37 - Digital Signal Processing and Pattern Recognition
- 38 - Digital Signal Processing and Pattern Recognition
- 39 - Digital Signal Processing and Pattern Recognition
- 40 - Digital Signal Processing and Pattern Recognition
Milestone for your project:
Implement Viterbi matching algorithmand test it.
Compute distance between feature vector sequences and the optimal path (will be needed in the next step).
Do some speech recognition experiments using the feature extraction program on my web page:
(long/short words, different speaker for test/reference signal, noise, …)
- 41 - Digital Signal Processing and Pattern Recognition
Improvement:
More than one reference sequence for a class
References for the class
Mean value for the class:
„Model“
„States“
Problem:
Matching with each reference sequence time consuming Solution:
Compute „average“ over all sequences of one class
Example:
- 42 - Digital Signal Processing and Pattern Recognition
Problem:
Solution:
Match all reference vector sequences of a class to a common model with fixed length
Two reference sequences from the same class
Reference vector sequences of a class may have different lengths
- 43 - Digital Signal Processing and Pattern Recognition
Outline:
Determine the length of the model
Match each reference sequence to the model length (assume linear time distortion)
Initial estimation of the model vector sequence by averaging
Match each reference sequence to the current model (dynamic time warping, Viterbi matching as above) Recalculate the model by averaging
Iterative improvement
Viterbi Training of a Model
Initial estimation
- 44 - Digital Signal Processing and Pattern Recognition
Choice of the model length
Model is too long!
Choice of the model length e.g. ½ median of the length of the reference sequences
Model length is ok.
- 45 - Digital Signal Processing and Pattern Recognition
Linear Segmentation
Linear mapping of a reference vector sequence to the model states.
(Values of the feature vectors are irrelevant in this step.)
Model
Reference sequence
- 46 - Digital Signal Processing and Pattern Recognition
Example
two reference
sequences of a class (given)
Model for the class (wanted)
Length 6 Length 7
Length 3
- 47 - Digital Signal Processing and Pattern Recognition
Linear Segmentation
- 48 - Digital Signal Processing and Pattern Recognition
Initial estimation of the model vectors: Initial model
Model vector =
Mean value of all reference vectors which have been mapped to the model state
- 49 - Digital Signal Processing and Pattern Recognition
Match the reference sequences with the model using Viterbi algorithm (Segmentation)
- 50 - Digital Signal Processing and Pattern Recognition
Reestimate the model vectors:
Model vector =
mean value of all reference vectors which have been mapped to the state.
Recompute segmentation with the new model Reestimate model using the new segmentation
Iterate:
Expectation Maximization Principle (EM algorithm)
- 51 - Digital Signal Processing and Pattern Recognition
- 52 - Digital Signal Processing and Pattern Recognition
Milestone for your project:
Implement Viterbi training and test it.
Make recognition experiments with varying number of reference recordings for each word.
Make two sets of recordings: one for training the models and one for evaluating the recognition error rates. Automate this proess.
Make two models of the same word spoken by different speakers.
Instead of speech recognition you can do speaker recognition that way.
- 53 - Digital Signal Processing and Pattern Recognition
Example