Machine Learning Techniques for Multimedia
Case Studies on Organization and Retrieval
Bearbeitet von
Matthieu Cord, Pádraig Cunningham
1. Auflage 2008. Buch. xvi, 289 S. Hardcover ISBN 978 3 540 75170 0
Format (B x L): 15,5 x 23,5 cm Gewicht: 625 g
Weitere Fachgebiete > EDV, Informatik > Datenbanken, Informationssicherheit, Geschäftssoftware > Datenkompression, Dokumentaustauschformate
Zu Leseprobe
schnell und portofrei erhältlich bei
Die Online-Fachbuchhandlung beck-shop.de ist spezialisiert auf Fachbücher, insbesondere Recht, Steuern und Wirtschaft.
Im Sortiment finden Sie alle Medien (Bücher, Zeitschriften, CDs, eBooks, etc.) aller Verlage. Ergänzt wird das Programm durch Services wie Neuerscheinungsdienst oder Zusammenstellungen von Büchern zu Sonderpreisen. Der Shop führt mehr
als 8 Millionen Produkte.
Part I Introduction to Learning Principles for Multimedia Data
1 Introduction to Bayesian Methods and Decision Theory . . . . 3
Simon P. Wilson, Rozenn Dahyot, and P´adraig Cunningham 1.1 Introduction . . . 3
1.2 Uncertainty and Probability . . . 4
1.2.1 Quantifying Uncertainty . . . 4
1.2.2 The Laws of Probability . . . 5
1.2.3 Interpreting Probability . . . 6
1.2.4 The Partition Law and Bayes’ Law . . . 7
1.3 Probability Models, Parameters and Likelihoods . . . 8
1.4 Bayesian Statistical Learning . . . 9
1.5 Implementing Bayesian Statistical Learning Methods . . . 10
1.5.1 Direct Simulation Methods . . . 11
1.5.2 Markov Chain Monte Carlo . . . 12
1.5.3 Monte Carlo Integration . . . 13
1.5.4 Optimization Methods . . . 14
1.6 Decision Theory . . . 15
1.6.1 Utility and Choosing the Optimal Decision . . . 16
1.6.2 Where Is the Utility? . . . 17
1.7 Naive Bayes . . . 17
1.8 Further Reading . . . 18
References . . . 19
2 Supervised Learning. . . . 21
P´adraig Cunningham, Matthieu Cord, and Sarah Jane Delany 2.1 Introduction . . . 21
2.2 Introduction to Statistical Learning . . . 22
2.2.1 Risk Minimization . . . 22
2.2.2 Empirical Risk Minimization . . . 23
2.2.3 Risk Bounds . . . 24
ix
2.3 Support Vector Machines and Kernels . . . 26
2.3.1 Linear Classification: SVM Principle . . . 26
2.3.2 Soft Margin . . . 27
2.3.3 Kernel-Based Classification . . . 28
2.4 Nearest Neighbour Classification . . . 29
2.4.1 Similarity and Distance Metrics . . . 31
2.4.2 Other Distance Metrics for Multimedia Data . . . 32
2.4.3 Computational Complexity . . . 35
2.4.4 Instance Selection and Noise Reduction . . . 36
2.4.5 k-NN: Advantages and Disadvantages . . . 39
2.5 Ensemble Techniques . . . 40
2.5.1 Introduction . . . 40
2.5.2 Bias–Variance Analysis of Error . . . 41
2.5.3 Bagging . . . 41
2.5.4 Random Forests . . . 44
2.5.5 Boosting . . . 45
2.6 Summary . . . 46
References . . . 47
3 Unsupervised Learning and Clustering . . . . 51
Derek Greene, P´adraig Cunningham, and Rudolf Mayer 3.1 Introduction . . . 51
3.2 Basic Clustering Techniques . . . 52
3.2.1 k-Means Clustering . . . 52
3.2.2 Fuzzy Clustering . . . 53
3.2.3 Hierarchical Clustering . . . 54
3.3 Modern Clustering Techniques . . . 58
3.3.1 Kernel Clustering . . . 58
3.3.2 Spectral Clustering . . . 60
3.4 Self-organizing Maps . . . 65
3.4.1 SOM Architecture . . . 66
3.4.2 SOM Algorithm . . . 66
3.4.3 Self-organizing Map and Clustering . . . 69
3.4.4 Variations of the Self-organizing Map . . . 70
3.5 Cluster Validation . . . 73
3.5.1 Internal Validation . . . 75
3.5.2 External Validation . . . 79
3.5.3 Stability-Based Techniques . . . 84
3.6 Summary . . . 87
References . . . 87
4 Dimension Reduction . . . . 91
P´adraig Cunningham 4.1 Introduction . . . 91
4.2 Feature Transformation . . . 93
4.2.1 Principal Component Analysis . . . 94
4.2.2 Linear Discriminant Analysis . . . 97
4.3 Feature Selection . . . 99
4.3.1 Feature Selection in Supervised Learning . . . 99
4.3.2 Unsupervised Feature Selection . . . 104
4.4 Conclusions . . . 110
References . . . 110
Part II Multimedia Applications 5 Online Content-Based Image Retrieval Using Active Learning. . . . 115
Matthieu Cord and Philippe-Henri Gosselin 5.1 Introduction . . . 115
5.2 Database Representation: Features and Similarity . . . 117
5.2.1 Visual Features . . . 117
5.2.2 Signature Based on Visual Pattern Dictionary . . . 117
5.2.3 Similarity . . . 118
5.2.4 Kernel Framework . . . 119
5.2.5 Experiments . . . 120
5.3 Classification Framework for Image Collection . . . 121
5.3.1 Classification Methods for CBIR . . . 122
5.3.2 Query Updating Scheme . . . 123
5.3.3 Experiments . . . 123
5.4 Active Learning for CBIR . . . 124
5.4.1 Notations for Selective Sampling Optimization . . . 125
5.4.2 Active Learning Methods . . . 125
5.5 Further Insights on Active Learning for CBIR . . . 127
5.5.1 Active Boundary Correction . . . 128
5.5.2 MAP vs Classification Error . . . 130
5.5.3 Batch Selection . . . 130
5.5.4 Experiments . . . 132
5.6 CBIR Interface: Result Display and Interaction . . . 132
References . . . 136
6 Conservative Learning for Object Detectors . . . . 139
Peter M. Roth and Horst Bischof 6.1 Introduction . . . 140
6.2 Online Conservative Learning . . . 143
6.2.1 Motion Detection . . . 143
6.2.2 Reconstructive Model . . . 144
6.2.3 Online AdaBoost for Feature Selection . . . 146
6.2.4 Conservative Update Rules . . . 148
6.3 Experimental Results . . . 149
6.3.1 Description of Experiments . . . 149
6.3.2 CoffeeCam . . . 151
6.3.3 Switch to Caviar . . . 153
6.3.4 Further Detection Results . . . 156
6.4 Summary and Conclusions . . . 156
References . . . 156
7 Machine Learning Techniques for Face Analysis. . . . 159
Roberto Valenti, Nicu Sebe, Theo Gevers, and Ira Cohen 7.1 Introduction . . . 160
7.2 Background . . . 160
7.2.1 Face Detection . . . 160
7.2.2 Facial Feature Detection . . . 161
7.2.3 Emotion Recognition Research . . . 162
7.3 Learning Classifiers for Human–Computer Interaction . . . 163
7.3.1 Model Is Correct . . . 165
7.3.2 Model Is Incorrect . . . 166
7.3.3 Discussion . . . 167
7.4 Learning the Structure of Bayesian Network Classifiers . . . 168
7.4.1 Bayesian Networks . . . 168
7.4.2 Switching Between Simple Models . . . 169
7.4.3 Beyond Simple Models . . . 169
7.4.4 Classification-Driven Stochastic Structure Search . . . 170
7.4.5 Should Unlabeled Be Weighed Differently? . . . 171
7.4.6 Active Learning . . . 172
7.4.7 Summary . . . 173
7.5 Experiments . . . 173
7.5.1 Face Detection Experiments . . . 174
7.5.2 Facial Feature Detection . . . 178
7.5.3 Facial Expression Recognition Experiments . . . 183
7.6 Conclusion . . . 184
References . . . 185
8 Mental Search in Image Databases: Implicit Versus Explicit Content Query. . . . 189
Simon P. Wilson, Julien Fauqueur, and Nozha Boujemaa 8.1 Introduction . . . 189
8.2 “Mental Image Search” Versus Other Search Paradigms . . . 190
8.3 Implicit Content Query: Mental Image Search Using Bayesian Inference . . . 191
8.3.1 Bayesian Inference for CBIR . . . 191
8.3.2 Mental Image Category Search . . . 193
8.3.3 Evaluation . . . 195
8.3.4 Remarks . . . 196
8.4 Explicit Content Query: Mental Image Search by Visual Composition Formulation . . . 197
8.4.1 System Summary . . . 198
8.4.2 Visual Thesaurus Construction . . . 198
8.4.3 Symbolic Indexing, Boolean Search
and Range Query Mechanism . . . 199
8.4.4 Results . . . 201
8.4.5 Summary . . . 203
8.5 Conclusions . . . 203
References . . . 204
9 Combining Textual and Visual Information for Semantic Labeling of Images and Videos . . . . 205
Pınar Duygulu, Muhammet Bas¸tan, and Derya Ozkan 9.1 Introduction . . . 206
9.2 Semantic Labeling of Images . . . 207
9.3 Translation Approach . . . 210
9.3.1 Learning Correspondences Between Words and Regions . 211 9.3.2 Linking Visual Elements to Words in News Videos . . . 212
9.3.3 Translation Approach to Solve Video Association Problem . . . 213
9.3.4 Experiments on News Videos Data Set . . . 214
9.4 Naming Faces in News . . . 218
9.4.1 Integrating Names and Faces . . . 218
9.4.2 Finding Similarity of Faces . . . 219
9.4.3 Finding the Densest Component in the Similarity Graph . 220 9.4.4 Experiments . . . 221
9.5 Conclusion and Discussion . . . 223
References . . . 223
10 Machine Learning for Semi-structured Multimedia Documents: Application to Pornographic Filtering and Thematic Categorization. 227 Ludovic Denoyer and Patrick Gallinari 10.1 Introduction . . . 227
10.2 Previous Work . . . 229
10.2.1 Structured Document Classification . . . 230
10.2.2 Multimedia Documents . . . 231
10.3 Multimedia Generative Model . . . 231
10.3.1 Classification of Documents . . . 231
10.3.2 Generative Model . . . 232
10.3.3 Description . . . 232
10.4 Learning the Meta Model . . . 238
10.4.1 Maximization of Lstructure . . . 238
10.4.2 Maximization of Lcontent. . . 239
10.5 Local Generative Models for Text and Image . . . 239
10.5.1 Modelling a Piece of Text with Naive Bayes . . . 240
10.5.2 Image Model . . . 240
10.6 Experiments . . . 241
10.6.1 Models and Evaluation . . . 241
10.6.2 Corpora . . . 242
10.6.3 Results over the Pornographic Corpus . . . 243
10.6.4 Results over the Wikipedia Multimedia Categorization Corpus . . . 244
10.7 Conclusion . . . 246
References . . . 246
11 Classification and Clustering of Music for Novel Music Access Applications. . . . 249
Thomas Lidy and Andreas Rauber 11.1 Introduction . . . 250
11.2 Feature Extraction from Audio . . . 251
11.2.1 Low-Level Audio Features . . . 251
11.2.2 MPEG-7 Audio Descriptors . . . 252
11.2.3 MFCCs . . . 255
11.2.4 MARSYAS Features . . . 256
11.2.5 Rhythm Patterns . . . 258
11.2.6 Statistical Spectrum Descriptors . . . 259
11.2.7 Rhythm Histograms . . . 260
11.3 Automatic Classification of Music into Genres . . . 262
11.3.1 Evaluation Through Music Classification . . . 263
11.3.2 Benchmark Data Sets for Music Classification . . . 264
11.4 Creating and Visualizing Music Maps Based on Self-organizing Maps . . . 267
11.4.1 Class Visualization . . . 268
11.4.2 Hit Histograms . . . 269
11.4.3 U-Matrix . . . 270
11.4.4 P-Matrix . . . 271
11.4.5 U*-matrix . . . 272
11.4.6 Gradient Fields . . . 272
11.4.7 Component Planes . . . 273
11.4.8 Smoothed Data Histograms . . . 274
11.5 PlaySOM – Interaction with Music Maps . . . 276
11.5.1 Interface . . . 276
11.5.2 Interaction . . . 277
11.5.3 Playlist Creation . . . 278
11.6 PocketSOMPlayer – Music Retrieval on Mobile Devices . . . 280
11.6.1 Interaction . . . 281
11.6.2 Playing Scenarios . . . 282
11.6.3 Conclusion . . . 282
11.7 Conclusions . . . 282
References . . . 283
Index . . . 287