• Keine Ergebnisse gefunden

Multimedia Databases

N/A
N/A
Protected

Academic year: 2021

Aktie "Multimedia Databases"

Copied!
61
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Multimedia Databases

Wolf-Tilo Balke Janus Wawrzinek

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

(2)

• Lecture

– 22.10.2019 – 04.02.2020

– 09:45-12:15 (3 lecture hours with a break) – Exercises and detours

• 4 or 5 Credits depending on the examination rules

• Exams

– Oral exam

0 Organizational Issues

(3)

Recommended literature

– Schmitt: Ähnlichkeitssuche in Multimedia-Datenbanken,

Oldenbourg, 2005

http://dx.doi.org/10.1524/9783486595048

– Steinmetz: Multimedia-Technologie:

Grundlagen, Komponenten und Systeme, Springer, 1999

0 Organizational Issues

(4)

– Castelli/Bergman: Image Databases, Wiley, 2002

http://dx.doi.org/10.1002/0471224634

– Khoshafian/Baker: Multimedia and Imaging Databases, Morgan

Kaufmann, 1996

– Sometimes: original papers (on our Web page)

0 Organizational Issues

(5)

Course Web page

– http://www.ifis.cs.tu-bs.de/ws-1920/mmdb

– Contains slides, exercises, related papers and a video of the lecture

– Any questions? Just drop us an email…

0 Organizational Issues

(6)

1 Introduction

1.1 What are multimedia databases?

1.2 Multimedia database applications 1.3 Evaluation of retrieval techniques

1 Introduction

(7)

• What are multimedia databases (MMDB)?

Databases + multimedia = MMDB

• Key words: databases and multimedia

• We already know databases, so what is multimedia?

1.1 Multimedia Databases

(8)

Multimedia

– The concept of multimedia expresses the integration of different digital media types

– The integration is usually performed in a document – Basic media types are text, image, vector graphics,

audio and video

1.1 Basic Definitions

(9)

• Text

– Text data, Spreadsheets, E-Mail, …

• Image

– Photos (Bitmaps), Vector graphics, CAD, …

• Audio

– Speech- and music records, annotations, wave files, MIDI, MP3, …

• Video

– Dynamical image record, frame-sequences, MPEG, AVI, …

1.1 Data Types

(10)

Document types

Media objects are documents which are of only one type (not necessarily text)

Multimedia objects are general documents which allow an arbitrary combination of different types

• Multimedia data is transferred through the use of a medium

1.1 Documents

(11)

Medium

– A medium is a carrier of information in a communication connection

– It is independent of the transported information – The used medium can also be

changed during information transfer

1.1 Basic Definitions

(12)

Book

1.1 Medium Example

– Communication between author and reader

– Independent from content – Hierarchically built on text

and images

– Reading out loud represents

medium change to sound/audio

(13)

• Based on receiver type

– Visual/optical medium – Acoustic mediums

– Haptical medium – through tactile senses – Olfactory medium – through smell

– Gustatory medium – through taste

• Based on time

– Dynamic – Static

1.1 Medium Classification

(14)

• We now have seen…

– …what multimedia is

– …and how it is transported (through some medium)

• But… why do we need databases?

– Most important operations of databases are data storage and data retrieval

1.1 Multimedia Databases

(15)

Persistent storage of multimedia data, e.g.:

– Text documents

– Vector graphics, CAD – Images, audio, video

Content-based retrieval

– Efficient content-based search

– Standardization of meta-data (e. g., MPEG-7, MPEG-21)

1.1 Multimedia Databases

(16)

Stand-alone vs. database storage model?

– Special retrieval functionality as well as corresponding optimization can be provided in both cases…

– But in the second case we also get the general advantages of databases

• Declarative query language

• Orthogonal combination of the query functionality

• Query optimization, Index structures

• Transaction management, recovery

• ...

1.1 Multimedia Databases

(17)

1.1 Historical Overview

Retrieval procedures for text documents (Information Retrieval)

Relational Databases and SQL

Presence of multimedia objects intensifies

SQL-92 introduces BLOBs First Multimedia-Databases

1960

1970

1980

1990

2000

(18)

• Relational Databases use the data type BLOB (binary large object)

– Un-interpreted data

– Retrieval through metadata like e.g., file name, size, author, …

Object-relational extensions feature enhanced retrieval functionality

– Semantic search

– IBM DB2 Extenders, Oracle Cartridges, …

– Integration in DB through UDFs, UDTs, Stored Procedures, …

1.1 Commercial Systems

(19)

Requirements for multimedia databases (Christodoulakis, 1985)

– Classical database functionalityMaintenance of unformatted data – Consideration of

special storage and presentation devices

1.1 Requirements

(20)

• To comply with these requirements the following aspects need to be considered

Software architecture – new or extension of existing databases?

Content addressing – identification of the objects through content-based features

Performance – improvements using indexes, optimization, etc.

1.1 Requirements

(21)

User interface – how should the user interact with the system? Separate structure from content!

Information extraction – (automatic) generation of content-based features

Storage devices – very large storage capacity, redundancy control and compression

Information retrieval – integration of some extended search functionality

1.1 Requirements

(22)

Retrieval: choosing between data objects.

Based on…

– a SELECT condition (exact match) – or a defined similarity connection

(best match)

• Retrieval may also cover the

delivery of the results to the user

1.1 Retrieval

(23)

• Closer look at the search functionality

– „Semantic“ search functionality

– Orthogonal integration of classical and extended functionality

– Search does not directly access the media objects – Extraction, normalization and indexing of content-

based features

– Meaningful similarity/distance measures

1.1 Retrieval

(24)

• “Retrieve all images showing a sunset !”

• What exactly do these images have in common?

1.1 Content-based Retrieval

(25)

• Usually 2 main steps

– Example: image databases

1.1 Schematic View

Digitization Image

collection Image analysis

and feature extraction

Image database

Digitization Image

query Image analysis

and feature extraction

Similarity search Querying the database

Creating the database

(26)

1.1 Detailed View

Query Result

3. Query

preparation 5. Result preparation

4. Similarity computation & query processing

2. Extraction of features

1. Insert into the database MM-Database Query plan & feature values

Feature values Raw & relational data Result data

Raw data

(27)

1.1 More Detailed View

Query Result

MM-Database BLOBs/CLOBs

Similarity computation Query processing

Result preparation Medium transformation

Format transformation

Result data Query preparation

Normalization Segmentation Feature extraction

Optimization

Query plan Feature values

Feature values Feature index Feature extraction Feature recognition Feature preparation

Relational DB

Metadata Profile Structure data Pre-processing

Decomposition Normalization Segmentation

Relevance feedback

(28)

• Lots of multimedia content on the Web

– Social networking e.g., Facebook, MySpace, Hi5, etc.

– Photo sharing e.g., Flickr, Photobucket, Instagram, Picasa, etc.

– Video sharing e.g., YouTube, Metacafe, blip.tv, Liveleak, etc.

1.2 Applications

(29)

Cameras are everywhere

– In London “there are at least 500,000 cameras in the city, and one study showed that in a single

day a person could expect to be filmed 300 times”

1.2 Applications

(30)

• Picasa face recognition

1.2 Applications

(31)

• Picasa, face recognition example

1.2 Applications

(32)

• Picasa, learning phase

1.2 Applications

(33)

• Picasa example

1.2 Applications

(34)

• Consider a police investigation of a large-scale drug operation

• Possible generated data:

Video data captured by surveillance cameras – Audio data captured

Image data consisting of still photographs taken by investigators

Structured relational data containing background

information

Geographic information system data

1.2 Sample Scenario

(35)

• Possible queries

– Image query by keywords: police officer wants to examine pictures of “Tony Soprano”

• Query: “retrieve all images from the image library in which ‘Tony Soprano’ appears"

– Image query by example: the police officer has a photograph and wants to find the identity of the person in the picture

• He hopes that someone else has already tagged another photo of this person

• Query: “retrieve all images from the database in which the person appearing in the (currently displayed) photograph

1.2 Sample Scenario

(36)

Video Query: (Murder case)

• The police assumes that the killer must have interacted with the victim in the near past

• Query: “Find all video segments from last week in which Jerry appears”

1.2 Sample Scenario

(37)

Heterogeneous Multimedia Query:

• Find all individuals who have been photographed with “Tony Soprano” and who have been convicted of attempted

murder in New Jersey and who have recently had electronic fund transfers made into their bank accounts from ABC

Corp.

1.2 Sample Scenario

(38)

• … so there are different types of queries

… what about the MMDB characteristics?

Static: high number of search queries (read access), few modifications of the data

Dynamic: often modifications of the data

Passive: database reacts only at requests from outsideActive: the functionality of the database leads to

operations at application level

Standard search: queries are answered through the use of metadata e.g., Google-image search

Retrieval functionality: content based search on the multimedia repository e.g., Picasa face recognition

1.2 Characteristics

(39)

Passive static retrieval

– Art historical use case

1.2 Example

(40)

– Coat of arms: Possible hit in a multimedia database

1.2 Example

(41)

Active dynamic retrieval

– Wetter warning through evaluation of satellite photos

1.2 Example

Typhoon-Warning for the Philippines

Extraction

(42)

Standard search

– Queries are answered through the use of metadata e.g., Google-image search

1.2 Example

(43)

Retrieval functionality

– Content based e.g., Picasa face recognition

1.2 Example

(44)

• Basic evaluation of retrieval techniques

Efficiency of the system

• Efficient utilization of system resources

• Scalable also over big collections

Effectivity of the retrieval process

• High quality of the result

• Meaningful usage of the system

• What is more important? An effective retrieval process or an efficient one?

Depends on the application!

1.3 Retrieval Evaluation

(45)

• Characteristic values to measure efficiency are e.g.:

– Memory usage – CPU-time

– Number of I/O-Operations – Response time

• Depends on the (Hardware-) environment

Goal: the system should be efficient enough!

1.3 Evaluating Efficiency

(46)

• Measuring effectivity is more difficult and always depending on the query

• We need to define some query-dependent evaluation measures!

– Objective quality metrics

– Independent from the querying interface and the retrieval procedure

• Allows for comparing different systems/algorithms

1.3 Evaluating Effectivity

(47)

• Effectivity can be measured regarding an explicit query

– Main focus on evaluating the behavior of the system with respect to a query

Relevance of the result set

• But effectivity also needs to consider implicit information needs

– Main focus on evaluating the usefulness, usability and user friendliness of the system

Not relevant for this lecure!

1.3 Evaluating Effectivity

(48)

Relevance as a measure for retrieval:

each document will be binary classified as

relevant or irrelevant with respect to the query

– This classification is manually performed by “experts”

– The response of the system to the query will be compared to this classification

• Compare the obtained response with the “ideal” result

1.3 Relevance

(49)

• Then apply the automatic retrieval system:

1.3 Involved Sets

searched for (= relevant)

collection

found

(= query result)

Experts say:

this is relevant

The automatic retrieval says:

this is relevant

(50)

• False positives: irrelevant documents, classified as relevant by the system

False alarms

• Needlessly increase the result set

• Usually inevitable (ambiguity)

• Can be easily eliminated by the user

1.3 False Positives

searched for

collection

found

ca

cd

fa fd

(51)

• False negatives: relevant documents classified by the system as irrelevant

False dismissals

• Dangerous, since they

can’t be detected easily by the user

– Are there “better” documents in the collection which the system didn’t return?

– False alarms are usually not as bad as false dismissals

1.3 False Negatives

searched for

collection

found

ca

cd

fa fd

(52)

• Correct positives (correct alarms)

– All documents correctly classified by the system as relevant

• Correct negatives (correct dismissals)

– All documents correctly classified by the system as irrelevant

• All sets are disjunctive and their reunion is the entire document collection

1.3 Remaining Sets

searched for

collection

found

ca fa

fd

(53)

• Confusion matrix: visualizes the effectivity of an algorithm

1.3 Overview

cd fa

irrelevant

fd ca

relevant

irrelevant relevant

System- evaluation User-

evaluation

(54)

• Relevant results = fd + ca

– Handpicked by experts!

• Retrieved results = ca + fa

– Retrieved by the system

1.3 Interpretation

searched for

collection

found

ca

cd

fa

fd

(55)

Precision measures the ratio of correctly returned documents relative to all returned documents

P = ca / (ca + fa)

• Value between [0, 1]

(1 representing the best value)

• High number of false alarms mean worse results

1.3 Precision

searched for

collection

found

ca

cd

fa fd

(56)

Recall measures the ratio of correctly returned documents relative to all relevant documents

R = ca / (ca + fd)

• Value between [0, 1]

(1 representing the best value)

• High number of false drops mean worse results

1.3 Recall

searched for

collection

found

ca

cd

fa fd

(57)

• Both measures only make sense, if considered at the same time

– E.g., get perfect recall by simply returning all

documents, but then the precision is extremely low…

• Can be balanced by tuning the system

– E.g., smaller result sets lead to better precision rates at the cost of recall

• Usually the average precision-recall of more queries is considered (macro evaluation)

1.3 Precision-Recall Analysis

(58)

Alarms (returned elements) divided in ca and fa

– Precision is easy to calculate

Dismissals (not returned elements) are not so trivial to divide in cd und fd, because the entire collection has to be classified

– Recall is difficult to calculate

Standardized Benchmarks

– Provided connections and queries – Annotated result sets

1.3 Actual Evaluation

searched for collection

found

ca

cd

fa fd

(59)

1.3 Example

8 4 cd

0,5 0,8 0,2

P

0,525 0,8 0,25

R

Average 2

8 2

Q

2

6 2

8 Q

1

fd ca

fa

Query

(60)

• Precision-Recall-Curves

1.3 Representation

System 1 System 2 System 3

Average precision of the system 3 at a recall-level of 0,2

Which system is the best?

What is more

important: recall or precision?

(61)

• Retrieval of images by color

• Introduction to color spaces

• Color histograms

• Matching

Next lecture

Referenzen

ÄHNLICHE DOKUMENTE

Depending on the task, the algorithm must either complete the missing features or predict targets for newly presented data sets.. – Missing targets: The training set consists

2.3 Feature Extraction.. The query is analyzed similarly to the documents in the offline mode, but often we apply additional processing to correct spelling mistakes or to broaden

• SALSA considers the graph of the base set as a bipartite graph with pages having a double identity, once as a hub identity and once as a authority identity. – To compute

• Feature design may include the capturing of location information (much like we did with position information in text retrieval). Segmentation define areas of interest within

– Frequency Domain – transforming raw signal with STFT and analyzing frequencies and their energies at the given time point (see window technique).. – Perceptual Domain –

• SALSA considers the graph of the base set as a bipartite graph with pages having a double identity, once as a hub identity and once as a authority identity. – To compute

• SALSA considers the graph of the base set as a bipartite graph with pages having a double identity, once as a hub identity and once as a authority identity. – To compute

• Pixel based comparison: a naïve approach is to consider the changes per pixel along the time scale and compute a distance between subsequent