• Keine Ergebnisse gefunden

Pattern Recognition

N/A
N/A
Protected

Academic year: 2022

Aktie "Pattern Recognition"

Copied!
21
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Pattern Recognition

Probability Theory

(2)

Probability Space

is a three-tuple with:

• − the set of elementary events

• − algebra

• − probability measure

-algebra over is a system of subsets, i.e. ( is the power set) with:

is closed with respect to the complement and countable conjunction It follows – , countable disjunction (due to the De Morgan's laws)

(3)

Pattern Recognition: Probability Theory

Probability Space

3

Examples:

• (smallest) and (largest) -algebras over

• the minimal -algebra over containing a particular subset is

• discrete and finite,

• , the Borel-algebra (contains all intervals amongst others)

• etc.

(4)

Probability Measure

Is a “measure” ( ) with the normalizing

-additivity: let be pairwise disjoint subsets, i.e. , then

Note: there are sets, for which there is mo measure.

Examples: the set of irrational numbers, function spaces etc.

Banach–Tarski paradox:

(5)

Pattern Recognition: Probability Theory

(For us) practically relevant cases

5

• The set is “good-natured”, i.e. , discrete finite sets etc.

• , i.e. the algebra is the power set

• We often consider a (composite) “event” as the union of the elementary ones

• Probability of an event is

(6)

Random variables

Here a special case – real-valued random variables.

A random variable for a probability space is a mapping , satisfying

(always holds for power sets ).

Note: elementary events are not numbers – they are elements of an abstract set

Random variables in contrast are numbers, i.e. they can be summed up, subtracted, squared etc.

(7)

Pattern Recognition: Probability Theory

Distributions

7

Cumulative distribution function of a random variable :

Probability distribution of a discrete random variable :

Probability density of a continuous random variable :

(8)

Distributions

Why it is necessary to do it so complicated (through the cumulative distribution function)?

Example – a Gaussian.

Probability of any particular real value is zero → a “direct” definition of a “probability distribution” is senseless 

It is indeed possible through the cumulative distribution function.

(9)

Pattern Recognition: Probability Theory

Mean

9

A mean (average, expectation…) of a random variable is

Arithmetic mean is a special case:

with

(uniform probability distribution)

(10)

Mean

The probability of an event can be expressed as the mean value of a corresponding “indicator”-variable:

with

Often, the set of elementary events can be associated with a random variable (just enumerate all ).

Then one can speak about a “probability distribution over “ (instead of the probability measure).

(11)

Pattern Recognition: Probability Theory

Example 1 – numbers of a die

11

The set of elementary events:

Probability measure:

Random variable:

Cumulative distribution:

Probability distribution:

Mean value:

Another random variable (squared numbers of a die):

Mean value:

Note:

(12)

Example 2 – two independent dice numbers

The set of elementary events (6x6 faces):

Probability measure:

Two random variables:

1. The number of the first die:

2. The number of the second die

Probability distributions:

(13)

Pattern Recognition: Probability Theory

Example 2 – two independent dice numbers

13

Consider the new random variable

The probability distribution is not uniform anymore 

Mean value is

In general for mean values:

(14)

Random variables of higher dimension

Analogously: Let be a mapping ( for simplicity), with , and

Cumulative distribution function:

Joint probability distribution (discrete):

Joint probability density (continuous):

(15)

Pattern Recognition: Probability Theory

Independence

15

Two events and are independent, if

Interesting:

Events and are independent, if and are independent.

Two random variables are independent, if

It follows (example for continuous )

(16)

Conditional Probabilities

Conditional probability:

Independence (“almost” equivalent): and are independent, if and/or

Bayes’ theorem (formula, rule):

(17)

Pattern Recognition: Probability Theory

Further definitions (for random variables)

17

Shorthand:

Marginal probability distribution:

Conditional probability distribution:

Note:

Independent probability distributions:

(18)

Example

Let the probability to be taken ill be

Let the conditional probability to have a temperature in that case is

However, one may have a temperature without any illness, i.e.

What is the probability to be taken ill provided that one has a temperature?

(19)

Pattern Recognition: Probability Theory

Example

19

Bayes’ rule:

− not so high as expected , the reason – very low prior probability to be taken ill

(20)

Further topics

The model

Let two random variables be given:

• The first one is typically discrete (i.e. ) and is called “class”

• The second one is often continuous ( ) and is called

“observation”

Let the joint probability distribution be “given”.

As is discrete it is often specified by The recognition task: given , estimate .

Usual problems (questions):

• How to estimate from ?

• The joint probability is not always explicitly specified.

• The set is sometimes huge (remember the Hopfield-Networks)

(21)

Pattern Recognition: Probability Theory

Further topics

21

The learning task:

Often (almost always) the probability distribution is known up to free parameters. How to choose them (learn from examples)?

Next themes:

1. Recognition, Bayessian Decision Theory

2. Probabilistic (generative) learning, Maximum-Likelihood principle 3. Discriminative models, recognition and learning

4. Support Vector Machines

Referenzen

ÄHNLICHE DOKUMENTE

In summary, we present new materials possess- ing partial structures of conjugated heterocyclic me- someric betaines which are active in reversible photo- catalytic electron

The respect and prestige of scientist who manage to actually en- gage the general public in the USA (and other parts of the anglophone world) is seen evident in the likes of

– Move our decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified as salmon!).?. Now we use 2

• In many real world (empirical) distributions extreme events occur far more often than a Gaussian would allow. Gaussians have only a

• In many real world (empirical) distributions extreme events occur far more often than a Gaussian would allow. Gaussians have only a

During enrolment the user draws the pattern while the lock pattern application records the user’s biometrics, here the location, pressure and contact area of the finger as functions

This correspondence motivates a simple way of valuing the players (or factors): the players, or factor re- presentatives, set prices on themselves in the face of a market

Alain Pumps.—A section of the pump barrels, showing buckets, suction valve, plunger, guides, rods, &c., of the spring-water pump is shown in fig.. Two buckets are shown in this