• Keine Ergebnisse gefunden

Maximum Likelihood Parameter Estimation

N/A
N/A
Protected

Academic year: 2021

Aktie "Maximum Likelihood Parameter Estimation"

Copied!
16
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

- 1 - Digitale Signalverarbeitung und Mustererkennung

Maximum Likelihood

Parameter Estimation

(2)

- 2 - Digitale Signalverarbeitung und Mustererkennung

Motivation: Training of HMMs

Given:

Sample of feature vectors produced by an HMM state (training data)

Wanted:

Emission probability density of the state

Approach so far:

Assume normal distribution with independent components

Compute empirical mean- and variance vector from training data

(3)

- 3 - Digitale Signalverarbeitung und Mustererkennung

Simplification: numbers instead of vectors

Random variable X with unknown density function.

2, 5, 3 Sample of X:

Given:

Wanted:

Density function for X.

(4)

- 4 - Digitale Signalverarbeitung und Mustererkennung

Normal distribution with

Simplification: numbers instead of vectors

Random variable X with unknown density function.

2, 5, 3 Sample of X:

Given:

Wanted:

Density function for X.

(5)

- 5 - Digitale Signalverarbeitung und Mustererkennung

Other density function

 solution is not unique

Probably a bad choice:

„Overfitting“

Simplification: numbers instead of vectors

Random variable X with unknown density function.

2, 5, 3 Sample of X:

Given:

Wanted:

Density function for X.

(6)

- 6 - Digitale Signalverarbeitung und Mustererkennung

Correct problem formulation

Random variable X with parametric density function

Sample of X:

Given:

Wanted:

Values for the parameters , such that the likelihood of the observed sample is maximum, i.e.

Normal distribution with parameters Example:

Likelihood function

(7)

- 7 - Digitale Signalverarbeitung und Mustererkennung

Find  such that is maximum.

Set derivatives to zero, i.e.

Task

Problem

Derivatives of products with many factors leads to complicated terms!

Idea: take the logarithm!

Observation Method

and

have their maximum for the same value of , as ln is monotonic increasing!

Log likelihood

function

(8)

- 8 - Digitale Signalverarbeitung und Mustererkennung

Example:

Likelihood function

Log likelihood function

Maximum likelihood estimation of parameters  and 2 of the normal distribution from a sample x1, x2, …, xn

(9)

- 9 - Digitale Signalverarbeitung und Mustererkennung

Log likelihood function

Partial derivative wrt. 

(10)

- 10 - Digitale Signalverarbeitung und Mustererkennung

Log likelihood function

Partial derivative wrt. 2

(11)

- 11 - Digitale Signalverarbeitung und Mustererkennung

Maximum likelihood estimation of parameters  and 2 of the normal distribution from a sample x1, x2, …, xn

Result:

(12)

- 12 - Digitale Signalverarbeitung und Mustererkennung

Example:

Likelihood function (assume all xi  0)

Log likelihood function

Maximum likelihood estimation of parameter  of the exponential distribution from a sample x1, x2, …, xn

Density of the exponential distribution

if

otherwise

(13)

- 13 - Digitale Signalverarbeitung und Mustererkennung

Log likelihood function

Derivative wrt. 

Inverse of the empirical mean

(14)

- 14 - Digitale Signalverarbeitung und Mustererkennung

Example:

Maximum likelihood estimation of parameters a,b of the uniform distribution from a sample x1, x2, …, xn

Density of the uniform distribution

Likelihood function

if

otherwise

maximum if

minimum and

if and

otherwise

(15)

- 15 - Digitale Signalverarbeitung und Mustererkennung

Example:

Likelihood funktion

Log likelihood funktion

Maximum likelihood extimation of parameter p of a Bernoulli experiment from a sample x1, x2, …, xn

Probability distribution (discrete distribution!)

(16)

- 16 - Digitale Signalverarbeitung und Mustererkennung

Log likelihood function

Derivative wrt. p

Referenzen

ÄHNLICHE DOKUMENTE

We establish the asymptotic theory of the maximum likelihood estimator including consistency and limiting distribution, which is new to the spatial econometric literature.. A

Dieser einfache Fall ist aber nicht mehr gegeben, wenn relative, auf den Modellwert bezogene Unsicherheiten auftreten, oder wenn Unsicherheiten in Abszissen-Richtung behandelt

bei endlichen Lernstichproben stimmt der Mittelwert des geschätzten Parameters nicht unbedingt mit dem tatsächlichen überein. Beispiele: ML für µ ist erwartungswerttreu, ML für σ

(Allgemein): Das Modell ist eine Wahrscheinlichkeitsverteilung p(x,k; Θ) für Paare x (Beobachtung) und k (Klasse).. In der Lernstichprobe ist die Information unvollständig – die

Assumption: the training data is a realization of the unknown probability distribution – it is sampled according to it.. → what is observed should have a

Assumption: the training data is a realization of the unknown probability distribution – it is sampled according to it.. → what is observed should have a

In particular, we are interested in the following question: Given some a priori information about the initial pure quantum state, is it possible to improve the mean operation

2.3, we show a procedure to assess the gate errors required to achieve a given task as well as the expected execution time of the algorithm, and discuss the limitations such as the