• Keine Ergebnisse gefunden

The conditional probability distribution is p(x|k) =τ/2·exp −τ· |x−µk

N/A
N/A
Protected

Academic year: 2022

Aktie "The conditional probability distribution is p(x|k) =τ/2·exp −τ· |x−µk"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

MACHINE LEARNING II

11th SEMINAR – DISCRIMINATIVE LEARNING

Exercise 1. Consider the following probability distribution for two classes and a real- valued observationx∈R. Let the prior probabilities for classes p(k),k=1,2 be given.

The conditional probability distribution is p(x|k) =τ/2·exp

−τ· |x−µk| ,

whereτ>0 is equal for both classes. Derive the posterior probability distributionp(k|x).

Hint: Note, that the conditional probability distribution is not everywhere differentiable.

Hence, perform a case-by-case analysis.

Exercise 2. Consider a quadratic classifier f :Rn→ {0,1}for inputsx∈Rn:

f(x) =

1 wenn xT·A·x+hx,bi+c<0 0 sonst,

with an n×nmatrix A, a vector b∈Rn and a constantc∈R. Show, how to learn the unknown parameters of the classifier by the Perceptron algorithm.

Hint: Transform the input spaceRninto an appropriately chosen space of higher dimen- sion, in which the considered classifier is a linear one.

Exercise 3. Consider the "usual" Perceptron task, i.e. a learning of unknown parameters of a linear classifier hx,wi≷bgiven a training set. In addition, let it be required that certain parameters of the classifier are positive. For instance, for a particular component iof the weight vectorwi>0 should hold. How to modify the Perceptron algorithm in such a way that it allows only those classifiers that fulfil these additional constraints?

Exercise 4. Prove the multi-perceptron algorithm considered in the lecture (slide 20).

Show that it is a Perceptron algorithm (slide 14) in an appropriately chosen space.

Hint: Represent the constraints of the multi-class problem hxl,wyli>hxl,wki, for all l,k6=yl as scalar productshx,˜ wi˜ >0 using suitably defined ˜xand ˜w.

1

(2)

Exercise 5. Definition: A classifier family shatters the set of data points if, for all classifications of these points, there exists a classifier such that the model makes no errors when evaluating that set of data points.

Let a set of data points(x1,x2. . .xL),xl∈Rnbe given. Give a transformationφ :Rn→ Rd so that the corresponding family of generalized linear classifiershφ(x),wi≷0 shat- ters this training set.

Hint: Define a transformationφ :Rn→RL so that in the vector φ(x)each component

"is responsible" for one example of the training set.

Referenzen

ÄHNLICHE DOKUMENTE

In  diesem  kurzen  Kapitel  wollen  wir  die  bisherigen  Betrachtungen  in  einen  Begriff   kondensieren,  der  Wahrscheinlichkeitsverteilung,  und  dazu

[r]

Beginnt man mit dem System β und nimmt zu diesem System alle diejenigen Mengen dazu, welche durch die Operatio- nen in (τ ii) und (τ iii) erzeugt werden, so ist das Ergebnis

Mathematische Grundlagen der Informatik RWTH

Prof. every node has the same number of neighbours), (ii) the class of Hamiltonian graphs, and. (iii) the class of graphs that admit a

Mathematische Grundlagen der Informatik RWTH

Mathematische Grundlagen der Informatik RWTH

(Stone- ˇ Cech-Kompaktifizierung) Sei (X, τ ) ein topologischer Raum und βX die Menge aller Ultrafilter auf X.. Angenommen, U enth¨ alt keine endliche Teil¨ uberdeckung