MACHINE LEARNING II
11th SEMINAR – DISCRIMINATIVE LEARNING
Exercise 1. Consider the following probability distribution for two classes and a real- valued observationx∈R. Let the prior probabilities for classes p(k),k=1,2 be given.
The conditional probability distribution is p(x|k) =τ/2·exp
−τ· |x−µk| ,
whereτ>0 is equal for both classes. Derive the posterior probability distributionp(k|x).
Hint: Note, that the conditional probability distribution is not everywhere differentiable.
Hence, perform a case-by-case analysis.
Exercise 2. Consider a quadratic classifier f :Rn→ {0,1}for inputsx∈Rn:
f(x) =
1 wenn xT·A·x+hx,bi+c<0 0 sonst,
with an n×nmatrix A, a vector b∈Rn and a constantc∈R. Show, how to learn the unknown parameters of the classifier by the Perceptron algorithm.
Hint: Transform the input spaceRninto an appropriately chosen space of higher dimen- sion, in which the considered classifier is a linear one.
Exercise 3. Consider the "usual" Perceptron task, i.e. a learning of unknown parameters of a linear classifier hx,wi≷bgiven a training set. In addition, let it be required that certain parameters of the classifier are positive. For instance, for a particular component i∗of the weight vectorwi∗>0 should hold. How to modify the Perceptron algorithm in such a way that it allows only those classifiers that fulfil these additional constraints?
Exercise 4. Prove the multi-perceptron algorithm considered in the lecture (slide 20).
Show that it is a Perceptron algorithm (slide 14) in an appropriately chosen space.
Hint: Represent the constraints of the multi-class problem hxl,wyli>hxl,wki, for all l,k6=yl as scalar productshx,˜ wi˜ >0 using suitably defined ˜xand ˜w.
1
Exercise 5. Definition: A classifier family shatters the set of data points if, for all classifications of these points, there exists a classifier such that the model makes no errors when evaluating that set of data points.
Let a set of data points(x1,x2. . .xL),xl∈Rnbe given. Give a transformationφ :Rn→ Rd so that the corresponding family of generalized linear classifiershφ(x),wi≷0 shat- ters this training set.
Hint: Define a transformationφ :Rn→RL so that in the vector φ(x)each component
"is responsible" for one example of the training set.