• Keine Ergebnisse gefunden

Wissenschaftliches Rechnen II/Scientific Computing II

N/A
N/A
Protected

Academic year: 2021

Aktie "Wissenschaftliches Rechnen II/Scientific Computing II"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Wissenschaftliches Rechnen II/Scientific Computing II

Sommersemester 2016 Prof. Dr. Jochen Garcke Dipl.-Math. Sebastian Mayer

Exercise sheet 10 To be handed in on Thursday, 30.06.2016

Principal Component Analysis

1 Group exercises

G 1. Let Σ ∈ R d×d be a symmetric matrix with eigenvalues λ 1 ≥ · · · ≥ λ d ≥ 0.

a) Let u i ∈ R d be the eigenvector corresponding to eigenvalue λ i . Show that hu i , u j i = 0 if λ i 6= λ j .

b) Show that

kwk max

2

=1 w T Σw = λ 1 , min

kwk

2

=1 w T Σw = λ d .

G 2. Let λ 1 , . . . , λ d ∈ R such that λ 1 ≥ λ 2 ≥ · · · ≥ λ d ≥ 0. Further let α 1 , . . . , α d ∈ [0, 1]

such that P d

i=1 α i = p, where p ∈ N and p ≤ d. Show that

d

X

i=1

λ i α i ≤

p

X

i=1

λ i .

G 3. Let y ∈ R d be a random variable with E [y] = 0 and E [yy T ] = Σ y . Show that the minimization problem

min

W ∈ R

d×p

:W

T

W =I

p

E [ky − W W T yk 2 2 ] is equivalent to the maximization problem

max

W ∈ R

d×p

:W

T

W =I

p

tr(W T Σ y W )

G 4. Let y 1 , . . . , y n be some data points in R d . Assume that you cannot access the y i directly, but only the Euclidean distance matrix D = (ky i − y j k 2 2 ) n i,j=1 . Compute from D the centering Gram matrix G c = (hy i − µ, y j − µi) n i,j=1 , where µ = n −1 P n

i=1 y i .

(2)

2 Homework

H 1. (Principal components)

5 Let y ∈ R d be a random variable with E [y] = 0 and Σ y = E [yy T ]. Assume that rank(Σ y ) ≥ p and denote by λ 1 ≥ λ 2 ≥ · · · ≥ λ p > 0 the p largest eigenvalues of Σ y . Prove Theorem 2.3 presented in the lecture, i.e., show that the first p principal components of y are given by

x i = w T i y

where {w i } p i=1 are the orthonormal eigenvectors of Σ y associated with the eigenvalues λ 1 , . . . , λ p .

Hint: Use G1 b) to determine the p principal components step by step.

H 2. (PCA: Greedy vs. global minimization)

a) A greedy algorithm is an algorithm that tries to solve an optimization problem by making a locally optimal choice at each stage. A solution obtained by a greedy al- gorithm is called a greedy solution. Argue that the principal components, which you determined in H1, form a greedy solution for the maximiation problem considered in G3.

b) Let Σ ∈ R d×d be a symmetric matrix with eigenvalue decomposition Σ = U ΛU T . In the lecture it has been stated that

max

W ∈ R

d×p

,W

T

W =I

p

tr(W T ΣW )

is attained for W = U I d×p , where I d×p = (δ ij ) i=1,...,d,j=1,...,p . Show that this is indeed true. Hint: Use G2.

c) Use b) to conclude the principal components from H1 give the optimal solution for the optimization problem considered in G3.

Remark: This fact is remarkable since in general, a greedy algorithm will not find the optimal solution of an optimization problem.

(5 Punkte) H 3. (PCA: optimal rank-p approximation)

Let A ∈ R n×d with singular value decomposition A = U ΓV T . Let A p = U Γ P V T , where Γ p denotes the matrix obtained from Γ by settings to zero its elements on the diagonal after the pth entry.

a) Show that

kA − A p k 2 F =

min{n,d}

X

i=p+1

σ 2 i ,

where σ i denotes the ith singular value of A.

b) Show that the solution of

B∈ R

n×d

min ,rank(B)=p

kA − Bk 2 F is given by B = A p .

(5 Punkte) H 4. (Programming exercise: oddities of high-dimensional data)

In this programming exercise you will experiment with artificial high-dimensional data to observe that distances behave very counterintuitive in high dimensions. See accom- panying notebook for the details.

(5 Punkte)

2

Referenzen

ÄHNLICHE DOKUMENTE

Let B e n be the transformed version of the Bernoulli polynomial B n such that the domain is [0, 2π] instead of

As usual, you find the tasks in the accompanying note- book on the lecture’s website.

Exercise sheet 11 To be handed in on Thursday, 07.06.2016. Application of PCA

Prove the following simplyfied version of Lemma 2.16?. How should the algorithm be modified in

Vergabe von Terminen f¨ ur die m¨ undliche Pr¨ ufung: Termine f¨ ur die m¨ undliche Pr¨ ufung werden in der Vorlesung am Dienstag, 19.07.2016, vergeben. Appointments for oral

(Hint: First “spend” some of the uncer- tainty and clock drifts to show that logical clocks cannot increase too rapidly. Then argue that any solution would imply a pulse

Observe that (i) nodes in F always transition at the same times, (ii) nodes in S always transition at the same times, (iii) all nodes always take the exact same time to transition

(Hint: First “spend” some of the uncer- tainty and clock drifts to show that logical clocks cannot increase too rapidly. Then argue that any solution would imply a pulse