Eigen Problems and the Singular V alue Deomposition

De-omposition

Anymatrix

X ∈ R ⁿ ^× ^p

denesalinearmap

R ^p → R ⁿ

via

w 7→ Xw

^. ^The^singular

valuedeompositionof

X

^is^arepresentationofthis mapintermsoforthonormal basis vetors forboth

R ^p

and

R ⁿ

suhthat the mapdened by

X

^is^as^simple ^as

possible.

Proposition A.7 (SingularValue Deomposition). For any matrix

X ∈ R ⁿ ^× ^p

, there is an orthonormal basis

u ₁ , . . . u p

^of

R ^p

and a set of orthonormal vetors

v 1 , . . . , v p ∈ R ⁿ

suh that

Xu i = σ i v i , σ i ≥ 0 .

The quantities

σ i

^are ^al^le^d ^the ^singular ^values ^of

X

^and ^are ^numbered ⁱⁿ

de-reasingorder. In matrix notation, we have

X = V Σ U ^t

^(A.8)

with

V ^t V = I p

^and

U ^t U = I p .

It follows immediatelythatthe rankof

X

^equals^the ^number^of^nonzero ^singular

values (ounted with multipliities). We note that

V

^is ^a ^basis ^of ^the ^olumn

spae of

X

^and

U

^is ^a ^basis ^of ^the ^row ^spae ^of

X

^. ^We ^an ^extend ^the ^vetors

v i

^to^an orthonormalbasis of

R ⁿ

Denition A.8. A vetor

u ∈ R ^p \ { 0 }

^is ^alled ^an ^eigenvetor ^of ^a ^quadrati

matrix

A ∈ R ^p ^× ^p

if there is a salar

λ ∈ R

suh that

Au = λu

^. ^W^e ^all

λ

^an

eigenvalue of

A

The eigendeomposition of

A

^is ^a representation of the form

A = U Λ U ⁻ ¹ .

For some matries, there is no eigendeomposition. If

A

^is ^however ^symmetri,

we have anorthogonal eigendeomposition

A = U Λ U ^t , U ^t U = I p .

The eigenvetors of a symmetri matrix an be omputed with the help of the

so-alledpowermethod.

Algorithm A.9 (Power method). For a symmetri matrix

A

^and ^an ^initial

vetor

b ₀

^, ^the ^power ^method ^omputes iteratively

e b _k+1 = Ab k

^matrix multipliation

b _k+1 = ¹

k

b _k+1 k e b _k+1

normalization

The power algorithm onverges to the eigenvetor

u

^for ^whih ^the

orrespond-ingeigenvalue has the greatest absolutevalue, if this eigenvalue is dominant (in

absoluteterms)and ifthe startingvetor

b 0

^is^not ^orthogonal^on^the ^eigenvetor

u

A.4 Projetions

Letusonsider ageneralHibertspae

V

^. ^F^or^a^subspae

U

^and^any^vetor

v ∈ V

we denethe following optimizationproblem:

arg

min k v − u k ,

subjet to

u ∈ U .

As we assume that

V

^is ^a ^Hilbert ^spae, ^the ^solution ^exists ^if

U

^is ^a ^losed

sub-spae. We all the unique solution the (orthogonal) projetion of

v

^onto

U

^and

denoteit by

P U v

U

^is nite-dimensional, we an give a short representation of the projetion operator. Denote by

U = (u 1 , . . . , u k )

^any ^set ^of ^vetors ^that ^generate ^the

subspae

U

^. ^F^or ^any ^other ^set

V = (v 1 , . . . , v l )

^of ^vetors ^we ^dene ^the

k × l

matrix

h U , V i = ( h u i , v j i ) .

Furthermore,we dene the (symboli) multipliationof

U

^with ^a ^vetor

α ∈ R ^k

U α = X k

i=1

α i u i .

The projetion mapis then

P ^U v = U ( h U , U i ) ⁻ h U , v i .

^(A.9)

Wenow listsome properties of projetion operators.

Proposition A.10. Denote by

P U

^the ^projetion ^onto ^the ^subspae

U

P U

^is^a ^symmetri ^map.

2. The projetion operator is idempotent,

P _U ² ≡ P U

3. If thespae

U ^⊥

^that^is ^orthogonal ^on

U

^is^a^losed ^subspae, ^then

(Id _V − P )

is the projetion onto that spae.

4. If

V

^is nite-dimensional and

P U

^an ^be represented by a matrix

P

^, ^then

trae

(P ) = dim U

In Chapter 4, we need the rst derivative of a projetion operator. We now

present this result. Let us assume that both vetors

v = v(y), z = z(y) ∈ R ⁿ

depend ona vetor

y

^. ^The ^projetion ^of

z

^onto

v

^is^dened ^as ^(see ^(A.9))

P ^v z = v v ^t v − 1

v ^t z .

For any funtion

f

^that ^depends ^on

y

^, ^we ^use

df = df (y)

^as ^a ^shortut. ^Using

proposition A.5,wehave

d ( P ^v z) = d

v v ^t v ₋ 1

v ^t z

= (dv) v ^t v ₋ 1

v ^t z + v

d v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

d v ^t z

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t v

v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d v ^t

z + v ^t dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t

v v ^t v ₋ 1

v ^t z

− v v ^t v ₋ 1

v ^t dv v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

d v ^t

z + v v ^t v ₋ 1

v ^t dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t

P ^v z − P ^v dv v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d v ^t

z + P ^v dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d (v) ^t P ^v z − P ^v dv v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d (v) ^t z + P ^v dz

Using(A.2), this isequivalent tothe following. For all

h ∈ R ⁿ

(d ( P ^v z)) h = ((dv) h) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

(d (v) h) ^t P ^v z

−P ^v (dvh) v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

(d (v) h) ^t z + P ^v dzh .

This an be further simplied by fatoring out the expression

(v ^t v) ⁻ ¹

^and

rear-rangingsome terms. We obtain

(d ( P ^v z)) h = 1 v ^t v

v ^t z ((dv) h) − vz ^t P v ^t ((dv) h) − v ^t z P ^v ((dv) h) + vz ^t ((dv) h)

+ P ^v dzh

= 1

v ^t v

v ^t z − vz ^t P ^v − v ^t z P ^v + vz ^t

((dv) h) + P ^v dzh .

Finally,we use the denition of the rst derivative A.1 and obtain the following

result.

Proposition A.11. The rst derivate of the projetion operator is

∂ P ^v z

∂y = 1

v ^t v

vz ^t (I − P ^v ) + v ^t z (I − P ^v ) ∂v

∂y + P ^z ∂z

∂y .

A.5 The Moore-Penrose Inverse

The ontents of this setion an befound e.g. in Kokelkorn (2000). If amatrix

A

^is ^not ^quadrati ^or^is ^not ^of ^full ^rank, ^we ^have ^to^nd ^a ^suitable^surrogate ^for

itsinverse. In this work,we use the Moore-Penrose inverse.

Proposition A.12 (Moore-PenroseInverse). For any matrix

A ∈ R ^p × l

, there is

a unique matrix

A ⁻ ∈ R ^l × p

suh that

A = AA ⁻ A , A ⁻ = A ⁻ AA ⁻ , AA ⁻ t

= AA ⁻ , A ⁻ A t

= A ⁻ A .

Proposition A.13. If

A

^is ^a ^symmetri ^matrix ^with eigendeomposition

A = U Λ U ^t ,

the Moore-Penrose inverse of

A

^is ^dened ⁱⁿ ^the ^following^way. ^Set

Λ ⁻

ij =

 

 



 

 

0 i 6 = j

1 λ i i = j

^and

λ i 6 = 0 0 i = j

^and

λ i = 0 .

Then

A ⁻ = U Λ ⁻ U ^t .

Proof. It follows readily fromthe deniton of

Λ ⁻

that

ΛΛ ⁻ = Λ ⁻ Λ =

^diag

( 1, . . . , 1

| {z }

(A) −

^times

, 0, . . . , 0) .

This impliesthat

Λ ⁻

isindeedthe Moore-Penrose inverse of

Λ

,asthe properties in proposition A.12are fullled. It follows that

AU Λ ⁻ U ^t A = U ΛΛ ⁻ Λ U ^t = U Λ U ^t = A

and

U Λ ⁻ U ^t AU Λ ⁻ U ^t = U Λ ⁻ ΛΛ ⁻ U ^t = U Λ ⁻ U ^t .

Finally,we remark that the matrix

AU Λ ⁻ U ^t = U Λ ⁻ U ^t A = U

^diag

(1, . . . , 1, 0, . . . , 0)U ^t

issymmetri.

Proposition A.14. The system of linear equations

Ax = b

has asolution if and onlyif

x ^∗ = A ⁻ b

^is^a ^solution. ^Any ^solution ^of ^these ^linear

equations has the form

x = x ^∗ + I − A ⁻ A v

for any vetor

v

^. ^The ^two ^omponents ^of

x

^are orthogonal.

Results of the Simulation Study

We display the results of the simulation study that is desribed in Setion 7.3.

Thefollowingtables show theMSE-RATIO for

β b

^as^well^as^for

y b

^. ^In ^addition^to

theMSE-RATIO,wedisplaytheoptimalnumberofomponentsforeahmethod.

Itisinterestingtosee thatthetwoquantitiesarethesamealmostallofthetimes.

ollinearity no no no med. med. med. high high high

stnr 1 3 7 1 3 7 1 2 7

1 0.833 0.861 0.676 0.958 1.000 0.993 1.000 0.999 1.000

2 0.980 0.976 0.975 0.995 0.938 0.864 0.847 0.965 0.866

3 1.000 0.993 1.001 0.969 0.960 0.993 0.954 0.980 0.967

4 1.000 1.001 0.999 0.988 1.000 1.002 0.997 0.993 0.992

m ^opt _{P LS}

² ⁵ ² ² ⁴ ³ ¹ ² ⁵

m ^opt _{T RN}

² ⁵ ² ² ⁴ ³ ¹ ² ⁵

Table B.1: MSE-RATIOof

β b

^for

p = 5

^. ^The ^rst^two^rows ^display ^the ^setting^of

the parameters. The rows entitled 1-4 display the MSE ratio for the respetive

number of omponents.

ollinearity no no no med. med. med. high high high

stnr 1 3 7 1 3 7 1 2 7

1 0.775 0.780 0.570 0.919 1.000 0.970 1.004 0.995 0.999

2 0.978 0.972 0.9697 0.994 0.882 0.786 0.828 0.951 0.823

3 1.001 0.990 1.001 0.969 0.967 0.992 0.960 0.977 0.973

4 1.000 1.001 0.999 0.990 1.000 1.001 0.997 0.996 0.993

m ^opt _{P LS}

³ ⁵ ³ ² ⁴ ⁴ ¹ ² ⁵

m ^opt _{T RN}

² ⁵ ² ² ⁴ ⁴ ¹ ² ³

Table B.2: MSE-RATIOof

y b

^for

p = 5

ollinearity no no no med. med. med. high high high

stnr 1 3 7 1 3 7 1 2 7

1 0.929 0.963 0.972 0.98 0.998 0.989 1.000 1.000 1.000

2 0.938 0.959 0.977 0.922 0.91 0.978 0.789 0.793 0.792

3 0.907 0.952 0.981 0.875 0.91 0.945 0.849 0.843 0.849

4 0.905 0.933 0.971 0.879 0.913 0.912 0.857 0.864 0.868

5 0.901 0.942 0.954 0.879 0.924 0.898 0.870 0.883 0.879

6 0.898 0.942 0.945 0.878 0.915 0.891 0.882 0.890 0.893

7 0.892 0.926 0.949 0.887 0.906 0.891 0.891 0.895 0.898

8 0.899 0.926 0.956 0.892 0.904 0.895 0.897 0.897 0.903

9 0.908 0.933 0.955 0.897 0.910 0.895 0.903 0.902 0.904

10 0.913 0.938 0.951 0.900 0.916 0.898 0.902 0.899 0.901

11 0.913 0.937 0.947 0.902 0.919 0.907 0.906 0.901 0.902

12 0.917 0.931 0.944 0.909 0.919 0.917 0.908 0.904 0.906

13 0.924 0.932 0.946 0.919 0.92 0.925 0.913 0.907 0.914

14 0.933 0.939 0.946 0.927 0.917 0.936 0.921 0.911 0.922

15 0.94 0.945 0.95 0.933 0.916 0.936 0.928 0.916 0.931

16 0.949 0.945 0.951 0.935 0.918 0.941 0.938 0.922 0.936

17 0.956 0.945 0.954 0.939 0.922 0.945 0.944 0.926 0.936

18 0.961 0.944 0.959 0.943 0.931 0.95 0.946 0.930 0.935

19 0.968 0.946 0.964 0.946 0.934 0.958 0.953 0.939 0.936

20 0.973 0.951 0.973 0.949 0.935 0.962 0.961 0.947 0.939

21 0.977 0.958 0.977 0.954 0.936 0.966 0.968 0.955 0.943

22 0.98 0.965 0.981 0.961 0.94 0.973 0.972 0.962 0.948

23 0.984 0.97 0.984 0.968 0.945 0.98 0.976 0.967 0.950

24 0.987 0.976 0.988 0.975 0.948 0.983 0.98 0.970 0.953

25 0.989 0.98 0.99 0.978 0.953 0.987 0.981 0.973 0.959

26 0.992 0.985 0.993 0.982 0.959 0.991 0.984 0.977 0.966

27 0.994 0.989 0.996 0.986 0.966 0.992 0.987 0.981 0.975

28 0.995 0.991 0.997 0.988 0.973 0.994 0.99 0.985 0.984

29 0.996 0.993 0.998 0.99 0.978 0.995 0.993 0.988 0.988

30 0.997 0.994 0.999 0.992 0.982 0.996 0.995 0.991 0.99

31 0.998 0.995 0.999 0.994 0.985 0.997 0.996 0.992 0.993

32 0.998 0.996 0.999 0.996 0.99 0.998 0.996 0.994 0.995

33 0.999 0.997 1.000 0.996 0.991 0.999 0.997 0.995 0.996

34 0.999 0.998 1.000 0.997 0.993 0.999 0.998 0.996 0.997

35 0.999 0.999 1.000 0.999 0.994 0.999 0.998 0.997 0.998

36 1.000 1.000 1.000 0.999 0.996 0.999 0.999 0.998 0.998

37 1.000 1.000 1.000 0.999 0.998 1.000 0.999 0.998 0.999

38 1.000 1.000 1.000 1.000 0.999 1.000 0.999 0.999 0.999

39 1.000 1.000 1.000 1.000 0.999 1.000 1.000 0.999 1.000

m ^opt _{P LS}

¹ ³ ⁵ ¹ ² ² ¹ ¹ ¹

m ^opt _{T RN}

¹ ³ ⁵ ¹ ² ² ¹ ¹ ¹

Table B.3: MSE-RATIO of

β b

^for

p = 40

ollinearity no no no med. med. med. high high high

stnr 1 3 7 1 3 7 1 2 7

1 0.781 0.797 0.791 0.877 0.983 0.924 1.013 1.004 1.001

2 0.870 0.857 0.853 0.785 0.702 0.868 0.673 0.684 0.680

3 0.853 0.899 0.914 0.776 0.818 0.853 0.778 0.772 0.778

4 0.874 0.891 0.896 0.818 0.836 0.838 0.81 0.818 0.822

5 0.889 0.92 0.893 0.839 0.891 0.846 0.835 0.855 0.856

6 0.898 0.942 0.921 0.844 0.884 0.859 0.862 0.881 0.881

7 0.897 0.938 0.929 0.876 0.902 0.88 0.886 0.898 0.898

8 0.923 0.941 0.943 0.886 0.898 0.896 0.9 0.906 0.914

9 0.924 0.944 0.960 0.904 0.916 0.901 0.915 0.92 0.917

10 0.935 0.958 0.961 0.913 0.93 0.915 0.915 0.914 0.921

11 0.943 0.967 0.959 0.922 0.937 0.916 0.924 0.92 0.927

12 0.954 0.967 0.958 0.929 0.942 0.938 0.932 0.926 0.931

13 0.959 0.967 0.965 0.941 0.95 0.942 0.939 0.933 0.937

14 0.961 0.961 0.966 0.948 0.949 0.954 0.947 0.942 0.942

15 0.97 0.969 0.977 0.954 0.953 0.96 0.953 0.948 0.949

16 0.975 0.971 0.976 0.964 0.962 0.967 0.961 0.954 0.957

17 0.979 0.976 0.983 0.968 0.962 0.974 0.967 0.957 0.957

18 0.982 0.981 0.985 0.972 0.966 0.979 0.968 0.960 0.966

19 0.986 0.985 0.988 0.976 0.969 0.980 0.974 0.965 0.970

20 0.989 0.987 0.991 0.977 0.970 0.983 0.979 0.972 0.974

21 0.991 0.99 0.992 0.980 0.973 0.985 0.984 0.977 0.978

22 0.993 0.99 0.994 0.984 0.979 0.988 0.988 0.982 0.981

23 0.995 0.992 0.996 0.987 0.98 0.991 0.990 0.986 0.983

24 0.996 0.993 0.997 0.989 0.982 0.993 0.992 0.987 0.984

25 0.996 0.995 0.997 0.99 0.983 0.994 0.993 0.989 0.985

26 0.997 0.996 0.998 0.992 0.986 0.996 0.994 0.991 0.987

27 0.998 0.997 0.999 0.994 0.990 0.997 0.995 0.993 0.989

28 0.999 0.997 0.999 0.995 0.991 0.998 0.996 0.994 0.991

29 0.999 0.998 0.999 0.996 0.992 0.998 0.997 0.996 0.994

30 0.999 0.999 1.000 0.997 0.993 0.999 0.998 0.997 0.994

31 0.999 0.999 1.000 0.998 0.994 0.999 0.998 0.997 0.996

32 0.999 0.999 1.000 0.998 0.995 0.999 0.999 0.998 0.997

33 1.000 0.999 1.000 0.998 0.996 0.999 0.999 0.998 0.998

34 1.000 0.999 1.000 0.999 0.997 1.000 0.999 0.999 0.998

35 1.000 1.000 1.000 0.999 0.998 1.000 0.999 0.999 0.999

36 1.000 1.000 1.000 0.999 0.998 1.000 1.000 0.999 0.999

37 1.000 1.000 1.000 1.000 0.998 1.000 1.000 0.999 0.999

38 1.000 1.000 1.000 1.000 0.999 1.000 1.000 0.999 1.000

39 1.000 1.000 1.000 1.000 0.999 1.000 1.000 1.000 1.000

m ^opt _{P LS}

¹ ³ ⁹ ¹ ¹ ² ¹ ¹ ¹

m ^opt _{T RN}

¹ ³ ⁴ ¹ ² ² ¹ ¹ ¹

Table B.4: MSE-RATIO of

y b

^for

p = 40

Im Dokument Analysis of High Dimensional Data with Partial Least Squares and Boosting (Seite 156-167)

Eigen Problems and the Singular V alue Deomposition

X ∈ R n × p

R p → R n

w 7→ Xw

X

R p

R n

X

X ∈ R n × p

u 1 , . . . u p

R p

v 1 , . . . , v p ∈ R n

Xu i = σ i v i , σ i ≥ 0 .

σ i

X

X = V Σ U t

V t V = I p

U t U = I p .

X

V

X

U

X

v i

R n

u ∈ R p \ { 0 }

A ∈ R p × p

λ ∈ R

Au = λu

λ

A

A

A = U Λ U − 1 .

A

A = U Λ U t , U t U = I p .

A

b 0

e b k+1 = Ab k

b k+1 = 1

k

b k+1 k e b k+1

u

b 0

u

V

U

v ∈ V

min k v − u k ,

u ∈ U .

V

U

v

U

P U v

U

U = (u 1 , . . . , u k )

U

V = (v 1 , . . . , v l )

k × l

h U , V i = ( h u i , v j i ) .

U

α ∈ R k

U α = X k

i=1

α i u i .

P U v = U ( h U , U i ) − h U , v i .

P U

U

P U

P U 2 ≡ P U

U ⊥

U

(Id V − P )

V

P U

P

(P ) = dim U

v = v(y), z = z(y) ∈ R n

y

z

X ∈ R ⁿ ^× ^p

R ^p → R ⁿ

R ^p

R ⁿ

X ∈ R ⁿ ^× ^p

u ₁ , . . . u p

R ^p

v 1 , . . . , v p ∈ R ⁿ

X = V Σ U ^t

V ^t V = I p

U ^t U = I p .

R ⁿ

u ∈ R ^p \ { 0 }

A ∈ R ^p ^× ^p

A = U Λ U ⁻ ¹ .

A = U Λ U ^t , U ^t U = I p .

b ₀

e b _k+1 = Ab k

b _k+1 = ¹

b _k+1 k e b _k+1

α ∈ R ^k

P ^U v = U ( h U , U i ) ⁻ h U , v i .

P _U ² ≡ P U

U ^⊥

(Id _V − P )

v = v(y), z = z(y) ∈ R ⁿ

P ^v z = v v ^t v − 1

v ^t z .

d ( P ^v z) = d

v v ^t v ₋ 1

v ^t z

= (dv) v ^t v ₋ 1

v ^t z + v

d v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

d v ^t z

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t v

v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d v ^t

z + v ^t dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t

v v ^t v ₋ 1

v ^t z

− v v ^t v ₋ 1

v ^t dv v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

d v ^t

z + v v ^t v ₋ 1

v ^t dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d v ^t

P ^v z − P ^v dv v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d v ^t

z + P ^v dz

= (dv) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

d (v) ^t P ^v z − P ^v dv v ^t v ₋ 1

v ^t z +v v ^t v ₋ 1

d (v) ^t z + P ^v dz

h ∈ R ⁿ

(d ( P ^v z)) h = ((dv) h) v ^t v ₋ 1

v ^t z − v v ^t v ₋ 1

(d (v) h) ^t P ^v z

−P ^v (dvh) v ^t v ₋ 1

v ^t z + v v ^t v ₋ 1

(d (v) h) ^t z + P ^v dzh .

(v ^t v) ⁻ ¹

(d ( P ^v z)) h = 1 v ^t v

v ^t z ((dv) h) − vz ^t P v ^t ((dv) h) − v ^t z P ^v ((dv) h) + vz ^t ((dv) h)

+ P ^v dzh

v ^t v

v ^t z − vz ^t P ^v − v ^t z P ^v + vz ^t

((dv) h) + P ^v dzh .