Invariant Subspace Problem

(1)

Invariant Subspace Problem

Bachelor Thesis

Johannes Kaiser, Matrikelnummer 1225752

Technische Universit¨ at Wien

February 3, 2016

(2)

1 Introduction

In this paper we want to present a few results related to the invariant subspace problem. That is, the question whether an operator on a certain space, usually a Banach or Hilbert space, has a nontrivial invariant subspace. This problem seems to have been stated by Beurling and von Neumann and there are quite a lot of different theorems about it. Some of them give answers under conditions on the Banach or Hilbert space, others give their results under assumptions on the operator.

We divide the paper in two parts. In sections 2 and 3 some of the most popular positive answers to our question are shown and in section 4 we discuss one counterexample in detail, and present a few others in addition.

The chapter about similarity and quasisimilarity closely follows Chapter 4 of [1], while the chapter about polynomial boundedness follows Chapter 10 of [2]. The proof of Lomonosov’s theorem is extracted from Chapter 4.4 of [3]. The Bernstein-Robinson and the Aronszajn-Smith theorem are taken from [5] and [4].

2 Classical Results about the Existence of Invariant Subspaces

In the first section we want to present a few important theorems, which give the existence of an invariant subspace under certain conditions. To start this section we present a few results, which should already be known, or have quite simple proofs. We continue with the classic Aronszajn-Smith theorem given in [4], and the Bernstein-Robinson theorem, extracted from [5], whose proves share some common features. Then we introduce hyperinvariant subspaces and Lomonosov’s theorem, whose proof is different to the proofs of the first two.

2.1 Preliminary Notes

Before we start with the first theorem we recall and clarify the use of a few definitions

Definition 2.1. GenerallyH and K denote Hilbert spaces, while Bdenote a Banach space.

B(H) denote all bounded operators of H into itself. Further:

(i) We call an invariant subspaceSofH under an operatorT, a proper or nontrivial invariant subspace, if {0} 6=S6=H.

(ii) A closed linear subspaceM of H is said to be reducing forT ifM andM^⊥ are invariant subspaces for T.

(iii) An operator with kTk ≤1 is called a contraction.

Now we want to present a few conclusions already known from linear algebra and funda- mental functional analysis, and one with a quite simple proof.

Theorem 2.2. On finite dimensional vector spaces with dimension greater than one every nonzero operator has an eigenvector and hence a nontrivial invariant subspace.

Theorem 2.3. On a Hilbert spaceH that is not separable every operator has a proper invariant subspace.

Proof. It is easily seen that the sequence {Tⁿ :n∈N}spans a nontrivial subspace of H, and is invariant.

We obtain another conclusion through the spectral theorem for normal operators:

(4)

Theorem 2.4. Due to the spectral theorem every normal operator has an invariant subspace.

Theorem 2.5. Let T and L be nonzero operators on H. If LT = 0, then ker(L) and ran(T) are nontrivial invariant subspaces for T and L.

Proof. If LT = 0, then ran(T) ⊆ ker(L). Hence T(ker(L)) ⊆ T(H) = ran(T) ⊆ ker(L).

Since T 6= 0, we get that ker(L) 6= 0 and on the other hand we have that L 6= 0 so that ker(L)6=H. Therefore ker(L) is a nontrivial invariant subspace forT. Dually sinceT^∗L^∗ = 0, L^∗ 6= 0 and T^∗6= 0, it follows that ker(T^∗) is a nontrivial invariant subspace for L^∗, and hence ran(T) = ker(T^∗)^⊥ is a nontrivial invariant subspace for L. Finally recall that ker(L) and ran(T) are trivially invariant subspaces forL and T respectively.

Corollary 2.6. Every nilpotent operator has a nontrivial invariant subspace.

We want to finish this section with another theorem, which gives us an example of a Banach space on which every continuous operator has a proper nontrivial invariant subspace. The proof can be found in [6], but it is quite long so it is omitted.

Theorem 2.7. (Argyros,Haydon) There is an indecomposable Banach space with its dual space being isomorphic to`1. Every bounded linear operator on this space is expressible asλI+K with λa scalar and K compact. In particular every continuous operator has an invariant subspace.

2.2 The Aronszajn-Smith and Bernstein-Robinson Theorems

We now want to present two of the most classical results in invariant subspace theory. We start with the older Aronszajn-Smith Theorem and the proof we present follows the original one, presented in [4].

Theorem 2.8 (Aronszajn, Smith). Let T be a compact operator in a Banach Space B. Then there exist proper invariant subspaces of T.

In the proof of this theorem we use a map P on an arbitrary finite dimensional subspace C⊆B, defined through

kx−P xk=ρ(x, C) = min

y∈Ckx−yk.

We are able to limit ourselves to a separable Banach space, because of Theorem 2.3, therefore we can define an equivalent strictly convex norm on our Banach space, according to [7], Theorem 9. Because of that, we shall suppose that our norm on B is strictly convex. Therefore, and because of the separability, there exists a unique point P x ∈ C which realizes the minimal distance.

We refer to the upper map P as ”metric projection”, because it is quite similar to a projection, but not necessarily linear. Before we begin with the proof, we want to sum up a few general properties, which apply to the metric projection:

Lemma 2.9. IfP is a metric projection on a finite dimensional subspaceC, following properties are fulfilled:

1. P is idempotent: P² =P

2. P is homogeneous: P(αx) =αP x

3. P is quasiadditve: P(y+x) =y+P x for everyy∈C 4. kP x−xk ≤ kxk, kP xk ≤2kxk

(5)

5. |kx−P xk − ky−P yk| ≤ kx−yk

6. If C⁰ ⊆C andP⁰ is the metric projection on C⁰, then kx−P xk ≤ kx−P⁰xk.

Proof. The properties 1., 4. and 6. are obvious from the definition and 5. is the general property of the shortest distance from x to a fixed set C. To prove 2. consider P(x) = y and observe that kαx−αP(x)k= miny∈Ckαx−αyk=kαx−P(αx)k.

Property3. follows withk(x+y)−P(x+y)k= minz∈Ckx+y−zk=k(x+y)−(y+P(x))k, because y∈C.

Now we can begin with a few constructions for the proof of the Aronszajn-Smith-Theorem:

In finite dimensional spaces our theorem holds true, because of Theorem 2.2. Because of Theorem 2.3 we can restrict ourselves to infinite dimensional separable spaces and we have

(A) B is separable.

Now we are only interested in the case that

span{Tⁿf :n∈N}=B, f ∈B. (1)

because otherwise we would have already found our invariant subspace in the span of{Tⁿ:n∈ N}. This formula implies the following property:

(B) Tⁿf 6= 0 and all elements {Tⁿf :n∈N} are linearly independent.

To prove (B)suppose thatα1Tⁿ¹f+α2Tⁿ²f+...+α_kTⁿ^kf = 0, with 0≤n1 < n2 < ... < n_k and α_i 6= 0,i∈ {1, ..., k}. We obtain that

Tⁿ^kf =− 1

α_k

(α1Tⁿ¹f+...+αk−1Tⁿ^k−1f+),

and hence that all Tⁿf would lie in a subspace generated by Tⁿf with n < nk which is a contradiction to (1) and the infinite dimension ofB.

Now we consider a sequence of closed subspaces C_k∈B. The limes inferior of the sequence is defined as

lim inf

k C_k :={x∈B:∃x_k ∈C_k, x_k→x}.

The following two properties are easily verified:

(C) lim inf_kC_k is a closed subspace.

(D) If everyCkis finite dimensional, then x∈lim infkCk if and only ifPkx→x, wherePk

denotes the projection on C_k.

With f satisfying (1) we construct the k-dimensional subspace C_(k)= [Tⁿf]^k−1₀ .

We denote by P_k the metric projection onto C_(k). By (1) we obtain that lim infC_k = B and hence that

Pkx→x, x∈B. (2)

(6)

Now we consider the operatorT_k on C_(k) defined by Tkx=PkT x, x∈C_(k).

Now we prove thatT_k is linear. Through homogenity and quasiadditivity of the metric projection and by setting x:=Pk−1

i=0 ξiTⁱf, we get that

Tkx=PkT x=Pk k−1

X

i=0

ξiTⁱ⁺¹f =

k−2

X

i=0

ξiTⁱ⁺¹f +ξk−1PkT^kf, and hence that T_k is linear.

Because T_k is a linear operator on ak-dimensional space we can find a triangular matrix to represent Tk. Therefore there exists an increasing sequence of subspaces

0 =C_k⁰⊂C_k¹ ⊂...⊂C_k^k=C_k, (3) whereC_kⁱ is ani-dimensional invariant subspace ofT_k.

Lemma 2.10. Let {k_m}and{i_m} be sequences, such thatkm→ ∞ and0≤im ≤km,∀m∈N.

Further let x_m∈C_kⁱ^m

m. If T x_m →y theny∈lim infC_kⁱ^m

m. Proof. In fact we have P_k_mT xm = T_k_mxm ∈ C_kⁱ^m

m. On the other hand, by Theorem 2.9,5, we have

kT x_m−P^(k^m⁾T x_mk − ky−P^(k^m⁾T x_mk

≤ kT x_m−yk. With this and (2) we obtain kT x_m−P_(k_m₎T xmk ≤ ky−P_(k_m₎yk+kT x_m−yk →0

ky−P_(k_m₎T x_mk ≤ ky−T x_mk+kT x_m−P_(k_m₎T x_mk →0, which proves the lemma, becauseT_k_mx_m →y and thereforey∈lim inf_k_mC_kⁱ^m

m.

We continue the preparation for the proof of the Aronszajn-Smith Theorem with the following corollaries:

Corollary 2.11. For any sequences {k_m} and {i_m} satisfying the conditions of Lemma 2.10, lim inf_kC_kⁱ^m

m is an invariant subspace of T.

Proof. If we have x ∈lim inf_kC_kⁱ^m

m we have by the definition of the lim inf the existence of x_m such that x_m ∈ C_kⁱ^m

m, x_m → x. By the continuity of T we obtain T x_m → T x and by Lemma 2.10 T x∈lim infkC_kⁱ^m

m.

Corollary 2.12. If thelim inf of every subsequence ofC_kⁱ^m

m ={0}then for any bounded sequence xm ∈C_kⁱ^m_m we have T xm→ {0}.

Proof. By compactness of T, the sequencex_m is transformed into a relatively compact subse- quenceT xm. Therefore it is enough to prove that if any subsequenceT xmj converges to somey, theny= 0. But this follows from our hypothesis, since by Lemma 2.10 y∈lim inf_k_mC_kⁱ^m

m. Proof. (Aronszajn-Smith) Now we choose an arbitrary α >0 with

0< α <1,kT fk> αkTkkfk. (4) Since f ∈Ck we have by Theorem 2.9-3 and 2.9-6

kfk=kf −Pk,0fk ≥ kf −Pk,1fk ≥...≥ kf −Pk,kfk={0}.

(7)

Therefore there exists a unique indexi(k),0≤i(k)< k for eachk∈N such that

kf−P_kî(k)fk ≥αkfk>kf −P_kî(k)+1fk. (5) Letuk,k= 1,2, ...be an element ofC_kî(k)+1 such that

ku_kk= 1, P_kî(k)u(k) = 0. (6) An element with the given property can be obtained from an arbitrary v ∈ C_kî(k)+1\C_kî(k) by setting u_k := kv−P_kî(k)vk⁻¹(v−P_kî(k)v). The property (6) is now proved by homogenity and quasiadditivity of P.

Since the dimensions of C_kî(k)+1 and C_kî(k) differ by one, every element y∈C_kî(k)+1 is repre- sentable in a unique way in the form y =x+βu_k with x =P_kî(k)y. Correspondingly we shall put

P_kî(k)+1f =xk+βkuk, P_kî(k)+1T f =x⁰_k+β_k⁰uk, xk, x⁰_k∈C_kî(k). (7) We have by Theorem 2.9-4,

kx_kk=kP_k^i(k)P_k^i(k)+1fk ≤4kfk, .kx⁰_kk ≤4kT fk. (8) Now we prove the following statements:

(E) For every sequencekm→ ∞,lim infkmC_k^i(k^m⁾

m 6=B. (F) For some sequence k_m⁰ → ∞,lim inf_k_mC_k^i(k0 ⁰^m⁾⁺¹

m 6= 0.

(G) If for every sequence k_m → ∞,lim inf_k_mC_k^i(k^m⁾

m = {0} then for every sequence k_m⁰ →

∞,lim inf_k_mC_k^i(k0 ^m⁰ ⁾⁺¹ m 6=B. If lim inf_k_mC_k^i(k^m⁾

m = B, then by (B) P_k^i(k^m⁾

m f → f which contradicts (5), hence (E) holds true.

If(F)were not true we would have by Corollary 2.12 that the bounded subsequenceP_kî(k)+1f (see Theorem 2.9-4) is transformed into a sequence T P_kî(k)+1f converging to 0. Since T f = T(f−P_kî(k)+1f) +T P_kî(k)+1f we getkT fk= limkT(f−P_kî(k)+1f)k ≤lim inf_kkTkkf−P_kî(k)+1fk which by (5) giveskT fk ≤4kTkkfk in contradiction to (4).

Suppose that for some k_m⁰ → ∞, lim inf_k⁰

mC_k^i(k0 ⁰^m⁾⁺¹

m =B. By (B) we haveP_k^i(k0 ⁰^m⁾⁺¹ m f → f and P_k^i(k0 ⁰^m⁾⁺¹

m T f → T f. By (7) we have f = lim_k_m(x_k⁰_m +β_k⁰_mu_k_m⁰ ) and T f = lim_k⁰_m(x⁰_k0 m + β_k⁰0

mu_k⁰_m). Further we obtainT f = lim_k⁰_m(T x_k_m⁰ +β_k⁰_mT u_k⁰_m) andT²f = lim_k⁰_m(T x⁰_k0 m+β_k⁰0

mT u_k⁰_m).

By (8) and Corollary 2.12 it follows T f = lim_k⁰_mβ_k⁰_mT u_k⁰_m and T²f = lim_k_m⁰ β_k⁰0

mT u_k_m⁰ . There- foreβ_k⁰0

m\β_k⁰

m converges to someγ andT²f =γT f in contradiction to(B), hence(G)is proved.

Now we obtain the proof of our theorem as follows. If there is any sequence k_m → ∞ such that Λ = lim infkmC_k^i(k_m^m⁾6={0} then in view of(E)and Corollary 2.11, Λ is a proper invariant subspace. If there is no such sequence, then by (F) we choose a sequencek⁰_m → ∞ such that Λ⁰ = lim inf_k_mC_k^i(k0 ⁰^m⁾⁺¹

m 6={0}. By(G) and Corollary 2.11 Λ⁰ is a proper invariant subspace.

The Bernstein-Robinson theorem is an extension to the Aronszajn-Smith theorem, originally proved by using nonstandard analysis, in [5]. However we present the proof given by Halmos in [8], which still has a lot of features in common with Aronszajn-Smith’s proof, but does not use nonstandard analysis.

(8)

Theorem 2.13 (Bernstein-Robinson). If A is an operator on a Hilbert space H of dimension greater than one and if p is a nonzero polynomial such that p(A) is compact, there exists a nontrivial subspace of H underA.

Before we can begin with the proof we need a short definition:

Definition 2.14. Iff_n and g_n are sequences onH we shall write f_n∼g_n forkf_n−g_nk →0.

Proof. According to Theorem 2.2, we can assume the existence of a nonzero vectoresuch that e,Ae,A²e,... are linearly independent and have H as their closed linear span. Otherwise the closed linear span would already be an invariant subspace.

Through Gram-Schmidt orthogonalization we can obtain an orthonormal basis {e₁, e₂, ...}

with the property, that{e₁, e2, ..., em}has the same linear span as{e, Ae, ..., A^m−1e}, form∈N.

Ifam,n := (Aen, em), it follows that am,n = 0, ifm > n+ 1. The matrix entries of thekth power ofAare given bya^(k)m,n = (A^ken, em) Through induction one can see thata^(k)m,n = 0, ifm > n+k and

a^(k)_n+k,n = Y

1≤j≤k

an+j,n+j−1.

Let k ≥ 1 be the degree of our given polynomial p. If the matrix entries of p are given by a^(p)m,n = (p(A)en, em), then a^(p)_n+k,n is a constant multiple ofa^(k)_n+k,n. This is verified because the coefficient a^(l)_n+k,n = 0 if l < k. Since kp(A)e_nk → 0, for n → ∞, which we have because of the compactness ofp(A), there exists an increasing sequence{k(n)}_n∈_Nof postive integers such that the corresponding subdiagonal termsak(n)+1,k(n) converge to 0 forn→ ∞.

IfH_nis the span of{e₁, ..., e_k(n)}, then{H_n}_n∈_Nis an increasing sequence of finite-dimensional subspaces of H withH as their span. IfP_n is the projection with range H_n, thenP_n→^s I (I being the identity operator). Since, for each operator, the identityP_nAP_n leaves H_n invariant, it follows that for eachn there exists a chain of subspaces invariant underPnAPn

{0}=H_n⁽⁰⁾⊂H_n⁽¹⁾⊂...⊂H_n^(k(n))=H_n.

with dimHn⁽ⁱ⁾=i,i= 0,1, ..., k(n), a construction similar to the proof of the Aronszajn-Smith Theorem.

Iff_n∈H is a bounded sequence of vectors, we want to prove that

AP_nf_n∼P_nAP_nf_n. (9)

For the proof of (9), we have that Pnf =Pk(n)

j=1(f, ej)ej, iff ∈H and that APnfn−PnAPnfn=

k(n)

X

j=1

(fn, ej)

∞

X

i=k(n)+1

aijei.

Since the largestj isk(n) and the smallestiisk(n) + 1 and since aij = 0 ifi > j+ 1, it follows that kAP_nf_n−P_nAP_nf_nk ≤ kf_nkak(n)+1,k(n) → 0, and therefore we have proved (9). (9) can be generalized to higher exponents:

A^kPnfn∼(PnAPn)^kfn k= 1,2, ... (10) which can again be proved by induction. For k= 0, (10) says thatkP_nf_n−f_nk →0, which is a stringent condition on the bounded sequence f_n. If that is satisfied, then (10) implies that

p(A)Pnfn∼p(PnAPn)fn. (11)

(9)

Now we return to our vector e. Since P_ne = e for every n, it follows that p(A)P_ne ∼ p(P_nAP_n)e. Sincep(A)e6= 0, which follows because the vectors (e, Ae, ...) are linearly independent, we have

=kp(P_nAP_n)ek=kp(A)ek>0.

Consider for eachnthe numbers

kp(P_nAPn)e−p(PnAPn)P_n⁽⁰⁾ek, kp(P_nAPn)e−p(PnAPn)P_n⁽¹⁾ek,

...

kp(P_nAP_n)e−p(P_nAP_n)P_n^(k(n))ek.

wherePn⁽ⁱ⁾ is the projection with range Hn⁽ⁱ⁾. SincePn⁽⁰⁾ is the zero projection the first of these numbers tends to . Since, on the other hand Pn^(k(n))=P_n, the last of these numbers is always 0. In view of these facts it is possible to choose for eachnwith a finite number of exceptions a positive integer i(n), 1≤i(n)≤k(n), such that

kp(P_nAP_n)e−p(P_nAP_n)P_n^(i(n)−1)ek ≥

2 (12)

and

kp(P_nAPn)e−p(PnAPn)P_n⁽ⁱ⁽ⁿ⁾⁾ek<

2. (13)

Further leti(n) be the smallest integer for which these inequalities hold true.

Since bothPnⁱ⁽ⁿ⁾⁻¹ andPnⁱ⁽ⁿ⁾ are bounded sequences of operators, there exists an increasing sequence nj of integers such that both Pnⁱ⁽ⁿ^j⁾⁻¹ and Pnⁱ⁽ⁿ^j⁾ are weakly convergent. To simplify notation we set Q⁻_j := Pnⁱ⁽ⁿ^j⁾⁻¹ and Q⁺_j := Pnⁱ⁽ⁿ^j⁾. Let M⁻ := {f ∈ H | Q⁻_j f →^s f}, and M⁺:={f ∈H|Q⁺_jf →^s f}.

Now we are going to prove that M⁻ and M⁺ are subspaces of H that are both invariant underA, and that at least one of them is nontrivial. To prove that M⁻ is closed, suppose that g is in the closure ofM⁻. We have to show thatg∈M⁻ and therefore that Q⁻_jg →g. Given a positive number δ, one has to find f ∈ M⁻ so that kf −gk < ^δ₃ and then find j0 so that kQ_jf−fk< ^δ₃ forj≥j0. It follows that, if j≥j0, then kQ⁻_j g−gk ≤ kQ⁻_jg−Q⁻_j fk+kQ⁻_j f− fk+kf −gk< δ. This proves thatM⁻ is closed, the proof forM⁺ is the same.

To prove that M⁻ is invariant under A, we suppose that f ∈ M⁻, so that Q⁻_j f → f and infer, first, that AQ⁻_j f → Af, because A is bounded and second, that Q⁻_j AQ⁻_j f ∼ Q⁻_jAf, because Q⁻_j is uniformly bounded. Then we reason as follows:

Q⁻_j Af ∼Q⁻_jAQ⁻_j f =^(a)Q⁻_jP_n_jAP_n_jQ⁻_j f =^(b) P_n_jAP_n_jQ⁻_j f ∼^(c)AP_n_jQ⁻_jf =AQ⁻_j f →Af.

(a) is valid because Q⁻_j ≤ Pnj, (b) because the range of Q⁻_j is invariant under PnjAPnj and (c) because of (10). This proves that M⁻ is invariant under A, the prove that M⁺ is invariant follows the same scheme.

The next step is to prove that M⁻ 6=H. This is done by showing thate /∈M⁻. For this purpose observe first that the operators p(P_nAP_n) are uniformly bounded as one can see, by observing that

k(P_nAP_n)^kk ≤ kP_nAP_nk^k≤ kAk^k.

(10)

and by using the polynomial whose coefficients are the absolut values of the coefficients of p.

Now, because of (12) we have

2 ≤ kP_n_jAPnjkke−Q⁻_j ek.

SincekP_n_jAPnjk is bounded from above, its reciprocal is bounded from zero, and consequently ke−Q⁻_jek is bounded away from zero, which makes the convergenceQ⁻_j e→eimpossible.

The corresponding step for M⁺ says that M⁺ 6= {0}, but with a quite different proof.

The choice of the sequence {n_j} implies that the sequence {Q⁺_je} is weakly convergent. The compactness of p(A) implies therefore that the sequence{p(A)Q⁺_je} is strongly convergent to, say,f. The proof that follows, we have to show two parts:

1. f 6= 0 2. f ∈M⁺

To prove 1. we have p(A)Q⁺_j e ∼p(PnjAPnj)Q⁺_j e by (11), which is within ₂ of p(PnjAPnj)e, by (13), whose norm tends to . It follows thatk{p(A)Q⁺_je}k can not tend to zero, and hence thatf 6= 0.

To prove 2. we have thatQ⁺_j f ∼Q⁺_jp(A)Q⁺_j e, sinceQ⁺_j is uniformly bounded. Then we have Q⁺_j p(A)Q⁺_je∼Q⁺_jp(P_n_jAP_n_j)Q⁺_j eby (11) and uniform boundedness. Because the range ofQ⁺_j is invariant underp(P_n_jAP_n_j), we haveQ⁺_jp(P_n_jAP_n_j)Q⁺_j e=p(P_n_jAP_n_j)Q⁺_j e∼p(A)Q⁺_j ewith the last relation holding true because of (11). At last p(A)Q⁺_j e→f by definition.

If M⁺ 6= H all is well. It remains to be proved that if M⁺ = H then M⁻ 6= {0}. If M⁺ = H then Q⁺_j f → f for all f, at least weakly. At the same time the sequence {Q⁻_j} is known to be weakly convergent to, say, Q⁻. The operators Q⁻_j and Q⁺_j are projections such thatQ⁻_j ≤Q⁺_j and such thatQ⁺_j −Q⁻_j has rank 1. It follows that for eachjthere exists a unit vectorf_j, such that (Q⁺_j −Q⁻_j )f = (f, f_j)f_j for all f.

Observe now that Q⁻_j e cannot weakly tend to e, for if it did, it would tend strongly to e, which is a property of projections, but was proved to not be the case. This implies that Q⁻e6=e, or, equivalently that (1−Q⁻)e6= 0. Now the numbers |(e, f_j)|can not be arbitrarily small. This follows because, since|((Q⁺_j −Q⁻_j)e, g)| ≤ |(e, f_j)|kgkfor allg, an affirmative answer would imply that ((1−Q⁻)e, g) = 0 for all g, so that (1−Q⁻)e= 0 - a contradiction. So we have obtained that the numbers |(e, f_j)|are bounded away from zero, which makes it possible to prove that M⁻6={0}.

It turns out that ifg⊥(1−Q⁻)e, theng∈M⁻. Indeed since (e, fj)(fj, g)→((1−Q⁻)e, g) = 0, it follows that (f_j, g) → 0 and hence that (f, f_j)(f_j, g) → 0 for all f. This implies that ((1−Q⁻)f, g) = 0 for all f, and hence that (1−Q⁻g) = 0. In other words Q⁻_j g→ g weakly, and therefore strongly. From this it follows thatg∈M⁻.

2.3 Lomonosov’s Theorem

Lomonosov’s Theorem is another classical result in invariant subspace theory, but it has a com- pletely different proof to the Aronszajn-Smith Theorem and the Bernstein-Robinson Theorem.

The proof we present follows the one in [3], Chapter 6.

Actually Lomonosov’s Theorem is split in two parts, a lemma and the actual theorem.

Before we get to these two we need to prove another theorem and provide a few definitions:

(11)

Definition 2.15. Let B(H,K) be the space of all bounded operators from H to K, with H and K being Hilbert spaces.

(i) With K(H,K) we denote all compact operators from H to K, and with K(H) all compact operators fromH into itself.

(ii) Let Lat(T) denote the set of all invariant subspaces forT. (iii) If A ⊆ B(H) then let Lat(A) :=T

{Lat(T) :T ∈ A}.

Definition 2.16. If T is a linear operator, we shall call a subspace C ⊆H hyperinvariant if S(C)⊆C for allS commuting withT.

Definition 2.17. We denote the convex hull of a set S with co(S) and have co(S) := {xt+ (1−t)y:x, y∈S, t∈[0,1]}.

Theorem 2.18. (Mazur’s Theorem) IfH is a Banach space andK is a compact subset ofH, thenco(K) is compact.

Proof. Obviously, it suffices to show that co(K) is totally bounded. Let > 0 and choose x₁, ..., x_n in K such that K ⊆ Sn

j=1B(x_j, /3). Put C = co{x₁, ..., x_n}, then C is obviously compact. Hence there are vectorsy1, ..., yn inC such thatC ⊆Sn

j=1B(yj, /3). Ifw∈co(K), there is a z in co(K) with kw−zk < /3. Thus z =Pl

p=1α_pk_p, where k_p ∈ K, α_k ≥ 0 and Pα_k = 1. Now for eachk_p there is anx_j(k) with kk_p−x_j(p)k< /3. Therefore

z−

l

X

p=1

α_px_j(p)

=

l

X

p=1

α_k(k_p−x_j(p))

≤

l

X

p=1

α_kkk_p−x_j(p)k< /3.

But Pl

p=1α_px_j(p) ∈ C, so there is an y_i with kPl

p=1α_px_j(p)−y_ik < /3. With the triangle inequality we get kw−yjk ≤ kw−zk+kz−Pl

p=1αpx_j(p)k+kPl

p=1αpx_j(p)−yik< , which shows us that co(K)⊆S_n

j=1B(y_j, ) and so co(K) is totally bounded.

Lemma 2.19. If A is a subalgebra toB(H), such that I ∈ Aand LatA={∅,H} and ifK is a nonzero compact operator onH, then there is an A∈ Asuch that ker(AK−I)6= 0.

Proof. It may be assumed that kKk= 1. Fix x₀ ∈H such that kKx₀k>1 and putS ={x∈ H :kx−x0k ≤1}. If we have 0∈ S, we get 1< kK(x₀)k ≤ kKkkx₀−0k ≤ 1 and hence a contradiction.

On the other hand, if 0∈(K(S)) we have that there exists a sequence y_n∈S,n∈Nsuch that limnK(yn) = 0 and hence there exists an N ∈N such that 1<kKx₀k − kK(y_N)k. Now we have 1<kKx₀k − kK(y_N)k ≤ kKkkx₀−yNk ≤1 and therefore another contradiction.

In conclusion we have

0∈/ S and 0∈/K(S). (14)

Now ifx∈H andx6= 0,{T x:T ∈ A}is an invariant subspace forA, becauseAis an algebra, and it contains the nonzero vectorx, becauseI ∈ A. By hypothesis we get{T x:T ∈ A}=H. By (14) this says that for every y∈K(S), there is aT inA withkT y−x₀k<1. Equivalently

K(S)⊆ [

T∈A

{y:kT y−x₀k<1}.

(12)

Because K(S) is compact, there areT₁, ..., T_n inAsuch that K(S)⊆

n

[

j=1

{y:kT_jy−x₀k<1}. (15)

Fory∈K(S) and 1≤j≤n, letaj(y) = max{0,1− kT_jy−x0k}. By (15),Pn

j=1aj(y)>0, for all y∈K(S). Define b_j :K(S)→R, by

bj(y) = aj(y) Pn

i=1a_i(y), and define Ψ :S →H by

Ψ(x) =

n

X

j=1

bj(Kx)TjKx.

It is easy to see that aj : K(S) → [0,1] is a continuous function and hence bj and Ψ are continuous too.

If x∈S, thenKx∈K(S). Ifb_j(Kx)>0, thena_j(Kx)>0 and sokT_jKx−x₀k<1. That isTjKx∈S, whenever bj(Kx)>0. Since S is a convex set andPn

j=1bj(Kx) = 1 for x∈S, Ψ(S)⊆S.

Note that TjK ∈ K(H) for eachj so that S

j≤nTjK(S) has compact closure. By Mazur’s Theorem 2.18 co(S

TjK(S)), is compact. But this convex set contains Ψ(S) so that Ψ(S) is compact. This is, Ψ is a linear map. By the Schauder Fixed-Point Theorem, there is a vector x₁ ∈ S such that Ψ(x₁) = x₁. Let β_j = b_j(Kx₁) and put A = Pn

j=1β_iT_j. So A ∈ A and AKx1= Ψ(x1) =x1. Sincex16= 0, becausex1 is in S, we obtain ker(AK−I)6= 0.

Theorem 2.20. (Lomonosov’s Theorem) IfH is a Banach space over C,T ∈ B(H),T is not a multiple of the identity and T K =KT, for some nonzero compact operator K, then T has a nontrivial hyperinvariant subspace.

Proof. Let A = {T}⁰. We want to show that LatA 6= {0,H}. If this is not the case then Lomonosov’s Lemma implies that there is an operatorAinAsuch thatN = ker(AK−I)6= 0.

ButN ∈Lat(AK) andAK|_N is the identity operator. SinceAK ∈ K(H), we get dimN <∞.

Since AK ∈ A = {T}⁰ for any x ∈ N. AK(T x) = T A(Kx) = T x, hence TN ⊆ N. But dimN <∞, so thatT|_N must have an eigenvalueλ. Thus ker(T−λ) =M 6= 0. ButM 6=H, sinceT is not a multiple of the identity. It is easy to check that M is hyperinvariant forT:

x∈ker(T−λ)⇒0 = (T −λ)x=K(T−λ)x= (T−λ)Kx⇒Kx∈ker(T−λ)

3 Similarity and Quasisimilarity

In this section we want to take a look from a different angle at the invariant subspace problem. So far we have proved theorems which give us the existence of an invariant subspace under certain conditions. Now we want to take a look at how far the existence of an invariant subspace for one operator, carries over to another operator with a certain relation to the first one. These relations are going to be similarity and quasisimilarity. We are going to take a closer look at hyperinvariant subspaces, which we have already introduced at Lomonosov’s theorem. This section closely follows Chapter 4 of [1]. We can now start with a few definitions:

(13)

Definition 3.1. (i) T ∈ B[H,K] is called quasiinvertible, if it is an injective operator with dense range.

(ii) An operator T ∈ B[H,K] is called quasiaffine transform of L∈ B[H,K] if there exists a quasiinvertible X∈ B[K,H] such that XT =LX.

(iii) Two operatorsT ∈ B[H] andL∈ B[K] are called quasisimilar, if there exist two operators X∈ B[H,K] and Y ∈ B[K,H] such thatXT =LX and Y L=T Y.

(iv) An operatorX ∈ B[H,K] intertwines T ∈ B[H] toL∈ B[K] ifXT =LX.

Lemma 3.2. Let T ∈ B[H], L ∈ B[K] and X ∈ B[H,K] such that XT = LX. Suppose C(K is a nontrivial invariant subspace forL. If

ran(X) =K and ran(X)∩C6={0}, thenX⁻¹(C) is a nontrivial invariant subspace forT.

Proof. Let C ( K be an nontrivial invariant subspace for L. Since X : H → K, is linear and continuous, X⁻¹(C) is a subspace of H. Moreover, sinceX(X⁻¹(C))⊆C it follows that LXX⁻¹(C) ⊆L(C). Hence, since L(C) ⊆ C, we have LXX⁻¹(C) ⊆C. Because LX = XT, we get XT X⁻¹(C) ⊆ C and so X⁻¹XT X⁻¹(C) ⊆ X⁻¹(C). On the other hand, we get T X⁻¹(C)⊆X⁻¹XT X⁻¹(C), becauseA⊆X⁻¹(X(A)) for all setsA⊆H. Therefore we have

T X⁻¹(C)⊆X⁻¹(C),

in other wordsX⁻¹(C) is an invariant subspace forT. Now we shall verify that the assumptions on ran(X) are enough to ensure that the invariant subspace X⁻¹(C) is nontrivial. Take an arbitrary y ∈ ran(X)∩C so that y = Xu ∈ C, with u ∈ H. If X⁻¹(C) = {0}, then u = 0 because

X⁻¹(C) ={0} ⇔ {x∈H :Xx∈C}={0}.

Hence we obtain thaty= 0, becauseXis linear. We conclude that, ifX⁻¹(C) ={0}, it follows that ran(X)∩C={0}. Equivalently, we have

ran(X)∩C6={0} ⇒X⁻¹(C)6={0}.

If X⁻¹(C) = H, then ran(X) = X(H) = XX⁻¹(C). Thus since X(X⁻¹(C)) ⊆ C we get ran(X) ⊆ C = C 6= K. We conclude that X⁻¹(C) = H ⇒ ran(X) 6= K, or equivalently ran(X) =K ⇒X⁻¹(C)6=H

If the intertwining operator X is surjective, then X⁻¹(C) is a nontrivial invariant subspace forT, wheneverC is a nontrivial invariant subspace for L. An even more particular case reads as follows:

Corollary 3.3. If two operators are similar and one of them has an invariant subspace then so has the other.

Proposition 3.4. Let K be a Hilbert space, let M be a finite-dimensional subspace of K and let R be a linear manifold on K. If R=K, then

(R∩M^⊥) =M^⊥.

(14)

Proof. The result holds trivially if K is a finite-dimensional Hilbert space, for in such a case R =K and thereforeR∩M^⊥ =M^⊥. It also holds trivially if dim(M) = 0, because then we haveM ={0}and thereforeM^⊥=K, which gives usR∩M^⊥ =R=K =M^⊥. So from now on we can assume that K is infinite-dimensional and that m := dim(M) ≥1. First we shall verify that the above result holds true form= 1. Ifm= 1 thenM = span(e) for{0} 6=e∈K. Since R is dense inK there existsx∈K, such that

(e, x)6= 0,

because otherwise, if (e, x) = 0 for all x∈R, we would have that e∈R^⊥, or e={0}, because R is dense in K. Now take an arbitrary z ∈M^⊥. Since R = K and z ∈ K, there exists a sequence {z_j ∈R:j ≥1} such thatzj →zas j→ ∞.

For each j≥1 set

y_j =z_j−(e, z_j) (e, x)x,

and note thaty_j ∈R for everyj≥1, because z_j, x∈R and R is a linear manifold. Further we have y_j ∈ M^⊥ for every j ≥1, because (e, y_j) = (e, z_j)− ^(e,z_(e,x)^j⁾(e, x) = 0, so that y_j ⊥ e and hence yj ∈M^⊥ and yj → z, j → ∞, since (e, z) = 0, because z ∈ M^⊥ ={e}^⊥. Therefore for everyz∈M^⊥, there exists an (R∩M^⊥)-valued sequence converging to z. Hence (R∩M^⊥) is dense in M^⊥ and we conclude that the result holds for m= 1.

Now suppose it holds for somem≥1, that is, suppose R∩M^⊥=M^⊥,

for any m-dimensional subspace M of K. Take an arbitrary (m+ 1)-dimensional subspace of K, sayN. Let{e_l: 0≤l≤m}, be an orthonormal basis for N, so that

N =

m

M

l=0

[e_l] Take an arbitrary integerk∈ {0, ..., m}. Set

M_k =

m

M

l=0,l6=k

[e_l],

so that R∩M_k^⊥=M_k^⊥, once dimM_k=m. Note that there exists x_k∈R∩M_k^⊥ such that (e_k, x_k)6= 0,

because, if (e_k, x_k) = 0 for allx_k ∈ R∩M_k^⊥, then e_k ∈ (R∩M_k^⊥)^⊥ = (R∩M_k^⊥)^⊥ =M_k^⊥⊥ = Mk = Mk, which contradicts the fact that 0 6= ek ⊥ Mk. Take an arbitrary z ∈ N^⊥ = (L_m

l=0[e_k])^⊥. Since R = K and z ∈ K, there exists a sequence {z_j ∈ R :j ≥ 1} such that z_j →zas j→ ∞.

For each j≥1 set

yj =zj−

m

X

k=0

(e_k, z_j) (e_k, x_k)xk

and note that yj ∈ R for every j ≥ 1, because zj, x_k ∈ R and R is a linear manifold in K. Further y_j ∈ N^⊥ for every j ≥ 1, since x_k ∈M_k^⊥ = (Lm

l=0,l6=k[e_l])^⊥ ⊆ [e_n]^⊥ for every n 6=k, n∈ {0, ..., m}, it follows that (e_l, x_k) = 0, for every l6=k,l∈ {0, ..., m}. Hence

(e_l, yj) = e_l, zj−

m

X

k=0

(ek, zj) (e_k, x_k)x_k

!

=−

m

X

k=0,k6=l

(ek, zj)

(e_k, x_k)(e_l, x_k) = 0

(15)

for everyj≥1 and everyl∈ {, ..., m}. Therefore y_j ∈Tm

l=0[e_l]^⊥= (Lm

l=0[e_k])^⊥=N^⊥ for every j ≥ 1. Moreover y_j → z as j → ∞, since (e_k, z) = 0, because z ∈N^⊥ = Tm

l=0[e_l]^⊥. Thus for every z ∈ N^⊥ there exists an (R∩N^⊥)-valued sequence converging to z. Hence (R∩N^⊥) is dense in N^⊥ and we conclude that the result holds form+ 1, whenever it holds for m, which ends the proof by induction.

This proposition gives us the following two corollaries:

Corollary 3.5. TakeT ∈ B[H], L∈ B[K] andX ∈ B[H,K]such that XT =LX.

Let M ⊂K be a nontrivial finite-dimensional reducing subspace for L. If ran(X) = K, then X⁻¹(M^⊥) is a nontrivial invariant subspace forT.

Proof. This is an immediate conclusion from Proposition 3.4 and Lemma 3.2.

Corollary 3.6. If an operator T is a quasiaffine transform of another operator L, that has a nontrivial finite-dimensional reducing subspace, then T has a nontrivial invariant subspace.

3.1 Hyperinvariant Subspaces

Definition 3.7. The commutant {T}⁰ of T ∈ B[H] is the set of all operators in B[H] that commute with T, or equally

{T}⁰ :={U ∈ B[H] :U T =T U}.

Further letT_x:={y∈H :y=U x for someU ∈ {T}⁰}.

Proposition 3.8. For each x∈H, Tx is a subspace of H which is hyperinvariant for T. Proof. Take any x ∈ H and consider the set Tx ⊆ H. If y1, y2 ∈ Tx, then there exist U₁, U₂∈ {T}⁰, such thaty₁ =U₁x andy₂ =U₂x. Therefore we havey₁+y₂ = (U₁+U₂)x∈T_x, because obviously U₁, U₂ ∈ {T}⁰ ⇒ U₁+U₂ ∈ {T}⁰. Moreover αy ∈ T_x for every α ∈ C and everyy ∈Tx, trivially. Therefore Tx is a linear manifold on H. Now takeU ∈ {T}⁰ arbitrary.

If y∈T_x, then y=U₀x for some U₀ ∈ {T}⁰, so thatU y =U U₀x ∈T_x, forU U₀ is obviously in {T}⁰. Thus U(T_x)⊆T_x, and henceU(T_x)⊆T_x, becauseU is continuous.

Because the closure of a manifold is a linear subspace, we conclude that Tx is an invariant subspace for every U ∈ {T}⁰ and equivalently T_x is an hyperinvariant subspace for T.

Lemma 3.9. Let T ∈ B[H],L∈ B[K], X∈ B[H,K]and Y ∈ B[K,H]be such that XT =LX and Y L=T Y.

Suppose C is a nontrivial hyperinvariant subspace of L. If

ran(X) =K and ker(Y)∩C ={0},

then Y(C) 6= {0} and for each nonzero x ∈ Y(C), Tx is a nontrivial hyperinvariant subspace for T.

(16)

Proof. According to Proposition 3.8 it is enough to verify, that under the above hypothesis, {0} 6= T_x 6= H, for every 0 6= x ∈ Y(C) 6= {0}. First note that XU Y ∈ {L}⁰ for every U ∈ {T}⁰. Indeed, if Y L=T Y,T U =U T and XT =LX, then

(XU Y)L=XU T Y =XT U Y =L(XU Y).

SinceCis hyperinvariant forL, it follows thatCis invariant forXU Y, wheneverU ∈ {T}⁰. Now take x∈Y(C) arbitrary so that x=Y ufor someu∈C⊂K. Ify∈Tx, then y=U x=U Y u for some U ∈ {T}⁰ and hence Xy = XU Y u. But u ∈ C and C is invariant for XU Y. Thus Xy∈C. ThereforeX(T_x)⊆C so that, since X is continuous, we have

X(T_x)⊆C =C.

IfTx=H, then ran(X) =X(H) =X(Tx)⊆C6=K. We conclude that ran(X) =K ⇒Tx 6=H ∀x∈Y(C)

Finally, if Y(C) = {0}, then obviously C ⊆ ker(Y) and hence C ∩ker(Y) = C 6= {0}.

Therefore

ker(y)∩C={0} ⇒Y(C)6={0}.

Hence Tx 6={0}, for every nonzero x∈Y(C), because Tx={0} if, and only if,x= 0.

In particular, if XT =LX andY L=T Y, with ran(X) =K andker(Y) ={0}, then there exists x ∈ Y(C), such that T_x is a nontrivial hyperinvariant subspace for T, whenever C is a nontrivial hyperinvariant subspace forL. An even more particular case reads:

Corollary 3.10. If two operators are quasisimilar, and one has a nontrivial hyperinvariant subspace then so has the other.

3.2 Contractions Quasisimilar to Unitary Operators

Throgh this section T shall be a contraction on a Hilbert space H. Now we classify these contractions, but first we need a short definition:

Definition 3.11. We call a contraction strongly stable if {Tⁿ :n∈ N} converges strongly to the zero operator or in signs{Tⁿ:n∈N}→^s 0.

LetC_0.be the class of all strongly stable contractions and let C_.0 be the class of all contractions, whose adjoint is strongly stable. LetC1. and C.1 be the classes of contractions such that Tⁿx 90 and T^∗nx90, respectively for every nonzero x∈H. All combinations are possible and lead to classes C₀₀, C₁₀, C₀₁ and C₁₁. If T is a contraction, we have that the sequence {T^∗nTⁿ:n∈N}is a bounded monotone sequence of self-adjoint operators, so that it converges strongly. Therefore it has a strong limit A, while {TⁿT^∗n:n∈N} has a strong limitA∗. Proposition 3.12. Let T^∗nT^{n s}→A. Then A has the following properties:

(i) A is nonnegative and kAk ≤1.

(ii) kTⁿk → kA¹²k asn→ ∞ for allx∈H. (iii) ker(A) ={x∈H :Tⁿx→0}.

(iv) T^∗nATⁿ=A for every n≥1.

(17)

Proof. Ais nonnegative because it is the strong limit of a nonnegative sequence. Ais a contraction, because it is the strong limit of a sequence of contractions, indeedkAxk= lim_nkT^∗nTⁿxk ≤ kxkfor all x∈H, and we have proven (i).

We obtain(ii) by observing

kTⁿxk² = (T^∗nTⁿx, x)→(Ax, x) =kA¹²xk².

Further we have that (ii) implies (iii), because ker(A) = ker(A¹²), for every nonnegative operatorA.

Note that T^∗k+nT^k+n = T^∗nT^∗kT^kTⁿ, for every k, n ≥1, and also that T^∗k+nT^k+n →^s A and T^∗nT^∗kT^kTⁿ →T^∗nATⁿ as k→ ∞, for everyn≥1. Thus the identity in (iv) follows by uniqueness of the strong limit.

With Proposition 3.12, it follows that T ∈C0. if and only ifA= 0 andT ∈C1.if and only if ker(A) ={0}. Therefore

T ∈C₀₀⇔A=A∗ = 0,

T ∈C₀₁⇔A= 0 and ker(A∗) ={0}, T ∈C₁₀⇔A∗ = 0 and ker(A) ={0},

T ∈C₁₁⇔ker(A) = ker(A∗) ={0}.

These properties enable us to show a few theorems, about which contractions possess nontrivial invariant subspaces. To accomplish this we start with the following proposition:

Proposition 3.13. If a contraction is quasisimilar to a unitary operator, then it is of class C11.

Proof. If T ∈ B[H] is a contraction and U ∈ B[K] is a unitary operator such thatXT =U X and Y U =T Y, for a pair of quasiinvertible operatorsX ∈ B[H,K] andY ∈ B[K,H], then XTⁿ=UⁿX andY^∗T^∗n=U^∗nY^∗for everyn≥1. Therefore ifx∈H such that lim_nTⁿx= 0, then limnUⁿXx= 0. HenceXx= 0, so thatx= 0. That is, by Proposition 3.12, ker(A) ={0}.

If one repeats this argument by putting U^∗ in the position of U and Y^∗ in the position of X, one obtains, that ker(A∗) ={0}. Thus if a contraction T is quasismilar to a unitary operator, then ker(A) = ker(A∗) ={0}, or equivalently T ∈C11.

In the following theorem we want to show that also the conversion to that proposition holds.

Therefore we need to construct an isometryV associated to T such that the equality V A¹² =A¹²T

holds. Recall that ker(A) = ker(A∗), ran(A) ⊆ran(A¹²) and ran(A) = ran(A¹²) = ker(A¹²)^⊥ = ker(A)^⊥.

Now we consider the decomposition H = ker(A)⊕ker(A)^⊥ = ker(A)⊕ran(A) and define a map

V : ran(A)→ran(A),

as follows. Take an arbitraryy∈ran(A) so that there exists anx_y ∈ran(A¹²) withy=A¹²x_y. Note that xy is unique, for A is injective when acting on ran(A) and so is A¹². Set V0y = A¹²T x_y, which defines a transformationV₀ : ran(A)→ ran(A¹²) = ran(A) that is clearly linear, for A¹² and T are linear. Extend it to ran(A) to get the linear transformation V : ran(A) → ran(A).

Invariant Subspace Problem