My solution

(1)

American Mathematical Monthly

Problem 11415 by Finbarr Holland, generalized

Definitions. a) A matrix is called a complex matrix if all its entries are complex numbers. (In particular, any vector in Cⁿ is considered a n×1 complex matrix.)

b) For any complex matrix A, we denote the matrixA^T by A^∗.

c) Let U₂(C) be the group of all unitary 2×2 complex matrices. In other words, let

U2(C) =

U ∈GL2(C)|U^∗ =U⁻¹ .

d) A complex matrix A is called Hermitian if it satisfies A^∗ =A.

Problem. Let A1, A2, ..., An be n Hermitian 2×2 complex matrices. Define a function F from the Cartesian product (U₂(C))ⁿ toR by

F (U1, U2, ..., Un) = det

n

X

k=1

U_k^∗AkUk

!

for every (U1, U2, ..., Un)∈(U2(C))ⁿ, Show that

min

U∈(U2(C))ⁿ

F (U) =

n

X

k=1

σ₁(A_k)·

n

X

k=1

σ₂(A_k),

where σ₁(A_j) and σ₂(A_j) denote the greatest and the least eigenvalue of the matrix Aj,respectively, for every j ∈ {1,2, ..., n}.

Solution by Darij Grinberg.

We start with an important lemma:

Lemma 1 (the Spectral Theorem for 2×2 matrices). Let A be a Hermitian 2×2 complex matrix.

a)We have TrA∈R, detA∈R and (TrA)²−4 detA∈R^≥0. b) Let us define two numbersλ(A) and µ(A) by

λ(A) = 1 2

TrA+ q

(TrA)²−4 detA

; µ(A) = 1

2

TrA− q

(TrA)²−4 detA

.

These numbers λ(A) and µ(A) are real and satisfy λ(A)≥µ(A).

c)If the matrixAis positive definite, then λ(A)≥µ(A)>0.If the matrix A is nonnegative definite, thenλ(A)≥µ(A)≥0.

d) We have λ(A) +µ(A) = TrA and λ(A)·µ(A) = detA. In particular, detA=λ(A)·(TrA−λ(A)).

e)The eigenvalues of A(with algebraic multiplicities) are λ(A) andµ(A).

f ) There exists a matrixS(A)∈U2(C) such that

(S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

(2)

g) We have

λ(A) = max

v^∗Av

v^∗v | v ∈C²\ {0}

; µ(A) = min

v^∗Av

v^∗v | v ∈C²\ {0}

.

h) There exists a matrix Se(A)∈U₂(C) such that

Se(A)∗

·A·Se(A) = diag (λ(A), µ(A)) and det

Se(A)

= 1.

Proof of Lemma 1. SinceAis a 2×2 complex matrix, there exist complex numbers a, b, c, dsuch that A=

a b c d

.

SinceAis Hermitian, we haveA=A^∗,so that

a b c d

=A=A^∗ =

a b c d

^∗

= a c

b d

, and thus

a=a, b =c, c=b, d =d.

Now, a = a yields a ∈ R, and d = d yields d ∈ R, so that TrA = Tr

a b c d

= a

|{z}

∈R

+ d

|{z}

∈R

∈R. Also, a

|{z}

∈R

− d

|{z}

∈R

∈R.Besides,

detA = detA^∗ = detA^T = detA since detB^T = detB for any square matrix B

= detA

yields detA∈R.Thus, TrA

| {z }

∈R

!2

−4 detA

| {z }

∈R

∈R.Moreover,

detA= det

a b c d

=ad−bc yields

(TrA)²−4 detA=



TrA

| {z }

=a+d





2

−4 ad− b

|{z}

=c

c

!

= (a+d)²−4 (ad−cc) = (a+d)²−4ad

| {z }

=(a−d)²

+4cc

=



a−d

| {z }

∈R





2

| {z }

∈R≥0,since squares of reals are∈R≥0

+4 cc

|{z}

=|c|²∈R≥0

(1)

∈R^≥0.

(3)

Thus, Lemma 1 a)is proven.

Lemma 1a)yields (TrA)²−4 detA∈R^≥0,and thus q

(TrA)²−4 detA∈R^≥0,so that

λ(A) = 1 2





TrA

| {z }

∈R

+ q

(TrA)² −4 detA

| {z }

∈R≥0⊆R





∈R and

µ(A) = 1 2





TrA

| {z }

∈R

− q

(TrA)²−4 detA

| {z }

∈R≥0⊆R





∈R.

In other words, the numbers λ(A) and µ(A) are real. Besides, q

(TrA)²−4 detA ∈ R^≥0 yields

q

(TrA)²−4 detA≥ − q

(TrA)²−4 detA, so that

λ(A) = 1 2







TrA+ q

(TrA)²−4 detA

| {z }

≥−√

(TrA)²−4 detA







≥ 1 2

TrA− q

(TrA)²−4 detA

=µ(A).

Thus, Lemma 1 b) is proven. Besides, λ(A) +µ(A) = 1

2

TrA+ q

(TrA)²−4 detA

+ 1 2

TrA− q

(TrA)² −4 detA

= TrA and

λ(A)·µ(A) = 1 2

TrA+ q

(TrA)²−4 detA

· 1 2

TrA− q

(TrA)²−4 detA

= 1 4 ·

TrA+ q

(TrA)²−4 detA TrA− q

(TrA)² −4 detA

= 1 4 ·







(TrA)²− q

(TrA)²−4 detA 2

| {z }

=(TrA)²−4 detA







= 1

4 ·4 detA= detA.

Thus,

detA=λ(A)· µ(A)

| {z }

=TrA−λ(A), since λ(A)+µ(A)=TrA

=λ(A)·(TrA−λ(A)).

Thus, Lemma 1 d) is proven.

(4)

The characteristic polynomial of the matrix A is det (XI₂−A) = det

X 0

0 X

−

a b c d

= det

X−a 0−b 0−c X−d

= det

X−a −b

−c X−d

= (X−a) (X−d)

| {z }

=X²−(a+d)X+ad

−(−b) (−c)

| {z }

=bc

=X² −







a+d

| {z }

=TrA

=λ(A)+µ(A)





 X+







ad−bc

| {z }

=detA

=λ(A)·µ(A)







=X² −(λ(A) +µ(A))X+λ(A)·µ(A) = (X−λ(A)) (X−µ(A)). Hence, λ(A) and µ(A) are the roots of the characteristic polynomial of the matrix A (with multiplicities). But the eigenvalues of A (with algebraic multiplicities) are the roots of the characteristic polynomial of the matrixA (with multiplicities). Hence, the eigenvalues of A (with algebraic multiplicities) are λ(A) and µ(A). Thus, Lemma 1 e) is proven.

f ) We notice that

diag (α₁, α₂)·diag (β₁, β₂) = diag (α₁β₁, α₂β₂)

for every α₁ ∈C, α₂ ∈C, β₁ ∈C and β₂ ∈C, (2) and

diag (α1, α2) = diag (α1, α2) for every α1 ∈C and α2 ∈C and thus

(diag (α1, α2))^∗ = diag (α1, α2)^T = diag (α1, α2)^T = diag (α1, α2)

for every α₁ ∈C and α₂ ∈C. (3) Let ρ =

q

(TrA)²−4 detA. Then, ρ ∈ R^≥0 (since (TrA)² −4 detA ∈ R^≥0 by Lemma 1 a)).

Also,

λ(A) = 1 2





TrA+ q

(TrA)²−4 detA

| {z }

=ρ





= 1

2(TrA+ρ) ;

µ(A) = 1 2





TrA− q

(TrA)²−4 detA

| {z }

=ρ





= 1

2(TrA−ρ), so that

λ(A)−µ(A) = 1

2(TrA+ρ)−1

2(TrA−ρ) =ρ.

We now distinguish between two cases:

Case 1: We haveρ= 0.

(5)

Case 2: We haveρ6= 0.

First, let us consider Case 2. In this case, ρ6= 0.

Theorem 1 e)yields thatλ(A) is an eigenvalue of the matrixA. Thus, there exists a vector e_λ ∈ C² such that e_λ 6= 0 and Ae_λ = λ(A)e_λ. Since e_λ ∈ C², there exist complex numbers f_λ and g_λ such that e_λ =

f_λ g_λ

. Then, e^∗_λ = e_λ^T = f_λ

g_λ ^T

= f_λ

g_λ T

= f_λ, g_λ

and thus e^∗_λe_λ = f_λ, g_λ f_λ

g_λ

=f_λf_λ+g_λg_λ. Also,e^∗_λe_λ ∈R>0

(since eλ 6= 0), so that p

e^∗_λeλ ∈R^>0.

Theorem 1 e)yields thatµ(A) is an eigenvalue of the matrix A. Thus, there exists a vector e_µ ∈ C² such that e_µ 6= 0 and Ae_µ = µ(A)e_µ. Since e_µ ∈ C², there exist complex numbers fµ and gµ such that eµ =

fµ

g_µ

. Then, e^∗_µ = eµT = fµ

g_µ ^T

= f_µ

g_µ T

= f_µ, g_µ

and thuse^∗_µe_µ= f_µ, g_µ f_µ

g_µ

=f_µf_µ+g_µg_µ. Also, e^∗_µe_µ∈R>0

(since eµ6= 0), so that p

e^∗_µeµ∈R^>0. We have

e^∗_λ Ae_µ

|{z}

=µ(A)eµ

=e^∗_λµ(A)e_µ=µ(A)·e^∗_λe_µ and

e^∗_λ A

|{z}

=A^∗

e_µ= e^∗_λA^∗

| {z }

=(Aeλ)^∗

e_µ =



 Ae_λ

|{z}

=λ(A)eλ





∗

e_µ= (λ(A)e_λ)^∗

| {z }

=λ(A)·e^∗_λ

e_µ=λ(A)·e^∗_λe_µ,

so that

λ(A)·e^∗_λeµ=µ(A)·e^∗_λeµ. Thus,

0 = λ(A)·e^∗_λe_µ−µ(A)·e^∗_λe_µ =



λ(A)−µ(A)

| {z }

=ρ6=0



·e^∗_λe_µ, so that e^∗_λe_µ= 0. Since

e^∗_λ

|{z}

=(^fλ,gλ) e_µ

|{z}

= 0

@

fµ

g_µ

1 A

= f_λ, g_λ f_µ

g_µ

=f_λf_µ+g_λg_µ,

this becomes

f_λf_µ+g_λg_µ= 0.

Thus,

f_µf_λ+g_µg_λ =f_λf_µ+g_λg_µ=f_λf_µ+g_λg_µ

since f_λ =f_λ and g_λ =g_λ

=f_λf_µ+g_λg_µ= 0 = 0.

(6)

Define a matrix W ∈U₂(C) by W =

f_λ f_µ gλ gµ

.

Define a matrix S(A)∈U₂(C) by

S(A) =W ·diag 1

pe^∗_λe_λ, 1 pe^∗_µeµ

!

. (4)

Then,

W^∗ =W^T =

f_λ f_µ g_λ g_µ

^T

=

f_λ f_µ g_λ g_µ

T

=

f_λ g_λ f_µ g_µ

and thus

(S(A))^∗ = W ·diag 1

pe^∗_λe_λ, 1 pe^∗_µe_µ

!!∗

= diag 1

!!∗

| {z }

=diag 0

@

1 pe^∗_λe_λ^,

1 pe^∗_µe_µ

1 Aby (3)

·W^∗

= diag







1 pe^∗_λe_λ

| {z }

= 1

pe^∗_λe_λ

= 1

pe^∗_λe_λ^,

since√

e^∗_λeλ∈R

, 1

pe^∗_µe_µ

| {z }

= 1

pe^∗_µe_µ

= 1

pe^∗_µe_µ^,

since√

e^∗_µeµ∈R







·W^∗

= diag 1

!

·W^∗, (5)

and W^∗·W =

f_λ g_λ f_µ g_µ

·

f_λ f_µ g_λ g_µ

=

f_λf_λ+g_λg_λ f_λf_µ+g_λg_µ f_µf_λ+g_µg_λ f_µf_µ+g_µg_µ

=

e^∗_λe_λ 0 0 e^∗_µe_µ

since f_λf_λ+g_λg_λ =e^∗_λe_λ and f_µf_µ+g_µg_µ =e^∗_µe_µ

= diag e^∗_λe_λ, e^∗_µe_µ

, (6)

(7)

so that (4) and (5) yield (S(A))^∗ ·S(A) = diag 1

!

· W^∗·W

| {z }

=diag(^e^∗λe_λ,e^∗_µeµ)

by (6)

·diag 1

!

= diag 1 pe^∗_λeλ

, 1 pe^∗_µe_µ

!

·diag e^∗_λe_λ, e^∗_µe_µ

| {z }

=diag 0

@

1 pe^∗_λeλ

·e^∗_λeλ, 1 pe^∗_µe_µ^·e

∗ µeµ

1 A by (2)

·diag 1 pe^∗_λeλ

, 1 pe^∗_µe_µ

!

= diag





 1

pe^∗_λe_λ ·e^∗_λe_λ

| {z }

=√

e^∗_λeλ

, 1 pe^∗_µeµ

·e^∗_µe_µ

| {z }

=√

e^∗_µeµ







·diag 1

pe^∗_λe_λ, 1 pe^∗_µeµ

!

= diag

pe^∗_λe_λ,p e^∗_µe_µ

·diag 1

!

= diag p

e^∗_λe_λ· 1 pe^∗_λe_λ,p

e^∗_µe_µ· 1 pe^∗_µe_µ

!

(by (2))

= diag (1,1) =I₂.

Thus, the matrix S(A) is left-invertible. Hence, the matrix S(A) is invertible (since every left-invertible matrix is invertible). In other words, S(A) ∈ GL₂(C). Besides,

(S(A))^∗ = (S(A))⁻¹(since (S(A))^∗·S(A) = I₂). Thus,S(A)∈ {U ∈GL₂(C)|U^∗ =U⁻¹}= U₂(C).

On the other hand, af_λ+bg_λ

cf_λ+dg_λ

=

a b c d

| {z }

=A

f_λ g_λ

| {z }

=eλ

=Ae_λ =λ(A)e_λ =λ(A) f_λ

g_λ

=

λ(A)f_λ λ(A)g_λ

yields

afλ+bgλ =λ(A)fλ and cfλ+dgλ =λ(A)gλ. (7) Also,

af_µ+bg_µ cf_µ+dg_µ

=

a b c d

| {z }

=A

f_µ g_µ

| {z }

=eµ

=Ae_µ=µ(A)e_µ=µ(A) f_µ

g_µ

=

µ(A)f_µ µ(A)g_µ

yields

af_µ+bg_µ=µ(A)f_µ and cf_µ+dg_µ=µ(A)g_µ. (8)

(8)

Now,

A·W =

a b c d

·

f_λ f_µ g_λ g_µ

=

af_λ+bg_λ af_µ+bg_µ cf_λ +dg_λ cf_µ+dg_µ

=

λ(A)f_λ µ(A)f_µ λ(A)g_λ µ(A)g_µ

(by (7) and (8)) and

W

|{z}

= 0

@

fλ fµ

g_λ g_µ

1 A

·diag (λ(A), µ(A))

| {z }

= 0

@

λ(A) 0 0 µ(A)

1 A

=

f_λ f_µ g_λ g_µ

·

λ(A) 0 0 µ(A)

=

f_λ·λ(A) +f_µ·0 f_λ·0 +f_µ·µ(A) g_λ·λ(A) +g_µ·0 g_λ·0 +g_µ·µ(A)

=

f_λ·λ(A) f_µ·µ(A) g_λ ·λ(A) g_µ·µ(A)

=

λ(A)f_λ µ(A)f_µ λ(A)g_λ µ(A)g_µ

yieldA·W =W ·diag (λ(A), µ(A)), so that (4) and (5) yield

(S(A))^∗·A·S(A)

, 1 pe^∗_µe_µ

!

·W^∗· A·W

| {z }

=W·diag(λ(A),µ(A))

·diag 1 pe^∗_λeλ

, 1 pe^∗_µe_µ

!

= diag 1

!

· W^∗·W

| {z }

=diag(^e^∗λe_λ,e^∗_µeµ)

by (6)

·diag (λ(A), µ(A))·diag 1

!

, 1 pe^∗_µe_µ

!

·diag e^∗_λe_λ, e^∗_µe_µ

| {z }

=diag 0

@

1 pe^∗_λeλ

·e^∗_λeλ, 1 pe^∗_µe_µ^·e

∗ µeµ

1 A by (2)

·diag (λ(A), µ(A))·diag 1 pe^∗_λeλ

, 1 pe^∗_µe_µ

!

= diag





 1

pe^∗_λe_λ ·e^∗_λe_λ

| {z }

=√

e^∗_λeλ

, 1

pe^∗_µe_µ ·e^∗_µe_µ

| {z }

=√

e^∗_µeµ







·diag (λ(A), µ(A))·diag 1

!

= diag

pe^∗_λe_λ,p e^∗_µe_µ

·diag (λ(A), µ(A))

| {z }

=diag(√

e^∗_λeλ·λ(A),√

e^∗_µeµ·µ(A))

by (2)

·diag 1

!

= diag

pe^∗_λe_λ·λ(A),p

e^∗_µe_µ·µ(A)

·diag 1

!

= diag p

e^∗_λe_λ·λ(A)· 1 pe^∗_λeλ

,p

e^∗_µe_µ·µ(A)· 1 pe^∗_µe_µ

!

(by (2))

= diag (λ(A), µ(A)).

(9)

Thus, we have proven that, in Case 2, there exists a matrix S(A)∈U₂(C) such that (S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

In other words, we have proven that Lemma 1 f )holds in Case 2.

Next, let us consider Case 1. In this case, ρ= 0, so that 0 =ρ² =

q

(TrA)²−4 detA 2

= (TrA)²−4 detA

=



a−d

| {z }

∈R





2

+ 4 cc

|{z}

=|c|²∈R

(by (1)) (9)

≥4 cc

|{z}

=|c|²

since (a−d)² ≥0, since a−d∈R and since squares of reals are ≥0

= 4|c|²,

so that 0≥ |c|². But |c|² ≥0 (since |c| ∈ R and since squares of reals are ≥0). Thus,

|c|² = 0, so that |c| = 0 and thus c = 0. Hence, b = c = 0 = 0. Now, (9) yields 0 = (a−d)²+ 4c c

|{z}

=0

= (a−d)², so that a−d= 0, so that a=d. Thus,

A =

a b c d

=

d 0 0 d

(since b = 0, c= 0 and a=d)

= diag (d, d),

so that TrA=d+d= 2d. Hence,

λ(A) = 1 2



TrA

| {z }

=2d

+ ρ

|{z}

=0



= 1

2(2d+ 0) =d;

µ(A) = 1 2



TrA

| {z }

=2d

− ρ

|{z}

=0



= 1

2(2d−0) = d, so that diag (λ(A), µ(A)) = diag (d, d).

Now, let S(A) = I₂. Then, clearly, S(A)∈U₂(C) and (S(A))^∗

| {z }

=I₂^∗=I2

·A·S(A)

| {z }

=I2

=I₂·A·I₂ =A= diag (d, d) = diag (λ(A), µ(A)).

Thus, we have proven that, in Case 1, there exists a matrix S(A)∈U₂(C) such that (S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

In other words, we have proven that Lemma 1 f )holds in Case 1.

Altogether, we have now proven that Lemma 1 f ) holds in both Cases 1 and 2.

Thus, Lemma 1 f )always holds. Hence, Lemma 1 f )is proven.

(10)

g) According to Lemma 1 f ), there exists a matrix S(A)∈U₂(C) such that (S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

We have S(A) ∈ GL₂(C) (since S(A) ∈ U₂(C) = {U ∈GL₂(C)|U^∗ =U⁻¹}). In other words, the matrix S(A) is invertible. Also, S(A) ∈ U₂(C) yields (S(A))^∗ = (S(A))⁻¹, so that (S(A))^∗S(A) =I₂.

We have

C²\ {0} ⊇

S(A)w | w∈C²\ {0}

(since S(A)w∈C²\ {0}for every w∈C²\ {0} (since the matrixS(A) is invertible)) and

C²\ {0} ⊆

S(A)w | w∈C²\ {0}

(since for every v ∈ C²\ {0}, there exists some w ∈ C² \ {0} such that v = S(A)w (in fact, let w= (S(A))⁻¹v; then,v =S(A)wand w∈C²\ {0} (since S(A)w=v ∈ C²\ {0}, so that S(A)w6= 0, so thatw6= 0))). Thus,

C²\ {0}=

S(A)w | w∈C²\ {0} . Hence,

v^∗Av

v^∗v | v ∈C²\ {0}

=

v^∗Av

v^∗v | v ∈

S(A)w | w∈C²\ {0}

=

(S(A)w)^∗AS(A)w

(S(A)w)^∗S(A)w | w∈C² \ {0}

=

w^∗(S(A))^∗AS(A)w

w^∗(S(A))^∗S(A)w | w∈C²\ {0}

(since (S(A)w)^∗ =w^∗(S(A))^∗)

=

w^∗diag (λ(A), µ(A))w

w^∗w | w∈C²\ {0}

(10)



since (S(A))^∗AS(A) = diag (λ(A), µ(A)) andw^∗(S(A))^∗S(A)

| {z }

=I2

w=w^∗I₂w=w^∗w



. Now, for every w ∈ C² \ {0}, there exist w₁ ∈ C and w₂ ∈ C such that w =

w₁ w₂

, and we havew^∗ =w^T =

w₁ w₂

^T

= w₁

w₂ T

= (w₁, w₂), so that w^∗w= (w₁, w₂)

w₁ w₂

=w₁w₁

| {z }

=|w1|²

+w₂w₂

| {z }

=|w2|²

=|w₁|²+|w₂|² and

w^∗diag (λ(A), µ(A))w= (w₁, w₂) diag (λ(A), µ(A)) w₁

w₂

| {z }

= 0

@

λ(A)w₁ µ(A)w2

1 A

= (w₁, w₂)

λ(A)w₁ µ(A)w₂

=w₁λ(A)w₁+w₂µ(A)w₂ =λ(A)w₁w₁

| {z }

=|w1|²

+µ(A)w₂w₂

| {z }

=|w2|²

=λ(A)|w₁|²+µ(A)|w₂|².

(11)

Hence,

w^∗w = λ(A)|w₁|²+µ(A)|w₂|²

|w₁|²+|w₂|² (11)

≤ λ(A)|w₁|² +λ(A)|w₂|²

|w₁|² +|w₂|²

since µ(A)≤λ(A) and since |w₂|² ≥0 and |w₁|²+|w₂|² ≥0

= λ(A) |w₁|²+|w₂|²

|w₁|²+|w₂|² =λ(A),

and there exists aw∈C²\ {0}satisfying w^∗diag (λ(A), µ(A))w

w^∗w =λ(A) (in fact, set w=

1 0

; then,w^∗ =w^T = 1

0 ^T

= 1

0 T

= 1

0 T

= (1,0) and thus

w^∗w =

(1,0) diag (λ(A), µ(A)) 1

0

(1,0) 1

0

=

(1,0)

λ(A) 0

(1,0) 1

0

since diag (λ(A), µ(A)) 1

0

=

λ(A)·1 µ(A)·0

=

λ(A) 0

= 1·λ(A) + 0·0

1·1 + 0·0 = 1·λ(A)

1·1 =λ(A) ). Thus,

max

w^∗w | w∈C²\ {0}

=λ(A). (12) Besides, for every w ∈ C² \ {0}, there exist w₁ ∈ C and w₂ ∈ C such that w = w1

w₂

, and (11) yields w^∗diag (λ(A), µ(A))w

w^∗w = λ(A)|w1|²+µ(A)|w2|²

|w₁|²+|w₂|² ≥ µ(A)|w1|²+µ(A)|w2|²

|w₁|²+|w₂|²

since λ(A)≥µ(A) and since |w₁|² ≥0 and |w₁|²+|w₂|² ≥0

= µ(A) |w₁|²+|w₂|²

|w1|²+|w2|² =µ(A),

and there exists aw∈C²\ {0} satisfying w^∗diag (λ(A), µ(A))w

w^∗w =µ(A) (in fact, set

(12)

w= 0

1

; then,w^∗ =w^T = 0

1 ^T

= 0

1 T

= 0

1 T

= (0,1) and thus

w^∗w =

(0,1) diag (λ(A), µ(A)) 0

1

(0,1) 0

1

=

(0,1) 0

µ(A)

(0,1) 0

1

since diag (λ(A), µ(A)) 0

1

=

λ(A)·0 µ(A)·1

= 0

µ(A)

= 0·0 + 1·µ(A)

0·0 + 1·1 = 1·µ(A)

1·1 =µ(A) ). Thus,

min

w^∗w | w∈C²\ {0}

=µ(A). (13)

Now,

λ(A) = max

w^∗w | w∈C²\ {0}

(by (12))

= max

v^∗Av

v^∗v | v ∈C²\ {0}

(by (10)) and

µ(A) = min

w^∗w | w∈C²\ {0}

(by (13))

= min

v^∗Av

v^∗v | v ∈C²\ {0}

(by (10)). Thus, Lemma 1 g) is proven.

c) According to Lemma 1 f ), there exists a matrix S(A)∈U₂(C) such that (S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

We have S(A) ∈ GL₂(C) (since S(A) ∈ U₂(C) = {U ∈GL₂(C)|U^∗ =U⁻¹}). In other words, the matrixS(A) is invertible. Thus,S(A)·

0 1

6= 0 (since 0

1

6= 0).

Letv =S(A)· 0

1

. Then,v =S(A)· 0

1

6= 0.

(13)

Now, v^∗Av=

S(A)· 0

1 ∗

| {z }

= 0

@

0 1

1 A

∗

·(S(A))^∗

·A·

S(A)· 0

1

= 0

1 ∗

·(S(A))^∗·A·S(A)

| {z }

=diag(λ(A),µ(A))

· 0

1

=

0 1

∗

| {z }

= 0

@

0 1

1 A

T

= 0

@

0 1

1 A

T

=(^0,1)^=(0,1)

·diag (λ(A), µ(A))· 0

1

| {z }

= 0

@

λ(A)·0 µ(A)·1

1 A=

0

@

0 µ(A)

1 A

= (0,1)· 0

µ(A)

= 0·0 + 1·µ(A) = 1·µ(A) = µ(A). (14)

Now, if the matrix A is positive definite, then v^∗Av > 0 (since v 6= 0), what becomes µ(A)> 0 (by (14)), so that λ(A) ≥ µ(A) > 0 (since λ(A) ≥ µ(A)). Besides, if the matrix A is nonnegative definite, then v^∗Av ≥ 0, what becomes µ(A) ≥ 0 (by (14)), so that λ(A)≥µ(A)≥0 (sinceλ(A)≥µ(A)). Thus, Lemma 1 c)is proven.

h) According to Lemma 1 f ), there exists a matrix S(A)∈U2(C) such that (S(A))^∗·A·S(A) = diag (λ(A), µ(A)).

Note that S(A) ∈ U₂(C) = {U ∈GL₂(C)|U^∗ =U⁻¹} yields S(A) ∈ GL₂(C) and (S(A))^∗ = (S(A))⁻¹, and thus (S(A))^∗·S(A) =I₂. Thus,

1 = det I₂

|{z}

=(S(A))^∗·S(A)

= det ((S(A))^∗·S(A)) = det







(S(A))^∗

| {z }

=S(A)^T







·det (S(A))

= det

S(A)^T

| {z }

=detS(A)

=det(S(A))

·det (S(A)) = det (S(A))·det (S(A)) =|det (S(A))|²,

so that |det (S(A))|= 1 (since |det (S(A))| ∈R^≥0).

Since the field C is algebraically closed, there exists some τ ∈ C such that τ² = det (S(A)). We have |τ|² =|τ²|=|det (S(A))|= 1, so that |τ|= 1 (since |τ| ∈R^≥0), and thus τ 6= 0 and τ τ =|τ|² = 1² = 1, so that τ = 1

τ. Define a matrix Se(A) ∈ M₂(C) by Se(A) = 1

τS(A). Then, Se(A) = 1

τS(A) ∈

(14)

GL2(C) (sinceS(A)∈GL2(C) and 1

τ 6= 0) and

Se(A) ∗

= 1

τS(A) ∗

= 1

τ

|{z}

=τ⁻¹=τ⁻¹

(S(A))^∗ =τ⁻¹(S(A))^∗

| {z }

=(S(A))⁻¹

=τ⁻¹(S(A))⁻¹ =





 τ

|{z}

=1 τ

S(A)







−1

= 1

τS(A) −1

=

Se(A)⁻¹ .

Thus,Se(A)∈ {U ∈GL₂(C)|U^∗ =U⁻¹}= U₂(C). Besides,

Se(A)∗

·A·Se(A) = 1

τS(A) ∗

| {z }

=τ⁻¹(S(A))^∗

·A· 1

τS(A) =τ⁻¹(S(A))^∗·A· 1 τS(A)

= 1

τ τ

|{z}

=1,since τ τ=1

(S(A))^∗·A·S(A)

| {z }

=diag(λ(A),µ(A))

= diag (λ(A), µ(A))

and

det

Se(A)

= det 1

τS(A)

= 1

τ 2

det (S(A))

| {z }

=τ²

= 1

τ 2

τ² = 1.

Thus, we have proven that there exists a matrix Se(A)∈U2(C) such that

Se(A)∗

·A·Se(A) = diag (λ(A), µ(A)) and det

Se(A)

= 1.

In other words, we have proven Lemma 1 h).

A conclusion from Lemma 1:

Lemma 2. Let B₁, B₂, ..., B_n be n Hermitian 2×2 complex matrices.

Then, λ _n

P

k=1

B_k

≤

n

P

k=1

λ(B_k) and

det

n

X

k=1

Bk

!

≥

n

X

k=1

λ(Bk)·

n

X

k=1

µ(Bk).

Proof of Lemma 2. For every k ∈ {1,2, ..., n}, we have λ(B_k) = max

v^∗B_kv

v^∗v | v ∈C²\ {0}

(by Lemma 1 g), applied to A=B_k).

(15)

Also,

λ

n

X

k=1

B_k

!

= max









 v^∗

_n P

k=1

B_k

v

v^∗v | v ∈C²\ {0}











by Lemma 1 g), applied to A=

n

X

k=1

B_k

!

= max











n

P

k=1

v^∗B_kv

v^∗v | v ∈C²\ {0}











∈











n

P

k=1

v^∗B_kv

v^∗v | v ∈C²\ {0}









 .

Hence, there exists somew∈C²\ {0} such that λ _n

P

k=1

B_k

=

n

P

k=1

w^∗B_kw

w^∗w . Thus,

λ

n

X

k=1

B_k

!

=

n

P

k=1

w^∗B_kw w^∗w =

n

X

k=1

w^∗B_kw w^∗w ≤

n

X

k=1

λ(B_k)







since w^∗B_kw w^∗w ∈

v^∗B_kv

v^∗v | v ∈C²\ {0}

and thus w^∗B_kw

w^∗w ≤max

v^∗B_kv

v^∗v | v ∈C²\ {0}

=λ(B_k) for every k∈ {1,2, ..., n}





 .

Now, let Be =

n

P

k=1

Bk. Then,

TrBe = Tr

n

X

k=1

B_k

!

=

n

X

k=1

TrB_k (since the trace is linear).

Now, λ Be

+µ Be

= TrBe (by Lemma 1 d), applied to A=Be).

Besides, let λ_Σ =

n

P

k=1

λ(B_k). Then, λ Be

= λ _n

P

k=1

B_k

≤

n

P

k=1

λ(B_k) = λ_Σ, so that λ

Be

−λ_Σ ≤0. Also, TrBe−

λ Be

+λ_Σ

≤TrBe− λ

Be +λ

Be

(since λ_Σ ≥v)

≤TrBe−





 λ

Be

+µ

Be

| {z }

=TrBe







since λ Be

≥µ Be

by Lemma 1 b), applied to A=Be

= TrBe−TrBe= 0.

(16)

Thus,

λ Be

·

TrBe−λ Be

−λ_Σ·

TrBe−λ_Σ

=

λ Be

·TrBe− λ

Be2

−

λ_Σ·TrBe−λ²_Σ

=





 λ

Be

·TrBe−λ_Σ·TrBe

| {z }

=(^λ(^B^e)^−λΣ)^·Tr^B^e







−







λ Be2

−λ²_Σ

| {z }

=(^λ(^B^e)^−λΣ)(^λ(^B^e)^+λΣ)







= λ

Be

−λ_Σ

·TrBe− λ

Be

−λ_Σ λ Be

+λ_Σ

=





 λ

Be

−λ_Σ

| {z }

≤0







·







TrBe− λ

Be

+λ_Σ

| {z }

≤0







≥0,

so that

λ Be

·

TrBe−λ Be

≥λ_Σ·

TrBe−λ_Σ

. (15)

Hence, det

n

X

k=1

B_k

!

= detBe =λ Be

·

TrBe−λ

Be

by Lemma 1 d), applied to A=Be

≥λ_Σ·

TrBe−λ_Σ

(by (15))

=

n

X

k=1

λ(B_k)·

n

X

k=1

TrB_k−

n

X

k=1

λ(B_k)

!

since λΣ =

n

X

k=1

λ(Bk) and TrBe=

n

X

k=1

TrBk

!

=

n

X

k=1

λ(B_k)·

n

X

k=1

µ(B_k)







since

n

P

k=1

TrB_k− Pⁿ

k=1

λ(B_k) =

n

P

k=1

(TrB_k−λ(B_k)) =

n

P

k=1

µ(B_k), because TrB_k−λ(B_k) =µ(B_k) for every k ∈ {1,2, ..., n}

(because λ(B_k) +µ(B_k) = TrB_k by Lemma 1 d), applied to A=B_k)





 ,

and Lemma 2 is proven.

Now let us solve the problem:

For every k ∈ {1,2, ..., n}, there exists a matrix S(A_k)∈U₂(C) such that

(S(A_k))^∗·A_k·S(A_k) = diag (λ(A_k), µ(A_k)) (16) (according to Lemma 1 f ), applied to A = A_k). These matrices S(A₁), S(A₂), ...,

(17)

S(A_n) satisfy (S(A₁), S(A₂), ..., S(A_n))∈(U₂(C))ⁿ and

F (S(A₁), S(A₂), ..., S(A_n)) = det







n

X

k=1

(S(A_k))^∗·A_k·S(A_k)

| {z }

=diag(λ(Ak),µ(Ak)) by (16)







= det







n

X

k=1

diag (λ(Ak), µ(Ak))

| {z }

=diag

„ _n P

k=1

λ(Ak),

n

P

k=1

µ(Ak)

«







= det diag

n

X

k=1

λ(Ak),

n

X

k=1

µ(Ak)

!!

=

n

X

k=1

λ(A_k)·

n

X

k=1

µ(A_k). Hence,

n

X

k=1

λ(A_k)·

n

X

k=1

µ(A_k)∈ {F(U) | U ∈(U₂(C))ⁿ}. (17) On the other hand, for every (U₁, U₂, ..., U_n)∈(U₂(C))ⁿ, we have

F (U₁, U₂, ..., U_n) = det

n

X

k=1

U_k^∗A_kU_k

!

≥

n

X

k=1

λ(U_k^∗A_kU_k)·

n

X

k=1

µ(U_k^∗A_kU_k)





by Lemma 2, applied to B_k =U_k^∗A_kU_k, because the matrix U_k^∗A_kU_k is Hermitian for everyk ∈ {1,2, ..., n}, since (U_k^∗A_kU_k)^∗ =U_k^∗A^∗_k(U_k^∗)^∗ =U_k^∗A_kU_k,

because A^∗_k =A_k (since the matrix A_k is Hermitian) and (U_k^∗)^∗ =U_k





=

n

X

k=1

λ(A_k)·

n

X

k=1

µ(A_k)







since for every k ∈ {1,2, ..., n}, we have U_k^∗ =U_k⁻¹ (since U_k∈U₂(C) ) and thus U_k^∗A_kU_k =U_k⁻¹A_kU_k, so that the matrices U_k^∗A_kU_k and A_k are similar, so that

Tr (U_k^∗A_kU_k) = TrA_k and det (U_k^∗A_kU_k) = detA_k, so that λ(U_k^∗A_kU_k) = 1

2

Tr (U_k^∗A_kU_k) + q

(Tr (U_k^∗A_kU_k))²−4 det (U_k^∗A_kU_k)

= 1 2

TrA_k+ q

(TrA_k)²−4 detA_k

=λ(A_k) and µ(U_k^∗A_kU_k) = 1

2

Tr (U_k^∗A_kU_k)− q

(Tr (U_k^∗A_kU_k))²−4 det (U_k^∗A_kU_k)

= 1 2

TrA_k− q

(TrA_k)² −4 detA_k

=µ(A_k)





 .

In other words, for every U ∈(U2(C))ⁿ, we have F (U)≥

n

X

k=1

λ(A_k)·

n

X

k=1

µ(A_k).

(18)

This, together with (17), yields min

U∈(U2(C))ⁿF (U) =

n

X

k=1

λ(A_k)·

n

X

k=1

µ(A_k). (18) Lemma 1 b) and e) yields that for every Hermitian 2×2 complex matrix A, the numbers λ(A) and µ(A) are the greatest and the least eigenvalue of the matrix A, respectively. Hence, λ(A_j) =σ₁(A_j) and µ(A_j) =σ₂(A_j) for every j ∈ {1,2, ..., n}. Thus, (18) becomes

min

U∈(U2(C))ⁿ

F (U) =

n

X

k=1

σ₁(A_k)·

n

X

k=1

σ₂(A_k), and the problem is solved.