American Mathematical Monthly
Problem 11415 by Finbarr Holland, generalized
Definitions. a) A matrix is called a complex matrix if all its entries are complex numbers. (In particular, any vector in Cn is considered a n×1 complex matrix.)
b) For any complex matrix A, we denote the matrixAT by A∗.
c) Let U2(C) be the group of all unitary 2×2 complex matrices. In other words, let
U2(C) =
U ∈GL2(C)|U∗ =U−1 .
d) A complex matrix A is called Hermitian if it satisfies A∗ =A.
Problem. Let A1, A2, ..., An be n Hermitian 2×2 complex matrices. Define a function F from the Cartesian product (U2(C))n toR by
F (U1, U2, ..., Un) = det
n
X
k=1
Uk∗AkUk
!
for every (U1, U2, ..., Un)∈(U2(C))n, Show that
min
U∈(U2(C))n
F (U) =
n
X
k=1
σ1(Ak)·
n
X
k=1
σ2(Ak),
where σ1(Aj) and σ2(Aj) denote the greatest and the least eigenvalue of the matrix Aj,respectively, for every j ∈ {1,2, ..., n}.
Solution by Darij Grinberg.
We start with an important lemma:
Lemma 1 (the Spectral Theorem for 2×2 matrices). Let A be a Hermitian 2×2 complex matrix.
a)We have TrA∈R, detA∈R and (TrA)2−4 detA∈R≥0. b) Let us define two numbersλ(A) and µ(A) by
λ(A) = 1 2
TrA+ q
(TrA)2−4 detA
; µ(A) = 1
2
TrA− q
(TrA)2−4 detA
.
These numbers λ(A) and µ(A) are real and satisfy λ(A)≥µ(A).
c)If the matrixAis positive definite, then λ(A)≥µ(A)>0.If the matrix A is nonnegative definite, thenλ(A)≥µ(A)≥0.
d) We have λ(A) +µ(A) = TrA and λ(A)·µ(A) = detA. In particular, detA=λ(A)·(TrA−λ(A)).
e)The eigenvalues of A(with algebraic multiplicities) are λ(A) andµ(A).
f ) There exists a matrixS(A)∈U2(C) such that
(S(A))∗·A·S(A) = diag (λ(A), µ(A)).
g) We have
λ(A) = max
v∗Av
v∗v | v ∈C2\ {0}
; µ(A) = min
v∗Av
v∗v | v ∈C2\ {0}
.
h) There exists a matrix Se(A)∈U2(C) such that
Se(A)∗
·A·Se(A) = diag (λ(A), µ(A)) and det
Se(A)
= 1.
Proof of Lemma 1. SinceAis a 2×2 complex matrix, there exist complex numbers a, b, c, dsuch that A=
a b c d
.
SinceAis Hermitian, we haveA=A∗,so that
a b c d
=A=A∗ =
a b c d
∗
= a c
b d
, and thus
a=a, b =c, c=b, d =d.
Now, a = a yields a ∈ R, and d = d yields d ∈ R, so that TrA = Tr
a b c d
= a
|{z}
∈R
+ d
|{z}
∈R
∈R. Also, a
|{z}
∈R
− d
|{z}
∈R
∈R.Besides,
detA = detA∗ = detAT = detA since detBT = detB for any square matrix B
= detA
yields detA∈R.Thus, TrA
| {z }
∈R
!2
−4 detA
| {z }
∈R
∈R.Moreover,
detA= det
a b c d
=ad−bc yields
(TrA)2−4 detA=
TrA
| {z }
=a+d
2
−4 ad− b
|{z}
=c
c
!
= (a+d)2−4 (ad−cc) = (a+d)2−4ad
| {z }
=(a−d)2
+4cc
=
a−d
| {z }
∈R
2
| {z }
∈R≥0,since squares of reals are∈R≥0
+4 cc
|{z}
=|c|2∈R≥0
(1)
∈R≥0.
Thus, Lemma 1 a)is proven.
Lemma 1a)yields (TrA)2−4 detA∈R≥0,and thus q
(TrA)2−4 detA∈R≥0,so that
λ(A) = 1 2
TrA
| {z }
∈R
+ q
(TrA)2 −4 detA
| {z }
∈R≥0⊆R
∈R and
µ(A) = 1 2
TrA
| {z }
∈R
− q
(TrA)2−4 detA
| {z }
∈R≥0⊆R
∈R.
In other words, the numbers λ(A) and µ(A) are real. Besides, q
(TrA)2−4 detA ∈ R≥0 yields
q
(TrA)2−4 detA≥ − q
(TrA)2−4 detA, so that
λ(A) = 1 2
TrA+ q
(TrA)2−4 detA
| {z }
≥−√
(TrA)2−4 detA
≥ 1 2
TrA− q
(TrA)2−4 detA
=µ(A).
Thus, Lemma 1 b) is proven. Besides, λ(A) +µ(A) = 1
2
TrA+ q
(TrA)2−4 detA
+ 1 2
TrA− q
(TrA)2 −4 detA
= TrA and
λ(A)·µ(A) = 1 2
TrA+ q
(TrA)2−4 detA
· 1 2
TrA− q
(TrA)2−4 detA
= 1 4 ·
TrA+ q
(TrA)2−4 detA TrA− q
(TrA)2 −4 detA
= 1 4 ·
(TrA)2− q
(TrA)2−4 detA 2
| {z }
=(TrA)2−4 detA
= 1
4 ·4 detA= detA.
Thus,
detA=λ(A)· µ(A)
| {z }
=TrA−λ(A), since λ(A)+µ(A)=TrA
=λ(A)·(TrA−λ(A)).
Thus, Lemma 1 d) is proven.
The characteristic polynomial of the matrix A is det (XI2−A) = det
X 0
0 X
−
a b c d
= det
X−a 0−b 0−c X−d
= det
X−a −b
−c X−d
= (X−a) (X−d)
| {z }
=X2−(a+d)X+ad
−(−b) (−c)
| {z }
=bc
=X2 −
a+d
| {z }
=TrA
=λ(A)+µ(A)
X+
ad−bc
| {z }
=detA
=λ(A)·µ(A)
=X2 −(λ(A) +µ(A))X+λ(A)·µ(A) = (X−λ(A)) (X−µ(A)). Hence, λ(A) and µ(A) are the roots of the characteristic polynomial of the matrix A (with multiplicities). But the eigenvalues of A (with algebraic multiplicities) are the roots of the characteristic polynomial of the matrixA (with multiplicities). Hence, the eigenvalues of A (with algebraic multiplicities) are λ(A) and µ(A). Thus, Lemma 1 e) is proven.
f ) We notice that
diag (α1, α2)·diag (β1, β2) = diag (α1β1, α2β2)
for every α1 ∈C, α2 ∈C, β1 ∈C and β2 ∈C, (2) and
diag (α1, α2) = diag (α1, α2) for every α1 ∈C and α2 ∈C and thus
(diag (α1, α2))∗ = diag (α1, α2)T = diag (α1, α2)T = diag (α1, α2)
for every α1 ∈C and α2 ∈C. (3) Let ρ =
q
(TrA)2−4 detA. Then, ρ ∈ R≥0 (since (TrA)2 −4 detA ∈ R≥0 by Lemma 1 a)).
Also,
λ(A) = 1 2
TrA+ q
(TrA)2−4 detA
| {z }
=ρ
= 1
2(TrA+ρ) ;
µ(A) = 1 2
TrA− q
(TrA)2−4 detA
| {z }
=ρ
= 1
2(TrA−ρ), so that
λ(A)−µ(A) = 1
2(TrA+ρ)−1
2(TrA−ρ) =ρ.
We now distinguish between two cases:
Case 1: We haveρ= 0.
Case 2: We haveρ6= 0.
First, let us consider Case 2. In this case, ρ6= 0.
Theorem 1 e)yields thatλ(A) is an eigenvalue of the matrixA. Thus, there exists a vector eλ ∈ C2 such that eλ 6= 0 and Aeλ = λ(A)eλ. Since eλ ∈ C2, there exist complex numbers fλ and gλ such that eλ =
fλ gλ
. Then, e∗λ = eλT = fλ
gλ T
= fλ
gλ T
= fλ, gλ
and thus e∗λeλ = fλ, gλ fλ
gλ
=fλfλ+gλgλ. Also,e∗λeλ ∈R>0
(since eλ 6= 0), so that p
e∗λeλ ∈R>0.
Theorem 1 e)yields thatµ(A) is an eigenvalue of the matrix A. Thus, there exists a vector eµ ∈ C2 such that eµ 6= 0 and Aeµ = µ(A)eµ. Since eµ ∈ C2, there exist complex numbers fµ and gµ such that eµ =
fµ
gµ
. Then, e∗µ = eµT = fµ
gµ T
= fµ
gµ T
= fµ, gµ
and thuse∗µeµ= fµ, gµ fµ
gµ
=fµfµ+gµgµ. Also, e∗µeµ∈R>0
(since eµ6= 0), so that p
e∗µeµ∈R>0. We have
e∗λ Aeµ
|{z}
=µ(A)eµ
=e∗λµ(A)eµ=µ(A)·e∗λeµ and
e∗λ A
|{z}
=A∗
eµ= e∗λA∗
| {z }
=(Aeλ)∗
eµ =
Aeλ
|{z}
=λ(A)eλ
∗
eµ= (λ(A)eλ)∗
| {z }
=λ(A)·e∗λ
eµ=λ(A)·e∗λeµ,
so that
λ(A)·e∗λeµ=µ(A)·e∗λeµ. Thus,
0 = λ(A)·e∗λeµ−µ(A)·e∗λeµ =
λ(A)−µ(A)
| {z }
=ρ6=0
·e∗λeµ, so that e∗λeµ= 0. Since
e∗λ
|{z}
=(fλ,gλ) eµ
|{z}
= 0
@
fµ
gµ
1 A
= fλ, gλ fµ
gµ
=fλfµ+gλgµ,
this becomes
fλfµ+gλgµ= 0.
Thus,
fµfλ+gµgλ =fλfµ+gλgµ=fλfµ+gλgµ
since fλ =fλ and gλ =gλ
=fλfµ+gλgµ= 0 = 0.
Define a matrix W ∈U2(C) by W =
fλ fµ gλ gµ
.
Define a matrix S(A)∈U2(C) by
S(A) =W ·diag 1
pe∗λeλ, 1 pe∗µeµ
!
. (4)
Then,
W∗ =WT =
fλ fµ gλ gµ
T
=
fλ fµ gλ gµ
T
=
fλ gλ fµ gµ
and thus
(S(A))∗ = W ·diag 1
pe∗λeλ, 1 pe∗µeµ
!!∗
= diag 1
pe∗λeλ, 1 pe∗µeµ
!!∗
| {z }
=diag 0
@
1 pe∗λeλ,
1 pe∗µeµ
1 Aby (3)
·W∗
= diag
1 pe∗λeλ
| {z }
= 1
pe∗λeλ
= 1
pe∗λeλ,
since√
e∗λeλ∈R
, 1
pe∗µeµ
| {z }
= 1
pe∗µeµ
= 1
pe∗µeµ,
since√
e∗µeµ∈R
·W∗
= diag 1
pe∗λeλ, 1 pe∗µeµ
!
·W∗, (5)
and W∗·W =
fλ gλ fµ gµ
·
fλ fµ gλ gµ
=
fλfλ+gλgλ fλfµ+gλgµ fµfλ+gµgλ fµfµ+gµgµ
=
e∗λeλ 0 0 e∗µeµ
since fλfλ+gλgλ =e∗λeλ and fµfµ+gµgµ =e∗µeµ
= diag e∗λeλ, e∗µeµ
, (6)
so that (4) and (5) yield (S(A))∗ ·S(A) = diag 1
pe∗λeλ, 1 pe∗µeµ
!
· W∗·W
| {z }
=diag(e∗λeλ,e∗µeµ)
by (6)
·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag 1 pe∗λeλ
, 1 pe∗µeµ
!
·diag e∗λeλ, e∗µeµ
| {z }
=diag 0
@
1 pe∗λeλ
·e∗λeλ, 1 pe∗µeµ·e
∗ µeµ
1 A by (2)
·diag 1 pe∗λeλ
, 1 pe∗µeµ
!
= diag
1
pe∗λeλ ·e∗λeλ
| {z }
=√
e∗λeλ
, 1 pe∗µeµ
·e∗µeµ
| {z }
=√
e∗µeµ
·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag
pe∗λeλ,p e∗µeµ
·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag p
e∗λeλ· 1 pe∗λeλ,p
e∗µeµ· 1 pe∗µeµ
!
(by (2))
= diag (1,1) =I2.
Thus, the matrix S(A) is left-invertible. Hence, the matrix S(A) is invertible (since every left-invertible matrix is invertible). In other words, S(A) ∈ GL2(C). Besides,
(S(A))∗ = (S(A))−1(since (S(A))∗·S(A) = I2). Thus,S(A)∈ {U ∈GL2(C)|U∗ =U−1}= U2(C).
On the other hand, afλ+bgλ
cfλ+dgλ
=
a b c d
| {z }
=A
fλ gλ
| {z }
=eλ
=Aeλ =λ(A)eλ =λ(A) fλ
gλ
=
λ(A)fλ λ(A)gλ
yields
afλ+bgλ =λ(A)fλ and cfλ+dgλ =λ(A)gλ. (7) Also,
afµ+bgµ cfµ+dgµ
=
a b c d
| {z }
=A
fµ gµ
| {z }
=eµ
=Aeµ=µ(A)eµ=µ(A) fµ
gµ
=
µ(A)fµ µ(A)gµ
yields
afµ+bgµ=µ(A)fµ and cfµ+dgµ=µ(A)gµ. (8)
Now,
A·W =
a b c d
·
fλ fµ gλ gµ
=
afλ+bgλ afµ+bgµ cfλ +dgλ cfµ+dgµ
=
λ(A)fλ µ(A)fµ λ(A)gλ µ(A)gµ
(by (7) and (8)) and
W
|{z}
= 0
@
fλ fµ
gλ gµ
1 A
·diag (λ(A), µ(A))
| {z }
= 0
@
λ(A) 0 0 µ(A)
1 A
=
fλ fµ gλ gµ
·
λ(A) 0 0 µ(A)
=
fλ·λ(A) +fµ·0 fλ·0 +fµ·µ(A) gλ·λ(A) +gµ·0 gλ·0 +gµ·µ(A)
=
fλ·λ(A) fµ·µ(A) gλ ·λ(A) gµ·µ(A)
=
λ(A)fλ µ(A)fµ λ(A)gλ µ(A)gµ
yieldA·W =W ·diag (λ(A), µ(A)), so that (4) and (5) yield
(S(A))∗·A·S(A)
= diag 1 pe∗λeλ
, 1 pe∗µeµ
!
·W∗· A·W
| {z }
=W·diag(λ(A),µ(A))
·diag 1 pe∗λeλ
, 1 pe∗µeµ
!
= diag 1
pe∗λeλ, 1 pe∗µeµ
!
· W∗·W
| {z }
=diag(e∗λeλ,e∗µeµ)
by (6)
·diag (λ(A), µ(A))·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag 1 pe∗λeλ
, 1 pe∗µeµ
!
·diag e∗λeλ, e∗µeµ
| {z }
=diag 0
@
1 pe∗λeλ
·e∗λeλ, 1 pe∗µeµ·e
∗ µeµ
1 A by (2)
·diag (λ(A), µ(A))·diag 1 pe∗λeλ
, 1 pe∗µeµ
!
= diag
1
pe∗λeλ ·e∗λeλ
| {z }
=√
e∗λeλ
, 1
pe∗µeµ ·e∗µeµ
| {z }
=√
e∗µeµ
·diag (λ(A), µ(A))·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag
pe∗λeλ,p e∗µeµ
·diag (λ(A), µ(A))
| {z }
=diag(√
e∗λeλ·λ(A),√
e∗µeµ·µ(A))
by (2)
·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag
pe∗λeλ·λ(A),p
e∗µeµ·µ(A)
·diag 1
pe∗λeλ, 1 pe∗µeµ
!
= diag p
e∗λeλ·λ(A)· 1 pe∗λeλ
,p
e∗µeµ·µ(A)· 1 pe∗µeµ
!
(by (2))
= diag (λ(A), µ(A)).
Thus, we have proven that, in Case 2, there exists a matrix S(A)∈U2(C) such that (S(A))∗·A·S(A) = diag (λ(A), µ(A)).
In other words, we have proven that Lemma 1 f )holds in Case 2.
Next, let us consider Case 1. In this case, ρ= 0, so that 0 =ρ2 =
q
(TrA)2−4 detA 2
= (TrA)2−4 detA
=
a−d
| {z }
∈R
2
+ 4 cc
|{z}
=|c|2∈R
(by (1)) (9)
≥4 cc
|{z}
=|c|2
since (a−d)2 ≥0, since a−d∈R and since squares of reals are ≥0
= 4|c|2,
so that 0≥ |c|2. But |c|2 ≥0 (since |c| ∈ R and since squares of reals are ≥0). Thus,
|c|2 = 0, so that |c| = 0 and thus c = 0. Hence, b = c = 0 = 0. Now, (9) yields 0 = (a−d)2+ 4c c
|{z}
=0
= (a−d)2, so that a−d= 0, so that a=d. Thus,
A =
a b c d
=
d 0 0 d
(since b = 0, c= 0 and a=d)
= diag (d, d),
so that TrA=d+d= 2d. Hence,
λ(A) = 1 2
TrA
| {z }
=2d
+ ρ
|{z}
=0
= 1
2(2d+ 0) =d;
µ(A) = 1 2
TrA
| {z }
=2d
− ρ
|{z}
=0
= 1
2(2d−0) = d, so that diag (λ(A), µ(A)) = diag (d, d).
Now, let S(A) = I2. Then, clearly, S(A)∈U2(C) and (S(A))∗
| {z }
=I2∗=I2
·A·S(A)
| {z }
=I2
=I2·A·I2 =A= diag (d, d) = diag (λ(A), µ(A)).
Thus, we have proven that, in Case 1, there exists a matrix S(A)∈U2(C) such that (S(A))∗·A·S(A) = diag (λ(A), µ(A)).
In other words, we have proven that Lemma 1 f )holds in Case 1.
Altogether, we have now proven that Lemma 1 f ) holds in both Cases 1 and 2.
Thus, Lemma 1 f )always holds. Hence, Lemma 1 f )is proven.
g) According to Lemma 1 f ), there exists a matrix S(A)∈U2(C) such that (S(A))∗·A·S(A) = diag (λ(A), µ(A)).
We have S(A) ∈ GL2(C) (since S(A) ∈ U2(C) = {U ∈GL2(C)|U∗ =U−1}). In other words, the matrix S(A) is invertible. Also, S(A) ∈ U2(C) yields (S(A))∗ = (S(A))−1, so that (S(A))∗S(A) =I2.
We have
C2\ {0} ⊇
S(A)w | w∈C2\ {0}
(since S(A)w∈C2\ {0}for every w∈C2\ {0} (since the matrixS(A) is invertible)) and
C2\ {0} ⊆
S(A)w | w∈C2\ {0}
(since for every v ∈ C2\ {0}, there exists some w ∈ C2 \ {0} such that v = S(A)w (in fact, let w= (S(A))−1v; then,v =S(A)wand w∈C2\ {0} (since S(A)w=v ∈ C2\ {0}, so that S(A)w6= 0, so thatw6= 0))). Thus,
C2\ {0}=
S(A)w | w∈C2\ {0} . Hence,
v∗Av
v∗v | v ∈C2\ {0}
=
v∗Av
v∗v | v ∈
S(A)w | w∈C2\ {0}
=
(S(A)w)∗AS(A)w
(S(A)w)∗S(A)w | w∈C2 \ {0}
=
w∗(S(A))∗AS(A)w
w∗(S(A))∗S(A)w | w∈C2\ {0}
(since (S(A)w)∗ =w∗(S(A))∗)
=
w∗diag (λ(A), µ(A))w
w∗w | w∈C2\ {0}
(10)
since (S(A))∗AS(A) = diag (λ(A), µ(A)) andw∗(S(A))∗S(A)
| {z }
=I2
w=w∗I2w=w∗w
. Now, for every w ∈ C2 \ {0}, there exist w1 ∈ C and w2 ∈ C such that w =
w1 w2
, and we havew∗ =wT =
w1 w2
T
= w1
w2 T
= (w1, w2), so that w∗w= (w1, w2)
w1 w2
=w1w1
| {z }
=|w1|2
+w2w2
| {z }
=|w2|2
=|w1|2+|w2|2 and
w∗diag (λ(A), µ(A))w= (w1, w2) diag (λ(A), µ(A)) w1
w2
| {z }
= 0
@
λ(A)w1 µ(A)w2
1 A
= (w1, w2)
λ(A)w1 µ(A)w2
=w1λ(A)w1+w2µ(A)w2 =λ(A)w1w1
| {z }
=|w1|2
+µ(A)w2w2
| {z }
=|w2|2
=λ(A)|w1|2+µ(A)|w2|2.
Hence,
w∗diag (λ(A), µ(A))w
w∗w = λ(A)|w1|2+µ(A)|w2|2
|w1|2+|w2|2 (11)
≤ λ(A)|w1|2 +λ(A)|w2|2
|w1|2 +|w2|2
since µ(A)≤λ(A) and since |w2|2 ≥0 and |w1|2+|w2|2 ≥0
= λ(A) |w1|2+|w2|2
|w1|2+|w2|2 =λ(A),
and there exists aw∈C2\ {0}satisfying w∗diag (λ(A), µ(A))w
w∗w =λ(A) (in fact, set w=
1 0
; then,w∗ =wT = 1
0 T
= 1
0 T
= 1
0 T
= (1,0) and thus
w∗diag (λ(A), µ(A))w
w∗w =
(1,0) diag (λ(A), µ(A)) 1
0
(1,0) 1
0
=
(1,0)
λ(A) 0
(1,0) 1
0
since diag (λ(A), µ(A)) 1
0
=
λ(A)·1 µ(A)·0
=
λ(A) 0
= 1·λ(A) + 0·0
1·1 + 0·0 = 1·λ(A)
1·1 =λ(A) ). Thus,
max
w∗diag (λ(A), µ(A))w
w∗w | w∈C2\ {0}
=λ(A). (12) Besides, for every w ∈ C2 \ {0}, there exist w1 ∈ C and w2 ∈ C such that w = w1
w2
, and (11) yields w∗diag (λ(A), µ(A))w
w∗w = λ(A)|w1|2+µ(A)|w2|2
|w1|2+|w2|2 ≥ µ(A)|w1|2+µ(A)|w2|2
|w1|2+|w2|2
since λ(A)≥µ(A) and since |w1|2 ≥0 and |w1|2+|w2|2 ≥0
= µ(A) |w1|2+|w2|2
|w1|2+|w2|2 =µ(A),
and there exists aw∈C2\ {0} satisfying w∗diag (λ(A), µ(A))w
w∗w =µ(A) (in fact, set
w= 0
1
; then,w∗ =wT = 0
1 T
= 0
1 T
= 0
1 T
= (0,1) and thus
w∗diag (λ(A), µ(A))w
w∗w =
(0,1) diag (λ(A), µ(A)) 0
1
(0,1) 0
1
=
(0,1) 0
µ(A)
(0,1) 0
1
since diag (λ(A), µ(A)) 0
1
=
λ(A)·0 µ(A)·1
= 0
µ(A)
= 0·0 + 1·µ(A)
0·0 + 1·1 = 1·µ(A)
1·1 =µ(A) ). Thus,
min
w∗diag (λ(A), µ(A))w
w∗w | w∈C2\ {0}
=µ(A). (13)
Now,
λ(A) = max
w∗diag (λ(A), µ(A))w
w∗w | w∈C2\ {0}
(by (12))
= max
v∗Av
v∗v | v ∈C2\ {0}
(by (10)) and
µ(A) = min
w∗diag (λ(A), µ(A))w
w∗w | w∈C2\ {0}
(by (13))
= min
v∗Av
v∗v | v ∈C2\ {0}
(by (10)). Thus, Lemma 1 g) is proven.
c) According to Lemma 1 f ), there exists a matrix S(A)∈U2(C) such that (S(A))∗·A·S(A) = diag (λ(A), µ(A)).
We have S(A) ∈ GL2(C) (since S(A) ∈ U2(C) = {U ∈GL2(C)|U∗ =U−1}). In other words, the matrixS(A) is invertible. Thus,S(A)·
0 1
6= 0 (since 0
1
6= 0).
Letv =S(A)· 0
1
. Then,v =S(A)· 0
1
6= 0.
Now, v∗Av=
S(A)· 0
1 ∗
| {z }
= 0
@
0 1
1 A
∗
·(S(A))∗
·A·
S(A)· 0
1
= 0
1 ∗
·(S(A))∗·A·S(A)
| {z }
=diag(λ(A),µ(A))
· 0
1
=
0 1
∗
| {z }
= 0
@
0 1
1 A
T
= 0
@
0 1
1 A
T
=(0,1)=(0,1)
·diag (λ(A), µ(A))· 0
1
| {z }
= 0
@
λ(A)·0 µ(A)·1
1 A=
0
@
0 µ(A)
1 A
= (0,1)· 0
µ(A)
= 0·0 + 1·µ(A) = 1·µ(A) = µ(A). (14)
Now, if the matrix A is positive definite, then v∗Av > 0 (since v 6= 0), what becomes µ(A)> 0 (by (14)), so that λ(A) ≥ µ(A) > 0 (since λ(A) ≥ µ(A)). Besides, if the matrix A is nonnegative definite, then v∗Av ≥ 0, what becomes µ(A) ≥ 0 (by (14)), so that λ(A)≥µ(A)≥0 (sinceλ(A)≥µ(A)). Thus, Lemma 1 c)is proven.
h) According to Lemma 1 f ), there exists a matrix S(A)∈U2(C) such that (S(A))∗·A·S(A) = diag (λ(A), µ(A)).
Note that S(A) ∈ U2(C) = {U ∈GL2(C)|U∗ =U−1} yields S(A) ∈ GL2(C) and (S(A))∗ = (S(A))−1, and thus (S(A))∗·S(A) =I2. Thus,
1 = det I2
|{z}
=(S(A))∗·S(A)
= det ((S(A))∗·S(A)) = det
(S(A))∗
| {z }
=S(A)T
·det (S(A))
= det
S(A)T
| {z }
=detS(A)
=det(S(A))
·det (S(A)) = det (S(A))·det (S(A)) =|det (S(A))|2,
so that |det (S(A))|= 1 (since |det (S(A))| ∈R≥0).
Since the field C is algebraically closed, there exists some τ ∈ C such that τ2 = det (S(A)). We have |τ|2 =|τ2|=|det (S(A))|= 1, so that |τ|= 1 (since |τ| ∈R≥0), and thus τ 6= 0 and τ τ =|τ|2 = 12 = 1, so that τ = 1
τ. Define a matrix Se(A) ∈ M2(C) by Se(A) = 1
τS(A). Then, Se(A) = 1
τS(A) ∈
GL2(C) (sinceS(A)∈GL2(C) and 1
τ 6= 0) and
Se(A) ∗
= 1
τS(A) ∗
= 1
τ
|{z}
=τ−1=τ−1
(S(A))∗ =τ−1(S(A))∗
| {z }
=(S(A))−1
=τ−1(S(A))−1 =
τ
|{z}
=1 τ
S(A)
−1
= 1
τS(A) −1
=
Se(A)−1 .
Thus,Se(A)∈ {U ∈GL2(C)|U∗ =U−1}= U2(C). Besides,
Se(A)∗
·A·Se(A) = 1
τS(A) ∗
| {z }
=τ−1(S(A))∗
·A· 1
τS(A) =τ−1(S(A))∗·A· 1 τS(A)
= 1
τ τ
|{z}
=1,since τ τ=1
(S(A))∗·A·S(A)
| {z }
=diag(λ(A),µ(A))
= diag (λ(A), µ(A))
and
det
Se(A)
= det 1
τS(A)
= 1
τ 2
det (S(A))
| {z }
=τ2
= 1
τ 2
τ2 = 1.
Thus, we have proven that there exists a matrix Se(A)∈U2(C) such that
Se(A)∗
·A·Se(A) = diag (λ(A), µ(A)) and det
Se(A)
= 1.
In other words, we have proven Lemma 1 h).
A conclusion from Lemma 1:
Lemma 2. Let B1, B2, ..., Bn be n Hermitian 2×2 complex matrices.
Then, λ n
P
k=1
Bk
≤
n
P
k=1
λ(Bk) and
det
n
X
k=1
Bk
!
≥
n
X
k=1
λ(Bk)·
n
X
k=1
µ(Bk).
Proof of Lemma 2. For every k ∈ {1,2, ..., n}, we have λ(Bk) = max
v∗Bkv
v∗v | v ∈C2\ {0}
(by Lemma 1 g), applied to A=Bk).
Also,
λ
n
X
k=1
Bk
!
= max
v∗
n P
k=1
Bk
v
v∗v | v ∈C2\ {0}
by Lemma 1 g), applied to A=
n
X
k=1
Bk
!
= max
n
P
k=1
v∗Bkv
v∗v | v ∈C2\ {0}
∈
n
P
k=1
v∗Bkv
v∗v | v ∈C2\ {0}
.
Hence, there exists somew∈C2\ {0} such that λ n
P
k=1
Bk
=
n
P
k=1
w∗Bkw
w∗w . Thus,
λ
n
X
k=1
Bk
!
=
n
P
k=1
w∗Bkw w∗w =
n
X
k=1
w∗Bkw w∗w ≤
n
X
k=1
λ(Bk)
since w∗Bkw w∗w ∈
v∗Bkv
v∗v | v ∈C2\ {0}
and thus w∗Bkw
w∗w ≤max
v∗Bkv
v∗v | v ∈C2\ {0}
=λ(Bk) for every k∈ {1,2, ..., n}
.
Now, let Be =
n
P
k=1
Bk. Then,
TrBe = Tr
n
X
k=1
Bk
!
=
n
X
k=1
TrBk (since the trace is linear).
Now, λ Be
+µ Be
= TrBe (by Lemma 1 d), applied to A=Be).
Besides, let λΣ =
n
P
k=1
λ(Bk). Then, λ Be
= λ n
P
k=1
Bk
≤
n
P
k=1
λ(Bk) = λΣ, so that λ
Be
−λΣ ≤0. Also, TrBe−
λ Be
+λΣ
≤TrBe− λ
Be +λ
Be
(since λΣ ≥v)
≤TrBe−
λ
Be
+µ
Be
| {z }
=TrBe
since λ Be
≥µ Be
by Lemma 1 b), applied to A=Be
= TrBe−TrBe= 0.
Thus,
λ Be
·
TrBe−λ Be
−λΣ·
TrBe−λΣ
=
λ Be
·TrBe− λ
Be2
−
λΣ·TrBe−λ2Σ
=
λ
Be
·TrBe−λΣ·TrBe
| {z }
=(λ(Be)−λΣ)·TrBe
−
λ Be2
−λ2Σ
| {z }
=(λ(Be)−λΣ)(λ(Be)+λΣ)
= λ
Be
−λΣ
·TrBe− λ
Be
−λΣ λ Be
+λΣ
=
λ
Be
−λΣ
| {z }
≤0
·
TrBe− λ
Be
+λΣ
| {z }
≤0
≥0,
so that
λ Be
·
TrBe−λ Be
≥λΣ·
TrBe−λΣ
. (15)
Hence, det
n
X
k=1
Bk
!
= detBe =λ Be
·
TrBe−λ
Be
by Lemma 1 d), applied to A=Be
≥λΣ·
TrBe−λΣ
(by (15))
=
n
X
k=1
λ(Bk)·
n
X
k=1
TrBk−
n
X
k=1
λ(Bk)
!
since λΣ =
n
X
k=1
λ(Bk) and TrBe=
n
X
k=1
TrBk
!
=
n
X
k=1
λ(Bk)·
n
X
k=1
µ(Bk)
since
n
P
k=1
TrBk− Pn
k=1
λ(Bk) =
n
P
k=1
(TrBk−λ(Bk)) =
n
P
k=1
µ(Bk), because TrBk−λ(Bk) =µ(Bk) for every k ∈ {1,2, ..., n}
(because λ(Bk) +µ(Bk) = TrBk by Lemma 1 d), applied to A=Bk)
,
and Lemma 2 is proven.
Now let us solve the problem:
For every k ∈ {1,2, ..., n}, there exists a matrix S(Ak)∈U2(C) such that
(S(Ak))∗·Ak·S(Ak) = diag (λ(Ak), µ(Ak)) (16) (according to Lemma 1 f ), applied to A = Ak). These matrices S(A1), S(A2), ...,
S(An) satisfy (S(A1), S(A2), ..., S(An))∈(U2(C))n and
F (S(A1), S(A2), ..., S(An)) = det
n
X
k=1
(S(Ak))∗·Ak·S(Ak)
| {z }
=diag(λ(Ak),µ(Ak)) by (16)
= det
n
X
k=1
diag (λ(Ak), µ(Ak))
| {z }
=diag
„ n P
k=1
λ(Ak),
n
P
k=1
µ(Ak)
«
= det diag
n
X
k=1
λ(Ak),
n
X
k=1
µ(Ak)
!!
=
n
X
k=1
λ(Ak)·
n
X
k=1
µ(Ak). Hence,
n
X
k=1
λ(Ak)·
n
X
k=1
µ(Ak)∈ {F(U) | U ∈(U2(C))n}. (17) On the other hand, for every (U1, U2, ..., Un)∈(U2(C))n, we have
F (U1, U2, ..., Un) = det
n
X
k=1
Uk∗AkUk
!
≥
n
X
k=1
λ(Uk∗AkUk)·
n
X
k=1
µ(Uk∗AkUk)
by Lemma 2, applied to Bk =Uk∗AkUk, because the matrix Uk∗AkUk is Hermitian for everyk ∈ {1,2, ..., n}, since (Uk∗AkUk)∗ =Uk∗A∗k(Uk∗)∗ =Uk∗AkUk,
because A∗k =Ak (since the matrix Ak is Hermitian) and (Uk∗)∗ =Uk
=
n
X
k=1
λ(Ak)·
n
X
k=1
µ(Ak)
since for every k ∈ {1,2, ..., n}, we have Uk∗ =Uk−1 (since Uk∈U2(C) ) and thus Uk∗AkUk =Uk−1AkUk, so that the matrices Uk∗AkUk and Ak are similar, so that
Tr (Uk∗AkUk) = TrAk and det (Uk∗AkUk) = detAk, so that λ(Uk∗AkUk) = 1
2
Tr (Uk∗AkUk) + q
(Tr (Uk∗AkUk))2−4 det (Uk∗AkUk)
= 1 2
TrAk+ q
(TrAk)2−4 detAk
=λ(Ak) and µ(Uk∗AkUk) = 1
2
Tr (Uk∗AkUk)− q
(Tr (Uk∗AkUk))2−4 det (Uk∗AkUk)
= 1 2
TrAk− q
(TrAk)2 −4 detAk
=µ(Ak)
.
In other words, for every U ∈(U2(C))n, we have F (U)≥
n
X
k=1
λ(Ak)·
n
X
k=1
µ(Ak).
This, together with (17), yields min
U∈(U2(C))nF (U) =
n
X
k=1
λ(Ak)·
n
X
k=1
µ(Ak). (18) Lemma 1 b) and e) yields that for every Hermitian 2×2 complex matrix A, the numbers λ(A) and µ(A) are the greatest and the least eigenvalue of the matrix A, respectively. Hence, λ(Aj) =σ1(Aj) and µ(Aj) =σ2(Aj) for every j ∈ {1,2, ..., n}. Thus, (18) becomes
min
U∈(U2(C))n
F (U) =
n
X
k=1
σ1(Ak)·
n
X
k=1
σ2(Ak), and the problem is solved.