• Keine Ergebnisse gefunden

3.3 Interpolatory methods for large-scale matrix equations

3.3.7 Sylvester equations

3.3. Interpolatory methods for large-scale matrix equations 55 well-known, see [117, Section 8.3.1], the latter approach often results in a very slow con-vergence rate for common PDEs. Since the efficiency of Algorithm 3.3.1 also depends on the convergence rate, we cannot expect it to outperform state-of-the-art low rank tech-niques when it comes to computational efficiency. On the other hand, it obviously has the advantage of locally minimizing the residual for a given rank.

56 Linear Systems with Xfulfilling (3.47). As it is easily seen, this function results from a slight modifica-tion of the H2-norm of a dynamical system and thus can be computed as

f(Θ) = vec (BC)T (−E⊗A−H⊗M)1vec (BC). (3.49) For later purposes, it is helpful to note that f is invariant under orthonormal transfor-mations.

Lemma 3.3.2. Let Θ = (A,E,M,H,B,C) denote a set of matrices and let X be the solution of the associated Sylvester equation (3.47). Assume that Λis a diagonal matrix containing the eigenvalues of the matrix pencil(H,E)and that Qis a matrix containing an orthogonal set of eigenvectors. Then it holds

f(Θ) = tr BTXCT

= tr

BTYC˜T ,

where C˜ =CQ and Y is the solution of AY+MYΛ+BC˜ =0.

Proof. Let Q be the matrix of eigenvectors for the matrix pencil (H,E), i.e., assume that it holds QTEQ = I and QTHQ = Λ, where Λ is a diagonal matrix consisting of the eigenvalues. SinceH=HT ≺0 and E =ET 0 this is always possible. If we now postmultiply equation (3.47) withQ, we get

AXEQ+MXHQ+BCQ=0.

Due to the orthonormality ofQ, this can be transformed into AXQQTEQ+MXQQTHQ+BCQ=0.

If we denote Y =XQ and ˜C=CQ, it follows that AY+MYΛ+BC˜ =0, which implies that

tr

BTYC˜T

= tr BTXQQTC

= tr BTXCT .

3.3. Interpolatory methods for large-scale matrix equations 57

Assume now that we have constructed a reduced set of matrices Θˆ = (A,ˆ E,ˆ M,ˆ H,ˆ B,ˆ C)ˆ

by the following projection:

Aˆ =VTAV, Eˆ =WTEW, Mˆ =VTMV,

Hˆ =WTHW, Bˆ =VTB, Cˆ =CW. (3.50)

Next, for Θ and Θˆ we define the corresponding error set

Θerr = (Aerr,Eerr,Merr,Herr,Berr,Cerr), with

Aerr=

−A 0 0 Aˆ

, Eerr =

−E 0 0 Eˆ

, Merr =

−M 0 0 Mˆ

Herr=

−H 0 0 Hˆ

, Berr = B

, Cerr =

C Cˆ .

Similar to the previous cases, it is easy to show that a crucial lower bound is given by the objective function f evaluated in the error set Θerr.

Corollary 3.3.1. Let Θ and Θˆ denote two sets of matrices associated with large and reduced generalized Sylvester equations of the form (3.47), respectively. Then, for the associated error set Θerr, it holds that

f(Θerr)≤f(Θ)−f(Θ).ˆ

Hence, analog to the topic of H2-optimal model order reduction, we want to find a local minimizer off(Θerr).This can be done by deriving first order necessary conditions based on the computation formula of f(Θerr).Due to the structural similarity to the previous sections, we only briefly mention how to proceed. For convenience, let us start with the case of B = b and C = cT. First of all, according to Lemma 3.3.2, we may w.l.o.g.

assume that Herr = Λerr and Eerr =I. Consequently, the objective function simplifies

58 Linear Systems

according to

f(Θerr) = bTerrXerrcerr

= cTerr⊗bTerr

(−Λerr⊗Merr−I⊗Aerr)−1(cerr⊗berr)

=

n+ˆn

X

i=1

bTerr(−λiMerr−Aerr)1berr (c(i)err)2,

where c(i)err denotes the i-th component of cerr. Setting the derivative of f(Θerr) with respect toˆc(j) equal to zero yields

2ˆc(j) bTerr(−λˆjMerr−Aerr)1berr = 0,

with ˆλj being the j-th eigenvalue of (H,ˆ E).ˆ However, in terms of interpolation, the above means that

bT(−λˆjM−A)1b=bˆT(−λˆjMˆ −A)ˆ 1ˆb. (3.51) Similarly, for the derivative with respect to ˆλj,we obtain

bT(−λˆjM−A)1M(−λˆjM−A)1b=ˆbT(−λˆjMˆ −A)ˆ 1M(−ˆ λˆjMˆ −A)ˆ 1b.ˆ (3.52) Hence, these conditions are obviously an extension of the Hermite interpolation condi-tions for H2-optimality. On the other hand, we have

f(Θerr) =bTerrXerrcerr =cTerrXTerrberr and the same argumentation leads to

G(−µˆj) = ˆG(−µˆj), G0(−ˆµj) = ˆG0(−ˆµj), (3.53) with

G(s) =cT(sE−H)1c, G(s) =ˆ ˆcT(sEˆ −H)ˆ 1ˆc

andµj being the eigenvalues of the matrix pencil (A,ˆ M).ˆ Hence, we propose Algorithm 3.3.2 for iteratively constructing a reduced set of matrices fulfilling these conditions.

Remark 3.3.6. Due to the connection to optimal H2-model reduction, it should be mentioned that instead of Step 5 of Algorithm 3.3.2, one can alternatively solve two

3.3. Interpolatory methods for large-scale matrix equations 59 Algorithm 3.3.2 IRKA for symmetric Sylvester equations ((Sy)2IRKA)

Input: Initial selection of real interpolation points σi and µi for i = 1, . . . ,nˆ and a convergence tolerance tol.

Output: Xnˆ =V ˆXWT fulfilling first order necessary conditions

1: Choose V and W s.t. V = span{(σ1M−A)1b, . . . ,(σnˆM−A)1b} and W = span{(µ1E−H)1c, . . . ,(µnˆE−H)1c}and VTV=WTW=I.

2: while relative change in {σi, µi}> tol do

3: Aˆ =VTAV,Mˆ =VTMV,Eˆ =WTEW,Hˆ =WTHW

4: assign σi ← −λi(H,ˆ E) andˆ µi ← −λi(A,ˆ M) forˆ i= 1, . . . ,ˆn,

5: update V and W s.t. V = span{(σ1M−A)−1b, . . . ,(σnˆM−A)−1b} and W = span{(µ1E−H)1c, . . . ,(µnˆE−H)1c} and VTV =WTW=I.

6: end while

7: bˆ=VTb,ˆc=WTc

8: Solve A ˆˆXˆE+M ˆˆX ˆH+bˆˆcT.

9: Set Xˆn=V ˆXWT.

reduced Sylvester equations of the form

AV ˆE+MV ˆH+bˆcT =0, EW ˆA+HW ˆM+cˆbT =0.

For a robust solver for these types of equations, we refer to, e.g., [25].

It remains to show that in the case of convergence of Algorithm 3.3.2, the lower bound of Corollary 3.3.1 is actually attained. For this, we assume the following splitting of the solution of (3.47) for the error system

Xerr =

X Y Z Xˆ

=

X 0 0 0

+

0 Y 0 Xˆ

+

0 0 Z Xˆ

0 0 0 Xˆ

.

Hence, we get

f(Θerr) = f(Θ)−f(Θ) +ˆ bTerr Y

ˆc+bˆT Z Xˆ

cerr.

A closer look at the right hand side of the previous equation reveals that

bTerr Y

ˆ

c=bTYˆc+ˆbTXˆˆc,

60 Linear Systems

where Y is the solution of

−AY−MYΞ+bˆcT =0.

Here, we again assumed that the reduced matrix pencil (H,ˆ E) is given in its eigenvalueˆ decomposition and that the eigenvalues are contained in the diagonal matrix Ξ. As a consequence, it holds that

ˆcT ⊗bT

vec (Y) =− ˆcT ⊗bT

(−Ξ⊗M−I⊗A)1(ˆc⊗b). On the other hand, we know that

ˆ cT ⊗bˆT

vec

= ˆ

cT ⊗bˆT −Ξ⊗Mˆ −I⊗Aˆ 1

ˆc⊗ˆb

,

which, together with the interpolation conditions, yields bTerr Y

ˆ

c = 0. Similarly, we can show that ˆbT

Z Xˆ

cerr = 0.

Analog to the proof of Theorem3.3.1, one can eventually show that vec

X−V ˆXWTT

(−LS) vec

X−V ˆXWT

=f(Θ)−f(Θ).ˆ Altogether, we have thus proven our main result.

Theorem 3.3.4. Let Θ = (A,E,M,H,b,cT) denote a set of matrices determining a Sylvester equation as in (3.47) with solutionX.Let furtherXnˆ be computed by Algorithm 3.3.2 with convergence tolerance 0. Then Xˆn is a local minimizer of

Xmink∈M{vec (X−Xk)T (−LS) vec (X−Xk)}.

Extension to the MIMO case

So far, we have proved the result for a right hand side of rank 1, i.e.,bandcbeing vectors.

As the extension to the ’MIMO’ case is straightforward, we only sketch the necessary steps in the following. What remains to be clarified are suitable optimality conditions for the MIMO case in terms of either matrix equations or tangential interpolation conditions.

3.3. Interpolatory methods for large-scale matrix equations 61

For this, let us have a look at the objective function f evaluated in the error set f(Θerr) = tr BTerrXerrCTerr

,

where Xerr =

X Y Z Xˆ

is partitioned as before. As we have done for the case of the Lyapunov residual, the first step is to compute the derivate off with respect to an arbi-trary parameter γ that might be one of the entries ofA,B,C,E,H orM, respectively.

Accordingly, we obtain

∂f

∂γ = tr

∂Xerr

∂γ CTerrBTerr

+ tr

Xerr

∂(CTerrBTerr)

∂γ

.

Taking into account that Xerr is the solution of the generalized Sylvester equation AerrXerrEerr+MerrXerrHerr+BerrCerr =0,

a careful analysis leads to

∂f

∂γ = tr

Xerr∂Eerr

∂γ XTerrAerr

+ tr

XerrEerrXTerr∂Aerr

∂γ

+ tr

Xerr∂Herr

∂γ XTerrMerr

+ tr

XerrHerrXTerr∂Merr

∂γ

+ 2 tr

Xerr∂(CTerrBTerr)

∂γ

.

Depending on the specific choice of γ,we can derive different optimality conditions. For example, setting γ =Eˆi,j, leads to the condition

−YTAY+XˆTA ˆˆX=0. (3.54a)

62 Linear Systems Similarly, for the derivatives with respect to Aˆi,j,Hˆi,j,Mˆi,j,Bˆi,j and Cˆi,j,we get

−ZEZT +XˆˆE ˆXT =0, (3.54b)

−YTMY+XˆTM ˆˆX=0, (3.54c)

−ZHZT +X ˆˆH ˆXT =0, (3.54d)

YTB+XˆTBˆ =0, (3.54e)

ZCT +X ˆˆCT =0. (3.54f)

We already know that there is an equivalence between Sylvester equations and the concept of tangential interpolation, see again [62]. Hence, it is not surprising that the above matrix equation based conditions can alternatively be replaced by demanding that a reduced-order transfer function matrix tangentially interpolates the original transfer function matrix at given interpolations points. To be precise, let

G(s) = C(sE−H)1CT ∈R(s)m×m and F(s) =BT(sM−A)1B∈R(s)m×m. Moreover, assume that (Q,Λ) and (R,Ξ) are the eigenvalue decompositions of the matrix pencils (H,ˆ E) and (ˆ A,ˆ M),ˆ respectively. Here, Λ = diag (λ1, . . . , λˆn) and Ξ = diag (µ1, . . . , µˆn) contain the eigenvalues while Q and R consist of a set of Eˆ and M-ˆ orthogonal eigenvectors. A locally optimal reduced set Θˆ of matrices now has to fulfill

G(−µj)˜bj =G(−µˆ j)˜bj, (3.55a) b˜TjG(−µj) = ˜bTjG(−µˆ j), (3.55b) b˜TjG0(−µj)˜bj = ˜bTj0(−µj)˜bj, (3.55c) F(−λj)˜cj =F(−λˆ j)˜cj, (3.55d)

˜

cTjF(−λj) = ˜cTjF(−λˆ j), (3.55e)

˜

cTjF0(−λj)˜cj = ˜cTj0(−λj)˜cj, (3.55f) with ˜B =BˆTR and ˜C=CQˆ denoting tangential directions. For the sake of complete-ness, in Algorithm 3.3.3 we now see a matrix version that upon convergence yields a local minimizer for the MIMO case. Consequently, we have the following result extend-ing Theorem3.3.4.

Corollary 3.3.2. Let Θ = (A,E,M,H,B,C) denote a set of matrices determining a Sylvester equation as in (3.47) with solutionX.Let furtherXnˆ be computed by Algorithm

3.4. Numerical examples 63