On the genealogy of the IDR family

(1)

Jens-Peter M. Zemke

zemke@tu-harburg.de

(partially joint work with Martin Gutknecht)

Institut für Numerische Simulation Technische Universität Hamburg-Harburg

TU Bergakademie Freiberg December 17th, 2009

(2)

Outline

Ancestors: The year 1950

Birth and Childhood: The years 1976–1982 Adolescence: The years 1984–1992 Adulthood: 1993 and onwards Rebirth of IDR: The years 2006–2010 Outlook & Conclusion

(3)

Ancestors: The year 1950

The origin of transpose-free methods . . .

“

Instead of iterating withAandA^T ntimes, we can also iterate withA alone2ntimes. [..]The transposed matrix is not used here at all.

E. C. Bouwer of the Douglas Aircraft Co. points out to the author that from the machine viewpoint a uniform iteration scheme of2n

iterations is preferable to a divided scheme ofn+niterations. [..] In case of a symmetric matrix it is evident that afterniterations the basic scalars should be formed, instead of continuing withnmore

iterations.

”

— Cornelius Lanczos, footnote on page 263 in (Lanczos, 1950), referring to his progressive algorithm based on Hankel determinants.

(4)

Birth and Childhood: The years 1976–1982

The origin of IDR: poor man’s secant method

In1976Peter Sonneveld (Sonneveld, 2006; Sonneveld, 2008) preparednotes for a course on Numerical Analysis at TU Delft. The secant method was part of the course. He generalized it to a multidimensional secant method . . . Let f(x) :=b−Ax, whereA∈C^n×nandb∈Cⁿare given. Then

F_k:=f(Xk) := f(x0) · · · f(xn)

∈C^n×(n+1) isrank deficient. For every solutionˆxofAx=b,

Fk=A(ˆxe^T−Xk), where e:=ones(n+1,1).

Thus, forF_kc_k=o_nande^Tc_k6=0,

be^Tc_k=Aˆxe^Tc_k=AX_kc_k ⇒ ˆx= Xkck

e^Tc_k.

(5)

For genuine non-linear (smooth) functions f, we replaceAby theJacobi matrixandbby thefunction evaluationat an initial guess.

Then the process described gives alinearizationand updates iterates to give better approximations.

Updating all columns ofF_kis ill-conditioned, as all columns converge to the same vectorf:= f(ˆx). Sonneveld updated only thelast two columns:

F_k:= F^const_n−1 f_k−1 fk . Therefore, withA:=∇f(ˆx),

Fk= A(ˆxe^T−Xn−1) +En−1 A(ˆx−xk−1) +dk−1 A(ˆx−xk) +dk

,

whereEn−1is aconstantmatrix and the vectorsdkconverge to zero.

(6)

Sonneveld used the exampleAx=o_nandmimicked the non-linearityby the presence of a constant matrixE_n−1in the process.

If used for a matrix of dimensionn∈N, the process gave (an approximation to) the value zero in step2n. In the following example I used Maple to exclude finite precision and a badly conditioned matrixAof size5.

kr₀k²=7.416198487, kr₁k²=31.28897569, kr₂k²=3.838120391, kr₃k²=3.944190988, kr₄k²=1.035754508, kr₅k²=1.035728492, kr6k²=0.983756197, kr7k²=0.983648677, kr8k²=0.520741201, kr9k²=0.520740892, kr10k²=kr2nk²=0.

He analyzed this startling behavior: thefirst IDR methodwas born.

(7)

To analyze, he realized thatc_kis of interestup to a scalarnon-zero factor. He considered the case thatcn−1+cn=1, i.e., that the sum of the last two elements is scaled to be one. He setscn−1 :=γkand thuscn=1−γk. Now, forc^(k)_n−1, we have to solve theoverdetermined consistent linear system

Fn−1c^(k)_n−1=−fk−γk(fk−fk−1).

AsF_n−1∈C^n×(n−1), there exists a non-zero vectorp∈Cⁿin theleft null space ofFn−1. With this vector,

0=p^HF_n−1c^(k)_n−1=p^H(−f_k−γk(fk−f_k−1)),

i.e.,γkis uniquely (in case of no breakdown) determined by γk:=− p^Hfk

p^H(fk−f_k−1).

(8)

The vectorc^(k)_n−1is then (because of theconsistencyof the given overdetermined system) given by

c^(k)_n−1:=−F^†_n−1(fk+γk(fk−f_k−1)).

Thenew residualf_k+1=on−Ax_k+1 satisfies f_k+1 =−A(Xn−1c^(k)_n−1+x_k+γk(xk−x_k−1))

e^Tc^(k)_n−1+1

=(En−1−F_n−1)c^(k)_n−1−fk−γk(fk−f_k−1)

e^Tc^(k)_n−1+1 = E_n−1c^(k)_n−1 e^Tc^(k)_n−1+1

= E_n−1F^†_n−1(fk+γk(fk−f_k−1))

e^TF^†_n−1(fk+γk(fk−fk−1))−1 =ρkB(fk+γk(fk−fk−1))

(9)

As the method usually converges, the vectorckin the null space ofF^(k)n will not change much, thusthe scaling will not changemuch, thus fork1

ρk:= 1

e^TF^†_n−1(fk+γk(fk−f_k−1))−1 = 1

e^Tck ≈const6=0.

Thefinite termination propertyof the resultingthree-term recurrence f_k+1 =ρkB(fk+γk(fk−f_k−1))

can thus not depend on the scaling, but only on the wayγkand thusfkis computed. For this reason, Sonneveld considered the caseρk=1for allk.

Do we need the information that the matrixB∈C^n×nis defined by B:=E_n−1F^†_n−1?

(10)

The constant matrixE_n−1wasarbitrarily chosen. Thus, we could represent everyat most rankn−1matrix with the same kernel asF^†_n−1.

The right kernel ofF^†_n−1is the left kernel ofFn−1, i.e., it is spanned by the vectorpused in the computation ofγk,

γk:=− p^Hf_k p^H(fk−fk−1).

The simplified (i.e., scaled) three-term recurrence fk+1=B(fk+γk(fk−fk−1))

is“immune”to changes inBin direction ofp, as theγkare chosen to construct vectors orthogonal top.

We could useanyB∈C^n×nwithout spoiling the finite termination property!

(11)

The origin of IDR: primitive IDR

Sonneveld first made experiments and then gave a rigorous proof. It is easy to see that apart from the first two (arbitrarily chosen) residuals the

constructed residuals are in theBimage of the spaceS:=p^⊥. The same argument proves that in general (observe that the first two residualsf₀,f₁ are usually not inS) fork>1

f2k,f_2k+1∈G^k:=

k

\

j=1

B^j(S) = k

+

j=1

B^−j^H{p}⊥

=

K^k(B^−H,B^−Hp)⊥

.

Sonneveld proved that thedimensionsof the spaces constructedare shrinking. This is the essence of the firstIDR Theorem. He did not use the description as an orthogonal complement of a Krylov subspace as it is done here. We remark that genericallydim(Kⁿ(B^−H,B^−Hp)) =n.

Using the Krylov subspace point of view and the explicit orthogonalization againstpbefore multiplication withB, we see that indeedf_2n=Bo_n=o_n.

(12)

The three-term recurrence

fk+1=B(fk+γk(fk−fk−1)), where γk= p^Hfk

p^H(fk−1−fk),

is an“implementation”of theInduced Dimension Reduction (IDR) Theorem.

The vectors constructed live in spaces of shrinking dimensions. Methods like this are called“IDR Algorithms”.

Another implementation by Sonneveld can be used to solve“genuine” linear systems. The idea is to rewrite the linear system to Richardson iteration form,

Ax=b ⇒ x= (I−A)x+b=:Bx+b.

Theclassical Richardson iterationwith a starting guessx0is then given by xk+1= (I−A)xk+b.

(13)

Withr₀:=b−Ax₀, theRichardson iterationis carried out as follows:

x_k+1=x_k+r_k, r_k+1= (I−A)rk.

In aRichardson-type IDR Algorithm, the second equation is replaced by the update

rk+1= (I−A)(rk+γk(rk−rk−1)), γk= p^Hrk

p^H(rk−1−r_k).

Theupdate of the iterateshas to be modified accordingly,

−A(xk+1−x_k) =r_k+1−r_k= (I−A)(rk+γk(rk−r_k−1)−r_k

= (I−A)(rk−γkA(xk−x_k−1)−rk

=−A(rk+γk(I−A)(xk−x_k−1))

⇔ xk+1−xk=rk+γk(I−A)(xk−xk−1)

=r_k+γk(xk−x_k−1+r_k−r_k−1).

(14)

Sonneveld terms the outcome thePrimitive IDR Algorithm(Sonneveld, 2006):

r₀=b−Ax₀ x₁=x₀+r₀ r₁=r₀−Ar₀ Fork=1,2, . . .do

γk=p^Trk/p^T(rk−1−rk) sk=rk+γk(rk−r_k−1) xk+1=xk+γk(xk−xk−1) +sk

rk+1=sk−Ask

done

x_old=x0

r_old=b−Ax_old x_new=x_old+r_old r_new=r_old−Ar_old While “not converged” do

γ=p^Tr_new/p^T(rold−r_new) s=r_new+γ(rnew−r_old) x_tmp=x_new+γ(xnew−x_old) +s r_tmp =s−As

x_old=x_new,x_new=x_tmp r_old=r_new,r_new=r_tmp done

On the next slide we compareRichardson iteration(red) andPIA(blue).

(15)

Impressions of“finite termination”andaccelerationin finite precision:

0 5 10

10⁻¹⁰ 10⁰ 10¹⁰

matrix−vector multiplies

true and updated residuals

PIA for n = 5 and no scaling

0 20 40 60

10⁰ 10¹⁰ 10²⁰ 10³⁰

0 100 200

10⁰ 10¹⁰⁰ 10²⁰⁰

0 5 10

10⁻¹⁰ 10⁰

PIA for n = 5 and scaling

0 20 40 60

10⁻¹⁰ 10⁰

0 100 200

10⁻¹⁰ 10⁰

(16)

Sonneveld never did use PIA, as he considered it to be too unstable, instead he went on with a corresponding acceleration of the Gauß-Seidel method. In (Sonneveld, 2008) he terms this methodAccelerated Gauß-Seidel (AGS)and refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid.

This part of the story took place “in the background” in the year 1976.

InSeptember 1979Sonneveld did attend theIUTAM Symposium on

Approximation Methods for Navier-Stokes Problemsin Paderborn, Germany.

At this symposium he presented a new variant of IDR based on avariable splittingI−ωjA, whereωjis fixed for two steps and otherwise could be chosen freely, but non-zero.

This algorithm withminimization of every second residualis included in the proceedings from 1980 (Wesseling and Sonneveld, 1980). The connection to Krylov methods, e.g., BiCG/Lanczos, is also given there.

(17)

The origin of IDR: classical IDR

γ0=0,f0=Ax0−b,∆g0=on,∆y0=on

Fork=1, . . .do sk=fk−1+γk−1∆gk−1

t_k=As_k

ifk=1orkis even ωk= (t^H_ksk)/(t^H_ktk) else

ωk=ωk−1

end

∆xk=γk−1∆yk−1−ωksk

∆fk=γk−1∆gk−1−ωktk

xk=xk−1+ ∆xk

fk=fk−1+ ∆fk

ifkis even

∆yk= ∆yk−1

∆gk= ∆gk−1

else

∆yk= ∆xk

∆g_k= ∆f_k end

γk=−(p^Hfk)/(p^H∆gk) done

This is theoriginal IDR Algorithm from page 551 of (Wesseling and Sonneveld, 1980).

It uses OrthoRes(1) in the first step and a residual (these are the−f2j)minimization every second step.

The finite termination property follows from a generalization of the IDR Theorem based on commutativityof the linear polynomialsI−ωjA.

(18)

The origin of IDR: classical IDR

A numerical comparison ofRichardson iteration, original IDR, andPIA.

0 5 10

10⁻¹⁰ 10⁰ 10¹⁰

RIP for n = 5 and no scaling

0 20 40 60

10⁰ 10²⁰

0 100 200

10⁰ 10¹⁰⁰ 10²⁰⁰

0 5 10

10⁻¹⁰ 10⁰

RIP for n = 5 and scaling

0 20 40 60

10⁻¹⁰ 10⁰

0 100 200

10⁻¹⁰ 10⁰

(19)

Adolescence: The years 1984–1992

Evolution: CGS and BiCGStab

IDR was presented at a Symposium onCFD. TheNumerical Linear Algebra communitymissed it completely. This changed, when Sonneveld gained more understanding of Krylov subspace methods and developed “better variants” of IDR.

There are two well-known methods based on IDR:CGSandBiCGStab.

CGS, dating to1984(Sonneveld, 1984; Sonneveld, 1989), was the outcome of the understanding that one can do Lanczos without the need forA^T, which follows from the analysis of IDR.

The analysis of IDR from the Krylov subspace point of view was based on the orthogonality properties of the residual polynomials. This immediately leads to the observation that all IDR methods construct residual polynomials that areproducts of auxiliary polynomials with the Lanczos polynomials.

(20)

CGS was based on choosing the auxiliary polynomial equal to the Lanczos polynomial. This has two advantages:It is at handand thecontraction is enhancedin case of contraction.

CGS has a severe disadvantage: Also theerratic behavior is amplified, thus CGS is more prone to rounding errors than BiCG and the ultimately attainable accuracy is larger.

If only a moderate backward error reduction is of interest and BiCG converges quite well, CGS is a better choice. But many problems are not of this type, and for these one might want to use a transpose-free method.

Sonneveld thought about rewriting the IDR Algorithm from (Wesseling and Sonneveld, 1980) and discussed this during a weekend with Henk van der Vorst. The resultingBiCGStab(van der Vorst and Sonneveld, 1990; van der Vorst, 1992) is mathematically equivalent to IDR. In the title of the report CGS was explicitely mentioned and Sonneveld was one of the authors . . .

(21)

“

Early ideas by Sonneveld (1984) for improvements in the bi-Conjugate Gradient (Bi-CG) method, for the solution of

unsymmetric linear systems, intrigued me for a long time. Sonneveld had a brilliant idea for doubling the speed of convergence of Bi-CG for virtually the same computational costs: CGS.He also published a rather obscure method under the name of IDR.I doubt whether that paper got more than two or three citations altogether.The eventual understanding of that method and the reformulation of it, so that rounding errors had much less bad influence on its speed of convergence, led to the so frequently cited Bi-CGSTAB paper (1992).

”

— Henk van der Vorst on IDR and CGS by Peter Sonneveld, see in-cites, September 2001,http:

//www.in-cites.com/papers/dr-henk-van-der-vorst.html.

(22)

Adulthood: 1993 and onwards

Evolution: LTPM

Soon it was realized byother researchersthat the new methods are based on residual polynomials which are products of auxiliary polynomials and the Lanczos polynomials.

Gutknecht (Gutknecht, 1997) coined the term“Lanczos-type product method”

(LTPM)for these methods. A plethora of new Krylov subspace methods popped into existence:

I BiCGStab2 (Gutknecht, 1993),

I BiCGStab(`) (Sleijpen and Fokkema, 1993),

I GCGS (Fokkema et al., 1996), includes CGS2 and shifted CGS,

I GPBiCG (Zhang, 1997) = BiCG×MR2 (Gutknecht, 1997),

I ML(k)BiCGStab (Yeung and Chan, 2000),

I BiCG×MR2_2×2 (Röllin and Gutknecht, 2002),

I GPBiCG(m,l) (Fujino, 2002),

I BiCGSafe (Fujino et al., 2005), . . .

(23)

Evolution: LTPM

Soon people observed that smoothed variants can besquaredand product-type methods can besmoothed. This added to the plethora:

I QMRS (Freund and Szeto, 1991; Freund and Szeto, 1992a; Freund and Szeto, 1992b),

I TFQMR (Freund, 1993),

I QMRCGStab (Chan et al., 1994),

I general smoothing techniques: (Zhou and Walker, 1994).

It was even considered to implement algorithms based on the (two-sided) Lanczos process via“transpose-free implementations”(Chan et al., 1991;

Chan et al., 1998). These are called

I squared Lanczos,

I TFiBiCG, and

I TFiQMR.

(24)

Evolution: LTPM

The main problem, namely thebreakdownof the underlying Lanczos process and its instability infinite precisionhas only partially been addressed.

“Look-aheadfor (Bi)CG”S was considered in (Brezinski and Redivo Zaglia, 1994), the resulting algorithm is called BSMRZS; look-ahead for BiCGStab (and related LTPM) was considered in (Brezinski and Redivo-Zaglia, 1995). In (Gutknecht and Ressel, 2000) look-ahead for general LTPM based on

three-term recurrences was considered.

Stability in finite precision was investigated byvery few people.

Of all “new” methods,only ML(k)BiCGStab differs substantiallyfrom the others: This method isbased onsleft starting vectors (shadow vectors) and one right starting vector (zeroth residual).

(25)

Rebirth of IDR: The years 2006–2010

The origin of IDR(s): ancestors

ML(k)BiCGStab was largelyneglectedby the Numerical Linear Algebra community. The main reason is the very technical paper, where the appendix contained the derivation of the computation of the scalars. Currently,

Man-Chung Yeung is reconsidering ML(k)BiCGStab and developing variants that exploit the freedom inherent in the method (Yeung, 2009).

Without knowing anything about ML(k)BiCGStab in 2006 the IDR idea was reconsidered. Peter Sonneveld together with Martin van Gijzen developed a new variant of IDR based on multiple shadow vectors: IDR(s) (≈IDR(s)ORes).

Nobody was thinking any more about IDR and Peter Sonneveld calls this“an example of serendipity”. . .

. . . so whatdidhappen?

The following is an excerpt of an e-mail and a copy of slide 36 ofthe after dinner talk by Peter Sonneveldat theThirty-fourth Woudschoten Conference.

(26)

The origin of IDR(s): rebirth of IDR

Date: Wed, 17 May 2006 14:02:27 +0200 (CEST) From: Jens-Peter M. Zemke <zemke@xxxxxxxxxxxxx>

To: <p.sonneveld@xxxxxxxxxxxxxx>

Cc: Jens-Peter M. Zemke <zemke@xxxxxxxxxxxxx>

Subject: A question about IDR [..] entitled

"The method of induced dimension reduction, an iterative solver for non-symmetric linear systems"

with the annotation "Publication in preparation".

My question is: What happended to this paper?

More precisely formulated:

- Did it evolve into the CGS paper?

or [..]

(27)

The origin of IDR(s): rebirth of IDR

7October 2009 36

Delft University of Technology

Zemke, and a short monologue

• 2006:Jens-Peter Zemke, from Hamburg, mails:What happened to IDR?

• Have to read carefully the 1980 version of the theorem, and the ancient history.

• Theorem used aspaceS, not justp^⊥.

• Serendipity moment:Why didn’t I use more vectorsp, says instead of1???

• Because it costss+ 1matvecs perGj-space.

• But maybe there is more dimension reduction perGj

• .... Never thought about, must try... and call it IDR(s)

(28)

The prototype IDR(s) (without the recurrences forxnand thus already slightly rewritten)

r₀=b−Ax₀

computeRs+1=R0:s=`

r0, . . . ,rs´

using, e.g., ORTHORES

∇R1:s=`

∇r₁, . . . ,∇rs´

=`

r₁−r₀, . . . ,rs−rs−1´ n←s+1,j←1

whilenot converged

cn= (P^H∇Rn−s:n−1)⁻¹P^Hrn−1

vn−1=rn−1− ∇Rn−s:n−1cn

computeωj

∇rn=−∇Rn−s:n−1cn−ωjAvn−1

rn=rn−1+∇rn,n←n+1

∇Rn−s:n−1=`

∇rn−s, . . . ,∇rn−1

´ fork=1, . . . ,s

cn= (P^H∇Rn−s:n−1)⁻¹P^Hrn−1

vn−1=rn−1− ∇Rn−s:n−1cn

∇rn=−∇Rn−s:n−1cn−ωjAvn−1

rn=rn−1+∇rn,n←n+1

∇Rn−s:n−1=`

∇rn−s, . . . ,∇rn−1

´ end for

j←j+1 end while

A few remarks:

We can start withany (simple)Krylov subspace method.

The steps in thes-loop only differ from the first block in thatno newωj

is computed.

IDR(s)ORes is based onoblique projections.

ands+1consecutive multiplications withthe same linear factor I−ωjA.

(29)

Understanding IDR: Hessenberg decompositions

Essential features of Krylov subspace methods can be described by a Hessenberg decomposition

AQ_n=Q_n+1H_n =Q_nH_n+q_n+1hn+1,ne^T_n. (1) Here,H_ndenotes an unreduced Hessenberg matrix.

In the perturbed case, e.g., in finite precision and/or based on inexact matrix-vector multiplies, we obtain aperturbed Hessenberg decomposition

AQn+Fn=Qn+1H_n=QnHn+qn+1hn+1,ne^T_n. (2)

The matrixH_nof the perturbed variant will, in general, still be unreduced.

(30)

IDR: Generalized Hessenberg decompositions

In case of IDR, we have to considergeneralized Hessenberg decompositions AQ_nU_n =Q_n+1H_n=Q_nH_n+q_n+1hn+1,ne^T_n (3) andperturbed generalized Hessenberg decompositions

AQ_nU_n+F_n=Q_n+1H_n=Q_nH_n+q_n+1h_n+1,ne^T_n (4) with upper triangular (possibly even singular)Un.

Generalized Hessenberg decompositions correspond to a skew projection of the pencil(A,I)to the pencil(Hn,U_n)as long asQ_n+1has full rank.

(31)

Understanding IDR: QOR/QMR/Ritz-Galërkin

There are various well-known approaches based on such Hessenberg decompositions, e.g.,

QOR: approximate x=A⁻¹r₀ by x_n:=Q_nH⁻¹_n e₁kr₀k., QMR: approximate x=A⁻¹r0 by x_n:=QnH^†_ne₁kr0k., Ritz-Galërkin: approximate J=V⁻¹AV by J_n:=S⁻¹_n H_nS_n.,

andV by Vn:=QnSn.,

“functions”: approximatef(A)q=p(A)qbyQ_nf(Hn)e1orQ_n+1f([H_n,f])e₁. Toeverymethod from one class corresponds a method of the other.

These approaches extend easily to generalized Hessenberg decompositions.

(32)

Understanding IDR: OrthoRes-type methods

The entries of the Hessenberg matrices of these Hessenberg decompositions are defined in different variations.

Three well-known ways for implementing the QOR/QMR approach are commonly denoted as OrthoRes/OrthoMin/OrthoDir.

OrthoRes-type methodshave ageneralizedHessenberg decomposition AR_nU_n=R_n+1H^◦_n =R_nH^◦_n+r_n+1h^◦_n+1,ne^T_n, (5) wheree^TH^◦_n =o^T_n,e^T= (1, . . . ,1)., and the matrix

R_n+1= r₀, . . . ,r_n=Q_n+1diag kr₀k

kq1k, . . . , kr_nk kqn+1k

(6) is diagonally scaled to be the matrix of residual vectors.

(33)

IDR: The underlying Hessenberg decomposition

The IDR recurrences ofIDR(s)ORescan be summarized by v_n−1:=r_n−1− ∇R_n−s:n−1c_n=R_{n−s−1:n−1}y_n

=(1−γ⁽ⁿ⁾s )rn−1+Ps−1

`=1(γ_s−`+1⁽ⁿ⁾ −γ_s−`⁽ⁿ⁾)rn−`−1+γ₁⁽ⁿ⁾rn−s−1, 1·rn:= (I−ωjA)v_n−1.

(7)

Here,n>s, and the index of the scalarωjis defined by j:= n

s+1

,

compare with the so-called “index functions” (Yeung/Boley, 2005).

Removingv_n−1from the recurrence we obtain thegeneralized Hessenberg decomposition

AR_nY_nDω=R_n+1Y^◦_n. (8)

(34)

IDR: Sonneveld pencil and Sonneveld matrix

The IDR(s)ORes pencil, the so-calledSonneveld pencil(Y^◦_n,YnD⁽ⁿ⁾ω ), can be depicted by







×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦

◦+×××× ◦ ◦ ◦ ◦ ◦ ◦

◦ ◦+×××× ◦ ◦ ◦ ◦ ◦

◦ ◦ ◦+×××× ◦ ◦ ◦ ◦

◦ ◦ ◦ ◦+×××× ◦ ◦ ◦

◦ ◦ ◦ ◦ ◦+×××× ◦ ◦

◦ ◦ ◦ ◦ ◦ ◦+×××× ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦+××××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+×××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+×





 ,







×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦

◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦

◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦

◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦

◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦

◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×





 .

The upper triangular matrixY_nD⁽ⁿ⁾ω could be inverted, which results in the Sonneveld matrix, afullunreduced Hessenberg matrix.

(35)

Understanding IDR: Purification

We know the eigenvalues≈roots of kernel polynomials1/ωj. We are only interested in the other eigenvalues.

Thepurified IDR(s)ORes pencil(Y^◦_n,UnD⁽ⁿ⁾ω ), that has only the remaining eigenvalues and some infinite ones as eigenvalues, can be depicted by







×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦

◦+×××× ◦ ◦ ◦ ◦ ◦ ◦

◦ ◦+×××× ◦ ◦ ◦ ◦ ◦

◦ ◦ ◦+×××× ◦ ◦ ◦ ◦

◦ ◦ ◦ ◦+×××× ◦ ◦ ◦

◦ ◦ ◦ ◦ ◦+×××× ◦ ◦

◦ ◦ ◦ ◦ ◦ ◦+×××× ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦+××××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+×××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+××

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+×





 ,







×××◦◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

◦ ××◦ ◦◦ ◦ ◦ ◦ ◦ ◦ ◦

◦ ◦ ×◦ ◦ ◦◦ ◦ ◦ ◦ ◦ ◦

◦ ◦ ◦◦ ◦ ◦ ◦◦ ◦ ◦ ◦ ◦

◦ ◦ ◦ ◦ ×××◦◦ ◦ ◦ ◦

◦ ◦ ◦ ◦ ◦ ××◦ ◦◦ ◦ ◦

◦ ◦ ◦ ◦ ◦ ◦ ×◦ ◦ ◦◦ ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦◦ ◦ ◦ ◦◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦◦





 .

We get rid of the infinite eigenvalues using a change of basis (Gauß/Schur).