curve; Hilbert matrix n=100 - Regularization of Least Squares Problems

10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

10⁰ 10¹ 10² 10³ 10⁴ 10⁵

||Ax_λ−b||₂

||xλ||2

L Kurve; Hilbertmatrix der Dimension 100; Stoerung 0.1%

α=1e−2 α=1 α=1e−4

α=1e−6 α=1e−8

α=1e−10 α=1e−12

α=1e−14 α=1e−16

TUHH Heinrich Voss Least Squares Problems Valencia 2010 66 / 82

Toy problem

The following table contains the errors for the linear systemAx =bwhereAis the Hilbert matrix, andbis such thatx =ones(n,1)is the solution. The regularization matrix isL=Iand the regularization parameter is determined by the L-curve strategy. The normal equations were solved by the Cholesky factorization, QR factorization and SVD.

n=10 n=20 n=40 Tikhonov Cholesky 1.41 E-3 2.03 E-3 3.51 E-3 Tikhonov QR 3.50 E-6 5.99 E-6 7.54 E-6 Tikhonov SVD 3.43 E-6 6.33 E-6 9.66 E-6 The following table contains the results for the LS problems (m=n+20).

n=10 n=20 n=40 Tikhonov Cholesky 3.85 E-4 1.19 E-3 2.27 E-3

Large problems

Large ill-posed problems

Until now we assumed that the SVD ofAor TSVD of(A,L)is available which is only reasonable for not too large dimensions.

We now assume that the matrixAis so large as to make the decomposition of its singular value decomposition undesirable or infeasible.

The first method introduced by Bj ¨orck (1988) combines the truncated SVD with the bidiagonalization of Golub-Kahan-Lanczos.

With the starting vectorb, put

β₁u₁=b, α₁v₁=A^Tu₁, (47)

and fori =1,2, . . . compute

β_i+1u_i+1 = Av_i−α_iu_i

α_i+1v_i+1 = A^Tu_i+1−β_i+1v_i (48) whereα_i ≥0 andβ_i ≥0,i =0,1,2, . . . are chosen so thatku_ik=kv_ik=1.

TUHH Heinrich Voss Least Squares Problems Valencia 2010 69 / 82

Bidiagonalization & TSVD

With

U_k = (u₁, . . . ,u_k), V_k = (v₁, . . . ,v_k), B¯_k =





 α₁ β₁ α₂

β₃ . ..

. .. α_k β_k







(49)

The recurrence relation can be rewritten as

Uk+1(β₁e1) =b, (50) AV_k =U_k+1B¯_k, A^TU_k+1=V_kB¯_k^T +α_k₊₁v_k₊₁e^T_k+1. (51) We are looking for an approximate solutionx_k ∈span{V_k}and write

Large problems

Bidiagonalization & TSVD cnt.

From the first equation of (51) it followsAx_k =AV_ky_k =U_k+1B¯_ky_k, and since (in exact arithmetic)U_k+1can be shown to be orthogonal, it follows that

kb−Ax_kk=kU_k+1(β₁e₁−B¯_ky_k)k=kβ₁e₁−B¯_ky_kk.

Hence,kb−Ax_kkis minimized over span(V_k)ify_k solves the least squares problem

kβ₁e1−B¯kykk=min!. (52) Withdk :=β₁e1−B¯kyk it holds

A^T(Ax_k−b) =A^TU_k+1d_k =V_kB¯_k^Td_k+α_k+1v_k+1e^T_k+1d_k, and ify_k solves (52) we get fromb^T_kd_k =0

A^T(b−Ax_k) =α_k+1v_k+1e^T_k+1d_k. (53)

TUHH Heinrich Voss Least Squares Problems Valencia 2010 71 / 82

Bidiagonalization & TSVD cnt.

Assume thatα_i 6=0 andβ_i 6=0 fori =1, . . . ,k. If in the next stepβ_i₊₁=0, thenB¯_k has rankk and it follows thatd_k =0, i.e.Ax_k =b. Ifβ_k+16=0 but α_k+1=0, then by (53)x_k is a least squares solution ofAx =b.

Thus, the recurrence (51) cannot break down before solution of kAx −bk=min!is obtained.

The bidiagonalization algorithm (47), (48) is closely related to the Lanczos process applied to the symmetric matrixA^TA. If the starting vectorv1is used, then it follows from (48)

(A^TA)V_k =A^TU_k+1B¯_k =V_k( ¯B_k^TB¯_k) +α_k+1β_kv_k+1e_k^T.

Large problems

Bidiagonalization & TSVD cnt.

To obtain a regulrized solution ofkAx−bk=min!consider the (full) SVD of the matrixB¯_k ∈R^k+1×k

B¯_k =P_k Ω_k

Q_k^T =

i=1

ω_ip_iq_i^T (54)

whereP_k andQ_k are square orthogonal matrices and

ω₁≥ω₂≥ · · · ≥ω_k >0. (55)

Then the solutionyk to the least squares problem (52) can be written y_k =β₁Q_k(Ω⁻¹_k , 0)P_k^Te₁=β₁

i=1

ω_i⁻¹κ_1iq_i (56) withκ_ij = (Pk)_ij.

TUHH Heinrich Voss Least Squares Problems Valencia 2010 73 / 82

Bidiagonalization & TSVD cnt.

The corresponding residual vector is

d_k =β₁P_ke_k+1e^T_k+1P_k^Te₁=β₁κ_1,k+1p_k₊₁, from which we get

kb−Axkk=kd_kk=β₁|κ_1,k+1|.

For a given thresholdδ >0 we define regularized solution by the TSVD method:

x_k(δ) =V_ky_k(δ), y_k(δ) =β₁X

ωi>δ

ω_i⁻¹κ_1iq_i. (57)

Notice, that the norm ofx_k(δ)and the residual norm kx_k(δ)k²=β²₁ X

ω_i>δ

(κ_1i)² and kr_k(δ)kr =β₁²X

ω_i≤δ

κ²_1i can be obtained for fixedk andδwithout formingx (δ)ory (δ)explicitly.

Large problems

Tikhonov regularization for large problems

Consider the Tikhonov regularization in standard form (L=I) and its normal equations

(A^TA+µ⁻¹I)x =A^Tb (58) where the regularization parameterµis positive and finite.

Usuallyλ=µ⁻¹is chosen, but for the method to develop (58) is more convenient.

The solution of (58) is

xµ= (A^TA+µ⁻¹I)⁻¹A^Tb (59) and the discrepancy

dµ=b−Axµ. (60)

We assume that an estimate for the errorεofbis explicitly known. We seek for a parameterµˆsuch that

ε≤ kdµˆk ≤ηε (61)

where the choice ofηdepends on the accuracy of the estimateε.

TUHH Heinrich Voss Least Squares Problems Valencia 2010 75 / 82

Tikhonov regularization for large problems cnt.

Let

φ(µ) :=kb−Axµk². (62)

Substituting the expression (59) into (62) and applying the identity I−A(A^TA+µ⁻¹I)⁻¹A^T = (µAA^T +I)⁻¹ one gets that

φ(µ) =b^T(µAA^T +I)⁻²b. (63)

Hence,φ⁰(µ)≤0 andφ⁰⁰(µ)≥0 forµ≥0, i.e.φis monotonically decreasing and convex, and forτ∈(0,kbk)the equationφ(µ) =τhas a unique solution.

Notice, that the functionν7→φ(1/ν)cannot be guaranteed to be convex. This is the reason why the regularization parameterµis used instead ofλ=1/µ.

Large problems

Tikhonov regularization for large problems cnt.

Newton’s method converges globally to a solution ofφ(µ)−ε²=0. However, for large dimension the evaluation ofφ(µ)andφ⁰(µ)for fixedµis much too expensive (or even impossible due to the high condition number ofA).

Calvetti and Reichel (2003) took advantage of partial bidiagonalization ofAto construct bounds toφ(µ)and thus determine aµˆsatisfying (61).

Application of`≤min{m,n}bidiagonalization steps (cf. (48)) gives the decomposition

AV`=U`B`+β_`+1u`+1e^T_`, A^TU`=V`B_`^T, b=β₁U`e1 (64) whereV`∈R^n×`andU`∈R^n×`have orthonormal columns andU_`^Tu`+1=0.

β_`+1is a nonnegative scalar, andku_`+1k=1 whenβ_`+1>0.

B`∈R^`×`is the bidiagonal matrix with diagonal elementsα_j and (positive) subdiagonal elementsβ_j (B`is the submatrix ofB¯`in (49) containing its first` rows).

TUHH Heinrich Voss Least Squares Problems Valencia 2010 77 / 82

Tikhonov regularization for large problems cnt.

Combining the equations in (64) yields

AA^TU`=U`B`B_`^T +α_`β_`+1u_`+1e_`^T (65) which shows that the columns ofU`are Lanczos vectors, and the matrix T`:=B`B^T_` is the tridiagonal matrix that would be obtained by applying` Lanczos steps to the symmetric matrixAA^T with initial vectorb.

Introduce the functions

φ`(µ) :=kbk²e₁^T(µB`B^T_` +I`)⁻²e1, (66) φ¯`(µ) :=kbk²e^T₁(µB¯`B¯_`^T +I_`+1)⁻²e₁, (67)

Large problems

Tikhonov regularization for large problems cnt.

Substituting the spectral factorizationAA^T =WΛW^T with Λ =diag{λ₁, . . . , λ_m}andW^TW =Iintoφone obtains

φ(µ) =b^TW(µΛ +I)⁻²W^Tb=

j=1

b˜_j² (µλ_j+1)² =:

∞

(µλ+1)²dω(λ) (68) where˜b=W^Tb, andωchosen to be a piecewise constant function with jump discontinuities of heightb˜²_j at the eigenvaluesλ_j.

Thus, the integral in the right-hand side is a Stieltjes integral defined by the spectral factorization ofAA^T and by the vectorb.

TUHH Heinrich Voss Least Squares Problems Valencia 2010 79 / 82

Tikhonov regularization for large problems cnt.

Golub and Meurant (1993) proved thatφ_`defined in (66) is the`-point Gauß quadrature rule associated with the distribution functionωapplied to the function

ψ(t) := (µt+1)⁻². (69) Similarly, the functionφ¯_`in (67) is the(`+1)-point Gauß-Radau quadrature rule with an additional node at the origin associated with the distribution functionωapplied to the functionψ.

Since for any fixed positive value ofµthe derivatives ofψwith respect tot of odd order are strictly negative and the derivatives of even order are strictly positive fort≥0, the remainder terms for Gauß and Gauß-Radau quadrature rules show

φ`(µ)< φ(µ)<φ¯`(µ). (70) Instead of computing a valueµthat satisfies (61), one seeks to determine values`andµsuch that

Large problems

Tikhonov regularization for large problems cnt.

A method for computing a suitableµcan be based on the fact that for every µ >0 it holds that

φ_k(µ)< φ`(µ)< φ(µ)<φ¯`(µ)<φ¯_k(µ) for 1≤k < ` (72) which was proved by Hanke (2003).

In general the value`in pairs(`, µ)that satisfy (71) can be chosen quite small (cf. the examples in the paper of Calvetti and Reichel).

Once a suitable valueµˆof the regularization parameter is available, the regularized solution

xµ,`ˆ =V`yµ,`ˆ (73) wheryµ,`ˆ ∈R^`satisfies the Galerkin equation

V_`^T(A^TA+ ˆµ⁻¹I)V`yµ,`ˆ =V`A^Tb. (74)

TUHH Heinrich Voss Least Squares Problems Valencia 2010 81 / 82

Im Dokument Regularization of Least Squares Problems (Seite 62-77)