• Keine Ergebnisse gefunden

2.4 Adaptive wavelet methods for operator equations

3.1.3 Linear and nonlinear approximation results

We state error bounds to linear uniform and nonlinear approximations of X, first with respect to the L2-norm and, subsequently, we extend our studies and state error bound with respect to Besov norms.

3.1. A class of random functions in Besov spaces 47

1

L1 p

Lp s

s

ν

−d p

Bτs(Lτ), 1τ = s−νd + 1p

Figure 3.1: Setting for Corollary 3.12 illustrated in a DeVore-Triebel diagram Error bounds with respect to the L2-norm

Here, we have to consider

α+β >1, or α+β ≥1 and γd <−1, α, γ ∈R, β ∈[0,1], (3.19) in order to ensure X ∈L2(O) P-a.s., cf. Remark 3.1.

For linear approximation one considers the best approximation from linear subspaces of dimension at mostN, which is given by the orthogonal projection on these subspaces, cf. Section 2.3.2. The corresponding linear approximation error of X with respect to L2(O) is given by

elinN(X) := inf

E [∥X−X∥ 2L2(O)]1/2

with the infimum taken over all measurable mappings X : Ω→L2(O) such that dim(span(X(Ω))) ≤N.

Theorem 3.15. Letα, β, γ in (3.19)be fixed and let X be defined by (3.3). The linear approximation error with respect to L2(O) satisfies

elinN(X)≍(log2N)γd2 Nα+β−12 .

Proof. To prove the upper bound, we truncate the decomposition (3.3) of X at some level j1 ≥j0 and we obtain a uniform linear approximation

Xj1 :=

j1

j=j0

k∈∇j

σjYj,kZj,kψj,k, (3.20) which satisfies

E

∥X−Xj12L2(O)

j=j1+1

#∇jσj2ρj

j=j1+1

jγd2−(α+β−1)jd ≍j1γd2−(α+β−1)j1d. (3.21)

48 Chapter 3. A class of random functions

Since dim(span(Xj1(Ω)))≍j1

j=j0#∇j ≍2j1d, we get the upper bound as claimed.

To prove the lower bound we use the fact that ψj,k = Φblbej,k for an orthonormal basis (ej,k)j,k in L2(O) and a bounded linear bijection Φblb : L2(O) → L2(O), cf.

Remark 3.2. This implies

elinN(X)≍elinN−1blbX).

Furthermore, elinN−1blbX) depends on Φ−1blbX only via its covariance operator Q which, as we also know from Remark 3.2, is given by

Qξ =

j=j0

σj2ρj

k∈∇j

⟨ξ, ej,kL

2(O)ej,k. (3.22)

Consequently, the functions ej,k form an orthonormal basis of eigenfunctions of Q with associated eigenvalues σ2jρj. Due to a theorem by Micchelli and Wahba, see, e.g., Ritter [139, Proposition III.24], we can conclude

elinN−1blbX) =

j=j1+1

#∇jσj2ρj

1/2

with N =

j1

j=j0

#∇j. (3.23) Inserting (W3), i.e., #∇j ≍ 2jd, and (3.1) into (3.23) implies the asserted lower

bound.

The best N-term wavelet approximation imposes a restriction only on η(g) := #

λ : λ ∈ ∇, g=

λ∈∇

cλψλ, cλ ̸= 0

, (3.24)

the number of non-zero wavelet coefficients of g. Therefore, the correspondingerror of best N-term wavelet approximation for X with respect toL2(O) is given by

ebestN (X) := inf

E [∥X−X∥ 2L2(O)]1/2

with the infimum taken over all measurable mappings X : Ω→L2(O) such that η(X(ω)) ≤N P-a.s.

For deterministic functions x on O the error of best N-term wavelet approximation with respect to the L2-norm is defined by

edetN (x) := inf

∥x−x∥ L2(O) : x∈L2(O), η(x) ≤N

, (3.25)

cf. Section 2.3.2. Clearly, we have

ebestN (X) =

E [edetN (X)2]1/2

.

Theorem 3.16. Let α, β, γ in (3.19) be fixed and let X be defined by (3.3). For every ε >0, the error of best N-term wavelet approximation with respect to L2(O) satisfies

ebestN (X)⪯

N−1/ε :if β = 1, N

α+β−1 2(1−β)

:otherwise.

3.1. A class of random functions in Besov spaces 49

Proof. The case β = 1 is a direct consequence of the definition of X. For β < 1, let s and τ satisfy (3.15) with ν = 0 and p= 2, i.e., 1/τ = s/d+ 1/2. By Remark 2.26 in Section 2.3.3 we have that x∈Bτs(Lτ(O)) impliesedetN (x)⪯ ∥x∥Bτs(Lτ(O))N−s/d and

therefore it remains to apply Corollary 3.12.

For random functions it is also reasonable to impose a constraint on the average number of non-zero wavelet coefficients only, and to study the error of best average N-term wavelet approximation

eavgN (X) := inf

E [∥X−X∥ 2L2(O)]1/2

with the infimum taken over all measurable mappings X : Ω→L2(O) such that E [η(X)] ≤N.

Theorem 3.17. Let α, β, γ in (3.19) be fixed and let X be defined by (3.3). The error of best average N-term wavelet approximation with respect to L2(O) satisfies

eavgN (X)⪯

Nγd2 2αdN2 :if β = 1, (log2N)γd2 N

α+β−1

2(1−β) :otherwise.

Proof. LetNj1 := E[η(Xj1)] forXj1 as in (3.20). Clearly Nj1 =

j1

j=j0

#∇jρj

j1

j=j0

2(1−β)jd

j1 : if β = 1 2(1−β)j1d : otherwise.

In particular, 2j1d≍Nj1/(1−β)1 if β ∈[0,1). Now, observe the error bound (3.21). The asymptotic behavior of the linear approximation error elinN(X) is determined by the decay of the eigenvalues σj2ρj of the covariance operator Q, see (3.22), i.e., it is essentially determined by the sum α+β. According to Theorem 3.10, the sum α+β also determines the regularity of X in the scale of Sobolev spaces Hs(O).

For β ∈(0,1] nonlinear approximation issuperior to linear approximation. More precisely, the following holds true. By definition, eavgN (X) ≤ ebestN (X), and for β >0 the convergence of ebestN (X) to zero is faster than that of elinN(X). For β ∈ (0,1) the upper bounds for eavgN (X) andebestN (X) slightly differ, and any dependence of ebestN (X) on the parameter γ is swallowed by the term Nε in the upper bound. For linear and best average N-term approximation we have

eavgN1−β(X)⪯elinN(X) if β ∈(0,1) and eavgclog

2N(X)⪯elinN(X) if β = 1 with a suitably chosen constant c >0.

Remark 3.18. We stress that for β ∈ (0,1] the simulation of the approximation Xj1, which achieves the upper bound in Theorem 3.17, is possible at an average computational cost of order Nj1. Let us briefly sketch the method of simulation. Set nj := #∇j. For each level j we first simulate a binomial distribution with parameters nj and ρj, which is possible at an average cost of order at most njρj. Conditional on a

50 Chapter 3. A class of random functions

realization L(ω) of this step, the locations of the non-zero coefficients on level j are uniformly distributed on the set of all subsets of {0, . . . , nj} of cardinality L(ω). Thus, in the second step, we employ acceptance-rejection to collect the elements of such a random subset sequentially. If L(ω)≤ nj/2, then all acceptance probabilities are at least 1/2, and otherwise we switch to complements to obtain the same bound for the acceptance probability. In this way, the average cost of the second step is of order njρj, too. In the last step we simulate the values of the non-zero coefficients. In total, the average computational cost for each level j is of ordernjρj.

Remark 3.19. For Theorems 3.15 and 3.17 we only need the Riesz basis property (W1) and the property (W3), i.e., #∇j ≍2jd, of the basis Ψ, and (W3) enters only via the asymptotic behavior of the parameters ρj and σj. After a lexicographic reordering of the indices (j, k) the two assumptions essentially amount to

X=

n=1

σnYnZnψn

with any Riesz basis {ψn}n∈Nfor L2(O), andσn ≍(log2n)γd/2n−α/2 as well as indepen-dent random variables Yn andZn, where Zn is N(0,1)-distributed and Yn is Bernoulli distributed with parameter ρn≍n−β. Therefore, Theorems 3.15 and 3.17 remain valid beyond the wavelet setting. For instance, let ρn = 1, which corresponds to β = 0.

Classical examples for Gaussian random functions on O = [0,1]d are the Brownian sheet, which corresponds to α= 2 and γ = 2(d−1)/d, and L´evy’s Brownian motion, which corresponds to α = (d+ 1)/d and γ = 0. Theorem 3.15 is due to Papageor-giou, Wasilkowski [132], Wo´zniakowski [175] for the Brownian sheet and due to Wasilkowski [173] for L´evy’s Brownian motion. See Ritter [139, Chapter VI]

for further results and references on approximation of Gaussian random functions.

Therefore, for β >0 our stochastic model provides sparse variants of general Gaussian random function.

Error bounds with respect to Besov norms

We extend above findings and state error bounds for linear and nonlinear approximation schemes for the considered random functions with respect to the norms of the Besov spaces Bpν(Lp(O)) with ν ∈Rand 1< p <∞.

We define the linear approximation error of X with respect toBpν(Lp(O)) by elinN,p,ν(X) := inf

E[∥X−X∥ pBν

p(Lp(O))]1/p

with the infimum taken over all measurable mappings X such that dim(span(X(O))) ≤N.

Theorem 3.20. Let β ∈ [0,1), m >0. For a fixed approximation space Bpν(Lp(O)), ν ∈R, 1< p <∞, let X be given by (3.3) with

ν+m < d((α−1)/2 +β/p) =: ν+m, (3.26) i.e., X ∈Bpν+m(Lp(O)) for all m < m. The linear approximation error with respect to Bpν(Lp(O)) satisfies

elinN,p,ν(X)⪯(log2N)γd2 N−(α−12 +βpνd). (3.27)

3.1. A class of random functions in Besov spaces 51

Proof. Again, as a specific linear approximation, we consider a uniform approximation of the form

Xj1 =

j1

j=j0

k∈∇j

σjYj,kZj,kψj,k

for some j1 ≥j0, where in particular N ≍2j1d. With Sj,p as defined in (3.4) and with (3.5), we obtain

E[∥X−Xj1pBν

p(Lp(O))]≍E

j=j1+1

2jp(ν+d(12p1))σpjSj,p

= E

j=j1+1

2jp((ν+m)+d(121p))2−jpmσjpSj,p

=

j=j1+1

2jpd((ν+m)+d(121p))2−jpmσjp#∇jρj.

Inserting (W3), i.e., #∇j ≍2jd, (3.1), and (3.26) we get E[∥X−Xj1pBν

p(Lp(O))]≍

j=j1+1

2jpd(α−12 +βp+121p)2−jpmjγdp2 2αjdp2 2jd2−βjd

=

j=j1+1

jγdp2 2−jpm

≍j

γdp 2

1 2−j1pm

≍(log2N)γdp2 N−p(α−12 +βpνd),

which yields (3.27).

Remark 3.21. In the setting of Theorem 3.20, with a slightly coarser error estimation, for all m < m we get

E[∥X−Xj1pBν

p(Lp(O))]⪯2−j1pmE

j=j0

2jp((ν+m)+d(121p)jpSj,p

⪯N−pm/dE

∥X∥p

Bpν+m(Lp(O))

 . Since we have E[∥X∥p

Bν+mp (Lp(O))] <∞, m < m, by (3.10), we derive that the linear approximation error satisfies

elinN,p,ν(X)⪯N−m/d

E[∥X∥pBν+m

p (Lp(O))]

1/p

. (3.28)

From (3.28), we observe that, similar to the well-known deterministic setting, see Section 2.3.2, the approximation order which can be achieved by uniform linear schemes depends on the regularity of the object under consideration in the same scale of smoothness spaces.

52 Chapter 3. A class of random functions

The following theorem is a generalization of Theorem 3.15. It states the error bounds for linear wavelet approximation with respect to Hν.

Theorem 3.22. Let β ∈ [0,1) and m > 0. For a fixed approximation space Hν(O), ν ∈ R, let X be given by (3.3) with ν +m < d(α−1 + β)/2 =: ν +m, that is, X ∈Hν+m(O) for all m < m. The linear approximation error with respect to Hν(O) satisfies

elinN,2,ν(X)≍(log2N)γd2 N(α−1+β2 νd).

Proof. The upper bound is proven in Theorem 3.20, where p= 2. The lower bound is analogously shown as in the proof of Theorem 3.15. Given {ej,k}j,k and Φblb as in Remark 3.2, we know that

elinN,2,ν(X)≍elinN,2,ν−1blbX).

We also know from Remark 3.2 that the covariance operator Q of Φ−1blbX is given by

Qξ =

j=j0

22jνσ2jρj

k∈∇j

⟨ξ, ej,kHν(O)ej,k,

which means, that the functions ej,k form an orthonormal basis of eigenfunctions of Q with associated eigenvalues 22jνσj2ρj. Using methods, e.g. shown in Ritter [139, Chapter III], we get

elinN,2,ν−1blbX) =

j=j1+1

#∇j22jνσ2jρj

1/2

with N =

j1

j=j0

#∇j. (3.29) Inserting (W3), i.e., #∇j ≍2jd, and (3.1) into (3.29) yields the claim. We define the average nonlinear approximation error of X : Ω→Bτs(Lτ(O)) with respect to Bpν(Lp(O)) in the scale (3.15), i.e.,

1

τ = s−ν d +1

p, 1< p <∞, and ν < s, cf. Corollary 3.12, by

eavgN,p,ν(X) := inf

E[∥X−X∥ pBν

p(Lp(O))]1/p

with the infimum taken over all measurable mappingsX such that E[η(X)] ≤N. Again, η(g) denotes the number of nonzero wavelet coefficients of g, see (3.24).

Theorem 3.23. Let β ∈ [0,1). For a fixed approximation space Bpν(Lp(O)), ν ∈ R, 1 < p < ∞, let X be given by (3.3) with −d/p ≤ ν < d((α−1)/2 +β/p), that is, X ∈Bτs(Lτ(O)) in the scale (3.15) for all s < s, where s is given by (3.16). Then the average nonlinear approximation error with respect to Bpν(Lp(O)) satisfies

eavgN,p,ν(X)⪯(log2N)γd2 N1−β1 (α−12 +βpνd). (3.30)

3.1. A class of random functions in Besov spaces 53

Proof. As a specific nonlinear approximation of X we consider

Xj1,N :=

j1

j=j0

k∈∇j

σjYj,kZj,kψj,k

for some j1 ≥j0, where only the non-zero coefficients N := E[η(Xj1,N)] are retained.

We have

N =

j1

j=j0

#∇jρj ≍2(1−β)j1d.

WithSj,p being defined in (3.4) and with (3.5), we use (3.15), wheres =s andτ = τ, to obtain

E[∥X−Xj1,NpBν

p(Lp(O))]≍E

j=j1+1

2jp(ν+d(121p)jpSj,p

= E

j=j1+1

2jp(s+d(12τ1)jpSj,p

=

j=j1+1

2jp(s+d(12τ1)pj#∇jρj.

Inserting (W3), i.e., #∇j ≍2jd, (3.1), (3.18), and (3.15), where s=s and τ =τ, we get

E[∥X−Xj1,NpBν

p(Lp(O))]≍

j=j1+1

2jpd(α−12 +τβ+121

τ)jγdp2 2αjdp2 2jd2−βjd

=

j=j1+1

jγdp2 2−jp(1−β)(s−ν)

≍j

γdp 2

1 2−j1p(1−β)(s−ν)

≍(log2N)γdp2 N1−βp (α−12 +βpνd),

which yields (3.30).

An analogous statement to Remark 3.21 also holds for the average nonlinear approximation error.

Remark 3.24. Letε >0 ands:=s−εwith s being defined in (3.16). In the setting of Theorem 3.23, with a slightly coarser error estimation, we get

E[∥X−Xj1,NpBν

p(Lp(O))]≍E

j=j1+1

2j(p−τ)(s+d(121τ)jp−τ2(s+d(121τ)jτSj,p

≍E

j=j1+1

2j(p−τ)(s+d(121τα2))jγd2(p−τ)2(s+d(12τ1)jτSj,p

54 Chapter 3. A class of random functions

⪯E

j=j1+1

2j(p−τ)(s+d(121τα2))+δj2(s+d(12τ1)τjSj,p

 , for any δ > 0. Insertings =s−ε and δ := p(s−ν)(1−β)(ετ)/d, as well as using (3.18), (3.15), and also (3.15) with s=s and τ =τ, which yields 1/τ = 1/τ +ε/d, we get

E[∥X−Xj1,NpBν

p(Lp(O))]⪯E

j=j1+1

2j(p−τ)(s−ε+d(121τα2))+δj2(s+d(121τ)jτSj,p

≍E

j=j1+1

2j((p−τ)d(β−1)(1τ+εd)) 2(s+d(121τ)jτSj,p

≍E

j=j1+1

2j(p(s−ν)τ(β−1)(τ1+εd)) 2(s+d(12τ1)τjSj,p

≍E

j=j1+1

2−jp(1−β)(s−ν)

2(s+d(121τ)jτSj,p

≍2−j1p(1−β)(s−ν)

E

j=j+1

2(s+d(121τ)jτSj,p

⪯2−j1p(1−β)(s−ν)

E

j=0

2(s+d(121τ)τjSj,p

⪯N−ps−νd E

∥X∥τBs τ(Lτ(O))

with E[Sj,p] = #∇jρjνp = #∇jρjντννp

τ = E[Sj,τ]ννp

τ. Since we have E[∥X∥τBs

τ(Lτ(O))]<∞ for s < s, by (3.10), we see that the average nonlinear approximation error satisfies

eavgN,p,ν(X)⪯Ns−νd

E[∥X∥τBs

τ(Lτ(O))]1/p

. (3.31)

From (3.31) we observe that, similar to the deterministic setting, the approximation order which can be achieved by nonlinear approximation does not depend on the regularity in the same scale of smoothness spaces of the object under consideration, but on the regularity in the corresponding scale (3.15) of Besov spaces.

For the case p= 2, i.e., for nonlinear wavelet approximation with respect to Hν, also a lower bound for the average nonlinear approximation error can be derived.

Theorem 3.25. Let β ∈[0,1). For a fixed approximation space Hν(O), ν ∈R, let X be given by (3.3) with −d/2≤ν < d(α−1 +β)/2, i.e., X ∈ Bτs(Lτ(O)) in the scale (3.15)for all s < s, wheres is given by (3.16)with p= 2. Then the average nonlinear approximation error in Hν(O) satisfies

eavgN,2,ν(X)⪰(log2N)γd2 N1−β1 (α−1+β2 νd). (3.32) Proof. Let X be defined by (3.3). For every level j, we define the number of scaled coefficients of X larger than a thresholdδj >0 as

M(j, δj) := #

2σjYj,k|Zj,k| : 2σjYj,k|Zj,k|> δj, k∈ ∇j

. (3.33)

3.1. A class of random functions in Besov spaces 55

We set Yj,β := 

k∈∇jYj,k and obtain Yj,β ∼ Bin(2jd,2−βjd). Since the (Yj,k)j,k are discrete and (Zj,l)j,l are identically distributed we can compute

E [M(j, δj)] =

2jd

l=0

E

M(j, δj)

k∈∇j

Yj,k =l

 P

k∈∇j

Yj,k =l

=

2jd

l=0

lP

2σj|Zj,l|> δj

P (Yj,β =l)

= E[Yj,β] P(2σj|Zj,k|> δj)

= 2jd(1−β)2

1−Φcdf

 δj

2σj



,

where Φcdf denotes the cumulative distribution function of the standard normal distri-bution. Now, we choose

δj := 2σj (3.34)

and we obtain E[M(j,2σj)] =c12jd(1−β) withc1 := 2(1−Φcdf(1)). For a givenN ∈N0

we set j1 := min{j : N ≤2jd2 } and determine a levelj2, such that E

M(j2,2j2νσj2)

≥c1N2. (3.35)

This holds for

j2 =

 j1

1−β

. (3.36)

Up to this point we have shown that, for X and any given N ∈N0, we can find a level j2, which contains on average at least c1N2 coefficients, that are larger thanδj2.

Let

XN :=

j=j0

k∈j

cj,kψj,k :=

λ∈

cλψλ

with E[#∇] = E[η( XN)]≤N be any approximation of X =

j=j0

k∈∇j

σjYj,kZj,kψj,k :=

λ∈∇

dλψλ.

We set |λ|:=j. Then, by using the norm equivalence from (W6), we obtain E[∥X−XN2Hν(O)] = E

λ∈∇

dλψλ−

λ∈

cλψλ

2

Hν(O)

= E

λ∈∇\

dλψλ+

λ∈

(dλ −cλλ

2

Hν(O)

≍E

λ∈∇\

22|λ|ν|dλ|2 +

λ∈

22|λ|ν|dλ−cλ|2

.

56 Chapter 3. A class of random functions

If we omit the second sum and by (3.33) and (3.34), we get

E[∥X−XN2Hν(O)]⪰E

λ∈∇\

22|λ|ν|dλ|2

⪰E

λ∈∇j2\

22|λ|ν|dλ|2

≥E

k∈∇j

2\

22j2ν|dj2,k|2

≥E

#

k ∈ ∇j2 \∇ : 22j2ν|dj2,k|2 > δj22

·δj22

≥E

M(j2,2j2νσj2)−#∇

·22j2νσj22

=

E[M(j2,2j2νσj2)]−E[#∇] 

·22j2νσj2

2,

so that, by inserting (3.1), (3.35), (3.36), E[#∇] ≤N ≤2j12d, we can conclude E[∥X−XN2Hν(O)]⪰

2j1d−2j12d

22j2νj2γd2−αj2d

⪰j1γd2j1d+2j1−β1ναj1−β1d

≍(log2N)γdN1−β1 (α−1+β−d),

which yields (3.32).

Remark 3.26. For the proof of Theorem 3.25 it is essential to be able to compute the expected value of M(j, δj), i.e., the average number of coefficients on levelj which are larger than the threshold δj. This random variable can be derived solely due to the structure of X. Since the threshold δj = 2jγd/22−αjd/2 decays with increasing levelj, cf. (3.34), the growth of E[M(j, δj)] is in compliance with Theorem 3.10.

Remark 3.27. Observe that the upper bound in Theorem 3.23 for p= 2 coincides with the lower bound in Theorem 3.25.