Estimation of the occurring error terms and probabilities of exception sets

3.3 Estimation of the occurring error terms and probabilities of excep-tion sets

We collect a series of auxiliary results before we can gain outcomes in the next section. We start with the probabilities occurring in Corollary 3.2.7, where we can proceed similar to Gobet et al. [22]:

Lemma 3.3.1. For any i=0, . . . ,N−1, d=1, . . . ,D it holds:

P([A^1,L_0,i]^c) ≤ 2E

K_0,i^L exp Ã

− L∆_i^β 8K_0,i^L R²(1+T)²

P([A^1,L_d,i]^c) ≤ 2E

K_d,i^L exp Ã

− L∆^β+1_i 8K^L_d,iR²(1+T)²R²₀

and for n=2, 3, . . .

P([A^n,L_0,i ]^c) ≤ 2E

· _N−1

j=i+1

∏

N₂³ _∆β/2 i

12 q

K^L_0,iTK²∆jN,[P_0,j]y,(_λXⁱ_t_j,_λXt_j)_λ=1,...,L´

D d=1

∏

³ ∆^β/2_i 12q

K_0,i^LTK²∆_jND,[P_d,j]z,(_λXⁱ_t_j,_λXtj)_λ=1,...,L

´¶

·K_0,i^L exp µ

− L∆^β_i 72K_0,i^L R²(1+T)²

¶¸

P([A^n,L_d,i ]^c) ≤ 2E

· _N−1

j=i+1

∏

N₂³ _∆(β+1)/2 i

12 q

K^L_d,iTK²∆jNR²₀,[P_0,j]y,(_λXⁱ_t_j,_λXt_j)_λ=1,...,L´

D e=1

∏

³ ∆^(β+1)/2_i 12q

K_d,i^LTK²∆_jNDR²₀,[P_e,j]z,(_λXⁱ_t_j,_λXtj)_λ=1,...,L

´¶

·K_d,i^L exp µ

− L∆^β+1_i 72K_d,i^L R²(1+T)²R²₀

¶¸

Proof. The estimate for the first probability is least complicated since the occurring coefficients do not depend on random estimators. For anyi=0, . . . ,N−1 we have

P([A^1,L_0,i]^c) = P Ã1

∑

L λ=1

|(eα^1,L_0,i −α^1,L_0,i)^>(p^λ_0,i)^>|²≥∆^β_i

= P Ã

(eα^1,L_0,i −α^1,L_0,i)^>1 L

∑

L λ=1

(p^λ_0,i)^>p^λ_0,i(eα^1,L_0,i −α^1,L_0,i)≥∆_i^β

= P µK_0,i^L

∑

k=1

|eα^1,L_0,i,k−α^1,L_0,i,k|²≥∆^β_i

= E

· P_i^L

µK_0,i^L

∑

k=1

|eα^1,L_0,i,k−α^1,L_0,i,k|²≥∆_i^β

¶¸

46 3.3. Estimation of the occurring error terms and probabilities of exception sets

where we used(B^L_0,i)^>B_0,i^L

L =Id_KL

0,i. Due to the definition of the coefficients we have:

P_i^L which are independent of any other occurring random variables. The last equality is true sincep^λ_0,i,kis measurable with respect toF_i^L, the random variables(∆_λW_j,_λXt_j+1)and(∆_λW_j,_λXⁱ_t_j+1),j=i, . . . ,N−1

Now, Hoeffding’s inequality (see the appendix) yields for anyk=1, . . . ,K_0,i^L the following estimate:

P_N^L,i

3.3. Estimation of the occurring error terms and probabilities of exception sets 47

The second probability gets a little bit more involved: For anyd = 1, . . . ,Dand i = 0, . . . ,N−1 the definition of the coefficients yields

P([A^1,L_d,i]^c) = P Again, we insert a sequence of i.i.d. Bernoulli random variables denoted byU_λ with P(U_λ = 1) = ¹₂ andP(U_λ =−1) = ¹₂. This does not change the probability since(∆_λW_j,_λXtj+1)and(∆_λW_j,_λXⁱ_t_j+1)for j=i, . . . ,N−1 conditioned toF_i^Lare identically distributed. Hence, we further consider:

P_i^L

48 3.3. Estimation of the occurring error terms and probabilities of exception sets

yields due to the independence ofU_λfork=1, . . . ,K^L_d,i

E^L,i_N[H_d,λ,k] =0

and

|H_d,λ,k| ≤ |p^λ_d,i,k| ·2R(1+T)R0

p∆_i.

Hoeffding’s inequality implies for arbitraryk=1, . . . ,K_d,i^L the following estimates:

P_N^L,i





¯¯

¯ 1 L

∑

L λ=1

H_d,λ,k

¯¯

¯≥ vu ut∆^β+2_i

K^L_d,i



 ≤ 2 exp Ã

− 2L∆^β+2_i

K_d,i^L ¹_L∑^L_λ=1|4R(1+T)R₀p

∆_ip^λ_d,i,k|²

= 2 exp Ã

− L∆^β+1_i

8K^L_d,iR²(1+T)²R²₀¹_L∑^L_λ=1|p^λ_d,i,k|²

= 2 exp Ã

− L∆^β+1_i 8K^L_d,iR²(1+T)²R²₀

! .

Gathering the results yields

P([A^1,L_d,i]^c)≤2E

K_d,i^L exp Ã

− L∆^β+1_i 8K^L_d,iR²(1+T)²R²₀

Forn=2, 3, . . . we additionally use methods from nonparametric statistics. For a fixed timei=0, . . . ,N− 1 in the grid we have:

P([A^n,L_0,i ]^c) = P Ã1

∑

L λ=1

|(eα^n,L_0,i −α_0,i^n,L)^>(p_0,i^λ)^>|²≥∆^β_i

= P Ã

(eα^n,L_0,i −α^n,L_0,i )^>1 L

∑

L λ=1

(p_0,i^λ)^>p^λ_0,i(eα^n,L_0,i −α^n,L_0,i)≥∆_i^β

= P µK^L_0,i

∑

k=1

|eα^n,L_0,i,k−α^n,L_0,i,k|²≥∆^β_i

= E

· P_i^L

µK_0,i^L

∑

k=1

|eα^n,L_0,i,k−α^n,L_0,i,k|²≥∆^β_i

¶¸

3.3. Estimation of the occurring error terms and probabilities of exception sets 49 variablesU_λcannot be inserted without difficulties as before. The way out of this problem leads via the introduction of covers: Due toy^n−1,L_j ∈[P_0,j]yandz^d,n−1,L_j ∈[P_d,j]zwe obtain for anyk=1, . . . ,K^L_0,i: where againU_λare i.i.d. Bernoulli random variables independent of any other with the same property as above. The last equality is true since(∆_λW_j−1,_λXt_j)and(∆_λW_j−1,_λXⁱ_t_j)forj=i+1, . . . ,Nconditioned

50 3.3. Estimation of the occurring error terms and probabilities of exception sets Without loss of generality we can assume that elements ofG_0,jare bounded byC_y and elements ofG_d,j, are bounded byC_z^π. Moreover,G_0,jandG_d,jdepend on(_λXⁱ_t_j,_λXt_j)_λ=1,...,Lbut not onU_λ. variables_λV, the Lipschitz property of f and Young’s inequality further imply:

¯¯

3.3. Estimation of the occurring error terms and probabilities of exception sets 51

where we took in the last inequality the number of all combinations of functions fromG_0,j andG_d,j,j = i+1, . . . ,N−1 andd=1, . . . ,Dand weighted them with the maximal probability. Applying Hoeffding’s inequality in the usual way to the last conditional probability finally yields

P([A^n,L_0,i ]) ≤ 2E

Combining the techniques from the estimate forP([A^1,L_d,i]^c)and the estimate forP([A^n,L_0,i ]^c)concerning the statistical tools we also obtain an upper bound forP([A^n,L_d,i]^c). For fixedi=0, . . . ,N−1,d=1, . . . ,Dand

52 3.3. Estimation of the occurring error terms and probabilities of exception sets Due to the definition of the coefficients it holds:

P_i^L

Again, we cannot insert the Bernoulli random variablesU_λ at once and consequently introduce again covers of function classes: Sincey^n−1,L_j ∈[P_0,j]_yandz^e,n−1,L_j ∈[P_e,j]_z(e=1, . . . ,D), for anyk=1, . . . ,K_d,i^L

3.3. Estimation of the occurring error terms and probabilities of exception sets 53 which are independent of any other. The last equality is due to the conditionally identical distribution of (∆_λWt_j,_λXt_j+1)and(∆_λWt_j,_λXⁱ_t_j+1)for j =i, . . . ,N−1 conditioned toF_i^L. Again, we introduce covers Without loss of generality we can again assume that elements ofG_0,jare bounded byC_yand elements of G_e,jare bounded byC^π_z. Moreover,G_0,jandG_e,jdepend on(_λXⁱ_t_j,_λXt_j)_λ=1,...,L, but not onU_λ. Cauchy-Schwarz and Young’s inequality and the Lipschitz continuity off yield:

¯¯

54 3.3. Estimation of the occurring error terms and probabilities of exception sets Thus, we can conclude

P_i^L

3.3. Estimation of the occurring error terms and probabilities of exception sets 55

Because of the independence ofU_λwe obtain

E_N^L,i[H_d,λ,k] =0 and

|H_d,λ,k| ≤ 2R(1+T)R0

p∆_i|p^λ_d,i,k|.

Hoeffding’s inequality implies the following estimate:

P_N^L,i Thus, we finally obtain:

P([A^n,L_d,i]^c) ≤ 2E

The probabilities describing the exception set concerning the change from ghost sample to original sample are estimated next:

56 3.3. Estimation of the occurring error terms and probabilities of exception sets

which can be seen by further conditioning onU_λ.

We now introduce a coverGof the function class[P_0,j]y−yⁿ⁻¹_j such that for anyξ ∈[P_0,j]y−yⁿ⁻¹_j there We can assume that elements ofG are bounded by 2Cy. Recall also thatG depends on the simulations (_λXt_j,_λXⁱ_t_j)_λ=1,...,Lbut not on the Bernoulli random variables(U_λ)_λ=1,...,L. Moreover, the random variable

Now, letg∈ G satisfying (3.28). Then, we obtain vu

and the first and fourth summand of the last expression can be estimated by:

√2

3.3. Estimation of the occurring error terms and probabilities of exception sets 57

Since on the set we are interested in A :=

Thus, combining these results we obtain:

P_j^L,i

58 3.3. Estimation of the occurring error terms and probabilities of exception sets

Hoeffding’s inequality implies

P_j^L,i Ã₁

L∑^L_λ=1U_λ n

|g(_λXⁱ_t_j)|²− |g(_λXt_j)|² o r

L∑^L_λ=1|g(_λX_t_j)|²+|g(_λXⁱ_t_j)|²

≥ 1 3∆^β/2_j

≤2 exp



−2L∆^β_j _L¹∑^L_λ=1|g(_λXt_j)|²+|g(_λXⁱ_t_j)|² 36¹_L∑^L_λ=1

|g(_λXⁱ_t_j)|²− |g(_λXtj)|²

´₂





=2 exp



− L∆^β_j ∑^L_λ=1|g(_λXt_j)|²+|g(_λXⁱ_t_j)|² 18∑_λ=1^L

|g(_λXⁱ_t_j)|²− |g(_λX_t_j)|²´₂



.

Moreover, because of

∑

L λ=1

|g(_λXⁱ_t_j)|²− |g(_λXt_j)|²

´₂

≤

∑

L λ=1

|g(_λXⁱ_t_j)|⁴+|g(_λXt_j)|⁴

≤ 4C_y²

∑

L λ=1

|g(_λXⁱ_t_j)|²+|g(_λX_t_j)|²

we gain

2 exp



− L∆^β_j ∑^L_λ=1|g(_λX_t_j)|²+|g(_λXⁱ_t_j)|² 18∑_λ=1^L

|g(_λXⁱ_t_j)|²− |g(_λXt_j)|²

´₂



≤2 exp



−L∆^β_j 72C_y²



.

Thus, finally we can conclude

P([A^n,L_y,jⁱ]^c)≤2 exp



−L∆^β_j 72C²_y



E

N₂³_∆^β/2

3√

2,[P_0,j]_y−yⁿ⁻¹_j ,(_λX_t_j,_λXⁱ_t_j)_λ=1,...,L´¸

For the probability concerning theZ-part we can copy large parts of the previous proof only noting the higher dimension:

Lemma 3.3.3. For n∈N, i=0, . . . ,N−1and j=i, . . . ,N−1holds:

P([A^n,L_z,jⁱ]^c) ≤ 2 exp



− L∆^β_j 72D(C_z^π)²





∏

d=1

N₂³ _∆^β/2

3√

2D,[P_d,j]z−z^d,n−1_j ,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L´¸

Proof. As before we define(U_λ)_λ=1,...,L to be i.i.d. Bernoulli random variables independent from every-thing else, satisfyingP(U_λ = 1) = ¹₂ = P(U_λ = −1). Furthermore we define for fixedi = 0, . . . ,N−1 and fixedj=i, . . . ,N−1

B_λ= _λXⁱ_t_j andB_L+λ= _λXt_j, ifU_λ =1, B_λ= _λXtj andB_L+λ= _λXⁱ_t_j, ifU_λ =−1.

3.3. Estimation of the occurring error terms and probabilities of exception sets 59

As in the last proof we obtain

P_i^L

which again can be seen by further conditioning onU_λ.

We now introduce in a whole D covers, that is G_d, d = 1, . . . ,D of [P_d,j]_z−z^d,n−1_j such that for any

60 3.3. Estimation of the occurring error terms and probabilities of exception sets

Since on the set we are interested in,A:=

3.3. Estimation of the occurring error terms and probabilities of exception sets 61

Hoeffding’s inequality yields

P_j^L,i Ã₁

L∑_λ=1^L U_λ∑^D_d=1 n

|g_d(_λXⁱ_t_j)|²− |g_d(_λX_t_j)|²o r

L1∑^L_λ=1∑^D_d=1|g_d(_λXtj)|²+|g_d(_λXⁱ_t_j)|²

≥1 3∆^β/2_j

≤ 2 exp



−2L∆^β_j ¹_L∑_λ=1^L ∑_d=1^D |g_d(_λXt_j)|²+|g_d(_λXⁱ_t_j)|² 36¹_L∑^L_λ=1

∑_d=1^D |g_d(_λXⁱ_t_j)|²− |g_d(_λXtj)|²

´₂





≤ 2 exp



− L∆^β_j ∑_λ=1^L ∑_d=1^D |g_d(_λXt_j)|²+|g_d(_λXⁱ_t_j)|² 18D∑^L_λ=1∑^D_d=1

|g_d(_λXⁱ_t_j)|²− |g_d(_λX_t_j)|²´₂



.

Moreover, from

∑

L λ=1

∑

D d=1

|g_d(_λXⁱ_t_j)|²− |g_d(_λXt_j)|²

´₂

≤

∑

L λ=1

∑

D d=1

|g_d(_λXⁱ_t_j)|⁴+|g_d(_λXt_j)|⁴

≤ 4(C_z^π)²

L λ=1

∑

D d=1

∑

|g_d(_λXⁱ_t_j)|²+|g_d(_λXtj)|²

we can derive

2 exp



− L∆^β_j ∑_λ=1^L ∑^D_d=1|g_d(_λXtj) +|g_d(_λXⁱ_t_j)|² 18D∑_λ=1^L ∑_d=1^D

|g_d(_λXⁱ_t_j)|²− |g_d(_λXt_j)|²

´₂



≤2 exp



− L∆^β_j 72D(C^π_z)²



.

Gathering the results we can conclude

P([A^n,L_z,jⁱ]^c) ≤ 2 exp



− L∆^β_j 72D(C_z^π)²





∏

d=1

· N2

³ _∆^β/2

3√

2D,[P_d,j]z−z^d,n−1_j ,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L

´¸

As last exception set we consider a probability which does not appear in the Euler-type scheme of Gobet et al. [22] and can therefore be seen as typical for our kind of approximation:

Lemma 3.3.4. For any n∈Nand i=1, . . . ,N−1it holds

P([A^n,L_z ⁱ⁻¹^,Lⁱ]^c) ≤ E

·_N−1

∏

j=i

µ N2

³ ∆^β+1_i 12q

TK²∆_jN,[P_0,j]y,(_λXⁱ⁻¹_t_j ,_λXⁱ_t_j)_λ=1,...,L

∏

D d=1

N₂³ _∆β+1 i

12 q

TK²∆jND,[P_d,j]_z,(_λXⁱ⁻¹_t_j ,_λXⁱ_t_j)_λ=1,...,L´¶¸

·2 exp Ã

− L∆^2(β+1)_i 72R²(1+T)²

! .

62 3.3. Estimation of the occurring error terms and probabilities of exception sets

Chow and Teicher [13], Corollary 7.3.3 moreover implies E_i^L,i−1

Due to the independence of all occurring Brownian increments, we know that for fixedξ_d,j,d=0, . . . ,D, j=i, . . . ,N−1 (3.30) can be written as

G(∆_λW0, . . . ,∆_λW_i−2,∆_λW_i−1)

for some Borel functionG. Moreover, since the structure of term (3.31) for fixedξ_d,j,d = 0, . . . ,Dand j=i, . . . ,N−1 is identical we obtain that this term can be represented as

G(∆_λW₀, . . . ,∆_λW_i−2,∆_λW_i−1).

3.3. Estimation of the occurring error terms and probabilities of exception sets 63

Hence, conditioned toF_i−1^L (3.30) and (3.31) are identically distributed. Moreover,yⁿ_i(_λXⁱ⁻¹_t_i )andyⁿ_i(_λXⁱ_t_i) satisfy this property as well, such that in principle we can proceed similar to the last proof.

That means, we define again i.i.d. random variablesU_λ,λ =1, . . . ,Lwith P(U_λ =1) = ¹₂ andP(U_λ =

−1) = ¹₂, which are independent from all other random variables.

Furthermore, we introduce

Since the random variables under the square root givenF_i−1^L are identically distributed we obtain the equality

for which the analog properties and notations are used as in the former proofs. To ease notation we define

64 3.3. Estimation of the occurring error terms and probabilities of exception sets

Letg_0,jandg_d,jsatisfying (3.32) and (3.33) respectively. Then, as in the last proofs, we obtain vu

which is again due to the Lipschitz continuity off, Young’s and the Cauchy-Schwarz inequality. Conse-quently, it holds

3.3. Estimation of the occurring error terms and probabilities of exception sets 65 where we used the definition

V_λ := holds, Hoeffding’s inequality implies

P_N^L,i−1 which is due to

∑

L which is the assertion.

66 3.3. Estimation of the occurring error terms and probabilities of exception sets

We now show that all occurring covering numbers are finite which implies that the above probabilities for fixed|π|andK_d,i,d=0, . . . ,D,i=0, . . . ,N−1 converge to 0, ifL→∞exceeds all bounds.

Lemma 3.3.5. For n∈N,ε<1, i=0, . . . ,N−1, j=i, . . . ,N−1and d=1, . . . ,D holds:

N₂³

ε,[P_0,j]y−yⁿ_j,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L´

≤3 µ8eC_y²

ε² ln µ12eC²_y

ε²

¶¶_K_0,j₊₁

, (3.34)

ε,[P_d,j]z−z^d,n_j ,(_λXtj,_λXⁱ_t_j)_λ=1,...,L

≤3

³8e(C^π_z)² ε² ln

³12e(C_z^π)² ε²

´´_K_d,j₊₁

. (3.35)

Proof. We can proceed word by word as Lemor [33] and only give the proof for the sake of completeness.

Leti=0, . . . ,N−1 be fixed. Lemma 9.2 of Györfi et al. [26] then implies for anyj=i, . . . ,N−1:

N₂

ε,[P_0,j]y−yⁿ_j,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L

= N₂

ε,[P_0,j]y,(_λXt_i,_λXⁱ_t_i)_λ=1,...,L

= N₂

ε,[P_0,j]y+Cy,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L

≤ M2

ε,[P_0,j]y+Cy,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L

´ , whereM2

ε,[P_0,j]y+Cy,(_λXtj,_λXⁱ_t_j)_λ=1,...,L

denotes the ε-packing number of[P_0,j]y+Cy on the ran-dom variables(_λXt_j,_λXⁱ_t_j)_λ=1,...,L. For a definition of packing numbers see Appendix A.3.2.

Now,[P_0,j]y+Cy is a class of positive functions bounded by 2Cy. Since usually 0 < ε < 1 < ^C₄^y holds and the Vapnik-Chervonenkis dimension of the subgraphsV_([P_0,j_]_y_+C_y₎⁺ is larger than 2 (for both terms we refer to Appendix A.3.3), we can apply Theorem 9.4 of Györfi et al. [26] and obtain

M₂³

ε,[P_0,j]_y+C_y,(_λX_t_j,_λXⁱ_t_j)_λ=1,...,L´

≤ 3 µ

2e(2Cy)² ε² ln

3e(2Cy)² ε²

¶¶_V

([P0,j]y+Cy)+

= 3 µ8eC²_y

ε² ln µ12eC²_y

ε²

¶¶_V

([P0,j]y+Cy)+

. Hence, it is left to bound the exponent. To do so we first showV_([P_0,j_]_y_+C_y₎⁺ ≤V_([P_0,j_]_y₎⁺: IfV_([P_0,j_]_y_+C_y₎⁺ =∞we consider in each case a fixed subset inR²:

{(z₁,t₁), . . . ,(zm,tm)}.

For any m ∈ N and any subset I of {1, . . . ,m} there is in this case by the definition of the Vapnik-Chervonenkis dimensiong∈[P_0,j]_y+C_ysuch that

g(z_k) ≥ t_k, fork∈ Iand g(z_k) < t_k, fork∈/ I.

This implies for anym∈Nand any subsetIof{1, . . . ,m}there isg∈[P_0,j]ysatisfying g(z_k) ≥ t_k−C_y, fork∈Iand

g(z_k) < t_k−Cy, fork/∈I and we can concludeV_([P_0,j_]_y₎⁺ =∞.

IfV_([P_0,j_]_y_+C_y₎⁺ =mfor somem∈N, the above argumentation yieldsV_([P_0,j_]_y₎⁺ ≥m.

In the next step we showV_([P_0,j_]_y₎+ ≤V_(P_0,j₎+:

IfV_([P_0,j_]_y₎+ =∞we argue as follows: We again consider a subset ofR² {(z₁,t₁), . . . ,(zm,tm)},

3.3. Estimation of the occurring error terms and probabilities of exception sets 67

which is now shattered by([P_0,j]_y)⁺, see Definition A.3.4 in the appendix for the meaning of the so-called shatter coefficients. As required for anym∈Nand any subset Iof{1, . . . ,m}there is a functiong∈ P_0,j satisfying

[g]y(z_k) < t_k, fork∈Iand [g]_y(z_k) ≥ t_k, fork/∈I.

It is enough to show

g(z_k) ≤ [g]y(z_k), fork∈Iand g(z_k) ≥ [g]_y(z_k), fork/∈I,

to find an element of(P_0,j)⁺which also identifiesIand consequently alsoV_(P_0,j₎+ =∞.

Assuming there isk∈Iwithg(z_k)>[g]_y(z_k). Then, necessarilyg(z_k)>C_yand hence[g]_y(z_k) = C_yand t_k >C_y. Thus, it is impossible to find a function in[P_0,j]_yidentifying the complement ofkin{1, . . . ,m}:

For if there was suchg⁰, it must hold[g⁰]_y(z_k)≥t_k >C_yand we obtain a contradiction. Consequently, it holdsg(z_k)≤[g]y(z_k)fork∈ I.

On the other hand assuming the existence ofk∈/ Isatisfyingg(z_k)<[g]y(z_k). Then necessarilyg(z_k) <

−Cy holds and moreover[g]y(z_k) = −Cy and t_k ≤ −Cy are valid. Hence, it is not possible to find a functiong⁰ ∈ [P_0,j]yidentifying the set{(z_k,t_k)}. In that case, we would have[g⁰]y(z_k)< t_k ≤ −Cyalso leading to a contradiction.

IfV_([P_0,j_]_y₎⁺ =mfor somem∈Nholds, the above argumentation impliesV_(P_0,j₎⁺ ≥m.

Theorem 9.5 and the remark at p. 152 in the upper part in Györfi et al. [26] now yieldV_(P_0,j₎⁺ ≤K_0,j+1.

Thus, finally we obtain:

N₂

ε,[P_0,j]y−yⁿ_j,(_λXt_j,_λXⁱ_t_j)_λ=1,...,L

≤3 µ8eC²_y

ε² ln µ12eC_y²

ε²

¶¶_K_0,j₊₁ .

The estimate for the second covering number can be derived analogly where we now have to consider subgraphs inR^D+1.

Remark 3.3.6. Estimates (3.34) and (3.35) also hold for the covering numbers occurring in Lemma 3.3.1 and 3.3.4. For the former this is obvious. The second group of covering numbers conditioned toF_i−1^L has the same structural properties as those covering numbers we explicitly examined in the above lemma.

The tower-property of conditional expectations then completes the proof.

We now turn to the error terms, which can be treated analogly to the error terms in Gobet et al. [22]. At first, we consider the terms containing typical projection errors.

Lemma 3.3.7. For n∈N, i=0, . . . ,N−1and d=1, . . . ,D holds

T_1,i^n,L ≤ R²(1+T)²E[K_0,i^L ]

L +inf

α E h

|yⁿ_i(Xt_i)−α·p_0,i(Xt_i)|² i

, T_3,d,i^n,L ≤ R²(1+T)²E[K_d,i^L ]

L∆_i + 1

∆_i inf

α E h

∆_iz^d,n_i (Xti)−α·p_d,i(Xti)|² i

68 3.3. Estimation of the occurring error terms and probabilities of exception sets

which is due to the measurability of the involved random variables and Jensen’s inequality.

We now apply a conclusion of Theorem 11.1 in Györfi et al. [26] also used by Lemor [33], see Corollary A.4.2 in the appendix, and obtain

T_1,i^n,L ≤ sup dimen-sional subspace ofP_0,i. The term containing the variance can be estimated using the boundedness ofyⁿ_i as follows: Analogly, we can derive the estimate forT_3,d,i^n,L:

T_3,d,i^n,L = E Again, the conclusion of Theorem 11.1 in Györfi et al. [26] yields

T_3,d,i^n,L ≤ sup

Here, the variance term can be further bounded by Varhp

3.3. Estimation of the occurring error terms and probabilities of exception sets 69

The last remaining terms yield estimates, which already appeared in the lemma above:

Lemma 3.3.8. For any n∈N, i=0, . . . ,N−1and d=1, . . . ,D holds

L =Id. Furthermore, we define V:= This yields the representation

eα_0,i^n,L = 1 inde-pendent from each other, it holds

E^L

70 3.3. Estimation of the occurring error terms and probabilities of exception sets

Consequently, the terms of the matrix(V−E^L[V])(V−E^L[V])^>, which are outside of the main diagonal, have conditional expectation (with respect toF^L) of 0. This would be wrong if we would not have used the ghost sample. For the terms on the main diagonal we obtain because of the boundedness ofφand f:

E^Lh

Gathering the results yields:

T_2,i^n,L = E

where the inequality is due to (3.36) and tr(A)denote the trace of a matrixA. Analogly, we can proceed forT_4,d,i^n,L, where nowB_d,i^L =

L =Id. Moreover, we define

V:=

Im Dokument A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling (Seite 57-83)