Least-squares Monte Carlo

Here we iteratively applied Lemma 2.1.6 and the last estimate is due to Lemma 2.1.5.

The claim now follows by induction. Forn =1 it is true by Lemma 2.1.5. Now, suppose it is valid for some(n−1)∈N, then

The first term is finite by the induction hypothesis, the second one can be estimated with the above calcu-lation. For theZ-part we can proceed analogly.

2.2 Least-squares Monte Carlo

To get a fully implementable algorithm we have to approximate the conditional expectations by some es-timator. In this section we describe a simulation based least-squares Monte Carlo estimator and prove its convergence. Recall that the least-squares method can be applied to estimate the conditional expectation of a square-integrable random variable, see e.g. Carrière [11] or Longstaff and Schwartz, [34]. However, we cannot guarantee that the processes(Yⁿ,Zⁿ)are square integrable in general under the measureP.

Therefore we cannot apply the least-squares approach directly to(Yⁿ,Zⁿ), but work with(Ψ⁰Yⁿ,Ψ⁰Zⁿ) instead.

As explained above, our remaining task is to estimate Y_tⁿ_i = E

Consequently,E[Ψ⁰_t_iV|Ft_i]is the orthogonal projection on the spaceL²(G_i), whereG_idenotes theσ-field generated by the random variables of the formΨ⁰_t_iv(Xti)for deterministic and measurable functionsv.

We now replace this projection by a projection on a finite-dimensional subspace. To do so, we choose, for each time partition point,D+1 sets of basis functions

{p_0,i,1(·), . . . ,p_0,i,K_0,i(·)}for the estimation ofY_tⁿ_i and {p_d,i,1(·), . . . ,p_d,i,K_d,i(·)}for the estimation ofZⁿ_d,t_i.

2.2. Least-squares Monte Carlo 15

We assume that

η_d,i,k:=Ψ⁰_t_ip_d,i,k(Xt_i)

satisfyE[|η_d,i,k|²] < ∞for everyd = 0, . . . ,D,i = 0, . . . ,N−1 andk = 1, . . . ,K_d,i, and that the vectors (η_d,i,1, . . . ,η_d,i,K_d,i) are linearly independent for every d = 0, . . . ,D, i = 0, . . . ,N−1. Now, we define Λ_d,i =span(η_d,i,k)and denote byP_d,ithe orthogonal (in theL²-sense) projection onΛ_d,i. As these spaces are finite-dimensional, there are coefficientsα_d,i,k(V)such that

P_d,i[Ψ⁰_t_iV] =

K_d,i k=1

∑

α_d,i,k(V)Ψ⁰_t_ip_d,i,k(Xt_i). (2.8)

The inner-product matrices associated to the chosen bases are B_d,i=E£

η_d,i,kη_d,i,l¤

k,l=1,...,Kd,i. (2.9)

Hence, we obtain as coefficients

α_d,i(V) = (B_d,i)⁻¹E[η_d,iV], (2.10) whereη_d,i = (η_d,i,1, . . . ,η_d,i,K_d,i)^> and α_d,i(V) = (α_d,i,1(V), . . . ,α_d,i,K_d,i(V))^>. Finally, the corresponding estimator forE[V|Ft_i] =E[V|Xt_i], given the basis{p_d,i,1(·), . . . ,p_d,i,K_d,i(·)}, is

K_d,i

∑

k=1

α_d,i,k(V)p_d,i,k(Xt_i).

Thanks to Theorem 2.1.1 and Corollary 2.1.7 we can apply this machinery for estimatingY_tⁿ_i andZⁿ_d,t

i. As

estimators for these quantities we define Yb_tⁿ_i = (Ψ⁰_t_i)⁻¹P_0,i

Ψ⁰_t_Nφ(Xt_N) +

N−1

∑

j=i

Ψ⁰_t_jf(t_j,St_j,Yb_tⁿ⁻¹_j ,Zbⁿ⁻¹_t_j )∆_j

K_0,i k=1

∑

αⁿ_0,i,kp_0,i,k(Xt_i), Zbⁿ_d,t_i = (Ψ⁰_t_i)⁻¹P_d,i

·∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(XtN) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,Stj,Yb_tⁿ⁻¹_j ,Zbⁿ⁻¹_t_j )∆_j

¶¸

K_d,i k=1

∑

αⁿ_d,i,kp_d,i,k(Xt_i) where

αⁿ_0,i = (B_0,i)⁻¹E

· η_0,i

Ψ⁰_t_Nφ(Xt_N) +

N−1

∑

j=i

Ψ⁰_t_jf(t_j,St_j,Yb_tⁿ⁻¹_j ,Zbⁿ⁻¹_t_j )∆_j

¶¸

, (2.11)

and ford=1, . . . ,D αⁿ_d,i= (B_d,i)⁻¹E

· η_d,i

µ∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(XtN) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,Stj,Yb_tⁿ⁻¹_j ,Zb_tⁿ⁻¹_j )∆_j

¶¶¸

, (2.12)

initialized at(Yb⁰,Zb⁰) =0.

16 2.2. Least-squares Monte Carlo

Remark 2.2.1. Note that Assumption A 7 and Theorem 2.2.2 below guarantee that the weights in (2.11) -(2.12) are finite.

In the following, we analyze the error which results from the approximation of (Ψ⁰_t_iY_tⁿ_i,Ψ⁰_t_iZⁿ_t_i) with (Ψ⁰_t_iYb_tⁿ_i,Ψ⁰_t_iZbⁿ_t_i). Analogly to Bender and Denk [2] this will be done in terms of the projection errors

|Ψ⁰_t_iY_tⁿ_i −P_0,i(Ψ⁰_t_iY_tⁿ_i)| and|Ψ⁰_t_iZ_d,tⁿ

i −P_d,i(Ψ⁰_t_iZⁿ_d,t

i)|. We extend their Theorem 11 (which corresponds to the caseh=0), reflecting the advantage of the Picard-type scheme: The error induced by the approxima-tion of the condiapproxima-tional expectaapproxima-tions does neither explode when the number of time steps tends to infinity nor does it blow up if the number of iterations grows. We simply obtain, that theL²-error is bounded by a constant times the worstL²-projection error occurring during iterations.

Theorem 2.2.2. There is a constant C depending on the data and the bound of h such that

0≤i≤Nmax Eh

for sufficiently small|π|.

Proof. Define Due to the orthogonality of the projection we also have

Eh and the analog equation holds forZⁿ_d,t

i. Consequently, we get for anyi=0, . . . ,N,

where we used in the last step that nontrivial orthogonal projections have norm 1. In the same way we get

2.2. Least-squares Monte Carlo 17

Multiplying these inequalities with the weightsλ_i, satisfyingλ₀ =1 andλ_i = (1+Γ∆_i−1)λ_i−1for some Γ>0 to be specified later, we obtain

0≤i≤Nmax λ_iE Putting estimates (2.13) - (2.14) together we obtain

0≤i≤Nmax λ_iE Iterating this inequality yields

0≤i≤Nmax E Hence, the claim finally follows if|π|is small enough.

Remark 2.2.3. The proof of the above theorem only made use of the fact that nontrivial orthogonal pro-jections have norm 1. Hence, it holds for orthogonal propro-jections on any, possibly infinite-dimensional, subspacesΛ_d,i⊂L²(G_i).

18 2.2. Least-squares Monte Carlo

The final approximation step of our algorithm replaces the expectations in (2.9) and (2.10) by their simula-tion-based counterparts, i.e. we assume that we have a number ofL ≥max_d,i{K_d,i}independent draws from(∆W_i^h,X_t_i,φ(X_t_N),Ψ⁰_t_i,η_d,i)which we denote by(∆_λW_i^h,_λX_t_i,φ(_λX_t_N),_λΨ⁰_t_i,_λη_d,i)forλ =1, . . . ,L.

The column vectors of these copies are denoted by(∆W^h_i,Xt_i,phi(XtN),Psi⁰_t_i,e_d,i), e.g.

Xt_i = (₁Xt_i, . . . , _LXt_i)^>. Define

A^L_d,i := 1

√L

λη_d,i,k¢

λ=1,...,L,k=1,...,K_d,i,d=0, . . . ,D, so that

B_d,i^L := (A^L_d,i)^>A^L_d,i= 1 L

Ã L

∑

λ=1

λη_d,i,k_λη_d,i,l

k,l=1,...,Kd,i

ford=0, . . . ,D

are the simulation-based analogons to the matricesB_d,i. Since the inverses ofB^L_d,ineed not exist, we switch to the pseudo-inverses(A^L_d,i)⁺in order to introduce in recursive manner simulation based analogons to (2.11) - (2.12) with the help of the least-squares method. In detail we define:

α^0,L_d,i = 0, d=0, . . . ,D,

λYb_tⁿ⁻¹_i =

K_0,i k=1

∑

(_λΨ⁰_t_i)⁻¹α^n−1,L_0,i,k _λη_0,i,k,

λZb_d,tⁿ⁻¹

i =

K_d,i

∑

k=1

(_λΨ⁰_t_i)⁻¹α^n−1,L_d,i,k _λη_d,i,k,

f(t_i) = (f(t_i,₁S_t_i,₁Yb_tⁿ⁻¹_i ,₁Zbⁿ⁻¹_t_i ), . . . ,f(t_i,_LS_t_i, _LYb_tⁿ⁻¹_i ,_LZbⁿ⁻¹_t_i ))^>, α^n,L_0,i = 1

√L(A^L_0,i)⁺ µ

Psi⁰_t_N•phi(Xt_N) +

N−1

∑

j=i

Psi⁰_t_j•f(t_j)∆_j

¶ ,

α^n,L_d,i = 1

√L(A^L_d,i)⁺

µ∆W^h_d,i

∆_i • µ

Psi⁰_t_N•phi(XtN) +

N−1

∑

j=i+1

Psi⁰_t_j •f(t_j)∆_j

¶¶

,d=1, . . . ,D, where•denotes the componentwise multiplication of two vectors and we used the abbreviationα^n,L_d,i = (α^n,L_d,i,1, . . . ,α^n,L_d,i,K

d,i)^>. This enables us to define simulation based estimators by Yb_t^n,L_i =

K0,i

∑

k=1

(Ψ⁰_t_i)⁻¹α^n,L_0,i,kη_0,i,k =

K0,i

∑

k=1

α_0,i,k^n,L p_0,i,k(X_t_i), Zb^n,L_d,t

i =

K_d,i k=1

∑

(Ψ⁰_t_i)⁻¹α^n,L_d,i,kη_d,i,k=

K_d,i k=1

∑

α^n,L_d,i,kp_d,i,k(Xti), d=1, . . . ,D.

Note that the thus constructed coefficients solve linear least-squares problems, e.g.

α_0,i^n,L = arginf

α∈R^K^0,i

1 L

∑

L λ=1

¯¯

¯α^>_λη_0,i−_λΨ⁰_t_Nφ(_λXt_N)−

N−1

∑

j=i

λΨ⁰_t_j f(t_j,_λSt_j,_λYb_tⁿ⁻¹_j ,_λZb_tⁿ⁻¹_j )∆_j

¯¯

¯², and similarly ford=1, . . . ,D.

Next, we derive almost sure convergence of the simulation based estimators starting with the coefficients.

2.2. Least-squares Monte Carlo 19

Lemma 2.2.4. (α^n,L_0,i , . . . ,α^n,L_D,i)converges P-almost surely to(αⁿ_0,i, . . . ,αⁿ_D,i), when L tends to infinity.

Proof. We proceed with an induction onn. Forn=0 the claim is true by definition, since α^0,L_d,i,k = α⁰_d,i,k=0, d=0, . . . ,D,k=1, . . . ,K_d,i.

Now, we suppose convergence is already proved for some(n−1)∈N₀. We only show the convergence forα^n,L_d,i,kfor some fixedd=1, . . . ,D, the arguments forα^n,L_0,i,kare basically the same.

By the law of large numbers we have

L→∞lim B_d,i^L =B_d,i, P-a.s. (2.15)

AsB_d,i is invertible the same is valid forB_d,i^L provided L is large enough. We assumeL to satisfy this condition in the following. In particular,A_d,i^L then has full rank and its pseudo-inverse can be written as

(A^L_d,i)⁺= (B_d,i^L )⁻¹(A_d,i^L )^>. Hence, we obtain

α^n,L_d,i = (B_d,i^L )⁻¹ µ1

∑

L λ=1

λη_d,i∆_λW_d,i^h

∆_i µ

λΨ⁰_t_Nφ(_λX_t_N) +

N−1

∑

j=i+1

λΨ⁰_t_jf(t_j,_λS_t_j,_λYb_tⁿ⁻¹_j ,_λZb_tⁿ⁻¹_j )∆_j

¶¶

so that due to (2.15), we only have to show for alll=1, . . . ,K_d,i 1

∑

L λ=1

λη_d,i,l∆_λW_d,i^h

∆_i µ

λΨ⁰_t_Nφ(_λXtN) +

N−1

∑

j=i+1

λΨ⁰_t_jf(t_j,_λStj,_λYb_tⁿ⁻¹_j ,_λZb_tⁿ⁻¹_j )∆_j

−→E

η_d,i,l∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(X_t_N) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,S_t_j,Yb_tⁿ⁻¹_j ,Zbⁿ⁻¹_t_i )∆_j

¶¸

P-a.s.

To do so, define

λYe_tⁿ⁻¹_i =

K0,i

∑

k=1

(_λΨ⁰_t_i)⁻¹αⁿ⁻¹_0,i,k_λη_0,i,k, _λZe_d,tⁿ⁻¹

i =

Kd,i

∑

k=1

(_λΨ⁰_t_i)⁻¹αⁿ⁻¹_d,i,k_λη_d,i,k (2.16) and_λZe_tⁿ⁻¹_i = (_λZe_1,tⁿ⁻¹

i , . . . ,_λZeⁿ⁻¹_D,t

i)^>. Note that

λη_d,i,l∆_λW_d,i^h

∆_i µ

λΨ⁰_t_Nφ(_λX_t_N) +

N−1

∑

j=i+1

λΨ⁰_t_jf(t_j,_λS_t_j,_λYe_tⁿ⁻¹_j ,_λZeⁿ⁻¹_t_j )∆_j

,λ=1, . . . ,L are independent and identically distributed with the same distribution as

η_d,i,l∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(XtN) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,St_j,Yb_tⁿ⁻¹_j ,Zb_tⁿ⁻¹_j )∆_j

¶ . Moreover, it holds that

η_d,i,l∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(X_t_N) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,S_t_j,Yb_tⁿ⁻¹_j ,Zbⁿ⁻¹_t_j )∆_j

¶¸

20 2.2. Least-squares Monte Carlo

where we used Hölder’s inequality, the independence of∆W_i^handXt_i, the Lipschitz continuity of f and Assumption A 7. Therefore we can apply Kolmogorov’s law of large numbers, and deduce that

2.2. Least-squares Monte Carlo 21

The first factor converges to zero due to the induction hypothesis, the second one to a finite number due to the law of large numbers. Combining (2.15) - (2.18) yields the claim.

Consequently, we obtain theP-a.s. convergence of the simulation based estimator:

Theorem 2.2.5. (Yb_t^n,L_i ,Zb_t^n,L_i )converges P-almost surely to(Yb_tⁿ_i,Zb_tⁿ_i)as L tends to infinity.

Hence the claim follows in view of Lemma 2.2.4.

We now summarize the approximation of (Y₀,Z₀)by the modified forward scheme with importance sampling in a least-squares Monte Carlo framework:

The final estimator for(Y₀,Z₀)is(Yb_t^n,L₀ ,Zb_t^n,L₀ ). Notice that at timet₀=0 the only choice for the projection It is important to see that the averaging here is over dependent paths, because the weights in the definition of(_λYb_tⁿ⁻¹_j ,_λZb_tⁿ⁻¹_j )depend on the whole collection of sample paths. In the very special case f =0, one averages, however, over independent paths andYb_t^n,L₀ reduces to the usual Monte Carlo estimator for the expectation ofφ(X_T)with importance sampling given byh. In the context of option pricing (with f =0) in a complete marketZ₀is (up to a linear transformation) the delta of the option. The estimator Zb^n,L_d,t then corresponds to the likelihood ratio delta with importance sampling. For more information on this0

classical situation we refer to Glasserman [20], Chapter 4.6 for importance sampling, and Chapter 7.3 for the likelihood ratio delta.

We now decompose the error into

|Y0−Yb_t^n,L₀ |²+|Z0−Zb^n,L_t₀ |²

22 2.2. Least-squares Monte Carlo

The first term captures the error due to the time discretization and the iteration. We know from the results in Section 3 that this error does not depend on the choice ofh. In typical situations it is of order 1/2 in the mesh size of the time partition and converges exponentially in the number of iterations, see Corollary 2.1.2. Although the size of this first error term does not depend onh, we emphasize thatY_tⁿ₀ is the expectation of an expression, whose variance changes withh. The second term contains the error stemming from the choice of the basis. Obviously, the weights (2.11) - (2.12) in the construction of(Yb_tⁿ₀,Zb_tⁿ₀) depend onh. Hence, for the second error term, the choice ofhinfluences the error term itself and the variances in the computation of the weights. By Theorem 2.2.2 the second term converges to zero when the basis increases in a way that the projections spacesΛ_d,i exhaust the space L²(G_i). Finally, the third term covers the simulation error. Thanks to Theorem 2.2.5 this error converges to zero almost surely as the number of paths tends to infinity.

Conclusively, we want to mention a few words, what we do and what we do not in this chapter: The objective of the importance sampling procedure, introduced in this chapter, is to reduce the third error in the above decomposition by a judicious choice ofh. However, nothing is said about how to choose it in practice. For this purpose it would be nice to have some theoretical criterion how to do this. Regarding this concern in the context of option pricing one is rather left pessimistic on the ability to establish a general rule covering any setting.

We do not go in further details here and come back to this task in an important class of BSDEs in Chapter 4, where we also illustrate the success of the importance sampling by several numerical examples.

Chapter3

L ² -convergence for the Picard-type estimator

The scope of this chapter is to establish anL²-convergence theorem for a variant of the Picard type al-gorithm of Bender and Denk [2]. Thereby, we have to circumvent the problem that arises out of taking means with respect to dependent random variables, see for this purpose (2.19) and (2.20) with_λΨ⁰_t_j ≡1 for j=0, . . . ,N. Thus the variance of the estimator can not be written as sum of the variances of the individ-ual random variables and we cannot apply the usindivid-ual methodology. Nonetheless, such a theorem is more than wishful, since the overall results of Bender and Denk [2] only yield convergence in probability of the final estimator towards the solution of the BSDE at time zero, whereas in the first two approximation steps they can show convergence in the strongerL²-sense.

With the help of concepts from nonparametric statistics we can overcome these difficulties, however, we have to deal with several technical details and a demanding notation. At first, we have to make sure that certain processes are bounded.

3.1 Bounded processes

Additionally to the Assumptions (A 1) - (A 5) we now impose for the whole chapter:

A 8. The functions f andφare bounded by some constant R>0.

At first sight this seems rather restrictive, however, we can regard this as first approximation of the origi-nal equations in the following sense: Defining

f^R(t,x,y,z):=





R, if f(t,x,y,z)>R, f(t,x,y,z), if −R≤ f(t,x,y,z)≤R,

−R, if f(t,x,y,z)<−R, φ^R(x):=





R, ifφ(x)>R, φ(x), if −R≤φ(x)≤R,

−R, ifφ(x)<−R

we obtain Lipschitz continuous functions bounded byR. Thus assuming A 8 is equivalent to considering dSt=b(t,St)dt+σ(t,St)dWt, S₀=s₀,

dY_t^R=−f^R(t,S_t,Y_t^R,Z_t^R)dt+Z_t^RdW_t, Y_T^R=φ^R(X_T)

24 3.1. Bounded processes

where we simply truncated the functions f and φ in absolute value at some large R. Theorem 3.3 of Yong and Zhou [44] then implies the unique solvability of this modified equation and furthermore we can obtain an estimate for the difference of the solutions of the original equation and the modified one. To avoid an even more complex notation we only mention this point of view and rather accept the further assumption.

We consider in the sequel some variant of the Picard-type discretization of BSDEs introduced by Bender and Denk [2].

For a fixed partition, their approximation scheme for the backward part is given fori=0, . . . ,Nby Y_tⁿ_i = E

φ(X_t_N) +

N−1

∑

j=i

f(t_j,S_t_j,Y_tⁿ⁻¹_j ,Z_tⁿ⁻¹_j )∆_j

¯¯

¯¯F_t_i

¸ ,

Z_d,tⁿ _i = E

·∆W_d,i

∆_i µ

φ(XtN) +

N−1

∑

j=i+1

f(t_j,Stj,Y_tⁿ⁻¹_j ,Zⁿ⁻¹_t_j )∆_j

¶¯¯

¯¯Fti

initialized at(Y⁰,Z⁰) = (0, 0).

In order to obtain bounded, discrete processes, we truncate the Brownian increments in the above iter-ation scheme and end up with the following approximiter-ation for the solution of the backward equiter-ation suppressing the dependency on the partitionπ:

Y_t^n,R_i = E

φ(XtN) +

N−1

∑

j=i

f(t_j,Stj,Y_t^n−1,R_j ,Z_t^n−1,R_j )∆_j

¯¯

¯¯Fti

¸ ,

Z^n,R_d,t

i = E

·[∆W_d,i]_w_i

∆_i µ

φ(Xt_N) +

N−1

∑

j=i+1

f(t_j,St_j,Y_t^n−1,R_j ,Z_t^n−1,R_j )∆_j

¶¯¯

¯¯Ft_i

¸ ,

which we also initialize with(Y^0,R,Z^0,R) = (0, 0)and constantly extend between two points in the time grid.

Hereby, the mapping[·]w_itruncates atR0

p∆_ifor someR0>0, meaning forx ∈Rwe have

[x]wi = (−R0

p∆_i∨x)∧R0

p∆_i. Moreover, forx∈R^Dwe also write[x]w_i = ([x₁]w_i, . . . ,[x_D]w_i)^>.

For a fixed partition, we now can show the boundedness of the discretized solution of the BSDE:

Lemma 3.1.1. There is a constant Cyonly depending on the bound of f andφand a constant C^π_z also depending on the partition, such that for i=0, . . . ,N and any n∈N0holds

|Y_t^n,R_i | ≤Cy and |Z_d,t^n,R

i| ≤C^π_z.

Proof. Jensen’s inequality and the boundedness ofφand f imply for anyi=0, . . . ,Nand anyn∈N

|Y_t^n,R_i | =

¯¯

¯¯E

φ(XtN) +

N−1

∑

j=i

f(t_j,St_j,Y_t^n−1,R_j ,Z^n−1,R_t_j )∆_j

¯¯

¯¯Ft_i

¸¯¯

¯¯

≤ E

·¯¯¯φ(X_t_N)

¯¯

¯+

N−1

∑

j=i

¯¯

¯f(t_j,S_t_j,Y_t^n−1,R_j ,Z^n−1,R_t_j )

¯¯

¯∆_j

¯¯

¯¯F_t_i

≤ E£

R+RT|Ft_i

¤=R(1+T) =:Cy.

3.1. Bounded processes 25

We now turn to the aim of this section. We show that the truncation of the Brownian increments cause an error converging fast to zero, ifR0exceeds all bounds. We separate the proof of this property in different steps to improve clarity:

Proposition 3.1.2. For any n ∈ N,Γ >0andλ_i, i =0, . . . ,N−1, withλ₀ =1andλ_i+1 = (1+Γ∆_i)λ_i for

Proof. This can be shown exactly as in Bender and Denk [2], Lemma 7, Step 2.

The second result is devoted to theZ-part of the solution and also uses ideas of Gobet et al. [22]:

Proposition 3.1.3. For any n∈N,γ,Γ>0andλ_i, i=0, . . . ,N−1, withλ₀=1andλ_i+1= (1+Γ∆_i)λ_ifor Taking squares and expectations and multiplying with∆_iyields together with the inequalities of Young,

26 3.1. Bounded processes

We plug-in the definition of the Picard-type and its modified approximation scheme and introduce the following abbreviation

Multiplying both sides of the inequality withλ_iand summing up from 0 toN−1 we obtain via Young’s inequality and the Lipschitz property off for anyγ>0:

N−1

∑

3.1. Bounded processes 27

Together with Proposition 3.1.2 this yields for theD-dimensional process

N−1

We are now able to estimate the error in then-th Picard-iteration against the error in the first iteration and the error resulting out of the truncation of the Brownian increments:

Corollary 3.1.4. For n∈N,γ=8DK²(T+1),Γ=4K²(T+1)(2DγT+1)it holds

28 3.1. Bounded processes

Proof. Adding the terms in the preceding propositions yields for anyn∈N

0≤i≤N−1max λ_iE

ChoosingΓandγas indicated above we obtain the claim via an iterated application of the above estimate.

We are left to give an upper bound for the error in the first Picard iteration which we will do next:

Lemma 3.1.5. It holds

0≤i≤N−1max λ_iEh

Together with the inequality of Proposition 3.1.3 (see (3.1) - (3.3)) we obtain the following estimate for the Z-part of the solution:

N−1

∑

For the preparations we still need the following lemma specifying an upper bound for the error resulting out of the truncation of the Brownian increments:

Lemma 3.1.6. There is a constant C, such that for any n∈N

Proof. For anyn∈Nand a generic constantCwe can calculate

N−1

∑

Im Dokument A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling (Seite 26-41)

2.2 Least-squares Monte Carlo

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

∑

L 2 -convergence for the Picard-type estimator

3.1 Bounded processes

∑

∑

∑

∑

∑

∑

∑

∑

∑

L ² -convergence for the Picard-type estimator