Projection approach in the case of Markov processes

≤ e^ΓTR²(1+T)²D

N−1

∑

i=0

∆_i Z

|x|≥R₀√

∆_ix²dN(0,∆_i)

= e^ΓTR²(1+T)²D 1

√2π

N−1

∑

i=0

1 (∆_i)^3/2

|x|≥R0

√∆i

x²e⁻^x

2 2∆idx

= e^ΓTR²(1+T)²D 1

√2π

N−1

∑

i=0

1 (∆_i)^3/2

p∆_i Z

|y|≥R₀∆_iy²e⁻^y

2 2 dy

= e^ΓTR²(1+T)²D 1

√2π

N−1

∑

i=0

|y|≥R0

y²e⁻^y

2 2 dy

= 2Ne^ΓTR²(1+T)²D 1

√2πe^−R²⁰^/4 Z _∞

R₀ y²e⁻^y

2+R²₀/4dy

≤ 2Ne^ΓTR²(1+T)²D 1

√2πe^−R²⁰^/4 Z _∞

y²e⁻^y

2 4 dy

≤ 2Ne^ΓTR²(1+T)²D 1

√2πe^−R²⁰^/4C

= CNR²exp(−R²₀/4),

whereN(µ,σ²)denotes the cumulative distribution function of Gaussian distributed random variables with meanµand varianceσ².

Now, we are able to provide the aggregate error which is generated by this kind of truncation:

Theorem 3.1.7. There is a constant C such that for any n∈N

0≤i≤N−1max E h

|Y_tⁿ_i −Y_t^n,R_i |² i

N−1

∑

i=0

E h

|Z_tⁿ_i −Z^n,R_t_i |² i

∆_i≤CNR²exp(−R²₀/4), given|π|is small enough.

Proof. Corollary 3.1.4, Lemma 3.1.5 and Lemma 3.1.6 yield

0≤i≤N−1max E h

|Y_tⁿ_i −Y_t^n,R_i |² i

N−1

∑

i=0

E h

|Z_tⁿ_i −Z^n,R_t_i |² i

∆_i

≤ max

0≤i≤N−1λ_iEh

|Y_tⁿ_i −Y_t^n,R

i |²i

N−1

∑

i=0

λ_iEh

|Zⁿ_t_i−Z^n,R_t

i |²i

∆_i

≤

n−1

∑

k=0

µΓ 4|π|+1

¶_k

CNR²exp(−R²₀/4)

≤ CNR²exp(−R²₀/4)

for some other constantCprovided|π|is small enough.

3.2 Projection approach in the case of Markov processes

The next step is the basis for a least-squares Monte Carlo procedure and is easily proven with the results of the second chapter:

30 3.2. Projection approach in the case of Markov processes

Lemma 3.2.1. There are deterministic functions yⁿ_i and z^d,n_i (i=0, . . . ,N,n=0, 1, . . . ,d=1, . . . ,D)such that Y_t^n,R_i =yⁿ_i(Xt_i)and Z_d,t^n,R

i =z^d,n_i (Xt_i).

We define zⁿ_i = (z^1,n_i , . . . ,z^D,n_i )^>.

Proof. We can proceed as in Lemma 2.1.1 and only have to take into account that sinceXt_i and∆W_d,iare independent the same is true forXt_i and[∆W_d,i]w_i.

Note, that here the regression functions also depend onR, however we suppress this feature for simplicity in the sequel.

Now, we choose as in the last chapter fori=0, . . . ,N−1 and anyd=0, . . . ,Din each caseK_d,i determin-istic basis functionsp_d,i,k(·),k =1, . . . ,K_d,i such thatp_d,i(Xti)is square-integrable. We thereby used the notation

p_d,i(·) = (p_d,i,1(·), . . . ,p_d,i,K_d,i(·))^>.

In the following proofs we make use of a lot of structures, which we now describe in detail starting with the projection spaces:

Projection spaces and matrices:

Usually, as approximation, projections to the finite-dimensional subspaces P_d,i =:n

α·p_d,i(·),α∈R^K^d,io

are considered. Thereby, we use for the approximation of yⁿ_i the basis functions p_0,i and for the ap-proximation ofz^d,n_i the functions p_d,i withd = 1, . . . ,D. Due to our special situation we apply a slight modification, which we describe in detail later on.

In order to calculate the different projection coefficients we needL>max_d,iK_d,iindependent Monte Carlo simulations ofX_t_i,i =0, . . . ,Nand∆W_i,i =0, . . . ,N−1. We denote them again by_λX_t_i λ = 1, . . . ,L, i=0, . . . ,Nand∆_λW_i,λ=1, . . . ,L,i=0, . . . ,N−1 respectively.

To simplify the notation we writep_d,i,k(_λX_t_i) = p^λ_d,i,kand p^λ_d,i = (p^λ_d,i,1, . . . ,p^λ_d,i,K

d,i)^>ford =0, 1, . . . ,D.

We define ford =0, 1, . . . ,Dthe matrixB_d,i^L with dimensionL×K_d,iwhich just contains(p^λ_d,i)^>as rows.

The rank of this matrix is denoted byK^L_d,iwhich is smaller or equal toK_d,iand random.

As the unknown functions yⁿ_i and z^d,n_i are bounded by Cy and C^π_z respectively, their approximations should satisfy this property as well and we therefore approximateyⁿ_i with an element of

[P_0,i]y=:

[α·p_0,i]y(·),α∈R^K^0,i o

andz^d,n_i ford=1, . . . ,Dwith an element of [P_d,i]_z =:n

[α·p_d,i]_z(·),α∈R^K^d,io ,

where[·]yand[·]ztruncate atCyandC^π_z respectively, i.e. for any real-valued functionξwe set [ξ]_y(x):= (−C_y∨ξ(x))∧C_y,

[ξ]z(x):= (−C_z^π∨ξ(x))∧C_z^π.

’Ghost samples’:

For the convergence proof we need extra simulations of_λXt_j, j = 0, . . . ,Nand the simulated Brownian

3.2. Projection approach in the case of Markov processes 31

increments∆_λW_j,j=0, . . . ,N−1. More precisely, we create a further set of independent (from everything else) Brownian increments denoted by∆_λW_j,j=0, . . . ,N−1,λ=1, . . . ,Land thereby constructNsets of discrete Markov processes_λXⁱ,i=0, . . . ,N−1 such that forλ=1, . . . ,L,j=i+1, . . . ,Nthe random variables_λXⁱ_t_j and_λXt_j conditioned to_λXt_i are independent and identically generated and_λXⁱ_t_j = _λXt_j

forλ=1, . . . ,Landj=0, . . . ,i.

If one applies the Euler-Maruyama scheme for the approximation of the forward diffusion this means for the first components of_λX_t_j and_λXⁱ_t_j:

λSt_j+1 = _λSt_j +b(t_j,_λSt_j)∆_j+σ(t_j,_λSt_j)∆_λW_j, j=0, . . . ,N−1,

λSⁱ_t_j = _λS_t_j, j=0, . . . ,i,

λSⁱ_t_j+1 = _λSⁱ_t_j+b(t_j,_λSⁱ_t_j)∆_j+σ(t_j,_λSⁱ_t_j)∆_λW_j, j=i, . . . ,N−1.

The other components are constructed by

λXⁱ_t_j+1 =u_π(t_j+1,_λXⁱ_t_j,∆_λW_j), j=i, . . . ,N−1.

Discretized norms:

We introduce the following discretized norms: For aR^D⁰-valued functionξ= (ξ₁, . . . ,ξ_D⁰)^>we define for i=0, . . . ,N

kξk_i,L :=

vu ut1

∑

L λ=1

D⁰

∑

d=1

|ξ_d(_λX_t_i)|² and fork=0, . . . ,N−1,i=k, . . . ,N

kξk_{i, ¯}_L_k :=

vu ut1

∑

L λ=1

D⁰ d=1

∑

|ξ_d(_λX^k_t_i)|².

Algorithm and projection coefficients:

Our algorithm now works as follows: It is initialized withy^0,L_i = 0 = z^0,L_i and for any n ∈ Nwith y^n,L_N (_λXt_N) =φ(_λXt_N)andz^n,L_N =0.

Afterwards, we calculate the following projection coefficients:

α^n,L_0,i =arginf

1 L

∑

L λ=1

¯¯

φ(_λX_t_N) +

N−1

∑

j=i

f(t_j,_λS_t_j,y^n−1,L_j (_λX_t_j),z^n−1,L_j (_λX_t_j))∆_j

−α·p^λ_0,i

¯¯

(3.4)

and ford=1, . . . ,D α^n,L_d,i =arginf

1 L

L λ=1

∑

¯¯

¯¯[∆_λW_d,i]w_i

∆_i

φ(_λXtN) +

N−1 j=i+1

∑

f(t_j,_λStj,y^n−1,L_j (_λXtj),z^n−1,L_j (_λXtj))∆_j

−α·p^λ_d,i

¯¯

.(3.5) We emphasize that these coefficients are despite the same notation not equal to those of Chapter 2. How-ever, since in the remainder of this chapter we only will use the here defined version, no confusion should arise.

32 3.2. Projection approach in the case of Markov processes

Our approximations then finally are

y_i^n,L(_λX_t_i) = [α^n,L_0,i ·p_0,i]_y(_λX_t_i), z^d,n,L_i (_λXti) = [α^n,L_d,i ·p_d,i]z(_λXti).

We additionally need the following projection coefficients to prove convergence:

eα^n,L_0,i =arginf the functionsy^n−1,L_j and z^n−1,L_j are fixed and we use for the estimation ofα^n,L_d,i the original simulations

λXti,λ=1, . . . ,Land∆_λW_j j=i, . . . ,N−1,λ=1, . . . ,L, whereas the estimation ofeα^n,L_d,i is based on the

Orthonormality of the basis functions

The definition of the above projection coefficients is not always unique. Especially, we have to specify which solution we choose if the matrices(B^L_d,i)^>B_d,i^L ford =0, 1, . . . ,Dandi =0, . . . ,N−1 are singular.

As Gobet et al. [22] we choose an approach due to the singular value decomposition of a matrix, which we shortly sketch in the appendix based on the description in Golub and Van Loan [24]. Proceeding this way, we can suppose that for anyd =0, 1, . . . ,Dandi =0, . . . ,N−1 the identity ¹_L(B_d,i^L )^>B_d,i^L = Id_KL solve problems (3.4) and (3.5) respectively.

3.2. Projection approach in the case of Markov processes 33

σ-algebras, conditional expectations and conditional probabilities:

We define theσ-algebras

F_i^L :=σ(∆_λW_j,λ=1, . . . ,L,j=0, . . . ,i−1),

F_i^L,k :=σ(∆_λW_j,λ=1, . . . ,L,j=0, . . . ,N−1,∆_λW_k, . . . ,∆_λW_i−1,λ=1, . . . ,L)fori=k+1, . . . ,N, F_k^L,k :=F^L:=σ(∆_λW_j,λ=1, . . . ,L,j=0, . . . ,N−1)

and the corresponding syntax for the conditional expectations and conditional probabilities. We introduce the redundant parameterkin the third definition to harmonize notation in the sequel.

Further abbreviations:

We define to shorten the notation:

λf_iⁿ := f(t_i,_λSt_i,yⁿ_i(_λXt_i),zⁿ_i(_λXt_i)),

λf_i^n,L := f(t_i,_λSt_i,y^n,L_i (_λXt_i),z^n,L_i (_λXt_i)),

λf^n,k_i := f(t_i,_λS^k_t_i,yⁿ_i(_λX^k_t_i),zⁿ_i(_λX^k_t_i)), i=k, . . . ,N−1,

λf^n,L,k_i := f(t_i,_λS^k_t_i,y^n,L_i (_λX^k_t_i),z^n,L_i (_λX^k_t_i)) i=k, . . . ,N−1, y^λ,n,L,k_i := E_i^L,kh

φ(_λX^k_t_N) +

N−1

∑

j=i

λf^n−1,L,k_j ∆_ji

= E_i^L,k h

y^λ,n,L,k_i+1 i

+ _λf^n−1,L,k_i ∆_i, i=k, . . . ,N−1, . For functionsξ_d:R^M⁰ →R,d=0, . . . ,Dwe write furthermore

λf_i(ξ_0,..,D) := f(t_i,_λS_t_i,ξ₀(_λX_t_i),(ξ_d(_λX_t_i))_d=1,..,D),

λf^k_i(ξ_0,..,D) := f(t_i,_λS^k_t_i,ξ₀(_λX^k_t_i),(ξ_d(_λX^k_t_i))_d=1,..,D).

Occurring regular and exception sets:

Following events play a decisive role in the subsequent proofs. Forn∈N,d=1, . . . ,D,i=0, . . . ,N−1 andβ>0 to be chosen later on we define:

A_0,i^n,L =

k{eα^n,L_0,i −α^n,L_0,i }p_0,ik²_i,L <∆_i^β o

A^n,L_d,i = n

k{eα^n,L_d,i −α^n,L_d,i }p_d,ik²_i,L <∆_i^βo , A^n,L_y,i^k =

∀ξ∈[P_0,i]y−yⁿ⁻¹_i :kξk_i,L

k− kξk_i,L<∆^β/2_i o

A^n,L_z,i^k =





∀ξ∈



 [P_1,i]_z

... [P_D,i]_z



−zⁿ⁻¹_i :kξk_i,L

k− kξk_i,L <∆_i^β/2





, and forn∈N,i=1, . . . ,N−1

A^n,L_z ⁱ⁻¹^,Lⁱ =

½µ1 L

∑

L λ=1

|yⁿ_i(_λXⁱ⁻¹_t_i )−y_i^λ,n,L,i−1|²

¶_1/2

− µ1

∑

L λ=1

|y_iⁿ(_λXⁱ_t_i)−y^λ,n,L,i_i |²

¶_1/2

<∆_i^β+1

¾ .

34 3.2. Projection approach in the case of Markov processes

After this bunch of definitions we now turn to the mathematics. The key tool for the convergence proof is the following observation:

Lemma 3.2.2. Let k =0, . . . ,N−1,Γ >0andλ_i, i =0, . . . ,N−1withλ0=1andλ_i+1= (1+Γ∆_i)λ_i for

where the second equality is due to the conditional independence of_λXⁱ_t_j and_λXt_j, see e.g. Chow and Teicher [13], Corollary 7.3.2. Moreover, we obtain

yⁿ_i(_λX^k_t_i)−y^λ,n,L,k_i

Via Young’s and Jensen’s inequality and because of the Lipschitz continuity of f we gain for anyΓ>0:

3.2. Projection approach in the case of Markov processes 35

Repeated application of Young’s inequality and multiplication withλ_iyields λ_iE It is quite bothering that the discretized norms are evaluated at the ghost samples which will be overcome with the help of the exception sets. For the first inequality below we additionally use the following property of non-negative numbersx,y:

x²≤2(x−y)²₊+2y².

36 3.2. Projection approach in the case of Markov processes

We start with the convergence proof with a result for the first part of the solution of the BSDE:

Proposition 3.2.3. For any n=2, 3, . . .,Γ,β>0andλ_i, i=0, . . . ,N withλ₀=1andλ_i+1= (1+Γ∆_i)λ_ifor Due to the equality

yⁿ_i(_λX_t_i) = E

and the minimality of the coefficientβe^n,L_0,i we obtain that its conditional expectationE^L(eβ^n,L_0,i)minimizes 1

∑

L λ=1

|yⁿ_i(_λX_t_i)−α·p^λ_0,i|².

The Pythagorean theorem then implies:

3.2. Projection approach in the case of Markov processes 37

The second term of (3.6) can be estimated by the contraction property of projections:

3E and the first term of (3.7) is an error term:

3Eh

k{E^L(eα^n,L_0,i )−eα^n,L_0,i}p_0,ik²_i,Li

=3T_2,i^n,L. (3.9)

The second term of (3.7) remains which can be tackled again by the contraction property:

3E Multiplying withλ_iand transition to the maximum yields:

0≤i≤N−1max λ_iEh Lemma 3.2.2 then implies the assertion fork=i:

0≤i≤N−1max λ_iE

38 3.2. Projection approach in the case of Markov processes

Next, we consider the corresponding error for theZpart:

Proposition 3.2.4. For any n=2, 3, . . .,Γ,β,µ>0andλ_i, i=0, . . . ,N withλ₀ =1andλ_i+1= (1+Γ∆_i)λ_i sincez^d,n_i is bounded and[·]zis Lipschitz continuous:

which is due to the conditional identical and conditional independent distribution of the involved random variables_λXt_j and_λXⁱ_t_j, and because of the minimality ofeβ^n,L_d,i, we know thatE^L(βe^n,L_d,i)minimizes

The Pythagorean theorem and the triangle inequality then yield:

3.2. Projection approach in the case of Markov processes 39

The second term of (3.12) is estimated withβ>0 and the contraction property as follows:

3Eh

k{eα^n,L_d,i −α^n,L_d,i} ·p_d,ik²_i,Li

=3E

k{eα^n,L_d,i −α^n,L_d,i } ·p_d,ik²_i,L1_{k{e_αn,L

d,i−α^n,L_d,i}·p_d,ik²_i,L<∆^β_i}

+3E

k{eα^n,L_d,i −α^n,L_d,i } ·p_d,ik²_i,L1_{k{e_αn,L

d,i−α^n,L_d,i}·p_d,ik²_i,L≥∆^β_i}

≤3∆_i^β +3E

·1 L

∑

L λ=1

¯¯

¯¯[∆_λW_d,i]_w_i

∆_i

φ(_λXⁱ_t_N) +

N−1

∑

j=i+1

λf^n−1,L,i_j ∆_j

−[∆_λW_d,i]_w_i

∆_i

φ(_λXt_N) +

N−1

∑

j=i+1

λf_j^n−1,L∆_j

¾¯¯

¯¯

·1

{k{eα^n,L_d,i−α^n,L_d,i}·p_d,ik²_i,L≥∆^β_i}

≤3∆_i^β+12R²(1+T)²R²₀

∆_iP([A^n,L_d,i]^c). (3.13)

The first summand of (3.12) directly yields an error term:

3Eh

k{E^L(eα^n,L_d,i)−eα^n,L_d,i } ·p_d,ik²_i,Li

=3T_4,d,i^n,L, (3.14)

and we are left with the second summand of (3.11). The contraction property of projections yields:

3Eh

k{E^L(βe^n,L_d,i)−E^L(eα^n,L_d,i)} ·p_d,ik²_i,Li

≤ 3E

·1 L

∑

L λ=1

¯¯

¯¯E^L

·[∆_λW_d,i]w_i

∆_i

φ(_λXⁱ_t_N) +

N−1

∑

j=i+1

λf^n−1,i_j ∆_j−φ(_λXⁱ_t_N)−

N−1

∑

j=i+1

λf^n−1,L,i_j ∆_j

¾¸¯¯

¯¯

2¸ .

The above inequality can be transformed with the help of the tower property of conditional expectations and the measurability of_λW_d,ito:

3E h

k{E^L(βe^n,L_d,i)−E^L(eα^n,L_d,i)} ·p_d,ik²_i,L i

≤ 3E

·1 L

∑

L λ=1

¯¯

¯¯E^L

·[∆_λW_d,i]w_i

∆_i

½ E_i+1^L,i

· _N−1

j=i+1

∑

(_λf^n−1,i_j − _λf^n−1,L,i_j )∆_j

¸¾¸¯¯

¯¯

2¸

. (3.15)

As[∆_λW_d,i]w_iis independent ofF^Land the truncation is symmetric to zero, we haveE^L£

[∆_λW_d,i]w_iV¤

40 3.2. Projection approach in the case of Markov processes

for anyF^L-measurable random variableV. We apply this property to term (3.15) and obtain:

3E where the first inequality is due to the Cauchy-Schwarz inequality.

We add in term (3.16) similar terms however containing the ghost samples starting at timet_i−1such that if summing them up we obtain a telescoping sum. Definingyⁿ₀(_λX⁻¹_t₀ ) := 0 =: y^λ,n,L,−1₀ simplifies the notation at this juncture. For term (3.17) we exploit the Cauchy-Schwarz inequality and get forµ>0:

3.2. Projection approach in the case of Markov processes 41

where we used in the last inequality the Lipschitz continuity of f, Young’s inequality and the identity

λXⁱ_t_i = _λX_t_i.

Fori = 0, (3.19) - (3.20) is negative and we can skip this term. To keep notation as easy as possible we thus defineA^n,L_z ⁻¹^,L⁰ =∅. Since fori =1, . . . ,N−1 the second factor of (3.19) - (3.20) is nonnegative we can estimate this expression further by:

Hence, adding (3.13), (3.14), (3.18), (3.21), (3.22), (3.23) we obtain for theD-dimensional processZ E Multiplying withλ_i∆_iand summing up from 0 toN−1 finally yields:

N−1

∑

42 3.2. Projection approach in the case of Markov processes

The assertion is then implied by Lemma 3.2.2.

The whole error originating by the Monte Carlo simulation in then-th Picard iteration can be estimated by:

Proof. Gathering the both preceding propositions yields forn=2, 3, . . .

0≤i≤N−1max λ_iE

3.2. Projection approach in the case of Markov processes 43

Proceeding exactly as before we get an upper bound for the error occurring in the first Picard iteration:

Lemma 3.2.6. Taking into account the notation of Proposition 3.2.5 it holds:

0≤i≤N−1max λ_iEh

Proof. Proceeding in the same way as in Proposition 3.2.3 the boundedness and Lipschitz continuity of [·]yyields fori=0, . . . ,N−1

and because of the minimality of the coefficient βe^1,L_0,i we know that E^L[βe^1,L_0,i] minimizes the expression

1L∑^L_λ=1|y¹_i(_λXti)−α·p_0,i^λ|². Thus, the Pythagorean theorem and the contraction property imply:

44 3.2. Projection approach in the case of Markov processes

Analogly, we obtain for the second part of the solution:

and due to the minimality of the coefficientβe^1,L_d,i its conditional expectationE^L[βe^1,L_d,i]minimizes the expres-sion¹_L∑^L_λ=1|z_i^d,1(_λX_t_i)−α·p^λ_d,i|². Hence, the Pythagorean theorem and the contraction property imply: As consequence we can phrase:

Corollary 3.2.7. Taking into account the definitions of Proposition 3.2.5 we obtain for any n∈N:

0≤i≤N−1max Eh for a number of simulationsLexceeding all bounds. The next section is dedicated to this task.

Im Dokument A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling (Seite 41-57)