Approximation of stochastic processes - A Picard-type Iteration for Backward Stochastic Differe

1.2 Notation

1.2.2 Approximation of stochastic processes

φ(X^π_t_N) +

N−1

∑

j=i

f(t_j,S^π_t_j,Yb_t^n−1,π_j ,Zb_t^n−1,π_j )∆_j

¸ , Zb_d,t^n,π

i = P_d,i

·∆W_d,i

∆_i µ

φ(X^π_t_N) +

N−1

∑

j=i+1

f(t_j,S^π_t_j,Yb_t^n−1,π_j ,Zb^n−1,π_t_j )∆_j

¶¸

, d=1, . . . ,D,

initialized again at(Yb^0,π,Zb^0,π) = (0, 0).

At this stage the advantage of the forward approximation scheme reveals: Theorem 11 of Bender and Denk [2] specifies the moderate error occurring when approximating(Y_t^n,π_i ,Z^n,π_t_i )with(Yb_t^n,π_i ,Zb_t^n,π_i ). In the forward scheme this error is bounded by a constant times the worst projection error occurring during the iterations. Consequently, it does not explode if the mesh grid size tends to zero as it is the case for the backward schemes. For more details, see the discussion in Bender and Denk [2], pp. 1802-1803.

In a final step Bender and Denk [2] replace the theoretical projections P_d,i by simulation based least-squares estimators and derive at last in their Theorem 15 that this estimator convergesP-almost surely to the approximation coming from the theoretical projection. Overall, they obtain convergence in probability for their final estimator towards the solution of the FBSDE.

1.2 Notation

1.2.1 Function spaces

As usual in the theory of BSDEs we deal with the following function spaces:

• L²(F)- the space ofF-measurable random variablesXsuch thatE h

|X|² i

<∞,

• L²_F³

Ω,C([0,T]),R^d´

- the space of(Ft)_t≥0-adaptedR^d-valued continuous processes X such that Eh

sup_t∈[0,T]|X_t|²i

<∞,

• L²

³ 0,T;R^d

- the space of(Ft)t≥0-adaptedR^d-valued continuous processes Xsuch that we have EhR_T

0 |Xt|²dt i

<∞and

• M[0,T]:=L²_F(Ω,C([0,T]),Rⁿ)×L²(0,T;Rⁿ)equipped with the norm

||(Y(·),Z(·))||_M[0,T]:=

µ E

· sup

t∈[0,T]

|Yt|²

¸ +E

· Z _T

|Zt|²dt

¸¶_1/2 .

Pardoux and Peng [41] showed that inM[0,T]there is a unique solution to BSDEs satisfying the above assumptions.

1.2.2 Approximation of stochastic processes

In order to get the intuition behind the notation of the different discretizations of the occurring stochastic processes we introduce them in the following in full detail.

Time discretization

In any approximation scheme we will consider, the first stage of approximation is with respect to time.

That is we introduce a fixed partitionπ : 0 = t₀ < . . . < t_N = T of the interval [0,T] and compute

4 1.2. Notation

approximations of the solution of the FBSDE at the partition pointst_i,i = 0, . . . ,N. For the solution of the backward part we furthermore use an iterative Picard-type approach and label these iterations with n∈N₀. Hence, writing(S_t^π_i,Y_t^n,π_i ,Z_t^n,π_i )indicates the time discretized solution of the FSBDE at timet_iand iterationngiven the partitionπ. Proceeding this way, we have to introduce discretizations of increments of the Brownian motion with respect toπdenoted by∆W_i =Wti+1−Wti, i.e. we use forward increments.

Chapter 2 introduces a family of FSBDEs which is parameterized by a further stochastic processhwhich is again chosen once and then fixed throughout the whole calculations. We thus write(S^h,π_t_i ,Y_t^h,n,π_i ,Z_t^h,n,π_i ) for the time discretized solution of the modified FSBDE at timet_iand iterationngiven the partitionπ.

The choice h ≡ 0 thereby corresponds to the original discretization of Bender and Denk [2]. As our parametrization represents a change of measure, we also have to consider Brownian increments under a further measure and denote them by∆W_i^h=W_t^h_i+1−W_t^h_ito distinguish them from the former.

In order to ease notation at a later stage we drop the superindiceshandπfor the time discrete solution of the FBSDE, i.e. instead of(S_t^h,π_i ,Y_t^h,n,π_i ,Z^h,n,π_t_i )we simply write(St_i,Y_tⁿ_i,Z_tⁿ_i). We can justify this impre-ciseness not only because of the fewer indices but also because we do not change in the following steps the partition and the processhand hold them fixed.

Another variant of the equation withh ≡ 0 is studied in Chapter 3. Here we focus on drivers f, which are bounded by some constantR. As consequence, we will derive that under mild manipulations of the scheme of Bender and Denk our time discrete approximations of the solution of the backward part are bounded. To remind the reader of this property we write(S_t_i,Y_t^n,R_i ,Z^n,R_t_i )suppressing the dependency on the time partitionπ.

In any setting, there will be Borel-functions such that the time discrete approximations of the solution of the backward SDE can be written as functions of a forward Markov process, which contains as first com-ponents the (discrete) forward diffusion and the other comcom-ponents depend on the shape of the terminal condition. We denote this process byXt_i. It turns out that these deterministic functions only depend on the partition point and the number of the Picard-iteration, such that we will write in Chapter 2,Y_tⁿ_i =yⁿ_i(Xt_i), Zⁿ_t_i =zⁿ_i(Xt_i). In Chapter 3 we hereby ignore the influence of the boundRand also writeY_t^n,R_i =yⁿ_i(Xt_i), Z^n,R_t_i =zⁿ_i(Xti). We emphasize that these functions are not the same across chapters, but there is no danger of mixing them up, because within one chapter we only deal with one set of functions.

Projections on finite-dimensional spaces

In Chapter 2 conditional expectations are further replaced by orthogonal projections on finite-dimensional spaces. We indicate this step by a hat, i.e. we write(Yb_tⁿ_i,Zb_tⁿ_i)for the projection of the time discretized solution of the modified BSDE at timet_i and iterationn given the partitionπ on a fixed chosen finite-dimensional subspace.

Monte Carlo simulations

The final approximation step of our proceeding in Chapter 2 replaces the orthogonal projections on finite-dimensional subspaces by an estimator coming from a simulation based least-squares approach. For this purpose we need L independent Monte Carlo simulations of the occurring forward processes. In full detail, we have to simulate in analogy to Bender and Denk [2] the Brownian increments and the forward Markov process and denote them in Chapter 2 by∆_λW_i^h and _λXt_i respectively, for λ = 1, . . . ,L and i=0, . . . ,N. Thus it is natural to write(_λYb_tⁿ_i,_λZbⁿ_t_i)for the resulting estimators for the discretized solution of the backward equation.

Our approach in Chapter 3 directly passes from the time discretization to a simulation based least-squares procedure. For this purpose a whole set of further only imaginary simulations is required. We need for each time pointt_iin the partition extra simulations of the Brownian increments and the forward Markov process running until the end of the time horizon of the equation. These new processes are independent conditional to the information up to t_i and in the mean time are identically generated as the already existing discrete time processes. To be able to distinguish these sets of processes we signify the imaginary ones with bars, i.e. ∆_λW_jand_λXⁱ_t_j denote these processes at timet_j. The additional superindex for the

1.2. Notation 5

discrete Markov process indicates that the additional feature starts at timet_i. We will comment on these so-called ’ghost samples’ later on in more detail.

Further notation

A lot of other notation is used in the sequel, see also the index at the end of the thesis, however, it is not helpful to introduce it here. We will do so at the appropriate places and turn now to a variance reduced version of the algorithm of Bender and Denk [2].

Chapter2

Importance sampling

The content of this chapter is already published in Bender and Moseler [4]. We only supplemented some comments and explanations to further clarify our proceeding. The aim of this chapter is to introduce importance sampling for BSDEs. That is we develop a variance reduction method for BSDEs via a change of measure, whose basic idea is borrowed from option pricing.

2.1 Modified forward scheme

We now explain the starting point for the algorithm developed later on. Consider the following family of decoupled FBSDEs parameterized by some measurable, bounded and adapted processh:[0,T]−→R^D:

dS_t^h = ³

b(t,S_t^h) +σ(t,S_t^h)h_t´

dt+σ(t,S_t^h)dW_t, dY_t^h = ³

−f(t,S_t^h,Y_t^h,Z_t^h) + (Z_t^h)^>ht

dt+Z_t^hdWt, S₀^h = s₀, Y_T^h=φ(X_T^h).

where^>denotes the transposition of a matrix. We denote(S,Y,Z) := (S⁰,Y⁰,Z⁰), the solution of the original FBSDE withh≡0.

The first observation is that the initial value of the backward part does not depend onh. In fact, defining a new measureQ^hbydQ^h=Ψ^h_TdPwhere

Ψ^h_t =exp

− Z _t

h^>_udWu−1 2

Z _t

|hu|²du

¾ ,

we can apply the Girsanov theorem, to deduce that the law of (S^h,Y^h,Z^h) under Q^h is the same as that of(S,Y,Z)underP. In particular, the constants(Y₀,Z₀)and (Y₀^h,Z₀^h)coincide. We mention that, however, the path of the processes at later time points(S^h,Y^h,Z^h)and(S,Y,Z)differ. Nonetheless, in many applications, e.g. in option pricing problems, one is mainly interested in estimatingY0. Having the different representations forY0at hand, we aim at reducing the variance of Monte Carlo estimators forY₀by a judicious choice ofh. This turns out to generalize the importance sampling technique from calculating expectations to nonlinear BSDEs.

We now introduce the time discretized analog to the Picard-type iteration scheme with importance sam-pling induced by some processh. As it is natural that the choice ofhwill vary with the partitionπ, we do assume from now on that the partitionπis fixed. At first we specify the class of processes which we will consider in the sequel.

8 2.1. Modified forward scheme

A 6. The discretized process h is given by

h_t_i =eh(t_i,∆W₀, . . . ,∆W_i−1)

for some bounded deterministic functioneh:π×R^D×. . .×R^D−→R^D. The bound of h will be denoted C_h. The modified forward scheme is then given by

∆W_i^h,π = ∆W_i+ht_i∆_i, i=0, . . . ,N−1, ∆W_N^h,π=0, Ψ^h,π,j_t_i = exp

−

i−1

∑

k=j

h^>_t_k∆W_k−1 2

i−1

∑

k=j

|ht_k|²∆_k

, j=0, . . . ,N−1, i=j, . . . ,N, X_t^h,π₀ = X₀,

X_t^h,π_i = uπ(ti,X_t^h,π_i−1,∆W_i−1^h,π), i=1, . . . ,N, and, fori=0, . . . ,N,d=1, . . . ,D

Y_t^h,n,π_i = E

Ψ^h,π,i_t_N φ(X^h,π_t_N ) +

N−1

∑

j=i

Ψ^h,π,i_t_j f(t_j,S^h,π_t_j ,Y_t^h,n−1,π_j ,Z^h,n−1,π_t_j )∆_j

¯¯

¯¯Ft_i

, (2.1)

Z^h,n,π_d,t

i = E

·∆W_d,i^h,π

∆_i µ

Ψ^h,π,i_t_N φ(X_t^h,π_N ) +

N−1

∑

j=i+1

Ψ^h,π,i_t_j f(t_j,S^h,π_t_j ,Y_t^h,n−1,π_j ,Z_t^h,n−1,π_j )∆_j

¶¯¯

¯¯Ft_i

, (2.2)

initialized at(Y_t^h,0,π_i ,Z^h,0,π_t_i ) = (0, 0). For the special caseh ≡0, we are just back in the forward scheme discussed by Bender and Denk [2]. Note that, by construction, the firstMcomponents ofX_t^h,π

i coincide withS_t^h,π

i defined via the Euler-Maruyama scheme S^h,π_t₀ = s₀,

S^h,π_t_i+1 = ³

b(t_i,S^h,π_t_i ) +σ(t_i,S_t^h,π_i )ht_i

∆_i+σ(t_i,S_t^h,π_i )∆W_i, i=0, . . . ,N−1.

Defining a new measureQ^h,πbydQ^h,π=Ψ^h,π,0_t_N dPthe Girsanov theorem implies that the process W_t^h,π=W_t+

N−1

∑

j=0

h_t_j(t_j+1∧t−t_j∧t),

is Brownian motion underQ^h,π. Consequently,∆W^h,πare Brownian increments under this measure. This implies that(X^h,π,Ft_i)is a Markovian process underQ^h,π and that the transition probabilities of X^h,π underQ^h,πare the same as those ofX^πunderP.

The following theorem shows that, in this Markovian setting, the conditional expectations in the above iteration scheme actually simplify to regressions onX_t^h,π_i . On the one hand this is crucial for the Monte Carlo algorithm described in the next section, on the other hand it also allows us to derive some conver-gence results for the modified scheme in an elegant way.

Theorem 2.1.1. Under the standing assumptions there are deterministic functions y^n,π_i and z^n,π_i not depending on h such that

Y_t^h,n,π_i =y^n,π_i (X_t^h,π_i ), Z_t^h,n,π_i =z^n,π_i (X^h,π_t_i ).

2.1. Modified forward scheme 9

Proof. We proceed with a double induction, working forward in Picard-iterations and backward in time.

The claim is true forn =0,i =0, . . . ,N, since by definitionY_t^h,0,π_i =0 =Z^h,0,π_d,t

where we first use the martingale property ofΨ^h,π,i_t_j , the fifth equality is due to the induction hypothesis and the sixth one is true because(X^h,π_t_i ,Fti)is Markovian under the measureQ^h,π. Finally, the function

10 2.1. Modified forward scheme

Since the regression functions do not depend on the choice ofh and X^h,π_t₀ = X0, we can conclude that the error made by approximating(Y₀,Z₀)with(Y_t^h,n,π₀ ,Z_t^h,n,π₀ )is independent ofh. Hence, we can simply chooseh≡0 for which case the error estimate was already derived in Theorem 1.1.1.

Corollary 2.1.2. There are constants C and C^∗(independent of h) such that for all h

|Y_t^h,n,π₀ − Y₀|²+|Z^h,n,π_t₀ − Z₀|²≤CE h

|X_T−X_t^π_N|² i

+C|π|+C µ1

2+C^∗|π|

¶_n , where C^∗is the same constant as in Theorem 1.1.1.

Remark 2.1.3. Another way to prove this result is to rewrite the iteration scheme under the new measure Q^h,π. Since(S^h,π,Y^h,n,π,Z^h,n,π)has the same law under the new measure as(S^π,Y^n,π,Z^n,π)has underP we can derive the above error estimate.

So far we analyzed the approximation error which is due to the choice ofhand the partitionπ. From now on, we assume that these objects are chosen fixed and we concentrate on other features of the scheme.

Hence, we drop in the following the superindiceshandπ, such that we write for the discretized solution at timet_jof the modified FBSDE no more(S^h,π_t_j ,Y_t^h,n,π_j ,Z_t^h,n,π_j )but(S_t_j,Y_tⁿ_j,Z_tⁿ_j). Accordingly, we make use of the notationsΨⁱ_t_j andXtj instead ofΨ^h,π,i_t_j and X^h,π_t_i respectively. As last simplification we introduce

∆W_i^has substitute for∆W_i^h,π.

We now add a further assumption for the remainder of this chapter, which guarantees thatΨ⁰_t_iY_tⁿ_i and Ψ⁰_t_iZⁿ_t_i are square-integrable underP. This assumption turns out to be essential in order to avoid infinite variances within the Monte Carlo implementation.

A 7. For i=0, . . . ,N−1 E

·µ

Ψ⁰_t_Nφ(XtN) +

N−1

∑

j=i

Ψ⁰_t_jf(t_j,Stj, 0, 0)∆_j

¶₂¸

<∞. (2.3)

Remark 2.1.4. If the driver f does not depend onS_t andφ(X_t_N) ∈ L^2+ε(F)for someε > 0 the above condition is satisfied, sinceΨ⁰_t_j ∈L^p(F)for anyp>0 andj=0, . . . ,N. This is seen by

|Ψ⁰_t_j|^pi

= E

· exp

−

i−1

∑

j=0

p h^>_t_j∆W_j−1 2

i−1

∑

j=0

p²|h_t_j|²∆_j+1

2(p²−p)

i−1

∑

j=0

|h_t_j|²∆_j

¾¸

≤ exp

½1

2(p²−p)C_h²T

¾ E

· exp

−

i−1

∑

j=0

p h^>_t_j∆W_j−1 2

i−1

∑

j=0

|p ht_j|²∆_j

¾¸

= exp

½1

2(p²−p)C_h²T

and Hölder’s inequality finally implies condition (2.3).

For the first level of the Picard-iteration the square-integrability is now straightforward:

Lemma 2.1.5. It holds that(Ψ⁰_t_iY_t¹_i,Ψ⁰_t_iZ_t¹_i)∈L²(F)for every i=0, . . . ,N.

Proof. SinceΨⁱ_t_j =Ψ⁰_t_j/Ψ⁰_t_i andΨ⁰_t_i isFt_i-measurable we obtain fori=0, . . . ,N:

Ψ⁰_t_iY_tⁿ_i = E

Ψ⁰_t_Nφ(XtN) +

N−1

∑

j=i

Ψ⁰_t_jf(t_j,St_j,Y_tⁿ⁻¹_j ,Z_tⁿ⁻¹_j )∆_j

¯¯

¯¯Ft_i

, (2.4)

Ψ⁰_t_iZⁿ_d,t

i = E

·∆W_d,i^h

∆_i µ

Ψ⁰_t_Nφ(X_t_N) +

N−1

∑

j=i+1

Ψ⁰_t_jf(t_j,S_t_j,Y_tⁿ⁻¹_j ,Zⁿ⁻¹_t_j )∆_j

¶¯¯

¯¯F_t_i

. (2.5)

2.1. Modified forward scheme 11

Consequently forn=1 E

and by Hölder’s inequality Eh

In order to derive the analog result forn>1 we now state some a priori estimates generalizing Lemma 7 in Bender and Denk [2].

Lemma 2.1.6. SupposeΓandγare positive real numbers, y^ι,z^ι,ι=1, 2are adapted processes and

j is an{F_t_j}-martingale, we can rewrite, noting (2.4) and (2.5), Ψ⁰_t_iY_t^(ι)_i = E

12 2.1. Modified forward scheme

In a first step we show that

N−1

∑

2.1. Modified forward scheme 13 Secondly, we shall derive

0≤i≤Nmax λ_iE

To this end we estimate E

Multiplying withλ_i, an iterated application of the above inequality yields (2.7).

Finally, we put (2.6) - (2.7) together, and, noting thatZisD-dimensional, we deduce

0≤i≤Nmax λ_iEh With this result at hand we can conclude:

Corollary 2.1.7. For every i=0, . . . ,N and n∈Nwe have(Ψ⁰_t_iY_tⁿ_i,Ψ⁰_t_iZⁿ_t_i)∈L²(F).

Proof. Considering(Yⁿ,Zⁿ)and(Yⁿ⁻¹,Zⁿ⁻¹)we are in the situation of Lemma 2.1.6 withy⁽¹⁾ = Yⁿ⁻¹, y⁽²⁾ =Yⁿ⁻²,z⁽¹⁾ =Zⁿ⁻¹andz⁽²⁾ =Zⁿ⁻². Hence, choosingγ=8DK²(T+1)andΓ=4K²(T+1)(2D(γ+

Im Dokument A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling (Seite 15-26)