• Keine Ergebnisse gefunden

A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling

N/A
N/A
Protected

Academic year: 2022

Aktie "A Picard-type Iteration for Backward Stochastic Differential Equations : Convergence and Importance Sampling"

Copied!
155
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

A Picard-type Iteration for Backward Stochastic Differential Equations:

Convergence and Importance Sampling

Dissertation zur Erlangung des akademischen Grades Doktor der Naturwissenschaften

am Fachbereich Mathematik und Statistik der Universität Konstanz

vorgelegt von Thilo Moseler

Tag der mündlichen Prüfung: 10.06.2010

Referenten: Prof. Dr. Robert Denk (Universität Konstanz) Prof. Dr. Christian Bender

(Universität des Saarlandes, Saarbrücken)

Konstanzer Online-Publikations-System (KOPS) URN: http://nbn-resolving.de/urn:nbn:de:bsz:352-opus-121057

URL: http://kops.ub.uni-konstanz.de/volltexte/2010/12105/

(2)
(3)

Acknowledgements

I would like to express my gratitude to several people who helped me during the work on this thesis.

First of all, I would like to thank my supervisor Professor Dr. Robert Denk, who gave me the opportu- nity to study a challenging topic and always supported me in every conceivable way including many mathematical suggestions.

In the same breath, I want to mention Professor Dr. Christian Bender to whom I am greatly indebted for his continuous support and great hospitality. I enjoyed four very intensive stays in Braunschweig and Saarbrücken, where I had many fruitful discussions with him and his whole research group. Moreover, large parts of this thesis developed during these days.

Moreover, I am very grateful to my fellow students and friends Mario Kaip, Michael Pokojovy and Olaf Weinmann and the other members of the PDE research group. I appreciated the pleasant atmosphere and the permanent cooperativeness. In particular, I would like to highlight their great willingness to discuss any problem despite the fact that our research topics were miles away from each other.

Financial support by the DFG (German Research Foundation) via the research unit 518 ‘Price, Liquidity and Credit Risks: Measurement and Distribution’ is gratefully acknowledged.

Last but not least, I would like to thank my parents Christa and Walter Moseler, my sister Anke, and all my friends for their patience and their constant encouragement during my whole studies.

Konstanz, April 2010 Thilo Moseler

i

(4)
(5)

Introduction

The object of investigation in this thesis are backward stochastic differential equations, BSDE for short.

More precisely, we aim at solving numerically decoupled forward-backward stochastic differential equa- tions (FBSDEs) driven by a Brownian motionW, of the form

dSt = b(t,St)dt+σ(t,St)dWt, S0=s0, dYt = −f(t,St,Yt,Zt)dt+ZtdWt, YT =Φ(S).

The origin of such stochastic equations with terminal condition is found in Bismut [7] in the early 1970s where optimal control problems are considered. However, it took until 1990, when Pardoux and Peng [41] published their result about existence and uniqueness for a broad class of also nonlinear BSDEs.

Afterwards, a widespread development in the theory of such equations started, mainly driven by the numerous applications in mathematical finance, see the books of El Karoui [17], Ma and Yong [37], Yong and Zhou [44] and the survey article of El Karoui, Peng and Quenez [18].

At first, the numerics of BSDEs could not keep up with the speed of the development in the theory and accelerated only in recent years. The starting point for numerical schemes for FBSDEs was the theoretical Four Step Schemeof Ma, Protter and Yong [35], from which Douglas, Ma and Protter [16] developed an algorithm in 1996 approximating the solution of a parabolic partial differential equation related to the BSDE.

A totally different approach was followed later on by Bally [1] and Chevance [12]. They tried to solve the equation directly with the help of stochastic techniques using random time partitions under strong regularity assumptions. Unfortunately, their algorithms are hardly implementable. In 2002, Ma et al. [36]

suggested a similar approach, where the Brownian motion in the equation is replaced by a binary random walk.

The trigger for the research on the numerics of FBSDEs was the work of Zhang [45, 46], which established new results about the regularity of the second part of the solutionZwithout involving the derivatives of the coefficient functions of the BSDE. This allowed for a convergence proof under rather weak assump- tions with a deterministic time partition.

In recent years several algorithms were introduced based on these tools. They can be distinguished into categories along different criteria: The first characteristic is the time direction. The algorithms of Bouchard and Touzi [9], Gobet et al. [21, 22]1and Zhang [45, 46] are working backward in time and are based on a proceeding as in the Euler-Maruyama scheme for forward SDEs. We therefore talk of Euler-type schemes, whose characteric is a nesting of conditional expectations backward in time.

Only last year Zhao, Wang and Peng [47] proposed aθ-scheme for BSDEs, which transfers the ideas of the θ-scheme for forward SDEs and improves the error estimates forZunder rather restrictive assumptions.

In contrast to these algorithms the Picard-type schemes of Bender and Denk [2] and Labart [31], Chapter III, do not reverse time and thereby avoid nested conditional expectations. However, they have to put up with nestings of Picard iterations as used in the existence proof of Pardoux and Peng [41].

A second categorization can be made via the type of estimator used to approximate conditional expec- tations. While Zhao et al. [47] employ a Gauss-Hermite quadrature rule most fully implementable algo-

1For [22] we should rather write Lemor et al. since this is the original order. However, to get things simpler and more stadardized we use the other notation. We do not want to downweight the contribution of J. Lemor in this way.

iii

(6)

iv Introduction

rithms apply different Monte Carlo techniques. These in turn are based either on Malliavin calculus as the scheme of Bouchard and Touzi [9], on nonparametric regression as that of Labart [31] or most popular on least-squares Monte Carlo, see Bender and Denk [2], Bender and Zhang [5] and Gobet et al. [21, 22].

The outcome of these Monte Carlo algorithms once implemented are discrete time stochastic processes or considered at a specific point in the time grid random variables. Hence, starting the algorithm with different seeds for the simulations we end up with different outcomes of these random variables.

If one focuses on the applications of FBSDEs in mathematical finance and even more specific in option pricing, a particular high empirical variance of the estimators especially arises for out of the money op- tions or more general for options containing some rare event feature. In this kind of utilization,Stypically represents the price processes of some underlyings,Y is the price process of the option,Φis the payoff function andZis in simple cases a linear transformation of the hedging portfolio,

From a practitioners point of view, who is interested in the initial option priceY0this variability is clearly annoying and he wants to reduce this effect. Using such a Monte Carlo algorithm he is faced with an estimator of the form

Yb0= 1 L

L λ=1

θλ

where we for the moment assume thatθλ,λ =1, . . . ,Lare independent and identically distributed ran- dom variables to get things simple. Hence,

Var[Yb0] = Var[θ1] L

such that one possibility to obtain estimators with lower variance is to increase the number of simulations L. However, this means also increasing computation time and is therefore not attractive for practice.

Instead, a reduction of the term in the nominator also leads to a more stable estimator and is in many cases less costly than a higher number of simulations.

Now, this second possibility is the basic idea of variance reduction methods, which were already applied in special cases in the numerics of BSDEs see Bender and Denk [2] and Labart [31]. Both schemes use a so-called control variate method to stabilize the estimators.

The technique we want to apply for BSDEs is the importance sampling approach originating from the classical linear option pricing problem. In that context it turns out to be highly efficient for some path dependent options, see e.g. Glasserman [20]. In order to calculate option prices one popular way is to simulate paths of the underlying and then average over the corresponding discounted payoffs under an equivalent martingale measureQ, i.e. one tries to approximate

EQ h

Φ(S)BT−1 i

,

whereEQdenotes the expectation under the measureQandBtis the price of the risk-free asset. However, if the option under investigation involves a rare event feature one often ends up with only a few non-zero payoffs and the Monte Carlo estimator suffers from high empirical variance. Now, the basic idea of importance sampling is to drive more simulated paths of the underlyings into ‘interesting’ or ‘important’

regions, e.g. in the money. In doing so, the number of zero payoffs is reduced and therefore we obtain a more stable estimator.

Mathematically spoken, this drift change is nothing but a change of measure. Hence, adjusting the payoff Φ(Sh), whereShdenotes the price process of the underlying under a new measureQh, and the numéraire BthunderQhby multiplying their product with the stochastic exponentialΨ= dQdQh, yields a Monte Carlo estimator for the initial option price of a random variable with the same mean

EQh h

Φ(Sh)(BTh)−1 i

=EQ h

ΨΦ(Sh)(BhT)−1 i

.

(7)

Introduction v

At the same time we hope that its variance VarQh

·

Φ(Sh)(BhT)−1

¸

=EQ

· Ψ

µ

Φ(Sh)(BhT)−1

2¸

−EQ

·

ΨΦ(Sh)(BhT)−1

¸2

is smaller than the variance of its corresponding original counterpart VarQ

·

Φ(S)(BT)−1

¸

=EQ

·µ

Φ(S)(BT)−1

2¸

−EQ

·

Φ(S)(BT)−1

¸2 .

This is the delicate feature of this variance reduction technique: Choosing the wrong measure one can be faced with variance blow-ups and therefore the selection of a different drift has to be made very carefully.

The vast existing literature in the context of option pricing in different models reflects the complexity of importance sampling and can be categorized as follows. On the one hand, the optimal selection of a new measure is examined in continuous time, see e.g. the articles of Newton [39], Milstein and Schoenmakers [38] or Guasoni and Robertson [25], who try to find general rules for optimality. On the other hand, authors develop specific methods for special settings in discrete time, see e.g. the articles of Boyle et al.

[10], Glasserman et al. [19] or Ökten et al. [40].

The aim of this thesis is twofold. At first, we want to introduce importance sampling to BSDEs. This is done in the context of the forward in time scheme of Bender and Denk [2]. However, we think that our technique is not limited to this special algorithm and, in principle, can be used in any least-squares Monte Carlo approach for BSDEs. The second concern is to establish anL2-convergence theorem for the original Picard-type algorithm in order to complete the publication of Bender and Denk [2].

The organization of this thesis is as follows: InChapter 1we start with the framework, assumptions and definitions which hold throughout this publication. Furthermore, we briefly review the results of Bender and Denk [2], which are generalized later on and comment on the rather extensive notation used in the sequel.

Thesecond chapterintroduces importance sampling to BSDEs and is in large parts already published in Bender and Moseler [4]. More precisely, by a change of measure we parameterize the forward scheme of Bender and Denk [2] to obtain a family of time discretizations for the initial value (Y0,Z0) of the solution of the BSDE. That is, for some fixed processhwith suitable properties and a time gridπ: 0=t0 <. . . <

tN=Twe define discrete time stochastic processes(Sh,πh,π,j,Yh,n,π,Zh,n,π)by:

Sh,πti+1 = Sh,πti

b(t,Sh,πti ) +σ(t,Sh,πti )hti

´

(ti+1−ti) +σ(t,Sh,πti )(Wti+1−Wti), Ψh,π,jti = exp

½

i−1 k=j

h>tk(Wtk+1−Wtk)1 2

i−1 k=j

|htk|2(tk+1−tk)

¾ , and recursively

Yth,n,πi = E

·

Ψh,π,itN φ(Xh,πtN ) +

N−1

j=i

Ψh,π,itj f(tj,Sh,πtj ,Yth,n−1,πj ,Zth,n−1,πj )(tj+1−tj)

¯¯

¯¯Fti

¸ , Zth,n,πi = E

·µWti+1−Wti ti+1−ti +hti

¶µ

Ψh,π,itN φ(Xth,πN )

+

N−1

j=i+1

Ψh,π,it

j f(tj,Sth,π

j ,Yth,n−1,π

j ,Zth,n−1,π

j )(tj+1−tj)

¶¯¯

¯¯Fti

¸ ,

starting with(Yh,0,π,Zh,0,π) = (0, 0), whereΦ(S) = φ(XT)for some Markov process(Xt,Ft)which is related to the forward diffusion andXh,πtN is some approximation ofXT.

(8)

vi Introduction

A simple but very elegant observation yields an immediate error estimate for this approximation in Corol- lary 2.1.2, p. 10. We further proceed in the discretization procedure by replacing conditional expectations via least-squares Monte Carlo estimators. In comparison to Bender and Denk [2] and Gobet et al. [22]

we thereby face additional technical difficulties since the time discrete approximation for(Y,Z)is not square-integrable under the original measure. This problem is overcome by exploiting the properties of the density process of the change of measureΨh,π,0. In doing so, we can define an appropriate regres- sion basis and finally we can show convergence of the final estimator, which yields a fully implementable algorithm.

To be more specific, the just mentioned convergence proof is divided in two steps. The first step is devoted to give in Theorem 2.2.2, p. 16, anL2-estimate for the error which arises if one replaces conditional ex- pectations by projections on finite-dimensional subspaces. The second stage, Theorem 2.2.5, p. 21, shows almost sure convergence of the final estimator towards this only theoretically feasible approximation un- der the physical measureP.

Hence, overall we (only) prove convergence in probability of our estimator towards the solution of the BSDE at time zero, though in two out of three steps we are able to deriveL2-error estimates. The reason for this shortcoming lies in the fact that the final estimators are not independent and we average on them to obtain estimators of the next Picard-iteration level.

This disadvantage is overcome inChapter 3by means of nonparametric statistics. Here, we consider a variant of the Picard-type scheme of Bender and Denk [2] with slightly stronger assumptions. In the discrete time approximation we simply truncate the occurring Brownian increments and analyze this ad- ditional approximation error. It turns out to be vanishing rapidly as the truncation is relaxed more and more, see Theorem 3.1.7, p. 29. Furthermore, the reformulation of the scheme yields bounded approxi- mations for(Y,Z)thereby opening the door to strong statistical tools.

They are used to estimate terms which occur when examining the averageL2-error over a number of L Monte Carlo simulations which again are used to obtain discrete versions of the conditional expectations.

The main tool for this purpose is the introduction of a so-called ’ghost-sample’, that is a further set of only imaginary Monte Carlo simulations independent in a suitable sense of the former, actually appearing one.

With the help of these additional random variables we can come back to an average over independent random variables and then apply Hoeffding’s inequality for the mean of bounded, independent random variables, see section 3.3.

Lengthy calculations finally lead to the main theorem (Theorem 3.4.1, p. 71), which gives an upper bound for the L2-error depending on the parameters which can be chosen by ourselves. That is, we obtain an estimate containing the number of time steps, the dimension of the basis spanning the subspace for the approximation of the conditional expectation and the number of Monte Carlo simulations. We thus establish a rule, how to simultaneously choose these parameters such that we can assure convergence of our algorithm in the same time.

Finally, we compare our result to that of Gobet et al. [22]. It turns out that in higher dimensional settings, where hypercubes are used as basis functions, both algorithms reveal the same efficiency. However, for Φ(S) =Φ(ST)andSbeing one-dimensional the Euler-type algorithm is slightly more efficient.

Various numerical examples are studied inChapter 4where we focus on different aspects of variance reduction and numerics of BSDEs in general. After outlining our implementation as pseudo MATLAB code we first test some variance reduction methods stemming from option pricing also in the context of nonlinear BSDEs.

A first step towards a more general approach for the selection of a new measure inducing variance re- duction is made in a further section. We pick up an approach from econometrics and ’translate’ it into the BSDE situation. Our main interest are the questions how to choose the new measure and do we ob- tain better results than in the case where we simply adopted variance reduction techniques from option pricing. Our results for this so-called ’Efficient importance sampling’ (EIS) are slightly ambiguous. The technique turns out to be highly efficient for some examples; however, there are several theoretical and numerical problems left, which limit the number of cases where this kind of selection approach can be

(9)

Introduction vii

successfully applied.

Finally, we have a look at a potential rival of least-squares Monte Carlo estimators. After a quick review of the theory we try to use the simplest nonparametric estimator - the so-called Nadaraya-Watson estimator - for the approximation of conditional expectations and report on the numerical and theoretical problems.

The Appendix at the end of the thesis provides several frequently used inequalities and resumes the tools and results from nonparametric statistics applied in the technical part of Chapter 3.

(10)
(11)

Contents

Acknowledgements i

Introduction iii

1 Preliminaries 1

1.1 The model and basic assumptions . . . 1

1.2 Notation . . . 3

1.2.1 Function spaces . . . 3

1.2.2 Approximation of stochastic processes . . . 3

2 Importance sampling 7 2.1 Modified forward scheme . . . 7

2.2 Least-squares Monte Carlo . . . 14

3 L2-convergence for the Picard-type estimator 23 3.1 Bounded processes . . . 23

3.2 Projection approach in the case of Markov processes . . . 29

3.3 Estimation of the occurring error terms and probabilities of exception sets . . . 45

3.4 Global estimates . . . 71

3.4.1 Main theorem . . . 71

3.4.2 Simultaneous choice of the parameters . . . 74

3.4.3 Lipschitz continuity of the functionsyni andp ∆izin . . . 75

3.4.4 Error and complexity of the Picard- and Euler-type algorithm . . . 80

3.4.5 Error bound for the error with respect to the distribution ofXti . . . 82

3.4.6 Outlook and perspectives . . . 84

4 Numerical experiments 85 4.1 Implementation of the Picard-algorithm with importance sampling . . . 85

4.2 Variance reduction methods from option pricing . . . 87

4.2.1 Asian call options . . . 87

4.2.2 Lookback options . . . 90

4.2.3 Digital options . . . 92

4.3 First try to a more general approach for effective importance sampling . . . 95

4.3.1 Idea and heuristics . . . 95

4.3.2 Asian call options . . . 98

4.3.3 Lookback options . . . 101

4.3.4 Superhedging . . . 105

4.3.5 Energy derivatives . . . 113

4.3.6 Summary . . . 114

4.4 Nonparametric methods . . . 115

ix

(12)

A Appendix 125

A.1 Least-squares problem and singular value decomposition of a matrix . . . 125

A.2 Inequalities . . . 126

A.2.1 Young’s inequality . . . 126

A.2.2 Discrete Gronwall Lemma . . . 126

A.2.3 Hoeffding’s inequality . . . 126

A.3 Definitions and results from nonparametric regression . . . 127

A.3.1 Covering numbers . . . 127

A.3.2 Packing numbers . . . 128

A.3.3 Shatter coefficients and Vapnik-Chervonenkis dimension . . . 128

A.4 Rate of convergence for least-squares estimates . . . 129

Zusammenfassung auf Deutsch 131

Bibliography 137

Index 141

(13)

Chapter1

Preliminaries

1.1 The model and basic assumptions

We investigate numerical solutions of the following decoupled forward-backward stochastic differential equation on a complete probability space(Ω,F,Ft,P), where the filtration(Ft)t∈[0,T]is the augmentation of the one generated by aD-dimensional Brownian motionWandF =FT:

dSt = b(t,St)dt+σ(t,St)dWt, S0=s0, dYt = −f(t,St,Yt,Zt)dt+ZtdWt, YT =Φ(S).

Here the coefficient functionsb: [0,T]×RM −→RM,σ :[0,T]×RM −→RM×D, f :[0,T]×RM×R× RD −→Rare given. The terminal condition for the BSDE is defined via the functionalΦ, which acts on the space ofRM-valued RCLL-functions on[0,T]and is Lipschitz continuous in the sup-norm, i.e. there is a constantKsuch that for all RCLL-functionsx,x0

|Φ(x)−Φ(x0)| ≤K sup

0≤t≤T

|x(t)−x0(t)|

is satisfied. Recall, that a solution of the above equations is a triplet(S,Y,Z)of(Ft)-adapted, square- integrable stochastic processes. We require throughout this thesis the following assumptions, which in particular ensure the existence of a unique solution in the spaceM[0,T]defined in the next section:

A 1. There is a constant K such that for each(t,s),(t0,s0)([0,T]×RM):

|b(t,s)−b(t0,s0)|+|σ(t,s)−σ(t0,s0)| ≤K³p

|t−t0|+|s−s0|´ . A 2. For the same constant K and each(t,s,y,z),(t0,s0,y0,z0)([0,T]×RM×R×RD)holds:

|f(t,s,y,z)−f(t0,s0,y0,z0)| ≤K³p

|t−t0|+|s−s0|+|y−y0|+|z−z0|´ . A 3. There is an M0-dimensional Markov process(Xt,Ft)withStas its first M components such that

E h

sup

0≤t≤T

|Xt|2 i

<

andΦ(S) =φ(XT)for some Lipschitz continuous functionφwith Lipschitz constant K.

A 4. The above constant K satisfies sup

0≤t≤T

|b(t, 0)|+|σ(t, 0)|+|f(t, 0, 0, 0)|+|φ(0)| ≤K.

1

(14)

2 1.1. The model and basic assumptions

For a given, fixed partitionπ : 0 = t0 < . . . < tN = T with supi|ti+1−ti| =: |π| < 1 we define

i =ti+1−tiand the increments of the Brownian motion are denoted by∆Wi=Wti+1−Wti. We add the following structural assumption concerning the time discretization of the solution of the forward equation Sπti:

A 5. For every partitionπthere is a deterministic function uπ:π×RM0×RD−→RM0 such that Xtπi =uπ(ti,Xπti−1,∆Wi−1), i=1, . . . ,N, Xtπ0 =X0

satisfies Xm,tπ i =Sm,tπ i for m≤M and Eh

|XπtN− XT|2i

−→0as|π| −→0.

Under Assumption A 5(Xtπi,Fti)is a Markov process underPas well.

Given these assumptions Bender and Denk [2] introduced an approximation scheme, which we now briefly review since it is the starting point for our investigations.

The discretization for the forward equation is not considered further, we can simply apply the existing methods in the literature. However, given the results of Bender and Denk [2] it is by far enough to restrict ourselves to the simplest one, i.e. we can choose the Euler-Maruyama scheme which reads for the first components ofXtπi as follows:

Sπti+1 = Sπti +b(ti,Stπi)∆i+σ(ti,Sπti)∆Wi, i=0, . . . ,N−1, Sπt0 = s0.

The approximation scheme for the backward part is now defined recursively fori=0, . . . ,Nby Ytn,π

i = E

·

φ(XπtN) +

N−1

j=i

f(tj,Sπtj,Ytn−1,π

j ,Zn−1,πt

j )∆j

¯¯

¯¯Fti

¸

, (1.1)

Zd,tn,π

i = E

·∆Wd,i

i µ

φ(XπtN) +

N−1

j=i+1

f(tj,Sπtj,Ytn−1,πj ,Ztn−1,πj )∆j

¶¯¯

¯¯Fti

¸

, d=1, . . . ,D, (1.2)

initialized at(Y0,π,Z0,π) = (0, 0). We apply the convention∆WN :=0 and use constant extensions for the approximation, i.e. Ytn,π :=Ytn,πi andZtn,π := Ztn,πi fort∈ [ti,ti+1[. We see, that given the solution of the (n1)-th iteration, in principle we could calculate the solution in the next iteration level forward in time.

For this reason we also talk of a forward scheme. Obviously, nestings of conditional expectations within one Picard-iteration are avoided.

Theorem 2 of Bender and Denk [2] gives the convergence of the Picard-type discretization scheme:

Theorem 1.1.1. There is a constant C such that for any n∈N sup

0≤t≤T

Eh

|Yt−Ytn,π|2i +E

"Z T

0

|Zs−Zn,πs |2ds

#

≤CEh

|XT−XtπN|2i

+C|π|+C µ1

2+C|π|

n ,

where C=K2(T+1)¡

4DK2(T+1)DT+1¢ .

For its proof results of Bouchard and Touzi [9] and Zhang [46] are used, in particular the convergence of Bender and Denk’s scheme towards that of Bouchard and Touzi is needed. Hence, it is quite natural that in comparison to these backward schemes and that of Gobet et al. [21, 22] the error estimate of Bender and Denk [2] contains an extra term due to the Picard iterations.

In a further approximation step Bender and Denk [2] replace the conditional expectations in (1.1) - (1.2), which are actually conditional expectations with respect to theσ-algebra generated byXπti, by orthogonal

(15)

1.2. Notation 3

projectionsP0,ionD+1 subspaces ofL2(σ(Xtπi))for anyi=0, . . . ,N, i.e. they define Ybtn,πi = P0,i

·

φ(XπtN) +

N−1

j=i

f(tj,Sπtj,Ybtn−1,πj ,Zbtn−1,πj )∆j

¸ , Zbd,tn,π

i = Pd,i

·∆Wd,i

i µ

φ(XπtN) +

N−1

j=i+1

f(tj,Sπtj,Ybtn−1,πj ,Zbn−1,πtj )∆j

¶¸

, d=1, . . . ,D,

initialized again at(Yb0,π,Zb0,π) = (0, 0).

At this stage the advantage of the forward approximation scheme reveals: Theorem 11 of Bender and Denk [2] specifies the moderate error occurring when approximating(Ytn,πi ,Zn,πti )with(Ybtn,πi ,Zbtn,πi ). In the forward scheme this error is bounded by a constant times the worst projection error occurring during the iterations. Consequently, it does not explode if the mesh grid size tends to zero as it is the case for the backward schemes. For more details, see the discussion in Bender and Denk [2], pp. 1802-1803.

In a final step Bender and Denk [2] replace the theoretical projections Pd,i by simulation based least- squares estimators and derive at last in their Theorem 15 that this estimator convergesP-almost surely to the approximation coming from the theoretical projection. Overall, they obtain convergence in probability for their final estimator towards the solution of the FBSDE.

1.2 Notation

1.2.1 Function spaces

As usual in the theory of BSDEs we deal with the following function spaces:

L2(F)- the space ofF-measurable random variablesXsuch thatE h

|X|2 i

<∞,

L2F³

Ω,C([0,T]),Rd´

- the space of(Ft)t≥0-adaptedRd-valued continuous processes X such that Eh

supt∈[0,T]|Xt|2i

<∞,

L2

³ 0,T;Rd

´

- the space of(Ft)t≥0-adaptedRd-valued continuous processes Xsuch that we have EhRT

0 |Xt|2dt i

<∞and

• M[0,T]:=L2F(Ω,C([0,T]),Rn)×L2(0,T;Rn)equipped with the norm

||(Y(·),Z(·))||M[0,T]:=

µ E

· sup

t∈[0,T]

|Yt|2

¸ +E

· Z T

0

|Zt|2dt

¸¶1/2 .

Pardoux and Peng [41] showed that inM[0,T]there is a unique solution to BSDEs satisfying the above assumptions.

1.2.2 Approximation of stochastic processes

In order to get the intuition behind the notation of the different discretizations of the occurring stochastic processes we introduce them in the following in full detail.

Time discretization

In any approximation scheme we will consider, the first stage of approximation is with respect to time.

That is we introduce a fixed partitionπ : 0 = t0 < . . . < tN = T of the interval [0,T] and compute

(16)

4 1.2. Notation

approximations of the solution of the FBSDE at the partition pointsti,i = 0, . . . ,N. For the solution of the backward part we furthermore use an iterative Picard-type approach and label these iterations with n∈N0. Hence, writing(Stπi,Ytn,πi ,Ztn,πi )indicates the time discretized solution of the FSBDE at timetiand iterationngiven the partitionπ. Proceeding this way, we have to introduce discretizations of increments of the Brownian motion with respect toπdenoted by∆Wi =Wti+1−Wti, i.e. we use forward increments.

Chapter 2 introduces a family of FSBDEs which is parameterized by a further stochastic processhwhich is again chosen once and then fixed throughout the whole calculations. We thus write(Sh,πti ,Yth,n,πi ,Zth,n,πi ) for the time discretized solution of the modified FSBDE at timetiand iterationngiven the partitionπ.

The choice h 0 thereby corresponds to the original discretization of Bender and Denk [2]. As our parametrization represents a change of measure, we also have to consider Brownian increments under a further measure and denote them by∆Wih=Wthi+1−Wthito distinguish them from the former.

In order to ease notation at a later stage we drop the superindiceshandπfor the time discrete solution of the FBSDE, i.e. instead of(Sth,πi ,Yth,n,πi ,Zh,n,πti )we simply write(Sti,Ytni,Ztni). We can justify this impre- ciseness not only because of the fewer indices but also because we do not change in the following steps the partition and the processhand hold them fixed.

Another variant of the equation withh 0 is studied in Chapter 3. Here we focus on drivers f, which are bounded by some constantR. As consequence, we will derive that under mild manipulations of the scheme of Bender and Denk our time discrete approximations of the solution of the backward part are bounded. To remind the reader of this property we write(Sti,Ytn,Ri ,Zn,Rti )suppressing the dependency on the time partitionπ.

In any setting, there will be Borel-functions such that the time discrete approximations of the solution of the backward SDE can be written as functions of a forward Markov process, which contains as first com- ponents the (discrete) forward diffusion and the other components depend on the shape of the terminal condition. We denote this process byXti. It turns out that these deterministic functions only depend on the partition point and the number of the Picard-iteration, such that we will write in Chapter 2,Ytni =yni(Xti), Znti =zni(Xti). In Chapter 3 we hereby ignore the influence of the boundRand also writeYtn,Ri =yni(Xti), Zn,Rti =zni(Xti). We emphasize that these functions are not the same across chapters, but there is no danger of mixing them up, because within one chapter we only deal with one set of functions.

Projections on finite-dimensional spaces

In Chapter 2 conditional expectations are further replaced by orthogonal projections on finite-dimensional spaces. We indicate this step by a hat, i.e. we write(Ybtni,Zbtni)for the projection of the time discretized solution of the modified BSDE at timeti and iterationn given the partitionπ on a fixed chosen finite- dimensional subspace.

Monte Carlo simulations

The final approximation step of our proceeding in Chapter 2 replaces the orthogonal projections on finite- dimensional subspaces by an estimator coming from a simulation based least-squares approach. For this purpose we need L independent Monte Carlo simulations of the occurring forward processes. In full detail, we have to simulate in analogy to Bender and Denk [2] the Brownian increments and the forward Markov process and denote them in Chapter 2 by∆λWih and λXti respectively, for λ = 1, . . . ,L and i=0, . . . ,N. Thus it is natural to write(λYbtni,λZbnti)for the resulting estimators for the discretized solution of the backward equation.

Our approach in Chapter 3 directly passes from the time discretization to a simulation based least-squares procedure. For this purpose a whole set of further only imaginary simulations is required. We need for each time pointtiin the partition extra simulations of the Brownian increments and the forward Markov process running until the end of the time horizon of the equation. These new processes are independent conditional to the information up to ti and in the mean time are identically generated as the already existing discrete time processes. To be able to distinguish these sets of processes we signify the imaginary ones with bars, i.e. ∆λWjandλXitj denote these processes at timetj. The additional superindex for the

(17)

1.2. Notation 5

discrete Markov process indicates that the additional feature starts at timeti. We will comment on these so-called ’ghost samples’ later on in more detail.

Further notation

A lot of other notation is used in the sequel, see also the index at the end of the thesis, however, it is not helpful to introduce it here. We will do so at the appropriate places and turn now to a variance reduced version of the algorithm of Bender and Denk [2].

(18)
(19)

Chapter2

Importance sampling

The content of this chapter is already published in Bender and Moseler [4]. We only supplemented some comments and explanations to further clarify our proceeding. The aim of this chapter is to introduce importance sampling for BSDEs. That is we develop a variance reduction method for BSDEs via a change of measure, whose basic idea is borrowed from option pricing.

2.1 Modified forward scheme

We now explain the starting point for the algorithm developed later on. Consider the following family of decoupled FBSDEs parameterized by some measurable, bounded and adapted processh:[0,T]−→RD:

dSth = ³

b(t,Sth) +σ(t,Sth)ht´

dt+σ(t,Sth)dWt, dYth = ³

−f(t,Sth,Yth,Zth) + (Zth)>ht

´

dt+ZthdWt, S0h = s0, YTh=φ(XTh).

where>denotes the transposition of a matrix. We denote(S,Y,Z) := (S0,Y0,Z0), the solution of the original FBSDE withh≡0.

The first observation is that the initial value of the backward part does not depend onh. In fact, defining a new measureQhbydQhhTdPwhere

Ψht =exp

½

Z t

0

h>udWu1 2

Z t

0

|hu|2du

¾ ,

we can apply the Girsanov theorem, to deduce that the law of (Sh,Yh,Zh) under Qh is the same as that of(S,Y,Z)underP. In particular, the constants(Y0,Z0)and (Y0h,Z0h)coincide. We mention that, however, the path of the processes at later time points(Sh,Yh,Zh)and(S,Y,Z)differ. Nonetheless, in many applications, e.g. in option pricing problems, one is mainly interested in estimatingY0. Having the different representations forY0at hand, we aim at reducing the variance of Monte Carlo estimators forY0by a judicious choice ofh. This turns out to generalize the importance sampling technique from calculating expectations to nonlinear BSDEs.

We now introduce the time discretized analog to the Picard-type iteration scheme with importance sam- pling induced by some processh. As it is natural that the choice ofhwill vary with the partitionπ, we do assume from now on that the partitionπis fixed. At first we specify the class of processes which we will consider in the sequel.

7

(20)

8 2.1. Modified forward scheme

A 6. The discretized process h is given by

hti =eh(ti,∆W0, . . . ,∆Wi−1)

for some bounded deterministic functioneh:π×RD×. . .×RD−→RD. The bound of h will be denoted Ch. The modified forward scheme is then given by

∆Wih,π = ∆Wi+htii, i=0, . . . ,N−1, ∆WNh,π=0, Ψh,π,jti = exp

½

i−1

k=j

h>tk∆Wk1 2

i−1

k=j

|htk|2k

¾

, j=0, . . . ,N−1, i=j, . . . ,N, Xth,π0 = X0,

Xth,πi = uπ(ti,Xth,πi−1,∆Wi−1h,π), i=1, . . . ,N, and, fori=0, . . . ,N,d=1, . . . ,D

Yth,n,πi = E

·

Ψh,π,itN φ(Xh,πtN ) +

N−1

j=i

Ψh,π,itj f(tj,Sh,πtj ,Yth,n−1,πj ,Zh,n−1,πtj )∆j

¯¯

¯¯Fti

¸

, (2.1)

Zh,n,πd,t

i = E

·∆Wd,ih,π

i µ

Ψh,π,itN φ(Xth,πN ) +

N−1

j=i+1

Ψh,π,itj f(tj,Sh,πtj ,Yth,n−1,πj ,Zth,n−1,πj )∆j

¶¯¯

¯¯Fti

¸

, (2.2)

initialized at(Yth,0,πi ,Zh,0,πti ) = (0, 0). For the special caseh 0, we are just back in the forward scheme discussed by Bender and Denk [2]. Note that, by construction, the firstMcomponents ofXth,π

i coincide withSth,π

i defined via the Euler-Maruyama scheme Sh,πt0 = s0,

Sh,πti+1 = ³

b(ti,Sh,πti ) +σ(ti,Sth,πi )hti

´

i+σ(ti,Sth,πi )∆Wi, i=0, . . . ,N−1.

Defining a new measureQh,πbydQh,πh,π,0tN dPthe Girsanov theorem implies that the process Wth,π=Wt+

N−1

j=0

htj(tj+1∧t−tj∧t),

is Brownian motion underQh,π. Consequently,∆Wh,πare Brownian increments under this measure. This implies that(Xh,π,Fti)is a Markovian process underQh,π and that the transition probabilities of Xh,π underQh,πare the same as those ofXπunderP.

The following theorem shows that, in this Markovian setting, the conditional expectations in the above iteration scheme actually simplify to regressions onXth,πi . On the one hand this is crucial for the Monte Carlo algorithm described in the next section, on the other hand it also allows us to derive some conver- gence results for the modified scheme in an elegant way.

Theorem 2.1.1. Under the standing assumptions there are deterministic functions yn,πi and zn,πi not depending on h such that

Yth,n,πi =yn,πi (Xth,πi ), Zth,n,πi =zn,πi (Xh,πti ).

Referenzen

ÄHNLICHE DOKUMENTE

The method, instead of deterministic bounds, uses stochastic upper and lower estimates of the optimal value of subproblems, to guide the partitioning process.. Almost sure

This fact allows to use necessary conditions for a minimum of the function F ( z ) for the adaptive regulation of the algo- rithm parameters. The algorithm's

He presented an iteration method for solving it He claimed that his method is finitely convergent However, in each iteration step.. a SYSTEM of nonlinear

In this paper, the author develops a dual forest iteration method for the stochastic transportation problem.. In this sense it is simple compared with the primal forest

There are two major approaches in the finite-step methods of structured linear programming: decomposition methods, which are based on the Dantzig-Wolfe decomposition

This work shows that both the presented application of the put-call-parity and the approximations derived for Singer’s importance sampling formula [14] are capable of

Also in this framework, monotonicity has an important role in proving convergence to the viscosity solution and a general result for monotone scheme applied to second order

This chapter introduces the maple software package stochastic con- sisting of maple routines for stochastic calculus and stochastic differential equa- tions and for constructing