• Keine Ergebnisse gefunden

Multidimensional Singular Control and Related Skorokhod Problem: Suficient Conditions for the Characterization of Optimal Controls

N/A
N/A
Protected

Academic year: 2022

Aktie "Multidimensional Singular Control and Related Skorokhod Problem: Suficient Conditions for the Characterization of Optimal Controls"

Copied!
38
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

March 2021

645

Multidimensional Singular Control and Related Skorokhod Problem: Sufficient Conditions for the Characterization of Optimal Controls

Jodi Dianetti and Giorgio Ferrari

Center for Mathematical Economics (IMW) Bielefeld University

Universit¨atsstraße 25 D-33615 Bielefeld·Germany e-mail: imw@uni-bielefeld.de

bielefeld.de/zwe/imw/research/working-papers/

ISSN: 0931-6558

This work is licensed un- der a Creative Commons

“Attribution 4.0 Interna- tional” license.

(2)

SKOROKHOD PROBLEM: SUFFICIENT CONDITIONS FOR THE CHARACTERIZATION OF OPTIMAL CONTROLS

JODI DIANETTI AND GIORGIO FERRARI

Abstract. We characterize the optimal control for a class of singular stochastic control problems as the unique solution to a related Skorokhod reflection problem. The considered optimization problems concern the minimization of a discounted cost functional over an in- finite time-horizon through a process of bounded variation affecting an Itˆo-diffusion. The setting is multidimensional, the dynamics of the state and the costs are convex, the volatility matrix can be constant or linear in the state. We prove that the optimal control acts only when the underlying diffusion attempts to exit the so-called waiting region, and that the direction of this action is prescribed by the derivative of the value function. Our approach is based on the study of a suitable monotonicity property of the derivative of the value function through its interpretation as the value of an optimal stopping game. Such a monotonic- ity allows to construct nearly optimal policies which reflect the underlying diffusion at the boundary of approximating waiting regions. The limit of this approximation scheme then provides the desired characterization. Our result applies to a relevant class of linear-quadratic models, among others. Furthermore, it allows to construct the optimal control in degenerate and non degenerate settings considered in the literature, where this important aspect was only partially addressed.

Keywords: Dynkin games, reflected diffusion, singular stochastic control, Skorokhod prob- lem, variational inequalities.

AMS subject classification: 93E20, 60G17, 91A55, 49J40.

1. Introduction

This paper considers the problem of characterizing optimal policies for singular stochastic control problems in multidimensional settings. More precisely, we consider the problem of controlling, through a one-dimensional c`adl`ag (i.e., right-continuous with left limits) process v with locally bounded variation, the first component of a multidimensional diffusion with initial condition x. Namely, the controller can affect a state process Xx;v which evolves according to the equation

(1.1) dXtx;v =b(Xtx;v)dt+σ(Xtx;v)dWt+e1dvt, t≥0, X0−x;v =x,

for a multidimensional Brownian motion W, a suitable convex Lipschitz function b, and a volatility matrixσ, which is either constant or linear in the state. The aim of the controller is to minimize the expected discounted cost

(1.2) J(x;v) :=E

Z

0

e−ρth(Xtx;v)dt+ Z

[0,∞)

e−ρtd|v|t

,

for a given convex function h and a suitable discount factor ρ > 0. Here, |v| denotes the total variation of the processv. The value functionV of the problem is defined, at any given initial condition x, as the minimum of J(x;v) over the choice of controls v. Also, a control

¯

v is said to be optimal forx ifJ(x; ¯v) = V(x). Existence of optimal controls can be proved

Date: March 15, 2021.

1

(3)

in very general frameworks using different probabilistic compactification methods (see, e.g., [11,18,34,47,52]).

Natural questions that immediately arise are whether it is possible to characterize V, and how one should act on the system in order to obtain the minimal costV. As a matter of fact, the Markovian nature of the problem together with mild regularity and growth conditions on band h, allows to employ the dynamic-programming approach. This leads to the characteri- zation of the value function as a solution (in a suitable sense) to the Hamilton-Jacobi-Bellman equation

(1.3) max{ρV −bDV −tr(σσ>D2V)/2−h,|Vx1| −1}= 0.

This equation provides key insights on the way the controller should act on the system in order to minimize the cost of her actions. Indeed, whenV is sufficiently regular, an application of Itˆo’s formula suggests that the controller should make the state process not leaving the set W := {|Vx1|<1}, usually referred to as the waiting region. In fact, in many examples (see, e.g., [23,32,44,45,50,57,62], among others) it is possible to construct the optimal control as the solution to a related Skorokhod reflection problem; that is, the optimal control can be characterized as that process ¯v, with minimal total variation, which is able to keep the process Xx;¯v inside the closure of the waiting regionW, by reflecting it in a direction prescribed by the gradient of the value function. However, in multidimensional settings, such a characterization often remains a conjecture (see the discussion in Chapter 6 in [58], Remark 5.2 in [8], and also [15, 16, 26, 27]), and many questions about the properties of optimal controls remain open, representing a strong limitation to the theory.

We now discuss more in detail the problem of the characterization of optimal rules. When the state process is one dimensional, optimal controls can be explicitly constructed as Sko- rokhod reflections in a general class of models in [1, 22,38,39, 50, 61], among others. Also, in the (non necessarily Markovian) one dimensional case, a similar characterization of op- timal controls has been achieved in [2, 3, 4], without relying on the dynamic-programming approach. When the dimension of the problem becomes larger than one, the difficulty of characterizing optimal controls drastically increases. Indeed, classical results on the existence of solutions to the Skorokhod reflection problem in the multidimensional domainW require some regularity of the boundary ofW and of the direction of reflection, which are, in most of the cases, unknown. When the value functionV is convex, this difficulty is overcome in some specific settings. A celebrated example is presented in [57], where the problem of controlling a two-dimensional Brownian motion with a two-dimensional process of bounded variation is considered. There, the authors show that the boundary of the waiting region (the so-called free boundary) is of classC2, and they are therefore able to construct the optimal policy as a solution to the associated Skorokhod problem. The problem of the characterization is also encountered in [15, 16, 26, 27], where the construction of the optimal control can be pro- vided only by requiring additional properties on the boundary of the waiting region. Another example is exhibited in [22], in which the case of controlling a multidimensional Brownian motion with a multidimensional control is considered in the case of a radial running cost h(x) =|x|2. We also refer to [44], where the construction of the optimal policy is provided in a two-dimensional context in which the drift is non-zero. To the best of our knowledge, in the case of a convex V, the most general multidimensional setting in which this characteri- zation is shown is presented in [45], and in its finite time-horizon counterpart [9]. There, the problem of controlling a multidimensional Brownian motion with a multidimensional control is considered for a convex running cost. Remarkably, in [45] (and in [9]) the author presents an approach which allows to construct the unique optimal policy as a solution to the related Skorokhod problem bypassing the problems related to the regularity of the free boundary. In non-convex settings, the number of contributions are even rarer. The suitable regularity of

(4)

the boundary ofW is shown, in two-dimensional settings, in [32] and in [23], while a multidi- mensional case is considered in [62], via a connection with Dynkin games. We also mention that the construction of multidimensional reflected diffusions in polyhedral domains has been recently studied in [19,31, 33], in the context of games with singular controls. To conclude, despite many decades of research in the field, the nature of optimal controls is, in general, far from being completely understood, and this motivates our study.

In this paper, we provide sufficient conditions for the characterization of the optimal policy of the singular control problem specified by (1.1) and (1.2) as the solution to the related Skorokhod reflection problem. Despite in our setting the control is one dimensional, the mul- tidimensional nature of the problem arises from the fact that the components of the state process are interconnected; in particular, the action of the controller on the first component of the state process can affect all the other components. We will show the claimed charac- terization under two main classes of assumptions in which the volatility matrix is constant or linearly dependent on the state. In both cases additional monotonicity assumptions are enforced to the running cost h and to the drift b. These structural conditions are satisfied in a relevant class of linear-quadratic models, and in some specific settings considered in the literature for which the problem of constructing the optimal control remained partially open (see [15, 16, 26, 27]). The strategy of our proof is inspired by [45] and can be resumed in three main steps.

(1) We first derive important monotonicity properties onVx1. This is done by identifying Vx1 as the value of a related Dynkin game, through a variational formulation in the spirit of [15].

(2) We construct solutions ¯vε to a family of Skorokhod problems in domains Wε approx- imating W. Here the monotonicity of Vx1 plays a crucial role in order to show the regularity ofWε. The controls ¯vε areε-optimal for (1.2); i.e. J(x; ¯vε)≤V(x) +ε.

(3) We find a control ¯v such that ¯vε → ¯v, as ε → 0. This implies that ¯v is optimal for x, and, thanks to the properties of ¯vε, that ¯v solves the Skorokhod problem on the original domain W. This then provides the desired characterization of the optimal policy ¯v.

As a consequence of our result, some works (in particular [15] and [62]) in the literature on singular control can be revisited, and the characterization of optimal controls can provided under slightly different assumptions. Also, our approach allows to treat the singular control problems with degenerate diffusion matrix studied in [26,27]. The results apply to problems with monotone controls, and to the case in which increasing the underlying diffusion has a different cost than decreasing it. The approach presented in this paper seems to be suitable to treat also singular control problems in the finite time-horizon.

Clearly, our results relate to stochastic differential equations (SDEs, in short) with reflect- ing boundary conditions, also known as Skorokhod reflection problems for SDEs. In this field, existence and uniqueness of strong solutions to reflected SDEs in convex time-independent domains with normal reflection was first shown in the seminal [60]. These results were then generalized to non-convex smooth domains with smooth oblique reflection in [48], and sub- sequently refined in [55]. Existence of strong solutions in a class of non-smooth domains has been proved in [24], and therefore generalized to the time-dependent case in [49]. This list is, however, far from being exhaustive, and we therefore refer the interested reader to [12, 13, 20, 21, 53, 59] and to the references therein. From this point of view, our results provide existence and uniqueness of a (strong) solution to a Skorokhod problem in which the domain is given by the noncoincidence setW of the solution of the variational inequality with gradient constraint (1.3), and in which the reflection direction is prescribed its gradient.

An essential tool for our analysis is the connection between optimal stopping and singular stochastic control theory. This connection is known since the seminal [5], where the authors

(5)

observed that the derivative of the value function of a singular control problem identifies with the value of an optimal stopping problem. Since then, this connections has been elaborated through different approaches (see [6,8,42], among others), until the more recent interpretation given in [47]. When the control is assumed to be of locally bounded variation, and the system has dynamics with independent components, with one of them being controlled, the space derivative of the value function of the control problem coincides with the value of a zero-sum game of stopping; i.e., a Dynkin game (cf. [7,15,16,32,43]). This connection was described in a multi-dimensional setting with interconnected dynamics in [15] and [16] by employing a variational formulation of the problem. In this paper, we employ essentially the formulation and the techniques elaborated in [15], however extending their arguments to fit our convex setting.

The rest of this paper is organised as follows. In Section 2 we formulate the problem, we enforce some structural conditions, and we state the main result of this paper. The proof of the main result for a constant volatility is presented in Section3, while the proof for a linear volatility is discussed in Section4. Extensions and examples are provided in Section5, while AppendixAand Appendix Bare devoted to some auxiliary technical results.

1.1. Notation. For d ∈ N with d ≥ 1, an open set B ⊂ Rd, α = (α1, ..., αd) ∈ Nd and a funciton f : B → R, we denote by Dαf := Dα11...Ddαdf the weak derivative of f, where Dif :=fxi := ∂f /∂xi, and we set |α|:=α1+...+αd. For`∈N, q ∈[1,∞], and a measure space (E,E, m), we define the spaces:

• Lq(E) := {measurablef :E → Rs.t. kfkLq(E) <∞}, wherekfkLq(E) := R

E|f|qdm ifq <∞, and kfkL(E) := ess supEf forq=∞;

• C`(B) :={f :B →R with continuous`-order derivatives} and

Cc(B) :={f :B →Rwith compact support, s.t.f ∈C`(B) for each`∈N};

• C`;1(B) := {f : B → R withkfkC`;1(B)<∞}, where kfkC0(B) := supx∈B|f(x)|, kfkLip (B) := supx,y∈B|f(y)−f(x)|/|y−x|, and kfkC`;1(B) := P

|α|≤`kDαfkC0(B)+ P

|α|=`kDαfkLip (B);

• W`;q(B) :={f ∈Lq(B) withkfkW`;q(B) <∞},

Wloc`;q(B) :={f|f ∈W`;q(D) for each bounded open set D⊂B}, andW0`;q(B) as the closure ofCc(B) in the norm k · kW`;q(B), wherekfkW`;q(B):=P

|α|≤`kDαfkLq(B). Forx∈Rd we denote byx>the transpose ofx. The vectorei ∈Rdindicates thei-th element of the canonical basis ofRdand, for x∈R2 and R >0, set BR(x) :={y ∈Rd| |y−x|< R}.

Finally, in this paperC indicates a generic positive constant, which may change from line to line.

2. Problem formulation and main result

2.1. Singular control and Skorokhod problem. Fix d∈N, d≥2, and a d-dimensional Brownian motionW = (W1, ..., Wd) on a filtered probability space (Ω,F,F,P) satisfying the usual conditions. For eachx= (x1, ..., xd)∈Rd, let the processXx = (X1,x, ..., Xd,x) denote the solution to the stochastic differential equation (SDE, in short)

(2.1)

(dXt1,x= (a1+b11Xt1,x)dt+ ¯σ(Xt1,x)dWt1, t≥0, X0−1,x=x1,

dXti,x=bi(Xt1,x, Xti,x)dt+ ¯σ(Xti,x)dWti, t≥0, X0−i,x=xi, i= 2, ..., d.

Here a1, b11 are constants, while the coefficients bi ∈ C(R2) and ¯σ ∈ C(R) are deterministic Lipschitz continuous functions. The drift ¯b(x) := (a1+b11x1, b2(x1, x2), .., bd(x1, xd))>and the function ¯σ satisfy Assumption 2.1below. Next, introduce the set of admissible controls as

V :={R-valued F-adapted and c`adl`ag processes with locally bounded variation},

(6)

and, for eachv ∈ V and x∈Rd, let the process Xx;v = (X1,x;v, ..., Xd,x;v) denote the unique strong solution to the controlled stochastic differential equation

(2.2) (

dXt1,x;v = (a1+b11Xt1,x;v)dt+ ¯σ(Xt1,x;v)dWt1+dvt, t≥0, X0−1,x;v =x1,

dXti,x;v =bi(Xt1,x;v, Xti,x;v)dt+ ¯σ(Xti,x;v)dWti, t≥0, X0−i,x;v =xi, i= 2, ..., d.

For any given initial condition x ∈ Rd, consider the problem of minimizing the expected discounted cost

(2.3) J(x;v) :=E Z

0

e−ρth(Xtx;v)dt+ Z

[0,∞)

e−ρtd|v|t

, v∈ V,

where |v|denotes the total variation of the process v, h :Rd → R is a continuous function, andρ >0 is a constant discount factor. We will say that the control ¯v∈ V is optimal if

(2.4) V(x) := inf

v∈VJ(x;v) =J(x; ¯v),

and, in the following, we will refer to the functionV as to the value function of the problem, and toXx;¯v as to the optimal trajectory.

The second integral appearing in (2.3) has to be understood in the Lebesgue-Stieltjes sense, and it is defined as

Z

[0,∞)

e−ρtd|v|t:=|v|0+ Z

0

e−ρtd|v|t,

in order to take into account possible jumps of the control at time zero. Moreover, forv∈ V we will often writedv=γd|v|to denote the disintegration

vt= Z t

0

γsd|v|s, for each t≥0, P-a.s.,

where|v|denotes the total variation of the signed measurev, and the processγ is the Radon- Nikodym derivative of the signed measure v with respect to |v|. Also, for a control v, the nondecreasing c`adl`ag processes ξ+, ξ will denote the minimal decomposition of the signed measure v; that is, v =ξ+−ξ, and ξ+ ≤ξ¯+ and ξ ≤ξ¯ for any other couple of nonde- creasing c`adl`ag processes ¯ξ+,ξ¯ which satisfy v= ¯ξ+−ξ¯.

Finally, recall from [45] the following notion of solution to the Skorokhod problem, which we adapt to our setting.

Definition 1. Let O be an open subset of Rd with closure O, x ∈ O, and set S :=∂O. Let

¯

ν be a continuous vector field onS, withν¯=e1ν and |ν(y)|= 1 for each y∈S.

We say that the process v∈ V is a solution to the modified Skorokhod problem for the SDE (2.2) in O starting at x with reflection directionν¯ if

(1) P[Xtx;v ∈ O,∀t≥0] = 1;

(2) P-a.s., for each t≥0 one has dv=γd|v|, with

|v|t= Z t

0

1{Xs−x;v∈S, ν(Xs−x;v)=γs}d|v|s;

(3) P-a.s., for each t ≥0, a possible jump of the process Xx;v at time t occurs on some intervalI ⊂S parallel to the vector fieldν; i.e., such that¯ ν(y)¯ is parallel to I for each y∈I. If Xx;v encounters such an interval I, it instantaneously jumps to its endpoint in the direction ν¯ onI.

Moreover, if v is continuous , then we say that v is a solution to the (classical) Skorokhod problem for the SDE (2.2) in O starting at x with reflection directionν.¯

(7)

2.2. Assumptions and main result. The main objective of this paper is to characterize optimal control policies for Problem (2.4) as solutions to related Skorokhod problems.

We will prove our main result under the following structural conditions, which we enforce throughout the rest of this paper. We postpone the discussion of some generalizations to Section5.

Assumption 2.1. Forp≥2 we have:

(1) The running cost h is C2;1(Rd), convex, and, for suitable constants K, κ1, κ2 >0, it satisfies, for eachx, y∈Rd and for all λ∈[0,1], the conditions

κ1|x1|p−κ2 ≤h(x)≤K(1 +|x|p),

|h(y)−h(x)| ≤K(1 +|x|p−1+|y|p−1)|y−x|,

λh(x) + (1−λ)h(x)−h(λx+ (1−λ)y)≤Kλ(1−λ)(1 +|x|p−2+|y|p−2)|x−y|2, 0< hx1x1(x).

(2) There exists a constantL¯ ≥0 such that, for each x, y∈Rd, we have

|¯b(x)| ≤L(1 +¯ |x|),

|¯b(y)−¯b(x)| ≤L|y¯ −x|.

The functions bi are convex of class C2;1(Rd). Furthermore, we assume that hxi ≥0 and bix1, bix1xi, hx1xi ≤0 for each i= 2, ..., d, and that D¯b is globally Lipschitz.

(3) For ρ := p(2p−1) and a constant σ > 0, either of the two conditions below is satisfied:

(a) σ(y) =¯ σ, y∈R, and the discount factor satisfies the relationρ >3ρL.¯

(b) σ(y) =¯ σy, y ∈ R, and the discount factor satisfies the relation ρ > 2ρ( ¯L+ σ2 −1)). In this case, we also assume that there exists x1 > 0 such that hx1(x)≤min{0,−b11} for each x withx1<2x1, that bi(x1, xi)≥0 for x1, xi ≥0 for each i= 2, ..., d, and that a1 ≥0.

Natural examples in which the conditions above are satisfied are given –after discussing generalizations of Assumption 2.1– in Section 5. These include a relevant class of linear- quadratic singular stochastic control problems (see Example 1 and Subsection 5.4 below).

Notice that the nature of problem (2.4) is genuinely multidimensional, as the components of the dynamics (2.2) are interconnected.

Remark 2.2 (On the role of Assumption 2.1). We underline that the particular choice of p≥2is motivated by quadratic running costs (cf. Example1 in Section5). From Condition2 one can see that quite strong requirements are needed in order to treat models with a generalbi. However, whenbi has a simpler form, some conditions on the derivativesbix1, bix1xi, hxi, hx1xi can be weakened (see Subsections5.1.1 and 5.1.2). Also, the assumption onhx1 in Condition 3bis to enforce that the optimal trajectories live in the setRd+:={x∈Rd|xi >0, i= 1, ..., d}, whenever the initial condition x ∈ Rd+ (cf. Lemma 4.1 below). This condition is a natural substitute, for minimization problems in dimension d≥2, of the classical Inada condition at 0 (see, e.g., equation (2.5) in [30]). The latter, is typically assumed in profit maximization problems, and it is satisfied by Cobb-Douglas production functions. Finally, the conditions on the discount factor ρ are in place in order to ensure a suitable “integrability” of the optimal trajectories, which allows to prove some semiconcavity estimates for the value functionV (see steps 2 and 3 in the proof of TheoremA.1in Appendix A).

Observe that, when Condition 3a is in place, a generic controlled trajectory Xx;v, v ∈ V, can reach the whole space with probabilityP>0. On the other hand, under Condition3b, as

(8)

mentioned in Remark2.2, the natural domain for a controlled trajectory isRd+. This suggest to define a domainDin the following way

(2.5) D:=Rd if Condition 3aholds, D:=Rd+ if Condition3b holds.

Indeed, it is possible to show that the value functionV is finite and it is a convex solution in Wloc2;∞(D) of the Hamilton-Jacobi-Bellman (HJB, in short) equation

(2.6) max{ρV − LV −h,|Vx1| −1}= 0, a.e. inD, where LV(x) := ¯b(x)DV(x) + 12Pd

i=1σ¯2(xi)Vxixi(x), x∈ D,is the generator of the uncon- trolled SDE (2.1). For completeness, a proof of this result is provided in Appendix A (see TheoremA.1). During the proof of Theorem A.1, the convergence of a certain penalization method is studied: This convergence will be a useful tool in many of the proofs in this paper.

Define next the waiting region W as

(2.7) W :={x∈D| |Vx1(x)|<1},

and notice that, by the Wloc2;∞-regularity of V, W is an open subset of D. Also, for each z∈Rd−1, we define the sets

D1(z) :={y∈R|(y, z)∈D} and W1(z) :={y∈R|(y, z)∈ W}.

In the sequel, the closure ofW (resp.W1(z)) inD(resp.D1(z)) will be denoted byW (resp.

W1(z)). We state here a technical lemma, whose proof is given in AppendixB.

Lemma 2.3. For any x = (x1, z) ∈ D, with z ∈ Rd−1, the set W1(z) is a nonempty open interval; in particular, W is nonempty.

Remark 2.4 (Existence and uniqueness of optimal controls). For each x¯∈D, it is possible to show that, under Assumption 2.1, there exists a unique optimal control ¯v ∈ V. This is a classical result when the drift is affine. In the case of a convex drift, it essentially follows from the convexity ofJ w.r.t.(x, v). The latter in turn follows from the convexity of the drift, the monotonicity ofh, and a comparison principle for SDEs. The argument can be recovered from the proof of Lemma 3.7below, which works for any sequence of controls minimizing the cost functionalJ. Finally, the uniqueness of the optimal control is a consequence of the strict convexity ofh in the variable x1.

The following is the main result of our paper, characterizing the optimal policies in terms of the waiting regionW and the derivativeVx1 in the sense of Definition1.

Theorem 2.5. Let x¯= (¯x1,z)¯ ∈D, with z¯∈Rd−1. The following statements hold true:

(1) Ifx¯∈ W, then the optimal control v¯is the unique solution to the modified Skorokhod problem for the SDE (2.2) in W starting at x¯ with reflection direction −Vx1e1; (2) If x /¯ ∈ W, then the optimal control ¯v can be written as v¯ = ¯y1 −x¯1 + ¯w, where y¯1

is the metric projection of x¯1 into the set W1(¯z), and w¯ is the unique solution to the modified Skorokhod problem for the SDE (2.2) in W starting at y¯:= (¯y1,z)¯ with reflection direction −Vx1e1.

In Section 3 we provide a proof of Theorem 2.5 under Condition 3a in Assumption 2.1.

The strategy of the proof can be resumed in three main steps:

Step a. In Subsection 3.1 we study an important monotonicity property of Vx1, through a connection with Dynkin games.

Step b. In Subsection3.2, this property will allow us to construct ε-optimal policies as solu- tions to Skorokhod problems in domains Wε approximatingW.

(9)

Step c. Finally, in Subsection3.3we prove that theε-optimal policies approximate the optimal policy, and that the latter is a solution to the Skorokhod problem in the original domainW.

The proof of Theorem 2.5 under Condition 3b in Assumption 2.1 follows similar rationales, and it is discussed in Section 4. In particular, in Subsections 4.1 a preliminary lemma is proved, while in Subsection4.2we show how to use this lemma in order to repeat (with minor modifications) the arguments of Section3.

3. Proof of Theorem 2.5 for constant volatility

In this section we assume that Condition 3a in Assumption 2.1 holds. To simplify the notation, the proof is given ford= 2, so thatD=R2. The generalization to the case d >2 is straightforward.

3.1. Step a: A connection to Dynkin games and the monotonicity property. In this subsection we adopt an approach based on the variational formulation of the problem in order to show, in the spirit of [15], a connection between the singular control problem (2.4) and a Dynkin game. This connection will enable us to prove a monotonicity property ofVx1, which will be then fundamental in order to constructε-optimal controls.

3.1.1. The related Dynkin game. We begin by characterizing Vx1 as a Wloc2;∞-solution to a two-obstacle problem. The proof of the next result borrows arguments from [15] (see in particular Theorem 3.9, Proposition 3.10, and Theorem 3.11 therein). However, since in our case b can be convex, the techniques used in [15] needs to be refined, and used along with suitable estimates (described more in detail in the proof of TheoremA.1 in AppendixA) on a penalization method. We provide a detailed proof for the sake of completeness.

Theorem 3.1. The function Vx1 is aWloc2;∞(R2)-solution to the equation (3.1) max{(ρ−b11)Vx1− LVx1−ˆh,|Vx1| −1}= 0, a.e. in R2, where ˆh:=hx1 +b2x1Vx2.

Proof. We organize the proof in two steps.

Step 1. In this step we show that the function Vx1 is a solution to a variational inequality with a local operator, and thatVx1 ∈Wloc2;∞(R2). Fix B ⊂R2 open bounded and consider a nonnegative localizing functionψ∈Cc(B). Define the sets

K:=

U ∈Wloc1;2(R2)| |U| ≤1 a.e. and Kψ :={ψU|U ∈ K}.

We show in the sequel that the function W := Vx1ψ is a solution in Kψ to the variational inequality

(3.2) AB(W, U −W)≥ hH, Uˆ −WiB, for each U ∈ Kψ,

where ˆH:= ˆhψ−Vx1Lψ−DVx1Dψ, the operatorAB :W1;2(B)×W1;2(B)→Ris given by AB( ¯U , U) := σ2

2

2

X

i=1

hU¯xi, UxiiB− h¯bDU , U¯ iB+ (ρ−b11)hU , U¯ iB for each ¯U , U ∈W1;2(B), andh·,·iB denotes the scalar product inL2(B).

Let us begin by introducing a family of penalized versions of the HJB equation (2.6). Let β ∈C(R) be a convex nondecreasing function withβ(r) = 0 if r ≤0 and β(r) = 2r−1 if

(10)

r ≥1. For each ε >0, let Vε be defined as in (A.2). As in Step 1 in the proof of Theorem A.1in AppendixA,Vε is aC2-solution to the partial differential equation

(3.3) ρVε− LVε+1

εβ((Vxε1)2−1) =h, x∈R2.

It is possible to show (see Step 2 in the proof of TheoremA.1 in AppendixA) that, for each R >0, there exists a constant CR such that

(3.4) sup

ε∈(0,1)

kVεkW2;∞(BR)≤CR.

Moreover (as in (A.18) in the proof of TheoremA.1), asε→0, on each subsequence we have:

(Vε, DVε) converges to (V, DV) uniformly in BR; (3.5)

D2Vε converges toD2V weakly inL2(BR).

We now show thatVx1 ∈ K. Since theWloc1;2-regularity ofVx1 is already known (cf. Theorem A.1in AppendixA), we only need to show that|Vx1| ≤1 in R2. To this end, takeR >0 and observe that, by (3.4) and (3.3), we have

(3.6) sup

ε∈(0,1)

kβ((Vxε

1)2−1)kL2(BR)≤CRε,

where the constant CR>0 does not depend on ε. Moreover, unless to consider a larger CR, by the estimate (3.4) and the definition ofβ, we also have the pointwise estimate

(3.7) |β((Vxε

1)2−1)| ≤2((Vxε1)2+ 1)≤CR, onBR, for each ε∈(0,1).

Therefore, the limits in (3.5) and the estimates (3.7) allow to invoke the dominated conver- gence theorem to deduce, thanks to (3.6), that

kβ((Vx1)2−1)kL2(BR) = lim

ε→0kβ((Vxε

1)2−1)kL2(BR)= 0.

SinceR is arbitrary, we conclude that|Vx1| ≤1 a.e. in R2, and therefore thatW ∈ Kψ. We continue by proving (3.2). Since Vε is a solution to (3.3), a standard bootstrapping argument (using Theorem 6.17 at p. 109 in [29]) allows to improve the regularity of Vε and to prove that Vε ∈C4. Therefore, we can differentiate equation (3.3) with respect to x1 in order to get an equation forVxε1. That is,

(3.8) [(ρ−b11)− L]Vxε

1 +2

εβ0((Vxε1)2−1)Vxε1Vxε1x1 = ˆhε, x∈R2,

where we have defined ˆhε := hx1 +b2x1Vxε2. Moreover, by (3.8), the localized function Vψε :=

Vxε1ψ is a solution to the equation (3.9) [(ρ−b11)− L]Vψε+2

εβ0((Vxε1)2−1)VψεVxε1x1 = ˆHε, x∈R2, where ˆHε:= ˆhεψ−Vxε1Lψ−DVxε1Dψ.

Let now U ∈ Kψ. Taking the scalar product of (3.9) with U−Vψε, an integration by parts gives

(3.10) AB(Vψε, U −Vψε) +2

εhβ0((Vxε1)2−1)VψεVxε1x1, U −VψεiB=hHˆε, U −VψεiB. Moreover, sinceσ > 0, the operator σ22 P2

i=1hUxi, UxiiB1/2

, U ∈W1;2(B), defines a norm on W01;2(B), and it is therefore lower semi-continuous with respect to the weak convergence

(11)

inW01;2(B). By the limits in (3.5), this implies that

(3.11) lim inf

ε→0

σ2 2

2

X

i=1

hVψxε

i, Vψxε iiB ≥ σ2 2

2

X

i=1

hWxi, WxiiB.

Therefore exploiting the convergences in (3.5) and (3.11), taking the liminf asε→0 in (3.10), we obtain

(3.12) AB(W, U −W) + lim inf

ε→0

2

εhβ0((Vxε1)2−1)VψεVxε1x1, U −VψεiB≥ hH, Uˆ −WiB. In order to prove (3.2), it thus only remains to show that the scalar product in (3.12) involving β0is nonpositive. WriteU asU =ψU¯, with ¯U ∈ K. Ifx∈R2is such that (Vxε1(x))2 ≤( ¯U(x))2, thenβ0((Vxε1(x))2−1) = 0 since ¯U ∈ K. On the other hand, if (Vxε

1(x))2 >( ¯U(x))2 then we have 2Vψε(U−Vψε)≤U2−(Vψε)2 <0. Hence, since Vε is convex andβ0 nonnegative, in both cases we deduce that

2

εβ0((Vxε1)2−1)VψεVxε1x1(U −Vψε)≤0.

Therefore, we conclude thatW is a solution to the variational inequality (3.2).

Finally, since σ > 0, the Wloc2;∞-regularity of Vx1 follows from Theorem 4.1 at p. 31 in [28], slightly modified in order to fit problem (3.2) (see Problem 1 at p. 44, combined with Problems 2 and 5 at p. 29 in [28]).

Step 2. We now prove that Vx1 is a pointwise solution to (3.1). For B ⊂ R2 open bounded andψ∈Cc(B), by Step 1 we have thatVx1ψis a solution to the variational inequality (3.2).

Moreover, thanks to the regularity ofVx1, an integration by parts in (3.2) reveals that (3.13) hLψ,ˆ (U −Vx1)ψiB≥0, for each U ∈ K,

where ˆL := [(ρ−b11)− L]Vx1 −h. For everyˆ ε > 0, define the sets Wcε := {|Vx1| < 1−ε}

and, for N >0 and 0< η < ε/N, set ˆψ:= −ηLˆ1Wcε1{L<N}ˆ . Define next U :=Vx1 + ˆψ, and observe thatU ∈ K. With this choice ofU, the inequality (3.13) rewrites as

0≤ Z

B

L(Uˆ −Vx12dx=−η Z

R2

2ψ21Wcε1{|L|<N}ˆ dx, which in turn implies thatR

R2

2ψ21Wcε1{|L|<N}ˆ dx= 0. Taking limits as N → ∞and ε→0, by monotone convergence theorem, we conclude that R

W2ψ2dx= 0; that is, ˆL= 0 a.e. in W.

Finally, defining the two regions

(3.14) I :={x∈R2|Vx1(x) =−1} and I+:={x∈R2|Vx1(x) = 1},

we can repeat the arguments above with ˆψ:=−ηLˆ+1I+1{|L|<N}ˆ and ˆψ:=−ηLˆ1I1{|L|<N}ˆ , in order to deduce that ˆL≤0 a.e. inI+∪I, and thus completing the proof of the theorem.

Theorem 3.1 allows to provide a probabilistic representation of Vx1 in terms of a Dynkin game. LetT be the set of F-stopping times, and, for τ1, τ2 ∈ T, define the functional

G(x;τ1, τ2) :=E

Z τ1∧τ2

0

e−ˆρtˆh(Xtx)dt−e−ˆρτ111≤τ2, τ1<∞}+e−ˆρτ2121}

,

where ˆh = hx1 +b2x1Vx2 (cf. Theorem 3.1), the process Xx denotes the solution to the un- controlled SDE (2.1), and ˆρ := ρ−b11. Consider the 2-player stochastic differential game of optimal stopping in which Player 1 (resp. Player 2) is allowed to choose a stopping time τ1

(resp.τ2) in order to maximize (resp. minimize) the functional G.

(12)

Recalling the definitions of I and I+ given in (3.14), from Theorem 3.1 we obtain the following verification theorem. Its proof is based on a generalized version of Itˆo’s formula (see Theorem 1 at p. 122 in [46]) which can be applied to the process (e−ˆρtVx1(Xtx))t≥0 because Vx1 ∈Wloc2;∞(R2) by Theorem3.1. Since these arguments are standard, we omit the details in the interest of length.

Theorem 3.2. For each x∈R2, the profile strategy(¯τ1,τ¯2) given by the stopping times

¯

τ1 := inf{t≥0|Xtx∈ I} and τ¯2:= inf{t≥0|Xtx ∈ I+}

is a saddle point of the Dynkin game, and its corresponding value equals Vx1(x); that is, G(x;τ1,¯τ2)≤Vx1(x) =G(x; ¯τ1,τ¯2)≤G(x; ¯τ1, τ2), for each τ1, τ2 ∈ T. Moreover, we have

(3.15) Vx1(x) = sup

τ1

infτ2

G(x;τ1, τ2) = inf

τ2

sup

τ1

G(x;τ1, τ2).

3.1.2. The monotonicity property. We now show how Condition2in Assumption2.1together with Theorems3.1 and 3.2lead to an important monotonicity ofVx1.

Proposition 3.3. We haveb2x1Vx1x2 ≥0 in R2.

Proof. Since b2x1 ≤0 by Condition 2in Assumption 2.1, it is enough to show thatVx1x2 ≤0.

Fix an initial condition x ∈ R2, take r > 0, and define a new initial condition xr ∈ R2 by settingxr:=x+re2.LetXxr = (X1,xr, X2,xr) be the solution to the uncontrolled dynamics (2.1), with initial conditionxr. By the structure we assumed on the drift, this perturbation of the initial condition will affect only the second component ofXxr. Indeed, sincexr2 ≥x2, a standard comparison principle for SDE (see [37]) givesXt2,xr−Xt2,x≥0 for eacht≥0, P-a.s., whileX1,xr =X1,x. Hence, sincehx1x2 ≤0, we have

(3.16) hx1(Xtxr)≤hx1(Xtx), for each t≥0, P-a.s.

Moreover, sinceb2x1 ≤0, we can exploit the convexity of V to obtain b2x1(Xtxr)(Vx2(Xtxr)−Vx2(Xtx))

(3.17)

=b2x1(Xtxr)(Xt2,xr −Xr2,x) Z 1

0

Vx2x2(Xtx+s(Xtxr −Xrx))ds

≤0, for each t≥0, P-a.s.

Let us now prove thatVx2(y)≥0, for eachy∈R2. Fixy ∈R2and letvbe an optimal control fory. Observe that, for eachδ >0 we can still employ a comparison principle to deduce that Xt1,y;v−Xt1,y−δe2;v = 0, and Xt2,y;v−Xt2,y−δe2;v ≥0, for eacht≥0, P-a.s. This, sincehx2 ≥0 andV ∈C1(R2), in turn implies that

Vx2(y) = lim

δ→0

V(y)−V(y−δe2) (3.18) δ

≥ lim

δ→0

J(y;v)−J(y−δe2;v) δ

≥ lim

δ→0

1 δE

Z

0

e−ρt(h(Xty;v)−h(Xty−δe2;v))dt

≥0,

where we have used that the controlv is suboptimal for the initial condition y−δe2. Hence, sinceb2x1x2 ≤0, we obtain that

(3.19) (b2x1(Xtxr)−b2x1(Xtx))Vx2(Xtx)≤0, for each t≥0, P-a.s.

(13)

Summing now the inequalities (3.16), (3.17) and (3.19), we find

(3.20) hx1(Xtxr) +b2x1(Xtxr)Vx2(Xtxr)≤hx1(Xtx) +b2x1(Xtx)Vx2(Xtx), for each t≥0, P-a.s.;

that is, ˆh(Xxr)≤ˆh(Xx). Therefore, for each stopping time τ1, τ2 ∈ T, we deduce that G(xr1, τ2)≤G(x;τ1, τ2).

Taking the supremum overτ1 ∈ T and the infimum over τ2 ∈ T in the latter inequality, we deduce, in light of (3.15) in Theorem 3.2, that Vx1(xr) ≤ Vx1(x). Hence, we conclude that Vx1x2 ≤0 in R2, which completes the proof of the proposition.

3.2. Step b: Construction of ε-optimal policies. For everyε >0 define the sets Wε:={x∈R2|Vx21(x)<1−ε}, Sε :=∂Wε.

The proof of the following lemma is obtained combining arguments from [45] together with the monotonicity property shown in Proposition3.3.

Lemma 3.4. For each ε > 0 such that x¯ ∈ Wε, there exists a solution vε ∈ V to the (classical) Skorokhod problem for the SDE (2.2) in Wε starting at x¯ with reflection direction

−Vx1/|Vx1|e1.

Proof. Fix ε >0 such that ¯x∈ Wε. In order to employ the results of [48] to construct vε as the solution of the Skorokhod problem with reflection alongSε, we first show thatSε is aC3 hypersurface.

To this end, we begin the proof by showing that

(3.21) Vx1x1(x)>0, for each x∈ W.

Take indeed x ∈ W and δ > 0 such that Bδ(x) ⊂ W. Since V solves the linear equation ρV − LV = h in W, from Theorem 6.17 at p. 109 in [29] it follows that V ∈ C4(W).

Therefore, we can differentiate two times with respect to x1 the HJB equation (2.6), and obtain an equation forVx1x1

(3.22) (ρ−2b11)Vx1x1 − LVx1x1 =hx1x1 + 2b2x1Vx1x2+b2x1x1Vx2, in Bδ(x).

Since by assumptionhx1x1 >0, thanks to Proposition3.3we have thathx1x1+ 2b2x1Vx1x2 >0.

By the inequality (3.18) in the proof of Proposition 3.3, and the fact that b2 is convex, we deduce that b2x1x1Vx2 ≥ 0. Therefore, the right hand side of (3.22) is positive. Next, by the strong maximum principle (see Theorem 3.5 at p. 35 in [29]), Vx1x1 cannot achieve a nonpositive local minimum inBδ(x), unless it is constant. IfVx1x1 is constant inBδ(x), then by (3.22) we obtain Vx1x1 > 0 as desired. If Vx1x1 attains its minimum at the boundary

∂Bδ(x), then by convexity of V we still have Vx1x1(y)> min

∂Bδ(x)Vx1x1 ≥0, for each y∈Bδ(x), which also proves (3.21)

Next, define ¯ν(x) := Vx1(x)/|Vx1(x)|e1 for each x ∈ Sε, and w(y) := |Vx1(y)|2 for each y∈ W. Notice thatp

w(y) =|∂ν¯V(y)|. For R > 0, by compactness ofWRε/2 := Wε/2∩BR, in light of (3.21) we can find a constantcRε >0 such that

(3.23) inf

x∈WRε/2

Vx1x1(x)≥cRε >0.

Therefore, forx∈Sε and R large enough, by (3.23), we have pw(x+λ¯ν) =∂ν¯V(x+λ¯ν)≥∂ν¯V(x) +λcRε/2 =p

w(x) +λcRε/2,

(14)

and hence

(3.24) ∂ν¯

pw(x)≥cRε/2.

It thus follows that∂ν¯w6= 0 onSε. This implies, by the implicit function theorem, that Sεis aC3-hypersurface.

Now, by (3.24), arguing as in Lemma 2.7 in [45] it is possible to show that the vector−¯ν is not tangential toSε, and, by definition of Wε and of ¯ν, we observe that the vector−¯ν points insideWε. Therefore, we can employ a version of Theorem 4.4 in [48] for unbounded domains in order to find a solutionvε∈ V to the Skorokhod problem for the SDE (2.2) inWε starting at ¯x, with reflection direction−Vx1/|Vx1|e1. We conclude this section with the following lemma. We omit its proof since this can be established as in the proof of Lemma 2.8 in [45].

Lemma 3.5. For eachx¯∈ W andε >0 such that x¯∈ Wε, let the controlvε be as in Lemma 3.4. Then J(¯x;vε)→V(¯x) as ε→0.

3.3. Step c: Characterization of the optimal control. Thanks to the results of Subsec- tions3.1and 3.2we can now prove Theorem2.5. We provide a separate proof for each of the two claims.

3.3.1. Proof of Claim 1. We will first prove Claim1 for ¯x ∈ W, and then, at the end of this subsection, we will give a proof for a general ¯x ∈ W. Fix ¯x ∈ W and a sequence (εn)n∈N

converging to zero. To simplify the notation, according to Lemma3.4we define the processes Xn:=Xx;v¯ εn, vn:=vεn, ξn:=|vεn|, for each n∈N.

Bear in mind that the processesvn, γn and ξn depend on the initial condition ¯x, and that, according to Lemma 3.5, the sequence of controls (vn)n∈N is a minimizing sequence; that is, J(¯x;vn)→V(¯x) as n→ ∞.

We begin with the following estimate.

Lemma 3.6. Let p0:= (2p−1)/2. We have

(3.25) sup

n

Z

0

e−ρt(E[|Xt1,n|p] +E[|Xtn|p0])dt≤C(1 +|¯x|p).

Proof. Denoting by Xx¯ the solution to (2.1), a standard use of Gr¨onwall’s inequality and of Burkh¨older-Davis-Gundy’s inequality leads to the classical estimate

E[|Xtx¯|p]≤CepLt¯ (1 +|¯x|p) for each t≥0,

where ¯L is the Lipschitz constant of ¯b and C >0 is a generic constant. Therefore, since the control constantly equal to zero is not necessarily optimal for ¯x, from the latter estimate and the growth rate ofh we obtain

V(¯x)≤E Z

0

e−ρth(Xtx¯)dt

≤C Z

0

e−ρt(1 +E[|Xtx¯|p])dt

≤C Z

0

e−(ρ−pL)t¯ (1 +|¯x|p)dt≤C(1 +|¯x|p),

where we have used that, by Condition3a in Assumption 2.1,ρ > pL. Therefore, since¯ vnis a minimizing sequence, for allnbig enough we find the estimate

κ1 Z

0

e−ρtE[|Xt1,n|p]dt−κ2 ≤J(¯x;vn)≤C(1 +|¯x|p),

Referenzen

ÄHNLICHE DOKUMENTE

The average case considered here is mathematically handled by assuming that the respective words and the automaton are given by a stochastic pro- cess; here

So this assumption restricts the validity of Peterka's model to the case that each technology has already gained a certain market share, let us say a certain percentage, and so

The water quality control is one of the fundamental categories of the general runoff control. In the following we will discuss a stochastic water quality control model which

[r]

In particular, we argue as in [18], and following a direct approach employing techniques from viscosity theory and free-boundary problems we show that: (i) the value function V

In particular, the state process is pushed towards the free boundary by installing a block of solar panels immediately, if the initial electricity price is above the critical

Keywords: infinite-dimensional singular stochastic control; semigroup theory; vector-valued integration; first-order conditions; Bank-El Karoui’s representation theorem;

Indeed, by introducing a singular stochastic control problem to model the exchange rates’ optimal management problem faced by a central bank, we are able to mimick the