Dynamic Entropic Risk Measures - On Dynamic Coherent and Convex Risk Measures: Risk Optimal Beh

3.5 Examples

3.5.1 Dynamic Entropic Risk Measures

As first fundamental example we consider dynamic entropic risk measures or, equivalently, dynamic multiplier preferences. Its robust representation is intuitive: the agent expects a reference distribution Q∈ M most likely and distributions further away seem to be more and more unlikely. Hence, nature shall be punished more severely the further “away” the chosen distribution from that specific Q. Relative entropy turns out to be the measure of dis-tance in the robust representation. We introduce multiplier preferences as in [Maccheroni et al., 06b]. [Cheridito et al, 06] and [F¨ollmer & Penner, 06]

equivalently consider this example as dynamic entropic risk measures. Let again (Ω,F,(F_t)t≤T,P0),T ∈N∪{∞}, be the underlying space andτ denote a stopping time.

Definition 3.5.1. For PQ, locally,¹⁷ we define the relative entropy of P

17By definition ofMthis is satisfied for all distributions under consideration.

with respect to Q at time t ≥0 as

H_t(P|Q) :=E^P[ln(Z_t)],

where Z_t := _dQ^dP|Ft. Furthermore, we define the conditional relative entropy of P with respect to Q at time t≥0 as

Hˆ_t(P|Q) := E^P

ln Z_T

Z_t

F_t

=E^Q Z_T

Z_t ln Z_T

Z_t

F_t

I^{Zt>0}.

Basic properties of relative entropy are stated in [Csiszar, 75]: H_t(P|Q) = 0 if and only ifP=QonF_t, i.e. Z_t = 1, and non-negative else. As we assume the distributions under consideration to be locally equivalent, the indicator function in the last equation vanishes.

We now formally introduce dynamic multiplier preferences:

Definition 3.5.2. Let θ > 0. We say that dynamic variational expected rewardπ_t^e(Xτ), t, τ ≤T, is obtained by dynamic multiplier preferences given reference model Q or, equivalently by dynamic entropic risk measures, if its robust representation is of the form¹⁸

π_t^e(X_τ) = ess inf

P∈M

E^P[X_τ|F_t] +θHˆ_t(P|Q)

. (3.10)

Remark 3.5.3. The variational formula for relative entropy implies π_t^e(X_τ) =−θln(E^Q[e⁻¹^θ^X^τ|F_t]).

Proposition 3.5.4. Dynamic multiplier preferences with reference distri-bution Q ∈ M are time-consistent: Its robust representation has minimal penaltyα^min_t (P) = θHˆ_t(P|Q) fort ≤T, P∈ M, satisfying the no-gain condi-tion. Hence, we have

π^e_t(X_τ) = X_tI^{τ=t}

+ ess inf

µ∈M|_F

t+1

π^e_t+1(X_τ)dµ+θHˆ_t+1(µ|Q(·|F_t))

I^{τ≥t+1},

18This is the generalized version of the respective definition in [Maccheroni et al., 06b].

By conditional cash invariance, forτ≤t both sides of the equation equalXτ.

where we set Hˆ_t+1(µ|Q(·|F_t)) :=E^µ[ln(_dQ(·|F^dµ

t)|Ft+1

)] which, by abuse of nota-tion, we write as E^µ[ln(_dQ(·|F^dµ

t+1

)], µ∈ M|_F_t+1.

Proof. The specific form of the penalty is shown in [F¨ollmer & Penner, 06], Lemma 6.2, in terms of dynamic entropic risk measures: Robust representa-tion of these are equal to those of multiplier preferences up to a minus sign.

Time-consistency is shown in [F¨ollmer & Penner, 06], p.92.

We now show the specific form of π^e_t: By Corollary 3.2.16, we have to show γ_t(µ) = θHˆ_t+1(µ|Q(·|F_t)). For µ ∈ M|_F_t+1 we recall γ_t(µ) :=

ess inf_P∈Mα_t^min(µ ⊗_t+1 P). As α^min_t only depends on the conditional dis-tributions given F_t, we may write α^min_t (µ⊗_t+1 P) := α^min_t (Q⊗_tµ⊗_t+1 P)

∀Q∈ M. Hence,

θγ_t(µ) = ess inf

P∈M α_t^min(Q⊗_tµ⊗_t+1P)

= ess inf

P∈M E^Q^⊗^t^µ⊗^t+1^P

d(Q⊗_tµ⊗_t+1P) dQ |_F_T

d(Q⊗tµ⊗_t+1P) dQ |F_t

# .

First, note that we have by dµ=d(Q⊗_tµ)(·|F_t)

E^µ

ln dµ

dQ(·|F_t) _F

t+1

=E^Q^⊗^t^µ

ln d(Q⊗tµ) dQ

t+1

F_t

# .

As the integrand isF_t+1-measurable and ^d(Q_dQ^⊗^t^µ) _F

t+1

= ^d(Q^⊗^t_dQ^µ⊗^t+1^P) _F

t+1

, the following equation holds for all P∈ M:

E^µ

ln dµ

dQ(·|F_t) _F

t+1

= E^Q^⊗^t^µ⊗^t+1^P

ln d(Q⊗_tµ⊗_t+1P) dQ

t+1

# .

Hence, it leaves to show for all R∈ M that

E^Q^⊗^t^µ⊗^t+1^R

ln d(Q⊗_tµ⊗_t+1R) dQ

t+1

= ess inf

P∈M E^Q^⊗^t^µ⊗^t+1^P

d(Q⊗_tµ⊗_t+1P) dQ |_F_T

d(Q⊗tµ⊗_t+1P) dQ |F_t

F_t

= ess inf

P∈M E^Q^⊗^t^µ⊗^t+1^P

ln d(Q⊗tµ⊗t+1P) dQ

F_t

# ,

where the last equation follows as ^d(Q^⊗^t_dQ^µ⊗^t+1^P)|F_t = 1.

We know from the properties of the entropy, that ˆH_t(P|Q) ≥ 0 and = 0 if and only ifP=Q onF_t. In the same way, we have that

Q∈arg ess inf

P∈ME^Q^⊗^t^µ⊗^t+1^P

ln d(Q⊗_tµ⊗_t+1P) dQ

F_t

# . More precisely,

arg ess inf

P∈ME^Q^⊗^t^µ⊗^t+1^P

ln d(Q⊗_tµ⊗_t+1P) dQ

F_t

= {V∈ M|V=R⊗_tµ⊗_t+1Q for someR∈ M}.

Hence, we have ess inf

P∈M E^Q^⊗^t^µ⊗^t+1^P

ln d(Q⊗_tµ⊗_t+1P) dQ

F_t

= E^Q^⊗^t^µ⊗^t+1^Q

ln d(Q⊗_tµ⊗_t+1Q) dQ

F_t

= E^Q^⊗^t^µ⊗^t+1^Q

ln d(Q⊗tµ⊗t+1Q) dQ

t+1

F_t

# ,

where the second equality follows since qt := ^dQ_dQ|_F_t = 1 for all t ≤ T and hence ^d(^Q^⊗^t_d^µ⊗^t+1^Q⁾

Ft+1

= ^d(^Q^⊗^t_d^µ⊗^t+1^Q⁾

Fη

for all η ≥ t+ 1. This completes the proof.

For the value function, we thus have V_t = ess sup

t≤τ≤T

π_t^e(X_τ)

= ess sup

t≤τ≤T

X_tI^{τ=t}

+ ess inf

µ∈M|Ft+1

π_t+1^e (X_τ)dµ+θHˆ_t+1(µ|Q(·|F_t))

I^{τ≥t+1}

)

= maxn X_t; ess sup

t+1≤τ≤T

ess inf

µ∈M|Ft+1

π_t+1^e (X_τ)dµ+θHˆ_t+1(µ|Q(·|F_t)) )

= maxn X_t; ess inf

µ∈M|Ft+1

ess sup

t+1≤τ≤T

π_t+1^e (X_τ)dµ+θHˆ_t+1(µ|Q(·|F_t)) )

= maxn

X_t; ess inf

Q∈M E^Q[V_t+1|F_t] +α^min_t (Q)

again showing the Bellman principle to hold but having applied our mini-max theorem. As we want to achieve explicit solutions, we further confine ourselves:

Assumption 3.5.5. Let the underlying probability space (Ω,F,(F_t)t≤T,P0) be given as the independent product of the time-t state space, (S,S, ν₀), S ⊂ R. Then P⁰ = ⊗^T_t=1ν_o and F_s is generated by the projection mappings _t : Ω7→S, t≤s. In particular, the ts are i.i.d. with ν0 under P⁰.

As in [Riedel, 09], we confine ourselves to the set M^[a,b]:=

P^β ≈P⁰ : dP^β dP0

=D_t^β ∀t, (β_t)_t⊂[a, b], predictable

, D^β_t := exp(Pt

s=1β_s_s−Pt

s=1L(β_s)) for some predictable process (β_t)t≤T ⊂ [a, b]⊂R and L(β_t) := lnR

Se^β^t^xν₀(dx).

Remark 3.5.6. As we have now constrained the set of possible probability distributions, we note that we are not in context of general dynamic entropic risk measures any longer.

Notation 3.5.7. The reference distribution of the entropic penalty write as Q := P^β

1, i.e. (β_t¹)_t≤T denotes the process defining the penalty’s reference distribution. Note that Q is in general not equal to P⁰. Other distributions in M write as P:=P^β

2. Then

dP dQ Ft

= D_t^β² D_t^β¹

dP0

dP⁰ Ft

= exp

s=1

(β_s²−β_s¹)_s−

s=1

[L(β_s²)−L(β_s¹)]

! . and the entropic penalty with reference distribution Qis given by

α^min_t (P) = θHˆt(P|Q)

= θE^P

" _T X

s=t+1

(β_s²−β_s¹)_s−

s=t+1

[L(β_s²)−L(β_s¹)]

F_t

# . We writeE^β :=E^P^β and ˆH_t(β²|β¹) := ˆH_t(P^β²|P^β¹) as well asα^min_t (β²). Note, in case Q = P0, we have (β_t¹)t≤T = 0 and hence for P = P^β

2: α^min_t (P) = θE^P

hPT

s=t+1β_s²_s−PT

s=t+1L(β_s²) F_ti

To make the value function (Vt)t≤T more explicit, note for µ ∈ M|Ft+1

given by previsible (β_t²)_t≤T and penalty’s reference distribution Q ∈ M by previsible (β_t¹)t≤T, we have

Hˆ_t+1(µ|Q(·|F_t)) = E^µ

dµ dQ(·|F_t)|_F_t+1

= E^β

2 t+1

(β_t+1² −β_t+1¹ )_t+1−(L(β_t+1² )−L(β_t+1¹ )) . Hence, as above the value is given by

V_t = ess sup

t≤τ≤T

ess inf

β²⊂[a,b]

E^β

2[X_τ|F_t] +θHˆ_t(β²|β¹)

(3.11)

= ess sup

t≤τ≤T

ess inf

β²⊂[a,b]E^β

X_τ +θ

s=t+1

(β_s²−β_s¹)_s

−

s=t+1

[L(β_s²)−L(β_s¹)]

F_t

= maxn X_t;

ess sup

t+1≤τ≤T

ess inf

β_t+1² ∈[a,b]E^β

2 t+1

π_t+1(X_τ) +θ (β_t+1² −β_t+1¹ )_t+1

−(L(β_t+1² )−L(β_t+1¹ ))o

= maxn

X_t ; ess inf

β_t+1² ∈[a,b]E^β

2 t+1

V_t+1+θ (β_t+1² −β_t+1¹ )_t+1

−(L(β_t+1² )−L(β_t+1¹ ))o ,

where the last equality follows from the Minimax result. In particular, we see that the value of the problem – and hence the worst case distribution – depends on the reference distributionQ=P^β¹ of the penalty. In caseT <∞, the same recursion has to hold for the Snell envelope (U_t)t≤N by Theorem 3.4.1:

U_t = max{X_t;π_t(U_t+1)}

= max (

X_t; ess inf

µ∈M|Ft+1

π_t+1(U_t+1)dµ+θH_t+1(µ|Q(·|F_t)) )

= max (

X_t; ess inf

µ∈M|_F

t+1

U_t+1dµ+θH_t+1(µ|Q(·|F_t)) )

= maxn

X_t; ess inf

β_t+1² ∈[a,b]E^β

2 t+1

U_t+1+θ (β_t+1² −β_t+1¹ )_t+1

−(L(β_t+1² )−L(β_t+1¹ ))o . To further solve problems under entropic risk, we have to make specific properties of the payoff process explicit. We constraint ourselves to monotone problems:

Assumption 3.5.8. X_t:=f(t, _t), wheref is a bounded measurable function that is strictly monotone in the state variable _t.

For monotone payoff processes in the ambiguous, i.e. multiple priors, case it is shown in [Riedel, 09] thatU_t is increasing in _t. However, having a look at the proof therein (Appendix F), we see that this crucially depends on _t being independent of Ft−1 (cf. equation (12) in [Riedel, 09]) as the process

(β_t²)_t yielding the worst case distribution under multiple priors is constant, and the worst case distribution being the one that is stochastically dominated for the payoff process (Lemma 13). We will see that these arguments do not have to hold in case of variational preferences. Furthermore, in [Riedel, 09]’s multiple priors case, the calculation of a worst case measure is done by virtue of stochastic dominance on the payoff process. It is intuitive that this cannot work as elegant under variational preferences: The penalty is not trivial, i.e.

not zero on some set of priors and infinite else. In particular, in the entropic case, the worst-case measure depends on the reference distribution Q: there might be a trade off between stochastic dominance on (X_t)_t and the penalty:

The penalty increases the further nature moves away fromQand in direction of a distribution minimizing the expectation of the payoff process.

To gain insights, we have a look at a special case for the reference distri-bution of the penalty:

Example 3.5.9. Let f be increasing and the reference distribution be Q = P^a, the distribution given by β_t¹ = a for all t ≤ T. We encounter for the first term in the value function, E^β²[f(τ, _τ)|F_t]: P^a is stochastically domi-nated, i.e. minimizes that term on M^[a,b]. P^a also minimizes the penalty:

Hˆ_t(β²|a) := ˆH_t(P^β

2|P^a) is increasing in β² on [a, b], Hˆ_t ≥0 and zero if and only if P^β

2 = P^a. Hence we have equivalence of the problem under dynamic multiplier preferences and the optimality problem under the worst case dis-tribution P^a as in Theorem 5 in [Riedel, 09].

Proposition 3.5.10. Let f be increasing, T < ∞, and τ^a denote the opti-mal stopping time for the classical optiopti-mal stopping problem of(Xt)t≤T under subjective distribution P^a, i.e. τ^a solves max0≤τ≤T E^a[X_τ]. Let Q = P^a be the reference measure for the penalty, i.e. β_t¹ =a, t ≤T, in equation (3.11).

Then, τ^a is the solution to the robust problem with dynamic multiplier pref-erences (π^e_t)t≤T as given in equation (3.11).

Proof. Intuitively, in Appendix F in [Riedel, 09], it is shown that P^a is the worst case distribution for the first term in the value function (3.11). As

Hˆ_t(a|a) = 0≤ Hˆ_t(β²|a) for all β², P^a also minimizes the penalty and hence is the worst case distribution in the multiplier case when Q=P^a.

Formally: For all increasing bounded measurable functions h : Ω → R and allt ≥1, we have by Lemma 13 in [Riedel, 09]

E^a[h(_t)|Ft−1] = ess inf

β²∈[a,b]E^β

2[h(_t)|Ft−1]

= ess inf

β²[a,b] E^β

2[h(_t)|Ft−1] + min

β²∈[a,b]θHˆt−1(β²|a)

| {z }

= ˆHt(a|a)=0

= ess inf

β²∈[a,b]

E^β

2[h(_t)|Ft−1] +θHˆt−1(β²|a) ,

where the last equation follows as the joint minimizer of both summands isP^a. Given this result, we can mimic the proof of Theorem 5 in [Riedel, 09]: Let (U_t)t≤T denote the variational Snell envelope of the problem under multiplier preferences and (U_t^a)t≤T the classical Snell envelope with respect to subjective priorP^a. Fort =T, we haveU_T =X_T =f(T, _T) = U_T^a and hence increasing in _T. As by induction hypothesis U_t+1 is an increasing function of _t+1, say U_t+1 = u(_t+1) for some bounded measurable increasing u, we have for all t < T

Ut = max

f(t, t), ess inf

β²∈M^[a,b]

E^β

2[Ut+1|Ft] +θHˆt(β²|a)

= max







f(t, _t), E^a[U_t+1|F_t] +θHˆ_t(a|a)

| {z }







= max{f(t, _t), E^a[U_t+1|F_t]}=:U_t^a.

This shows the assertion by equality of the recursion formulas: (U_t)_t≤T = (U_t^a)t≤T and hence the optimal stopping times coincide.

Remark 3.5.11. The foregoing proof particularly shows thatU_t is increasing in _t in case Q=P^a: _t+1 is independent of F_t under P^a and hence

U_t = max{f(t, _t),E^a[u(_t+1)|F_t]}

= max{f(t, _t),E^a[u(_t+1)]}.

The argument in the foregoing proof for the case Q=P^a is that P^a mini-mizesE^P[f(t, _t)] as well as ˆH_t(P|a). Of course, this does not hold true if the reference measure Q=P^β

1 is such that β_t¹ is not identicala. Then, we have a trade off between a decrease in the first term, E^P[f(t, _t)], which is inde-pendent ofP^β

1, and an increase of the penalty in the second term, ˆHt(P|β¹), the further nature deviates from the reference distributionP^β

1 “downwards”

to the distributionP^a. More elaborately, considering a distributionP^β² with β_t² ∈ [a, β_t¹], t ≤ T: When nature moves towards P^a, decreasing the first term, the second term increases; when nature moves towards the reference distribuitonP^β

1, minimizing the second term, the first term increases. How-ever, moving from P^β

1 in direction of the upper extremal distribution P^b, both terms increase:

Proposition 3.5.12. Let Q = P^β

1 ∈ M^[a,b] be the reference distribution of the entropic penalty, and f be increasing. Then, the worst case distribution P^β^¯

2 satisfies β¯_t² ∈[a, β_t¹].

Proof. Forh as above, we have ess inf

β∈[a,b]

E^β[h(_t)|Ft−1] + ˆHt−1(β|β¹)o

≤ E^β

2[h(_t)|F_t−1] + ˆH_t−1(β²|β¹)

for all β_t² ∈ [β_t¹, b] for all t as ˆHt−1(β¹|β¹) = 0 and ≥ 0 else and further-more E^β²[h(_t)|Ft−1] is increasing in β² as seen in the proof of Lemma 13 in [Riedel, 09]. As ˆH_t(·|β¹) is strictly increasing on [β_t¹, b], we have strict inequality on ]β_t¹, b].

We see, that the approaches e.g. in [Karatzas & Zamfirescu, 08], with nature maximizing over the set of priors, are easier to handle in this context as there is no tradeoff.

Example 3.5.13. The second extreme case for monotone increasing prob-lems to be considered is the penalty’s reference distribution set to Q = P^b:

Here, the smaller (β_t²)_t is chosen and hence the smaller the first term, the more increases the penalty as nature deviates further from the reference dis-tribution. In particular, we see that the worst case distribution depends on the specific form of f, not just on f being increasing: Due to tradeoff, it depends on the slope of f at a particular state of the world. This has severe consequences for the complexity of calculations: Let us for example take the case of an American call as considered in [Riedel, 09]. As long as it is in the money, the slope of f is one, whereas it is zero when out of the money. I.e., when out of the money, nature cannot just apply a distribution low enough to likely staying out of the money but also has to take care of it being close enough to Q not to increase the penalty too much. In this sense, the penalty comes relatively more severely into account when the call is out of the money and, hence, the one step ahead worst case distribution depends on the current state:

Remark 3.5.14. In case of variational preferences, correlation is already introduced for the call that has independent rewards under multiple priors as shown in [Riedel, 09].

In general, we see that an increase in penalty by deviating further from P^β

1 to P^a is less severe the steeper f, i.e. the tradeoff effect is in favor of minimizing the first part of the value function, the expectation. In extreme cases we might even still haveP^ato be the worst case distribution iffis “steep enough”, i.e. the increase in penalty might be outweighed by the decrease in expected f, andP^β¹ “is not too far away” fromP^a. To sum up:

Proposition 3.5.15. As we have already seen, the worst case distribution depends on the reference distribution Q of the penalty, i.e. on (β_t¹)t≤T. Fur-thermore, as we have argued, it is a function of the current state of the world and the specific form of the function f at that state and particularly of the whole history.

It is hence immediate that not even a constant reference process (β_t¹)t≤T

induces a constant worst case ( ¯β_t²)t≤T. This insight can be seen in the

follow-ing calculations: Let U_t = h(₁, . . . , _t), bounded and F_t-measurable. Then, the right hand side of the Snell envelope becomes

E^β

t[h(₁, . . . , _t)|Ft−1] +θHˆ_t(β₁²|P^β

1(·|Ft−1)|Ft)

= E^β

t2[h(₁, . . . , _t) +θ (β_t² −β_t¹)_t−(L(β_t²)−L(β_t¹))

|Ft−1].

In order to recursively obtain a worst-case distribution, we have to min-imize this expression with respect to β_t² ∈ [a, b] and obtain some ¯β_t² = β¯_t²(1, . . . , t−1, β_t¹). In particular, we can see that the process achieving the worst-case distribution is again previsible, i.e. ¯β_t² isF_t−1-measurable. Hence, given a specific structure of (X_t)t≤T and a reference P^β¹ for the penalty, we receive a worst case measure P^β^¯² where ( ¯β_t²)_t is achieved as above. Having achieved this worst case distribution, we can calculate the optimal stopping time τ^∗. However, as in general ˆH_t( ¯β_t²|β_t¹) 6= 0, we obtain a negation of Theorem 5 in [Riedel, 09] for our approach:

Proposition 3.5.16. Let ( ¯β_t²)_t denote the process inducing the worst-case distribution for the monotone problem under dynamic multiplier preferences (π_t^e)t≤T. Then,

Ut = max (

Xt; ess inf

β²_t+1∈[a,b]

E^β

t+1[Ut+1|Ft] +θHt+1(β_t+1² |P^β

1(·|Ft))

)

= max n

Xt;E^β^¯

t+1[Ut+1|Ft] +θHt+1( ¯β_t+1² |P^β

1(·|Ft)) o

≥ maxn X_t;E^β^¯

t+1[U_t+1|F_t]o

=U_t^β^¯²,

whereU_t^β^¯² denotes the classical Snell envelope of the optimal stopping problem under subjective prior given by β¯². In particular, we see that

τ^∗ = inf

t {Xt=Ut} ≥inf

t {Xt=U_t^β^¯²}=τ^β^¯²^∗.

As the recursion formulas for the Snell Envelopes and hence the optimal stop-ping times of the problem under dynamic multiplier preferences and the one for an expected utility maximizer under the respective worst case distribution

differ, we see that the intuition in [Riedel, 09] is not valid anymore: The agent does not behave as the expected utility maximizer under the worst case distribution.

As a tangible example, we apply the problem of an American put to vari-ational preferences. We assume the agent to consider the market as “emerg-ing”, i.e. she considers distributions more favorable under which the value of the underlying is likely to go up. We hence set the reference distribution of the entropic penalty to P^b. We will formally show the following result: As the value of the American put is decreasing in the value of the underlying and the penalty is minimal for P^b, the worst case distribution is given byP^b. Moreover, as ˆH_t(P^b|P^b) = 0 for all t, the agent behaves as expected utility maximizer under the subjective priorP^b. Formally:

Example 3.5.17 (American Options in CRR-Model). Let the agent assess utility in terms of dynamic multiplier preferences with entropic penalty given by parameter θ = 1 and reference distribution P^b. We consider American options for the Cox-Ross-Rubinstein (CRR) model: Let Ω := {0,1}^T, T <

∞.¹⁹ Let _t : Ω → {0,1}, t ≤ T, be the projection mappings and P0 such that _t’s are i.i.d. under P⁰ with P⁰[_t = 1] = P⁰[_t = 0] = ¹₂. Let M^[a,b]

be given as in Assumption 3.5.5. As in [Riedel, 09], we then have for all β := (β_t)_t that P^β[_t = 1|Ft−1] ∈ [p; ¯p], where p := _1+e^e^aa and p¯:= _1+e^e^bb. Let P^a be again the distribution induced by the constant process with β_t =a for all t and equivalently for P^b. Then, under P^a, _t’s are i.i.d. with P^a[_t] = p and equivalently for P^b with P^b[_t] = ¯p.

The “ingredients” of the CRR-model are given by a risk-less asset with value process B_t = (1 +r)^t for some fixed interest rate r > −1 and a risky asset with value process S_t at t such that S₀ = 1 and

S_t+1 =S_t·

( (1 +d) if t+1 = 1, (1 +c) if _t+1 = 0,

19The infinite case can be achieved by virtue of Theorem 3.4.6.

where we assume the constants to satisfy −1< c < r < d for the model not to allow for arbitrage opportunities.

Now, consider an American option with payoffA(t, S_t)from exercising at t. The agent has to solve the problem²⁰

ess sup

ess inf

P∈M^[a,b]

E^P[A(τ, S_τ)] +H₀(P|P^b) .

To further elaborate the example: Assume A^p(t, S_t) being an American put and, hence, decreasing inSt for allt.²¹ Let (U_t^b)t≤T denote the classical Snell envelope of A^p(t, S_t) under subjective probability P^b, i.e.

U_t^b(t, S_t) = max

A^p(t, S_t); ¯pU_t^b(t+ 1, S_t(1 +d))

+(1−p)U¯ _t^b(t+ 1, St(1 +c)) .

The following assertion holds: The variational Snell envelope (U_t)t≤T of the American put problem with dynamic multiplier preferences and reference distribution P^b satisfies (U_t)t≤T = (U_t^b)t≤T. In particular, the worst case distribution is given by P^b and, as the penalty vanishes for this distribution, the optimal stopping time is given by τ^∗ = inf{t ≥ 0|A^p(t, S_t) = U_t^b}= τ^b∗, i.e. the optimal stopping timeτ^b∗ of the problem under subjective prior P^b.

The proof of this assertion is immediate by virtue of stochastic dominance:

As in Appendix H in [Riedel, 09], we show for the variational Snell envelope (U_t)_t≤T that U_t = u(t, S_t) = U_t^b, t ≤ T, for a function u that is decreasing in the second variable: First, we have UT = A^p(T, ST) = U_T^b by definition.

For an inductive proof, we write with a slight but intuitively understandable misuse of notation Hˆ_t(p_t+1 ⊗p_t+2 ⊗. . .|P^b)²² for p_i ∈ [p; ¯p] and note that Hˆ_t(¯p⊗p¯⊗. . .|P^b) = 0 and ≥ 0 else, i.e. p¯at any t minimizes the penalty.

From the induction hypothesis, we haveu(t+1, S_t(1+d))≤u(t+1, S_t(1+c)).

20[Riedel, 09] achieves a general theory for American options under multiple priors.

21Equivalent results hold for an American call withP^a as reference distribution.

22Formally: Hˆt(pt+1⊗pt+2⊗. . .|P^b) := ˆHt(P^β|P^b) with (βt)_t≤T such that P^β[t = 1|F_t−1] =ptfort≤T; well defined as p1, . . . , ptdrops by general definition of ˆHt.

We hence have U_t = maxn

A^p(t, S_t) ; min

pt+1∈[p; ¯p]

p_t+1u(t+ 1, S_t(1 +d))

+(1−p_t+1)u(t+ 1, S_t(1 +c)) +H_t(p_t+1⊗p¯⊗. . .|P^b) o

= maxn

A^p(t, S_t) ; pu(t¯ + 1, S_t(1 +d))

+(1−p)u(t¯ + 1, S_t(1 +c)) +Ht(¯p⊗p¯⊗. . .|P^b)

| {z }

=U_t^b.

Thus, we have the equality of the variational Snell envelope and the classical Snell envelope under the worst case measure, i.e. (Ut)t≤T = (U_t^b)t≤T, and the coincidence of the respective optimal stopping times, i.e. τ^∗ =τ^b∗.

To conclude: The problem of optimally exercising an American put under dynamic entropic risk with reference distribution P^b for the entropic penalty coincides with the problem for the American put for an expected utility max-imizer with respect to subjective prior P^b.

In a way, the result in the example is more like a self fulfilling prophecy as the agent assumes the worst-case distribution to be the most likely one.

The same holds true for an American call with reference distributionP^a: In that case, the reference distribution is also the worst-case one. However, due to the tradeoff effects, P^ais not the worst-case distribution for the American call when P^b is the reference distribution; asP^b is not worst-case distribution for the American put when P^a is the reference one.

[F¨ollmer & Schied, 02] introduce convex risk measures based on expected loss or shortfall risk in a static framework. Entropic risk measures are just a special case when loss is exponential. Carrying over these risk measures to a dynamic framework, a fruitful further application could be achieved as risk measures based onshortfall risk have a quite intuitive appeal.

Im Dokument On Dynamic Coherent and Convex Risk Measures: Risk Optimal Behavior and Information Gains (Seite 125-140)