• Keine Ergebnisse gefunden

Markov chains under nonlinear expectation

N/A
N/A
Protected

Academic year: 2022

Aktie "Markov chains under nonlinear expectation"

Copied!
34
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

DOI: 10.1111/mafi.12289

O R I G I N A L A R T I C L E

Markov chains under nonlinear expectation

Max Nendel

Center for Mathematical Economics, Bielefeld University, Bielefeld, Germany

Correspondence

Max Nendel, Center for Mathematical Economics, Bielefeld University, 33615 Bielefeld, Germany.

Email:Max.Nendel@uni-bielefeld.de

Funding information

Deutsche Forschungsgemeinschaft, Grant/Award Number: CRC 1283

Abstract

In this paper, we consider continuous-time Markov chains with a finite state space under nonlinear expecta- tions. We define so-calledQ-operators as an extension of Q-matrices or rate matrices to a nonlinear setup, where the nonlinearity is due to model uncertainty. The main result gives a full characterization of convexQ-operators in terms of a positive maximum principle, a dual rep- resentation by means ofQ-matrices, time-homogeneous Markov chains under convex expectations, and a class of nonlinear ordinary differential equations. This extends a classical characterization of generators of Markov chains to the case of model uncertainty in the generator. We fur- ther derive an explicit primal and dual representation of convex semigroups arising from Markov chains under convex expectations via the Fenchel–Legendre transfor- mation of the generator. We illustrate the results with several numerical examples, where we compute price bounds for European contingent claims under model uncertainty in terms of the rate matrix.

K E Y W O R D S

generator of nonlinear semigroup, imprecise Markov chain, model uncertainty, nonlinear expectation, nonlinear ODE

This is an open access article under the terms of theCreative Commons AttributionLicense, which permits use, distribution and reproduc- tion in any medium, provided the original work is properly cited.

Β© 2020 The Authors.Mathematical Financepublished by Wiley Periodicals LLC

474 wileyonlinelibrary.com/journal/mafi Mathematical Finance.2021;31:474–507.

(2)

1 INTRODUCTION AND MAIN RESULT

In mathematical finance, model uncertainty or ambiguity is an almost omnipresent phenomenon, which, for example, appears due to incomplete information about certain aspects of an underly- ing asset or insufficient data in order to perform reliable statistical estimation methods for the parameters of a stochastic process. The latter typically leads to so-called parameter uncertainty in the generator of a stochastic process. Prominent examples for this type of uncertainty include a Black–Scholes model with uncertain volatility, the so-called uncertain volatility model, cf.

Avellaneda, Levy, and ParΓ‘s (1995), Avellaneda and ParΓ‘s (1996), and Vorbrink (2014), and a Brownian motion under drift or volatility uncertainty leading to theg-framework, see, for exam- ple, Coquet, Hu, MΓ©min, and Peng (2002) or theG-framework by Peng (2007) and Peng (2008), respectively. Lately, these approaches have been generalized to LΓ©vy processes with uncertainty in the LΓ©vy triplet, cf. Denk, Kupper, and Nendel (2020), Hu and Peng (2009), and Neufeld and Nutz (2017), and uncertainty in the generator of Feller processes, cf. Nendel and RΓΆckner (2019). While these works give sufficient conditions in order to guarantee the existence of stochastic processes under model uncertainty and to establish a connection to nonlinear partial differential equations, there is no necessary condition that determines the maximal degree of ambiguity that can be cap- tured by an uncertain process.

In the present paper, we address this issue in a simplified setup, where we consider a finite state space. We provide sufficient and necessary conditions in terms of the generators of time- homogeneous continuous-time Markov chains that guarantee the existence of a continuous-time Markov chain under a convex expectation. We further establish a one-to-one relation between the transition operators of convex Markov chains and a class of nonlinear ordinary differential equations. In particular, we extend a classical relation between Markov chains, rate matrices, and ordinary differential equations to the case of model uncertainty. The ordinary differential equa- tion related to a convex Markov chain is a spatially discretized version of a Hamilton–Jacobi–

Bellman equation, and the nonlinear transition operators are related, via a dual representation, to a control problem where, roughly speaking, β€œnature” tries to control the system into the worst possible scenario (see Remark4.18). The explicit description of the transition operators gives rise to a numerical scheme, different from Runge–Kutta methods, for the computation of price bounds for European contingent claims under model uncertainty. We illustrate this method and other numerical methods in several examples, where we consider an underlying Markov chain, which is a discrete version, more precisely, the generator is a finite difference discretization of the generator of a Brownian motion with uncertain drift, cf. Coquet et al. (2002), and uncertain volatility, cf. Peng (2007) and Peng (2008). The main tools, we use in our analysis, are convex duality, a semigroup-theoretic approach to control problems due to Nisio (1976/77), see also Denk et al. (2020) and Nendel and RΓΆckner (2019), and a convex version of Kolmogorov’s extension theorem due to Denk, Kupper, and Nendel (2018), which allows to extend the expectation to functionals that depend on the whole path. Restricting the time parameter, in the present work, to the set of natural numbers leads to a discrete-time Markov chain, in the sense of Denk et al.

(2018, Example 5.3).

The concept we use to describe ambiguity is the notion of a nonlinear expectation introduced by Peng (2005). Nonlinear expectations closely relate to other concepts describing model uncertainty, backward stochastic differential equations (BSDEs), cf. Cohen (2012), and Coquet et al. (2002), and 2BSDEs, cf. Cheridito, Soner, Touzi, and Victoir (2007) and Denis, Hu, and Peng (2011). We refer to Pardoux and Peng (1992), Pardoux and Peng (1990), and El Karoui, Peng, and Quenez (1997) for a detailed study of BSDEs and their applications within the field of mathematical finance. If a

(3)

nonlinear expectationis sublinear, then𝜌(𝑋) ∢=(βˆ’π‘‹)defines a coherent monetary risk mea- sure as introduced by Artzner, Delbaen, Eber, and Heath (1999), Delbaen (2000), and Delbaen (2002), see also FΓΆllmer and Schied (2011) for an overview of monetary risk measures. Moreover, if is a sublinear expectation, then is a coherent upper prevision, cf. Walley (1991), and vice versa. There is a similar one-to-one relation between convex expectations, convex upper previ- sions, cf. Pelessoni and Vicig (2003) and Pelessoni and Vicig (2005), and convex risk measures, cf. FΓΆllmer and Schied (2002) and Frittelli and Rosazza Gianin (2002). Further concepts, which are closely related to nonlinear expectations and describe model uncertainty, are Choquet capac- ities (see, e.g., Dellacherie & Meyer,1978), game-theoretic probability by Vovk and Shafer (2014), and niveloids, see, for example, Cerreia-Vioglio, Maccheroni, Marinacci, and Rustichini (2014).

Our setup is inspired by Peng (2005), where Markov chains under nonlinear expectations are considered in an axiomatic way. However, the existence of stochastic processes under nonlinear expectations has only been considered in terms of finite-dimensional nonlinear marginal distri- butions, whereas completely path-dependent functionals could not be regarded. Markov chains under model uncertainty have been considered among others by Avellaneda and Buff (1999), De Cooman, Hermans, and Quaeghebeur (2009), Hartfiel (1998), and Ε kulj (2009). Avellaneda and Buff (1999) study a finite difference discretization of the uncertain volatility model lead- ing to a Markov chain setting. Hartfiel (1998) considers so-called Markov set-chains in discrete time, using matrix intervals in order to describe model uncertainty in the transition matrices.

Later, Ε kulj (2009) approached Markov chains under model uncertainty using Choquet capaci- ties, which results in higher dimensional matrices on the power set, while De Cooman et al. (2009) considered imprecise Markov chains using an operator-theoretic approach with upper and lower expectations. In Denk et al. (2018, Example 5.3), Denk et al. describe model uncertainty in the transition matrix via a nonlinear transition operator, which, together with the results obtained in Denk et al. (2018), allows the construction of discrete-time Markov chains on the canonical path space. In continuous time, in particular, computational aspects of sublinear imprecise Markov chains have been studied amongst others by Krak, De Bock, and Siebes (2017) and Ε kulj (2015).

Another concept that is closely related to Markov chains under nonlinear expectations, as dis- cussed in the present paper, are BSDEs on Markov chains by Cohen and Elliott (2008) and Cohen and Elliott (2010a), see also Cohen and Szpruch (2012), Cohen and Hu (2013), and Cohen and Elliott (2010b) for the discrete-time case. Here, a reference Markov chain𝑋 = (𝑋𝑑)𝑑β‰₯0with gener- ator(π‘žπ‘‘)𝑑β‰₯0is fixed, and one considers BSDEs driven by𝑋. This can be viewed as a discretization of the classical BSDE setup, where the state space isℝ, the driving process is a Brownian Motion, and the generator is1

2πœ•π‘₯π‘₯. Cohen and Szpruch (2012) show that Markovian solutions to BSDEs on Markov chains are related via their driver to a system

𝑒′(𝑑) = 𝑓(𝑑, 𝑒(𝑑)) + 𝐴(𝑑)𝑒(𝑑) for all𝑑β‰₯0, 𝑒(0) = 𝑒0

of nonlinear ordinary differential equations with a nonlinear function𝑓that is assumed to be globally Lipschitz in the variable𝑒. In the present paper,𝑓(𝑑, 𝑒) =ξˆ½π‘’ for a convex operator. The biggest difference between our approach and the theory of BSDEs on Markov chains lies in the fact that we do not consider a fixed reference Markov chain that drives the model. On the other hand, our approach is restricted to considering Markovian solutions to BSDEs on Markov chains.

From a technical standpoint, further differences are that the theory of BSDEs allows for more gen- erality in terms of nonlinearity of the driver, while we do not require global Lipschitz continuity of the generator allowing for a possibly unbounded convex conjugate. Additionally, we only focus

(4)

on the time-homogeneous case. However, regarding the existence of Markov chains under con- vex expectations and their connection to nonlinear ordinary differential equations (ODEs), this restriction could easily be overcome with a slight modification of the construction of the transition operators.

Dentcheva and RuszczyΕ„ski (2018) consider Markov risk measures for a countable state space, see also Fan and RuszczyΕ„ski (2018a), Fan and RuszczyΕ„ski (2018b), and RuszczyΕ„ski (2010) for the discrete-time case. Here, the focus lies on time-consistent risk measurement related to a fixed reference continuous-time Markov chain𝑋 = (𝑋𝑑)𝑑β‰₯0. Using so-called semiderivatives in the direction of the generator𝐴, the authors derive, in the case of a coherent risk measure, a sub- linear ordinary differential equation related to the risk measure, where the dual representation of the nonlinear generator depends on the generator𝐴of the baseline model𝑋. Clearly, in the theory of Markov risk measures, the focus lies more on law-invariant risk measures such as the average value at risk, and is therefore not directly comparable with our approach, where we explic- itly avoid to fix a baseline model but rather try to capture very general forms of uncertainty in the generator. However, on a technical level, our approach also allows to consider risk evaluations related to convex generators that do not depend on a fixed reference generator.

In view of the aforementioned existing literature on imprecise versions of Markov chains, the contribution of this paper can be summarized as follows (see Remark2.6for further details):

– We propose a framework describing Markov chains under model uncertainty in terms of the rate matrix. Our approach complements the existing literature on BSDEs on Markov chains and Markov risk measures covering a different range of examples and applications in a consistent way. The key difference between our framework and the aforementioned existing approaches lies in the fact that we do not consider a fixed reference Markov chain describing the dynam- ics of an underlying asset. Moreover, our approach relies on analytic rather than stochastic methods using distributional rather than pathwise properties, and thus leading to restrictions in certain directions but advantages in other directions.

– We show that, as in the linear case, Markov chains under convex expectations with certain regularity at time 0 are linked via a one-to-one relation to certain convex functions (their gen- erator) and to solutions to convex differential equations, which can be solved, for example, by using an explicit Euler method or any other Runge–Kutta method. In particular, we prove the global existence of solutions to a class of convex differential equations with unbounded convex conjugate, that is, without a global Lipschitz condition on the generator.

– We show that the transition semigroup of a convex Markov chain can be explicitly constructed using any (!) dual representation of the generator. In particular, for numerical computations, a

β€œminimal” dual representation in terms of certain β€œcorner points” can be used to solve the non- linear Kolmogorov equation. Based on the explicit construction of the semigroup, we propose a novel algorithm for the numerical computation of solutions to a class of nonlinear ODEs. More- over, we show that every convex transition semigroup is the least upper bound (in the sense of semigroups) of a family of linear transition semigroups, and vice versa.

– The convex expectations we consider are defined on the whole path space without fixing any reference measure. We show that the nonlinear expectation, although possibly undominated, always admits a dual representation in terms of countably additive probability measures. More- over, we derive an explicit dual representation in terms of an optimal control problem, where nature tries to control the system into the worst possible scenario, giving a control-theoretic interpretation to Markov chains under convex expectations.

(5)

1.1 Structure of the paper

In Section2, we fix the notation, introduce our setup and basic definitions, and state the main result (Theorem2.5). In Section3, we prove the first part of Theorem2.5(implications(𝑣) β‡’ (𝑖𝑖) β‡’ (𝑖) β‡’ (𝑖𝑖𝑖)). The main tool, we use in this part, is convex duality inℝ𝑑. Moreover, we discuss how, in the sublinear case, computational efficiency can be improved by reducing compact and suitably convex sets of generator matrices to their β€œcorner points.” The effectiveness of this reduction is demonstrated in Section5. In Section4, we prove the remaining implications(𝑖𝑖𝑖) β‡’ (𝑖𝑣) β‡’ (𝑣) of Theorem2.5. Here, we use a combination of so-called Nisio semigroups, as introduced in Nisio (1976/77), the theory of ordinary differential equations, and a Kolmogorov-type extension theorem for convex expectations derived in Denk et al. (2018). We conclude this section by showing that the semigroup envelope admits a dual representation as a cost functional related to an optimal control problem. In Section5, we use and compare two different numerical methods, based on the results from Sections3and4, in order to compute price bounds for European contingent claims, where the underlying is a discrete version of a Brownian motion with drift uncertainty (g-framework) and volatility uncertainty (G-framework).

2 NOTATION, BASIC DEFINITIONS, AND MAIN RESULT

Given a measurable space(Ξ©,), we denote the space of all bounded measurable functionsΞ© β†’ ℝby∞(Ξ©,). Anonlinear expectationis then a functional ∢∞(Ξ©,) β†’ ℝ, which satisfies

βˆ™ (𝑋)≀ (π‘Œ)whenever𝑋(πœ”)β‰€π‘Œ(πœ”)for allπœ” ∈ Ξ©,

βˆ™ (𝛼1Ξ©) = 𝛼for all𝛼 ∈ ℝ.

Ifis additionally convex, that is, for all𝑋, π‘Œ ∈∞(Ξ©,)andπœƒ ∈ [0, 1],

(πœƒπ‘‹ + (1 βˆ’ πœƒ)π‘Œ)β‰€πœƒξˆ±(𝑋) + (1 βˆ’ πœƒ)(π‘Œ),

we say thatis aconvex expectation. It is well known (see, e.g., Denk et al.,2018or Fâllmer &

Schied,2011) that every convex expectationadmits a dual representation in terms of finitely addi- tive probability measures. If, however, even admits a dual representation in terms of (countably additive) probability measures, we say that(Ξ©,,)is aconvex expectationspace. More precisely, we say that(Ξ©,,)is aconvex expectationspace if there exists a setof probability measures on (Ξ©,)and a family(𝛼ℙ)β„™βˆˆξˆΌ βŠ‚ [0, ∞)withinfβ„™βˆˆξˆΌπ›Όβ„™= 0such that

(𝑋) = sup

β„™βˆˆξˆΌ(𝔼ℙ(𝑋) βˆ’ 𝛼ℙ)

for all𝑋 ∈∞(Ξ©,). Here,𝔼ℙdenotes the expectation w.r.t. a probability measureβ„™on(Ξ©,). If 𝛼ℙ= 0 for all β„™ ∈, we say that (Ξ©,,) is a sublinear expectation space. Here, the set

represents the set of all models that are relevant under the expectation. In the case of a sublin- ear expectation space, the functional is the best case among all plausible models. In the case of a convex expectation space, the functionalis a weighted best case among all plausible models

with an additional penalization term𝛼ℙfor everyβ„™ ∈. Intuitively,𝛼ℙcan be seen as a mea- sure for how much importance we give to the priorβ„™ ∈under the expectation. For example,

(6)

a low penalization, that is,𝛼ℙclose or equal to 0, gives more importance to the modelβ„™ ∈than a high penalization.

Throughout, we consider a finite nonempty state space𝑆with cardinality 𝑑 ∢=|𝑆|∈ β„•. We endow𝑆 with the discrete topology2𝑆 and w.l.o.g. assume that𝑆 = {1, … , 𝑑}. The space of all bounded measurable functions𝑆 β†’ ℝcan therefore be identified byℝ𝑑via

𝑒 = (𝑒1, … , 𝑒𝑑)𝑇 withπ‘’π‘–βˆΆ= 𝑒(𝑖) for all𝑖 ∈ {1, … , 𝑑}.

Therefore, we denote bounded measurable functions𝑒as vectors of the form𝑒 = (𝑒1, … , 𝑒𝑑)π‘‡βˆˆ ℝ𝑑, where𝑒𝑖represents the value of𝑒in the state𝑖 ∈ {1, … , 𝑑}. Onℝ𝑑, we consider the norm

β€–π‘’β€–βˆžβˆΆ= max

𝑖=1,…,𝑑|𝑒𝑖|= max

π‘–βˆˆ{1,…,𝑑}|𝑒(𝑖)|

for a vector𝑒 ∈ ℝ𝑑. Moreover, for𝛼 ∈ ℝ, the vector𝛼 ∈ ℝ𝑑denotes the constant vector𝑒 ∈ ℝ𝑑 with𝑒𝑖= 𝛼for all𝑖 ∈ {1, … , 𝑑}. For an arbitrary matrixπ‘ž = (π‘žπ‘–π‘—)1≀𝑖,π‘—β‰€π‘‘βˆˆ ℝ𝑑×𝑑, we denote byβ€–π‘žβ€–

the operator norm ofπ‘ž ∢ ℝ𝑑→ ℝ𝑑w.r.t. the normβ€–β‹…β€–βˆž, that is,

β€–π‘žβ€–= sup

π‘£βˆˆβ„π‘‘β§΅{0}

β€–π‘žπ‘£β€–βˆž

β€–π‘£β€–βˆž = max

𝑖=1,…,𝑑

( 𝑑

βˆ‘

𝑗=1

|π‘žπ‘–π‘—| )

.

Inequalities of vectors are always understood componentwise, that is, for𝑒, 𝑣 ∈ ℝ𝑑, 𝑒≀𝑣 ⟺ βˆ€π‘– ∈ {1, … , 𝑑} ∢ 𝑒𝑖 ≀𝑣𝑖.

In the same way, all concepts inℝ𝑑that include inequalities are to be understood componentwise.

For example, a vector field𝐹 ∢ ℝ𝑑 β†’ ℝ𝑑is calledconvexif

𝐹𝑖(πœ†π‘’ + (1 βˆ’ πœ†)𝑣)β‰€πœ†πΉπ‘–(𝑒) + (1 βˆ’ πœ†)𝐹𝑖(𝑣)

for all𝑖 ∈ {1, … , 𝑑},𝑒, 𝑣 ∈ ℝ𝑑andπœ† ∈ [0, 1]. A vector field𝐹is calledsublinearif it is convex and positive homogeneous (of degree 1). Moreover, for a set𝑀 βŠ‚ ℝ𝑑of vectors, we write𝑒 = sup 𝑀 if𝑒𝑖= supπ‘£βˆˆπ‘€π‘£π‘–for all𝑖 ∈ {1, … , 𝑑}and𝑒 = max 𝑀if𝑒 = sup 𝑀and, for all𝑖 ∈ {1, … , 𝑑}, there exists some𝑣 ∈ 𝑀with𝑒𝑖= 𝑣𝑖.

In the following, we briefly recall the basic definitions and concepts from the theory of (time-homogeneous) Markov chains. A (time-homogeneous) Markov chain is a quadruple (Ξ©,, (β„™1, … , ℙ𝑑), (𝑋𝑑)𝑑β‰₯0), where:

(M1) (Ω,)is a measurable space.

(M2) π‘‹π‘‘βˆΆ Ξ© β†’ {1, … , 𝑑}is-measurable for all𝑑β‰₯0.

(M3) (β„™1, … , ℙ𝑑)is a collection of probability measures, where, for𝑖 ∈ {1, … , 𝑑},ℙ𝑖(𝑋0= 𝑖) = 1, that is,ℙ𝑖denotes the probability distribution under which the Markov chain starts in the state𝑖. Moreover, we use the notation

𝔼𝑖(π‘Œ) ∢= 𝔼ℙ𝑖(π‘Œ) and 𝔼(π‘Œ) ∢= (𝔼1(π‘Œ), … , 𝔼𝑑(π‘Œ))𝑇 for𝑖 ∈ {1, … , 𝑑}and all random variablesπ‘Œ ∢ Ξ© β†’ ℝ.

(7)

(M4) For all𝑠, 𝑑β‰₯0and𝑖 ∈ {1, … , 𝑑},

𝔼𝑖(𝑒(𝑋𝑠+𝑑)|ξˆ²π‘ ) = 𝔼𝑖(𝑒(𝑋𝑑+𝑠)|𝑋𝑠) = 𝔼𝑋𝑠(𝑒(𝑋𝑑)).

In particular,𝔼𝑖(𝑒(𝑋𝑑+𝑠)|𝑋𝑠 = 𝑗) = 𝔼𝑗(𝑒(𝑋𝑑)) for all𝑖, 𝑗 ∈ {1, … , 𝑑}.

A matrixπ‘ž = (π‘žπ‘–π‘—)1≀𝑖,π‘—β‰€π‘‘βˆˆ ℝ𝑑×𝑑is called aQ-matrixorrate matrixif it satisfies the following conditions:

(Q1) π‘žπ‘–π‘–β‰€0for all𝑖 ∈ {1, … , 𝑑},

(Q2) π‘žπ‘–π‘—β‰₯0for all𝑖, 𝑗 ∈ {1, … , 𝑑}with𝑖≠𝑗, (Q3) βˆ‘π‘‘

𝑗=1π‘žπ‘–π‘— = 0for all𝑖 ∈ {1, … , 𝑑}.

It is well known that every continuous-time Markov chain with certain regularity properties at time𝑑 = 0can be related to aQ-matrixand vice versa. More precisely, for a matrixπ‘ž ∈ ℝ𝑑×𝑑, the following statements are equivalent:

(i) π‘žis aQ-matrix.

(ii) There is a Markov chain(Ξ©,, (β„™1, … , ℙ𝑑), (𝑋𝑑)𝑑β‰₯0)such that

π‘žπ‘’0= lim

β„Žβ†˜0

𝔼(𝑒0(π‘‹β„Ž)) βˆ’ 𝑒0

β„Ž for all𝑒0∈ ℝ𝑑, where𝑒0(𝑖)is the𝑖th component of𝑒0for𝑖 ∈ {1, … , 𝑑}.

In this case, for each vector𝑒0∈ ℝ𝑑, the function𝑒 ∢ [0, ∞) β†’ ℝ𝑑, 𝑑 ↦ 𝔼(𝑒0(𝑋𝑑))is the unique classical solution𝑒 ∈ 𝐢1([0, ∞); ℝ𝑑)to the initial value problem

𝑒′(𝑑) = π‘žπ‘’(𝑑), 𝑑β‰₯0, 𝑒(0) = 𝑒0,

that is,𝑒(𝑑) = π‘’π‘‘π‘žπ‘’0for all𝑑β‰₯0, whereπ‘’π‘‘π‘žis the matrix exponential ofπ‘‘π‘ž. We refer to Norris (1998) for a detailed illustration of this relation.

We say that a (possibly nonlinear) operator∢ ℝ𝑑→ ℝ𝑑satisfies thepositive maximum prin- cipleif, for every𝑒 = (𝑒1, … , 𝑒𝑑)π‘‡βˆˆ ℝ𝑑and𝑖 ∈ {1, … , 𝑑},

(ξˆ½π‘’)𝑖 ≀0 whenever𝑒𝑖 β‰₯𝑒𝑗for all𝑗 ∈ {1, … , 𝑑}.

This notion is motivated by the positive maximum principle for generators of Feller processes, see, for example, Jacob (2001, Equation (0.8)). Notice that a matrixπ‘ž ∈ ℝ𝑑×𝑑is aQ-matrixif and only if it satisfies the positive maximum principle andπ‘ž1 = 0, where1 ∢= (1, … , 1)π‘‡βˆˆ ℝ𝑑denotes the constant 1 vector. In fact, Property (Q3) is just a reformulation ofπ‘ž1 = 0. Moreover, ifπ‘žsatisfies the positive maximum principle, thenπ‘žπ‘–π‘–= (π‘žπ‘’π‘–)𝑖≀0for all𝑖 ∈ {1, … , 𝑑}andβˆ’π‘žπ‘–π‘— = (π‘ž(βˆ’π‘’π‘–))𝑗≀ 0for all𝑖, 𝑗 ∈ {1, … , 𝑑}with𝑖≠𝑗. That is,π‘žfulfills (Q1) and (Q2). On the other hand, ifπ‘žis a Q-matrix,𝑒 = (𝑒1, … , 𝑒𝑑)π‘‡βˆˆ ℝ𝑑and𝑖 ∈ {1, … , 𝑑}with𝑒𝑖β‰₯𝑒𝑗for all𝑗 ∈ {1, … , 𝑑}, then(π‘žπ‘’)𝑖=

βˆ‘π‘‘

𝑗=1π‘žπ‘–π‘—π‘’π‘—β‰€π‘’π‘–βˆ‘π‘‘

𝑗=1π‘žπ‘–π‘— = 0, which shows thatπ‘žsatisfies the positive maximum principle.

(8)

To state the main result, we introduce the following definitions.

Definition 2.1. A (possibly nonlinear) map∢ ℝ𝑑 β†’ ℝ𝑑is called aQ-operatorif the following conditions are satisfied:

(i) (ξˆ½πœ†π‘’π‘–)𝑖≀0for allπœ† > 0and all𝑖 ∈ {1, … , 𝑑},

(ii) ((βˆ’πœ†π‘’π‘—))𝑖 ≀0for allπœ† > 0and all𝑖, 𝑗 ∈ {1, … , 𝑑}with𝑖≠𝑗, (iii) ξˆ½π›Ό = 0for all𝛼 ∈ ℝ, where we identify𝛼with(𝛼, … , 𝛼)𝑇 ∈ ℝ𝑑.

Definition 2.2.Aconvex Markov chainis a quadruple(Ξ©,,, (𝑋𝑑)𝑑β‰₯0)that satisfies the following conditions:

(i) (Ω,)is a measurable space.

(ii) π‘‹π‘‘βˆΆ Ξ© β†’ {1, … , 𝑑}is-measurable for all𝑑β‰₯0.

(iii) = (1, … ,ξˆ±π‘‘)𝑇, where (Ξ©,,ξˆ±π‘–) is a convex expectation space for all 𝑖 ∈ {1, … , 𝑑} and

(𝑒0(𝑋0)) = 𝑒0. Here and in the following, we use the notation

(π‘Œ) ∢= (1(π‘Œ), … ,ξˆ±π‘‘(π‘Œ))π‘‡βˆˆ ℝ𝑑

forπ‘Œ ∈∞(Ξ©,).

(iv) The following version of the Markov property is satisfied: For all𝑠, 𝑑β‰₯0,𝑛 ∈ β„•,0≀𝑑1<

β‹― < 𝑑𝑛 ≀𝑠, and𝑣0∈ (ℝ𝑑)(𝑛+1),

(𝑣0(π‘Œ, 𝑋𝑠+𝑑)) =[

ξˆ±π‘‹π‘ ,𝑑(𝑣0(π‘Œ, β‹… ))]

, (1)

whereπ‘Œ ∢= (𝑋𝑑1, … , 𝑋𝑑𝑛)andξˆ±π‘–,𝑑(𝑒0) ∢=ξˆ±π‘–(𝑒0(𝑋𝑑))for all𝑒0∈ ℝ𝑑and𝑖 ∈ {1, … , 𝑑}. We say that the Markov chain (Ξ©,,, (𝑋𝑑)𝑑β‰₯0) is linear or sublinear if the mapping ∢

∞(Ξ©,) β†’ ℝ𝑑is, additionally, linear, or sublinear, respectively.

Notice that the properties(𝑖)–(𝑖𝑖𝑖) in the previous definition are a one-to-one translation of (M1)–(M3) to a convex setup. The Markov property given in(𝑖𝑣)of the previous definition is the nonlinear analog of the classical Markov property (M4) without using conditional expectations.

Due to the nonlinearity of the expectation, the definition and, in particular, the existence of a conditional (nonlinear) expectation are quite involved, which is why we avoid to introduce this concept. In order to get the idea behind the formulation in (iv), choose𝑣0= 𝑒(𝑋𝑠+𝑑)1𝐡(π‘Œ)for a measurable function𝑒 ∢ {1, … , 𝑑} β†’ ℝand arbitrary𝐡 βŠ‚ {1, … , 𝑑}𝑛. Then, ifis linear, Equation (1) reads as

(𝑒(𝑋𝑠+𝑑)1𝐡(π‘Œ)) =(

ξˆ±π‘‹π‘ ,𝑑(𝑒)1𝐡(π‘Œ)) ,

which is equivalent to (M4). On the other hand, for every linear Markov chain, Property (M4) implies Property(𝑖𝑣). Hence, in the linear case, Definition2.2is consistent with the classical def- inition of a Markov chain.

In line with Denk et al. (2018, Definition 5.1), we say that a (possibly nonlinear) map ∢ ℝ𝑑→ ℝ𝑑is akernel, if ismonotone, that is,(𝑒)≀ (𝑣)for all𝑒, 𝑣 ∈ ℝ𝑑with𝑒≀𝑣, andpreserves constants, that is,(𝛼) = 𝛼for all𝛼 ∈ ℝ.

(9)

Definition 2.3. A family S= (S(𝑑))𝑑β‰₯0of (possibly nonlinear) operatorsS(𝑑) ∢ ℝ𝑑→ ℝ𝑑 is called asemigroupif

(i) S(0) = 𝐼, where𝐼 = 𝐼𝑑is the𝑑-dimensional identity matrix, (ii) S(𝑠 + 𝑑) =S(𝑠)S(𝑑)for all𝑠, 𝑑β‰₯0.

Here and throughout, we make use of the notationS(𝑠)S(𝑑) ∢=S(𝑠) β—¦S(𝑑). If, additionally, S(β„Ž) β†’ 𝐼uniformly on compact sets asβ„Ž β†˜ 0, we say that the semigroupSisuniformly con- tinuous. We callSMarkovianifS(𝑑)is akernelfor all𝑑β‰₯0. We say thatSislinear,sublinear, or convexifS(𝑑)is linear, sublinear, or convex for all𝑑β‰₯0, respectively.

Definition 2.4. LetξˆΌβŠ‚ ℝ𝑑×𝑑 be a set ofQ-matrices and𝑓 = (π‘“π‘ž)π‘žβˆˆξˆΌ a family of vectors with supπ‘žβˆˆξˆΌπ‘“π‘ž= π‘“π‘ž0 = 0for someπ‘ž0∈, that is,π‘“π‘žβ‰€0for allπ‘ž ∈ and there exists someπ‘ž0∈ withπ‘“π‘ž0 = 0. We denote by

π‘†π‘ž(𝑑)𝑒0∢= π‘’π‘žπ‘‘π‘’0+∫

𝑑 0

π‘’π‘žπ‘ π‘“π‘žd𝑠 = 𝑒0+∫

𝑑 0

π‘’π‘ π‘ž(

π‘žπ‘’0+ π‘“π‘ž) d𝑠

for𝑑β‰₯0,𝑒0∈ ℝ𝑑andπ‘ž ∈. Then,π‘†π‘ž= (π‘†π‘ž(𝑑))𝑑β‰₯0is an affine linear semigroup. We call a semi- groupSthe(upper) semigroup envelope(later alsoNisio semigroup) of(, 𝑓)if

(i) S(𝑑)𝑒0β‰₯π‘†π‘ž(𝑑)𝑒0for all𝑑β‰₯0,𝑒0∈ ℝ𝑑andπ‘ž ∈,

(ii) for any other semigroup T satisfying (i) we have that S(𝑑)𝑒0≀T(𝑑)𝑒0 for all 𝑑β‰₯0 and 𝑒0∈ ℝ𝑑.

That is, the semigroup envelope S is the smallest semigroup that dominates all semigroups (π‘†π‘ž)π‘žβˆˆξˆΌ.

The following main theorem gives a full characterization of convexQ-operators.

Theorem 2.5. Let∢ ℝ𝑑→ ℝ𝑑be a mapping. Then, the following statements are equivalent:

(i) is a convexQ-operator.

(ii) is convex, satisfies the positive maximum principle, andξˆ½π›Ό = 0for all𝛼 ∈ ℝ, where𝛼 ∢=

(𝛼, … , 𝛼)π‘‡βˆˆ ℝ𝑑.

(iii) There exists a set ξˆΌβŠ‚ ℝ𝑑×𝑑 ofQ-matrices and a family 𝑓 = (π‘“π‘ž)π‘žβˆˆξˆΌ βŠ‚ ℝ𝑑 of vectors with π‘“π‘žβ‰€0for allπ‘ž ∈andπ‘“π‘ž0 = 0for someπ‘ž0∈, such that

ξˆ½π‘’0= sup

π‘žβˆˆξˆΌ

(π‘žπ‘’0+ π‘“π‘ž)

(2)

for all𝑒0∈ ℝ𝑑, where the supremum is to be understood componentwise.

(iv) There exists a uniformly continuous convex Markovian semigroupSwith

ξˆ½π‘’0= lim

β„Žβ†˜0

S(β„Ž)𝑒0βˆ’ 𝑒0 β„Ž

(10)

for all𝑒0∈ ℝ𝑑.

(v) There is a convex Markov chain(Ξ©,,, (𝑋𝑑)𝑑β‰₯0)such that

ξˆ½π‘’0= lim

β„Žβ†˜0

(𝑒0(π‘‹β„Ž)) βˆ’ 𝑒0 β„Ž

for all𝑒0∈ ℝ𝑑.

In this case, for each initial value𝑒0∈ ℝ𝑑, the function𝑒 ∢ [0, ∞) β†’ ℝ𝑑, 𝑑 β†¦ξˆ±(𝑒0(𝑋𝑑))is the unique classical solution𝑒 ∈ 𝐢1([0, ∞); ℝ𝑑)to the initial value problem

𝑒′(𝑑) =ξˆ½π‘’(𝑑) = sup

π‘žβˆˆξˆΌ

(π‘žπ‘’(𝑑) + π‘“π‘ž)

, 𝑑β‰₯0, (3)

𝑒(0) = 𝑒0.

Moreover, the Markovian semigroupSfrom (iv) is the (upper) semigroup envelope of(, 𝑓), and 𝑒(𝑑) =S(𝑑)𝑒0for all𝑑β‰₯0.

Remark2.6. Consider the situation of Theorem2.5.

(a) The dual representation in(𝑖𝑖𝑖)gives a model uncertainty interpretation toQ-operators. The set can be seen as the set of all plausible rate matrices, when considering theQ-operator

. For everyπ‘ž ∈, the vectorπ‘“π‘žβ‰€0can be interpreted as a penalization, which measures how much importance we give to each rate matrixπ‘ž. The requirement that there exists some π‘ž0∈ withπ‘“π‘ž0 = 0can be interpreted in the following way: There exists at least one rate matrixπ‘ž0 within the set of all plausible rate matrices to which we assign the maximal importance, which is the minimal penalization.

(b) The semigroup envelope S of(, 𝑓)can be constructed more explicitly, in particular, an explicit (in terms of(, 𝑓)) dual representation can be derived. For details, we refer to Sec- tion4(Definition4.2and Remark4.18). Moreover, we would like to highlight that the semi- group envelopeScan be constructed w.r.t. any dual representation(, 𝑓)as in (𝑖𝑖𝑖) and results in the unique classical solution to (3) independent of the choice of the dual represen- tation(, 𝑓)of. This gives, in some cases, the opportunity to efficiently compute the semi- group envelope numerically via its primal/dual representation (see Remark3.3and Exam- ple5.2).

(c) The same equivalence as in Theorem2.5holds if convexity is replaced by sublinearity in(𝑖), (𝑖𝑖),(𝑖𝑣), and(𝑣)andπ‘“π‘ž= 0for allπ‘ž ∈in(𝑖𝑖𝑖). In this case, the setin(𝑖𝑖𝑖)can be chosen to be compact as we will see in the proof of Theorem2.5.

(d) Theorem2.5extends and includes the well-known relation between (linear) Markov chains, Q-matrices, and ordinary differential equations.

(e) A remarkable consequence of Theorem2.5is that every convex Markovian semigroup, which is differentiable at time𝑑 = 0, is the semigroup envelope with respect to the Fenchel–Legendre transformation (or any other dual representation as in(𝑖𝑖𝑖)of its generator, which is a convex Q-operator.

(11)

(f) Althoughhas an unbounded convex conjugate, the convex initial value problem

𝑒′(𝑑) =ξˆ½π‘’(𝑑) for all𝑑β‰₯0, 𝑒(0) = 𝑒0, (4) has a unique global solution.

(g) Solutions to (4) remain bounded. Therefore, a Picard iteration or Runge–Kutta methods, such as the explicit Euler method, can be used for numerical computations, and the convergence rate (depending on the size of the initial value𝑒0) can be derived from the a priori estimate in Banach’s fixed point theorem.

(h) As in the linear case, by solving the differential equation (4), one can (numerically) compute expressions of the form

𝑒(𝑑) =(𝑒0(𝑋𝑑)).

We illustrate this computation procedure in Example5.1.

3 PROOF OF (𝒗) β‡’ (π’Šπ’Š) β‡’ (π’Š) β‡’ (π’Šπ’Šπ’Š)

We say that a set βŠ‚ ℝ𝑑×𝑑of matrices isrow-convexif, for any diagonal matrixπœƒ ∈ ℝ𝑑×𝑑with πœƒπ‘– ∢= πœ†π‘–π‘–βˆˆ [0, 1]for all𝑖 ∈ {1, … , 𝑑},

πœƒπ‘ + (𝐼 βˆ’ πœƒ)π‘ž ∈ for all𝑝, π‘ž ∈,

where𝐼 = πΌπ‘‘βˆˆ ℝ𝑑×𝑑 is the𝑑-dimensional identity matrix. Notice that, for all𝑖 ∈ {1, … , 𝑑}, the 𝑖th row of the matrixπœƒπ‘ + (𝐼 βˆ’ πœƒ)π‘žis the convex combination of the𝑖th row of𝑝andπ‘žwithπœƒπ‘–. Notice that a setξˆΌβŠ‚ ℝ𝑑×𝑑is row-convex if and only if it is convex and, for arbitrary𝑝, π‘ž ∈, the matrix that results from replacing the𝑖th row of𝑝by the𝑖th row ofπ‘žis again an element of.

For example, the set of allQ-matrices is row-convex.

Remark3.1. Letbe a convexQ-operator. For every matrixπ‘ž ∈ ℝ𝑑×𝑑, let

ξˆ½βˆ—(π‘ž) ∢= sup

π‘’βˆˆβ„π‘‘

(π‘žπ‘’ βˆ’ξˆ½(𝑒)) ∈ [0, ∞]𝑑

be theconjugate functionof. Notice that0≀ ξˆ½βˆ—(π‘ž)for allπ‘ž ∈ ℝ𝑑×𝑑, since(0) = 0. Let

ξˆΌβˆ—βˆΆ={

π‘ž ∈ ℝ𝑑×𝑑|||ξˆ½βˆ—(π‘ž) ∈ [0, ∞)𝑑}

andπ‘“π‘žβˆ—βˆΆ= βˆ’ξˆ½βˆ—(π‘ž)for allπ‘ž βˆˆξˆΌβˆ—. Then, the following facts are well-known results from convex duality theory inℝ𝑑.

(a) The setξˆΌβˆ—is row-convex and the mappingξˆΌβˆ—β†’ ℝ𝑑, π‘ž β†¦ξˆ½βˆ—(π‘ž)is lower semicontinuous.

(12)

(b) Let 𝑀β‰₯0 and ξˆΌπ‘€βˆ— ∢= {π‘ž ∈ ℝ𝑑×𝑑|ξˆ½βˆ—(π‘ž)≀𝑀}. Then, ξˆΌπ‘€βˆ— βŠ‚ ℝ𝑑×𝑑 is compact and row- convex. Therefore,

ξˆ½π‘€ ∢ ℝ𝑑 β†’ ℝ𝑑, 𝑒 ↦ max

π‘žβˆˆξˆΌβˆ—π‘€

(π‘žπ‘’ + π‘“π‘žβˆ—)

(5)

defines a convex operator, which is Lipschitz continuous. Notice that the maximum in (5) is to be understood componentwise. However, for fixed𝑒0∈ ℝ𝑑, the maximum can be attained, simultaneously in every component, by a single element ofξˆΌπ‘€βˆ—, that is, for all𝑒0∈ ℝ𝑑, there exists someπ‘ž0βˆˆξˆΌπ‘€βˆ— with

ξˆ½π‘€π‘’0= π‘ž0𝑒0+ π‘“π‘žβˆ—0.

This is due to the fact thatξˆΌπ‘€βˆ— is row convex and that, forπ‘ž βˆˆξˆΌβˆ—, the𝑖th component of the vectorπ‘“βˆ—π‘žonly depends on the𝑖th row ofπ‘ž.

(c) Let𝑅β‰₯0. Then, there exists some𝑀β‰₯0, such that

ξˆ½π‘’0= max

π‘žβˆˆξˆΌπ‘€βˆ—

(π‘žπ‘’0+ π‘“π‘žβˆ—)

=ξˆ½π‘€π‘’0

for all𝑒0∈ ℝ𝑑with‖𝑒0β€–βˆžβ‰€π‘…. In particular,is locally Lipschitz continuous and

ξˆ½π‘’0= max

π‘žβˆˆξˆΌβˆ—

(π‘žπ‘’0+ π‘“π‘žβˆ—)

for all𝑒0∈ ℝ𝑑,

where, for fixed𝑒0∈ ℝ𝑑, the maximum can be attained, simultaneously in every component, by a single element ofξˆΌβˆ—. In particular, there exists someπ‘ž0βˆˆξˆΌβˆ—withπ‘“βˆ—π‘ž0 = supπ‘žβˆˆξˆΌβˆ—π‘“βˆ—π‘ž=

(0) = 0.

Proof of Theorem2.5.(𝑣) β‡’ (𝑖𝑖): Asξˆ±π‘–is a convex expectation for all𝑖 ∈ {1, … , 𝑑}, it follows that the operatoris convex withξˆ½π›Ό = 0for all𝛼 ∈ ℝ. Now, let𝑒0∈ ℝ𝑑and𝑖 ∈ {1, … , 𝑑}with𝑒0,𝑖 β‰₯𝑒0,𝑗

for all𝑗 ∈ {1, … , 𝑑}. Let𝛼 > 0be such that

‖𝑒0+ π›Όβ€–βˆž= (𝑒0+ 𝛼)𝑖= 𝑒0,𝑖+ 𝛼,

and define𝑣0∢= 𝑒0+ 𝛼. Then,

ξˆ½π‘£0= lim

β„Žβ†˜0

(𝑒0(π‘‹β„Ž) + 𝛼) βˆ’ 𝑣0

β„Ž = lim

β„Žβ†˜0

(𝑒0(π‘‹β„Ž)) βˆ’ 𝑒0

β„Ž =ξˆ½π‘’0. Assume that(ξˆ½π‘’0)𝑖> 0. Then, there exists someβ„Ž > 0such that

ξˆ±π‘–(𝑣0(π‘‹β„Ž)) βˆ’ 𝑣0,𝑖 > 0.

Hence,

β€–β€–β€–ξˆ±(𝑣0(π‘‹β„Ž))β€–β€–β€–βˆžβ‰₯ ξˆ±π‘–(𝑣0(π‘‹β„Ž)) > 𝑣0,𝑖 =‖𝑣0β€–βˆž,

(13)

which is a contradiction to

β€–β€–β€–ξˆ±(𝑣0(π‘‹β„Ž))β€–β€–β€–βˆžβ‰€β€–π‘£0β€–βˆž. This shows thatsatisfies the positive maximum principle.

(𝑖𝑖) β‡’ (𝑖): This follows directly from the positive maximum principle, considering the vectors πœ†π‘’π‘–andβˆ’πœ†π‘’π‘–for allπœ† > 0and𝑖 ∈ {1, … , 𝑑}.

(𝑖) β‡’ (𝑖𝑖𝑖): Letbe a convexQ-operator. Moreover, letξˆΌβˆ—andπ‘“βˆ—= (π‘“π‘žβˆ—)π‘žβˆˆξˆΌβˆ—be as in Remark5.

Then, by Remark5(c), it only remains to show that everyπ‘ž βˆˆξˆΌβˆ—is aQ-matrix. To this end, fix an arbitraryπ‘ž βˆˆξˆΌβˆ—. Then, for all𝛼 ∈ ℝ,

π‘žπ›Ό = 1

πœ†π‘ž(πœ†π›Ό)≀ 1

πœ†((πœ†π›Ό) +ξˆ½βˆ—(π‘ž)) = 1

πœ†ξˆ½βˆ—(π‘ž) β†’ 0 asπœ† β†’ ∞.

Therefore,π‘žπ›Όβ‰€0for all𝛼 ∈ ℝ. Sinceπ‘ž is linear, it follows thatπ‘ž1 = 0. Now, let𝑖 ∈ {1, … , 𝑑}. Then, by definition of aQ-operator, we obtain that

π‘žπ‘–π‘–β‰€ 1

πœ†((πœ†π‘’π‘–) +ξˆ½βˆ—(π‘ž))𝑖≀ 1

πœ†(ξˆ½βˆ—(π‘ž))𝑖→ 0 asπœ† β†’ ∞,

that is,π‘žπ‘–π‘–β‰€0. Now, let𝑖, 𝑗 ∈ {1, … , 𝑑}with𝑖≠𝑗. Then, again by definition of aQ-operator, it follows that

βˆ’π‘žπ‘–π‘— ≀ 1

πœ†((βˆ’πœ†π‘’π‘–) +ξˆ½βˆ—(π‘ž))𝑗≀ 1

πœ†(ξˆ½βˆ—(π‘ž))𝑗→ 0 asπœ† β†’ ∞, that is,π‘žπ‘–π‘— β‰₯0. Therefore,π‘žis aQ-matrix.

It remains to show the implications (𝑖𝑖𝑖) β‡’ (𝑖𝑣) β‡’ (𝑣), which is done in the entire next

section. β–‘

Before we start with the proof of the remaining implications(𝑖𝑖𝑖) β‡’ (𝑖𝑣) β‡’ (𝑣), we would like to point out how, in the sublinear case, the setξˆΌβˆ—ofQ-matrices from Remark3.1can be reduced to certain β€œcorner points.” This can be done using the concept of row convexity, introduced at the beginning of this section, together with Minkowski’s theorem on extremal points of convex sets inℝ𝑑. LetξˆΉβŠ‚ ℝ𝑑×𝑑be a nonempty set of matrices. Then, we define therow-convex hullof by

rch() ∢=

{βˆ‘π‘› 𝑖=1

πœƒπ‘–π‘žπ‘–||

||𝑛 ∈ β„• , πœƒ1, … , πœƒπ‘› ∈ [0, ∞)𝑑×𝑑,

βˆ‘π‘› 𝑖=1

πœƒπ‘–= 𝐼, π‘ž1, … π‘žπ‘› ∈ }

.

For a convex set𝐢 βŠ‚ ℝ𝑑, we denote the set of all extreme points of𝐢 by𝐸(𝐢). Recall that an extreme point of a convex set𝐢 βŠ‚ ℝ𝑑 is an elementπ‘₯ ∈ 𝐢such thatπ‘₯ = πœ†π‘¦ + (1 βˆ’ πœ†)𝑧, forπœ† ∈ (0, 1)and𝑦, 𝑧 ∈ 𝐢, implies thatπ‘₯ = 𝑦 = 𝑧. For a matrixπ‘ž ∈ ℝ𝑑×𝑑and𝑖 ∈ {1, … , 𝑑}, we denote by

π‘žπ‘–βˆΆ= (π‘žπ‘–1, … , π‘žπ‘–π‘‘) ∈ ℝ𝑑

(14)

the𝑖th row ofπ‘ž. LetξˆΌβŠ‚ ℝ𝑑×𝑑be a nonempty compact row-convex set of matrices. Then, we say that a setξˆΎβŠ‚ξˆΌis-row-extremeif

{π‘žπ‘–|π‘ž ∈} = 𝐸({π‘žπ‘–|π‘ž ∈}) for all𝑖 ∈ {1, … , 𝑑}.

That is, the set of all𝑖th rows ofis the set of all extreme points of the𝑖th rows of. We say that a setξˆΉβŠ‚ξˆΌisminimal-row-extreme, ifis row-extreme for andξˆΎβŠ‚ξˆΉimplies= for any-row-extreme setξˆΎβŠ‚ξˆΌ.

Proposition 3.2. Let βŠ‚ ℝ𝑑×𝑑be nonempty, compact, and row-convex. Then, there exists a min- imal-row-extreme setξˆΉβŠ‚ξˆΌ. Moreover, = rch()is the row-convex hull of any (minimal)

-row-extreme setξˆΎβŠ‚ξˆΌand

maxπ‘žβˆˆξˆΌ π‘žπ‘’0= max

π‘žβˆˆξˆΎπ‘žπ‘’0 for all𝑒0∈ ℝ𝑑, (6)

where the maxima are to be understood componentwise.

Proof. By Minkowski’s theorem, the set of all-row-extreme sets is nonempty, and one readily verifies that the latter together with the partial orderβͺ―, given by1βͺ―2if and only if1βŠƒξˆΎ2, has the chain property. Hence, by Zorn’s lemma, there exists a maximal elementwithin the set of all-row-extreme sets, which, by definition, is a minimal-row-extreme set. Now, let be an arbitrary-row-extreme set and𝑒0∈ ℝ𝑑. Then,

maxπ‘žβˆˆξˆΌ (π‘žπ‘’0)𝑖 = max

π‘žβˆˆξˆΌ (π‘žπ‘–β‹… 𝑒0) = max

π‘žβˆˆξˆΎ (π‘žπ‘–β‹… 𝑒0) = max

π‘žβˆˆξˆΎ (π‘žπ‘’0)𝑖.

β–‘

Remark3.3. Let∢ ℝ𝑑→ ℝ𝑑be a sublinearQ-operator, andξˆΌβˆ—as in Remark3.1. Then,

ξˆΌβˆ—={

π‘ž ∈ ℝ𝑑|||π‘“π‘žβˆ—=ξˆ½βˆ—(π‘ž) = 0}

is a nonempty, compact, and row-convex set. By the previous proposition, there exists a minimal

ξˆΌβˆ—-row-extreme setξˆΉβŠ‚ξˆΌβˆ—, and, for all𝑒0∈ ℝ𝑑,

ξˆ½π‘’0= max

π‘žβˆˆξˆΉπ‘žπ‘’0,

where the maximum is to be understood componentwise. Sinceπ‘“βˆ—= (π‘“π‘žβˆ—)π‘žβˆˆξˆΌβˆ— = 0, it follows that (, 0)is a dual representation as in Theorem2.5(iii). Notice that, in many cases, the cardinality ofis way smaller than the cardinality ofξˆΌβˆ—. Therefore, concerning computational aspects, the dual representation(, 0)is often way more tractable than the dual representation(ξˆΌβˆ—, 0), and, by Theorem2.5, both representations result in the same semigroup envelope, and thus, the same solution to the ODE (3).

(15)

Example 3.4. Letπ‘ž0, π‘ž ∈ ℝ𝑑×𝑑be two fixedQ-matrices andπœ†π‘™, πœ†β„Žβˆˆ ℝwithπœ†π‘™β‰€πœ†β„Ž. We define the sublinearQ-operator∢ ℝ𝑑→ ℝ𝑑by

ξˆ½π‘’0∢= π‘ž0𝑒0+ max

πœ†βˆˆ[πœ†π‘™,πœ†β„Ž]πœ†π‘žπ‘’0 for all𝑒0∈ ℝ𝑑.

We consider the maximal row-convex setξˆΌβˆ—βŠ‚ ℝ𝑑×𝑑representing, defined as in Remark3.1.

Then,

ξˆΌβˆ—={

𝑝0+ πœ†π‘|||πœ† ∈diag([πœ†π‘™, πœ†β„Ž])} ,

where diag([πœ†π‘™, πœ†β„Ž])denotes the set of all diagonal matricesπœ† ∈ ℝ𝑑×𝑑with diagonal entriesπœ†π‘–π‘–βˆˆ [πœ†π‘™, πœ†β„Ž]for all𝑖 ∈ {1, … , 𝑑}. Now, let

∢= {π‘ž0+ πœ†π‘™π‘ž, π‘ž0+ πœ†β„Žπ‘ž}.

Then,is a minimalξˆΌβˆ—-row-extreme set, and thus,ξˆΌβˆ—= rch(). In particular, by the previous remark, the tuple

({π‘ž0+ πœ†π‘™π‘ž, π‘ž0+ πœ†β„Žπ‘ž}, (0, 0))

is a dual representation as in Theorem2.5(iii), which is way more tractable than the dual repre- sentation(ξˆΌβˆ—, 0).

4 PROOF OF (π’Šπ’Šπ’Š) β‡’ (π’Šπ’—) β‡’ (𝒗)

Throughout, letξˆΌβŠ‚ ℝ𝑑×𝑑be a set ofQ-matrices and𝑓 = (π‘“π‘ž)π‘žβˆˆξˆΌ βŠ‚ ℝ𝑑withπ‘“π‘žβ‰€0for allπ‘ž ∈ andπ‘“π‘ž0 = 0for someπ‘ž0∈, such that the map

∢ ℝ𝑑→ ℝ𝑑, 𝑒 ↦ sup

π‘žβˆˆξˆΌ

(π‘žπ‘’ + π‘“π‘ž)

is well-defined. For everyπ‘ž ∈, we consider the linear ODE

𝑒′(𝑑) = π‘žπ‘’(𝑑) + π‘“π‘ž, for𝑑β‰₯0, (7)

with𝑒(0) = 𝑒0∈ ℝ𝑑. Then, by a variation of constant, the solution to (7) is given by

𝑒(𝑑) = π‘’π‘žπ‘‘π‘’0+∫

𝑑 0

π‘’π‘žπ‘ π‘“π‘žd𝑠 = 𝑒0+∫

𝑑 0

π‘’π‘ π‘ž(

π‘žπ‘’0+ π‘“π‘ž

)d𝑠 =∢ π‘†π‘ž(𝑑)𝑒0 (8)

for𝑑β‰₯0, whereπ‘’π‘‘π‘žβˆˆ ℝ𝑑×𝑑 is the matrix exponential ofπ‘‘π‘ž for all𝑑β‰₯0. Then, the familyπ‘†π‘ž= (π‘†π‘ž(𝑑))𝑑β‰₯0defines a uniformly continuous semigroup of affine linear operators (see Definition2.3).

Remark4.1. Note that, for allπ‘ž ∈and𝑑β‰₯0, the matrix exponentialπ‘’π‘‘π‘žβˆˆ ℝ𝑑×𝑑is astochastic matrix, that is,

(16)

(i) (π‘’π‘‘π‘ž)𝑖𝑗β‰₯0for all𝑖, 𝑗 ∈ {1, … , 𝑑}, (ii) π‘’π‘‘π‘ž1 = 1.

Therefore,π‘’π‘‘π‘žβˆˆ ℝ𝑑×𝑑is a linear kernel, that is,π‘’π‘‘π‘žπ‘’0β‰€π‘’π‘‘π‘žπ‘£0for all𝑒0, 𝑣0∈ ℝ𝑑with𝑒0≀𝑣0and π‘’π‘‘π‘žπ›Ό = 𝛼for all𝛼 ∈ ℝ, which implies thatπ‘†π‘ž(𝑑)is monotone for allπ‘ž ∈and𝑑β‰₯0.

For the family(π‘†π‘ž)π‘žβˆˆξˆΌor, more precisely, for(, 𝑓), we will now construct theNisio semigroup, and show that it gives rise to the unique classical solution to the nonlinear ODE (3). To this end, we consider the set of finite partitions

𝑃 ∢={

πœ‹ βŠ‚ [0, ∞)|||0 ∈ πœ‹,|πœ‹|< ∞} .

The set of partitions with end point𝑑β‰₯0will be denoted by𝑃𝑑, that is,π‘ƒπ‘‘βˆΆ= {πœ‹ ∈ 𝑃|max πœ‹ = 𝑑}. Notice that

𝑃 =⋃

𝑑β‰₯0

𝑃𝑑.

For allβ„Žβ‰₯0and𝑒0∈ ℝ𝑑, we define

ξˆ±β„Žπ‘’0∢= sup

π‘žβˆˆξˆΌπ‘†π‘ž(β„Ž)𝑒0,

where the supremum is taken componentwise. Note thatξˆ±β„Žis well-defined since π‘†π‘ž(β„Ž)𝑒0= π‘’β„Žπ‘žπ‘’0+∫

β„Ž 0

π‘’π‘ π‘žπ‘“π‘ždπ‘ β‰€π‘’β„Žπ‘žπ‘’0≀‖𝑒0β€–βˆž

for allπ‘ž ∈,β„Žβ‰₯0and𝑒0∈ ℝ𝑑, where we used the fact thatπ‘’β„Žπ‘žis a kernel. Moreover,ξˆ±β„Žis a convex kernel, for allβ„Žβ‰₯0, as it is monotone and

ξˆ±β„Žπ›Ό = 𝛼 + sup

π‘žβˆˆξˆΌβˆ«

β„Ž 0

π‘’π‘ π‘žπ‘“π‘žd𝑠 = 𝛼

for all𝛼 ∈ ℝ, where we used the fact that there is someπ‘ž0∈withπ‘“π‘ž0 = 0. For a partitionπœ‹ = {𝑑0, 𝑑1, … , π‘‘π‘š} ∈ 𝑃withπ‘š ∈ β„•and0 = 𝑑0< 𝑑1< β‹― < π‘‘π‘š, we set

ξˆ±πœ‹βˆΆ=ξˆ±π‘‘1βˆ’π‘‘0β€¦ξˆ±π‘‘π‘šβˆ’π‘‘π‘šβˆ’1.

Moreover, we set{0}∢=0. Then,ξˆ±πœ‹is a convex kernel for allπœ‹ ∈ 𝑃since it is a concatenation of convex kernels.

Definition 4.2. TheNisio semigroupS= (S(𝑑))𝑑β‰₯0of(, 𝑓)is defined by S(𝑑)𝑒0∢= sup

πœ‹βˆˆπ‘ƒπ‘‘ξˆ±πœ‹π‘’0 for all𝑒0∈ ℝ𝑑and𝑑β‰₯0.

(17)

Notice thatS(𝑑) ∢ ℝ𝑑 β†’ ℝ𝑑is well-defined and a convex kernel for all𝑑β‰₯0sinceξˆ±πœ‹is a convex kernel for allπœ‹ ∈ 𝑃. In many of the subsequent proofs, we will first concentrate on the case, where the family𝑓is bounded and then use an approximation of the Nisio semigroup by means of other Nisio semigroups. This approximation procedure is specified in the following remark.

Remark 4.3. Let 𝑀β‰₯0, ξˆΌπ‘€ ∢= {π‘ž ∈| β€–π‘“π‘žβ€–βˆžβ‰€π‘€} and π‘“π‘€βˆΆ= (π‘“π‘ž)π‘žβˆˆξˆΌπ‘€. Notice that, by assumption, there exists someπ‘ž0∈withπ‘“π‘ž0 = 0, which implies thatπ‘ž0βˆˆξˆΌπ‘€. SinceξˆΌπ‘€βŠ‚ξˆΌ (and by definition of𝑓𝑀), the operator

ξˆ½π‘€ ∢ ℝ𝑑 β†’ ℝ𝑑, 𝑣 ↦ sup

π‘žβˆˆξˆΌπ‘€

(π‘žπ‘£ + π‘“π‘ž

)

is well-defined. LetS𝑀be the Nisio semigroup w.r.t.(ξˆΌπ‘€, 𝑓𝑀)for all𝑀β‰₯0. Since

⋃

𝑀β‰₯0

ξˆΌπ‘€=,

it follows thatξˆ½π‘€ β†—ξˆ½andS𝑀(𝑑) β†—S(𝑑), for all𝑑β‰₯0, as𝑀 β†’ ∞. Moreover, for allπ‘ž βˆˆξˆΌπ‘€, 𝑒0∈ ℝ𝑑with‖𝑒0β€–βˆž= 1, and𝑖 ∈ {1, … , 𝑑},

(π‘žπ‘’0)𝑖≀(

ξˆ½π‘’0βˆ’ π‘“π‘ž)

π‘–β‰€β€–ξˆ½π‘’0β€–βˆž+β€–π‘“π‘žβ€–βˆžβ‰€π‘€ + max

π‘£βˆˆπ•Šπ‘‘βˆ’1β€–ξˆ½π‘£β€–βˆž,

whereπ•Šπ‘‘βˆ’1∢= {𝑣 ∈ ℝ𝑑| β€–π‘£β€–βˆž= 1} and, in the last step, we used the fact that ∢ ℝ𝑑→ ℝ𝑑 is convex and therefore continuous. This implies that the setξˆΌπ‘€ is bounded in the sense that supπ‘žβˆˆξˆΌ

π‘€β€–π‘žβ€–< ∞. In particular,

π‘žβˆˆξˆΌsup𝑀

β€–π‘žπ‘’0+ π‘“π‘žβ€–βˆžβ‰€ sup

π‘žβˆˆξˆΌπ‘€

(β€–π‘žβ€–β€–π‘’0β€–βˆž+β€–π‘“π‘žβ€–βˆž)

≀𝑀 + sup

π‘žβˆˆξˆΌπ‘€

β€–π‘žβ€–β€–π‘’0β€–βˆž< ∞ (9)

for all𝑒0∈ ℝ𝑑.

Lemma 4.4.Assume that the family𝑓is bounded, that is,(, 𝑓) = (ξˆΌπ‘€, 𝑓𝑀)for some𝑀β‰₯0. Then, for all𝑒0∈ ℝ𝑑, the mapping[0, ∞) β†’ ℝ𝑑, β„Ž β†¦ξˆ±β„Žπ‘’0is Lipschitz continuous.

Proof. Let𝑒0∈ ℝ𝑑and0β‰€β„Ž1< β„Ž2. Then, by (8), for allπ‘ž ∈, we have that

β€–π‘†π‘ž(β„Ž2)𝑒0βˆ’ π‘†π‘ž(β„Ž1)𝑒0β€–βˆžβ‰€ βˆ«β„Žβ„Ž2

1

β€–β€–β€–π‘’π‘žπ‘ (π‘žπ‘’0+ π‘“π‘ž)β€–β€–β€–βˆžd𝑠≀(β„Ž2βˆ’ β„Ž1)β€–π‘žπ‘’0+ π‘“π‘žβ€–βˆž,

which implies that

β€–ξˆ±β„Ž2𝑒0βˆ’ξˆ±β„Ž1𝑒0β€–βˆžβ‰€sup

π‘žβˆˆξˆΌβ€–π‘†π‘ž(β„Ž2)𝑒0βˆ’ π‘†π‘ž(β„Ž1)𝑒0β€–βˆžβ‰€(β„Ž2βˆ’ β„Ž1) (

supπ‘žβˆˆξˆΌβ€–π‘žπ‘’0+ π‘“π‘žβ€–βˆž )

. (10)

Note thatsupπ‘žβˆˆξˆΌβ€–π‘žπ‘’0+ π‘“π‘žβ€–βˆž< ∞by (9). β–‘

Referenzen

Γ„HNLICHE DOKUMENTE

As a result, the Lithuanian customs authority denied the exemption from the import VAT, on the basis that Enteco Baltic had not delivered the fuel to the

In this paper, we have proposed the use of Markov chains and transition matrices to model transitions between databases, and used them to define a probabilistic metric space for

A distributed algorithm A has contamination radius r if only nodes within the r-hop neighborhood of the faulty node change their state during recovery from a 1-faulty

For prediction of transition probabilities for each line of transportation for all mentioned time-points in transportation routes, Markov Chains based on both homogeneous and

Assumption: the training data is a realization of the unknown probability distribution – it is sampled according to it1. β†’ what is observed should have a

Discriminative learning – large margin learning, SSVM, loss-based learning, learning with latent variables

Today: Markov chains – The probabilistic model – Some β€œuseful” probabilities – SumProd algorithm. – Inference – MAP,

Firstly, for the total variation distance in the white-box setting, [8] shows that deciding whether it equals one can be done in polynomial time, but computing it is NP-hard and