Markov semigroups - Probability in High Dimension

Part I Concentration

2.2 Markov semigroups

A (homogeneous) Markov process (X_t)_t∈_R₊ is a random process that satisfies theMarkov property: for every bounded measurable functionf ands, t∈R+, there is a bounded measurable functionP_sf such that

E[f(X_t+s)|{X_r}_r≤t] = (P_sf)(X_t).

[We do not put any restrictions on the state space:Xtcan take values in any measurable space E, and the functions above are of the form f : E → R.]

The interpretation, of course, is classical: the behavior of the process in the futureXt+sdepends on the history to date{Xr}_r≤tonly through the current stateXt, and is independent of the prior history; that is, the dynamics of the Markov processes are memoryless. The assumption that Psf does not also depend on t in the above expression (the homogeneity property) indicates that the same dynamical mechanism is used at each time.

A probability measureµis calledstationary orinvariant if µ(P_tf) =µ(f) for allt∈R+, bounded measurablef.

To interpret this notion, suppose thatX₀∼µ. Then

E[f(Xt)] =E[E[f(Xt)|X0]] =E[Ptf(X0)] =µ(Ptf).

Thus if µ is stationary, then E[f(Xt)] = µ(f) for every t ∈ R+ and f: in particular, if the process is initially distributed according to the stationary measureX0 ∼µ, then the process remains distributed according to the sta-tionary measureXt∼µfor every timet. In other words, stationary measures describe the “steady-state” or “equilibrium” behavior of a Markov process.

Let us describe a few basic facts about the functionsPtf.

Lemma 2.7.Let µ be a stationary measure. Then the following hold for all p≥1,t, s∈R+,α, β∈R, bounded measurable functionsf, g:

1.kPtfk_Lp(µ)≤ kfk_Lp(µ):=µ(f^p)^1/p (contraction).

2.Pt(αf+βg) =αPtf+βPtg µ-a.s. (linearity).

3.Pt+sf =PtPsf µ-a.s. (semigroup property).

4.Pt1 = 1µ-a.s. (conservativeness).

In particular,{Pt}t∈R+ defines a semigroup of linear operators onL^p(µ).

Proof. Assume thatX0∼µ. To prove contraction, note that

kPtfk^p_Lp(µ)=E[E[f(Xt)|X0]^p]≤E[E[f(Xt)^p|X0]] =kfk^p_Lp(µ), where we have used Jensen’s inequality. Linearity follows similarly as

E[αf(X_t) +βg(X_t)|X₀] =αE[f(X_t)|X₀] +βE[g(X_t)|X₀].

22 2 Variance bounds and Poincar´e inequalities To prove the semigroup property, note that

E[f(X_t+s)|X0] =E[E[f(X_t+s)|{Xr}r≤t]|X0] =E[P_sf(X_t)|X0].

The last property is trivial. ut

Remark 2.8.Letµbe a stationary measure. In view of Lemma 2.7, it is easily seen that the definition and basic properties ofPtf make sense not only for bounded measurable functionsf, but also for everyf ∈L¹(µ). From now on, we will assume theP_tf is defined in this manner for everyf ∈L¹(µ).

As an illustration of these basic properties, let us prove the following ele-mentary observation. In the sequel, we will write Varµ(f) :=µ(f²)−µ(f)². Lemma 2.9.Let µ be a stationary measure. Then t 7→ Var_µ(P_tf) is a de-creasing function of time for every functionf ∈L²(µ).

Proof. Note that

Var_µ(P_tf) =kPtf−µfk²_L2(µ)=kPt(f−µf)k²_L2(µ)=kPt−sP_s(f−µf)k²_L2(µ)

≤ kPs(f−µf)k²_L2(µ)=kPsf−µfk²_L2(µ)= Varµ(Psf)

for every 0≤s≤t. ut

We now turn to an important notion for Markov processes in continuous time. If you are familiar with Markov chains in discrete time with a finite state space, you will be used to the idea that the dynamics of the chain is defined in terms of a matrix of transition probabilities. This matrix describes with what probability the chain moves from one state to another in one time step, and forms the basic ingredient in the analysis of the behavior of Markov chains.

This idea does not make sense in continuous time, as a Markov process evolves continuously and not in individual steps. Nonetheless, there is an object that plays the analogous role in continuous time, called thegenerator of a Markov process. We will first describe the general notion, and then investigate the finite state space case as an example (in which case the generator can be interpreted as a matrix of transitionrates rather than probabilities).

From now on, we will fix a Markov process with stationary measureµand consider{Pt}t∈R+ as a semigroup of linear operators onL²(µ).

Definition 2.10 (Generator).The generatorL is defined as Lf := lim

t↓0

P_tf−f t

for every f ∈ L²(µ) for which the above limit exists in L²(µ). The set of f for whichLf is defined is called the domain Dom(L)of the generator, and L defines a linear operator fromDom(L)⊆L²(µ) toL²(µ).

2.2 Markov semigroups 23 Remark 2.11 (Warning). For Markov processes whose sample paths are of pure jump type (i.e., piecewise constant as a function of time) it is often the case that Dom(L) = L²(µ). This is the simplest setting for the theory of Markov processes in continuous time, and here many computations can be done without any technicalities. On the other hand, for Markov processes with continuous sample paths (such as Brownian motion, for example), it is an unfortunate fact of life that Dom(L)(L²(µ). In this setting, a rigorous treatment of semigroups, generators, and domains requires functional ana-lytic machinery that is not assumed as a prerequisite for this course. While we should therefore ideally restrict attention to the pure jump case, many important applications (for example, the proof of the Poincar´e inequality for Gaussian variables) will require the use of continuous Markov processes.

Fortunately, it turns out that domain problems prove to be of a purely technical nature in all the applications that we will encounter: results that we will derive for the case Dom(L) =L²(µ) will be directly applicable even when this condition fails. While a rigorous proof would require to check carefully that no domain issues arise, addressing such issues would take significant time and does not provide much insight into the high-dimensional phenomena that are of interest in this course. As a compromise, we will therefore generally ignore domain problems and assume implicitly that Dom(L) =L²(µ) when deriving general results, while we will still apply these results in more general cases. The interested reader should be aware when a shortcut is being taken, and refer to the literature for a careful treatment of such technical issues.

How can one use the generator L? We have defined the generator in terms of the semigroup; however, it is in fact possible to define the semigroup in terms of the generator, in analogy to the definition of a discrete Markov chain in terms of its transition probability matrix. To see this, note that

dtPtf = lim

δ↓0

Pt+δf−Ptf

δ = lim

δ↓0Pt

Pδf −f δ

=PtLf.

ThusP_tcan be recovered as the solution of the Kolmogorov equation d

dtP_tf =P_tLf, P₀f =f.

This computation could also have been performed in a different order:

dtPtf = lim

δ↓0

Pt+δf−Ptf

δ = lim

δ↓0

PδPtf−Ptf

δ =LPtf.

Thus we have demonstrated a basic property: the generator and the semigroup commute, that is, LPt = PtL. [These statements are entirely clear when Dom(L) =L²(µ), and must be given a careful interpretation otherwise.]

Example 2.12 (Finite state space). Let (Xt)t∈R+ be a Markov process with values in a finite state space Xt ∈ {1, . . . , d}. Such processes are typically described in terms of theirtransition rates λij ≥0 for i6=j:

24 2 Variance bounds and Poincar´e inequalities

P[Xt+δ =j|Xt=i] =λijδ+o(δ) fori6=j.

Evidently, the transition ratesλ_ij describe the infinitesimal rate of growth of the probability of jumping from stateito statej (informally, ifX_t=i, then the probability thatX_t+dt=j isλ_ijdt).

Let us organize the transition probabilitiesqt,ij =P[Xt =j|X0=i] and ratesλij into matrices Qt= (qt,ij)_1≤i,j≤d andΛ= (λij)_1≤i,j≤d, respectively, where we define the diagonal entries ofΛ asλii=−P

j6=iλij≤0. Then limt↓0

q_t,ij−q_0,ij t =λij

for every 1≤i, j≤d(the diagonal entriesλ_iiwere chosen precisely to enforce the law of total probabilityP

jq_t,ij = 1). In particular, we have Lf(i) = lim

t↓0 d

j=1

f(j)q_t,ij−q_0,ij

t =

j=1

λijf(j) = (Λf)i,

where we identify the functionf with the vector (f(1), . . . , f(d)) ∈R^d. We therefore conclude that the generator of a Markov process in a finite state space corresponds precisely to the matrix of transition rates. The Kolmogorov equation now reduces to the matrix differential equation

dtQ_t=Q_tΛ, Q₀=I.

This differential equation is the basic tool for computing probabilities of finite state space Markov processes. The solution is in fact easily obtained as

Qt=e^tΛ,

from which we readily see whyP_t andL must commute.

The above example provides some intuition for the notion of a generator.

Further examples of Markov semigroups will be given in the next section.

Remark 2.13.In analogy with the above example, we can formally express the relation between the semigroup and generator of a Markov process as Pt = e^t^L. This expression is readily made precise in the case Dom(L) = L²(µ) by interpreting e^t^L as a power series. While this does not work in the case Dom(L)(L²(µ), the intuition extends also to this setting; however, in this case the meaning of the exponential function must be carefully defined.

We conclude this section by introducing one more fundamental idea in the theory of Markov processes. Recall that we have defined semigroupPtas a family of linear operators onL²(µ). The latter is a Hilbert space, and we denote its inner product ashf, giµ :=µ(f g) (so thatkfk²_L2(µ)=hf, fiµ).

2.2 Markov semigroups 25 Definition 2.14 (Reversibility).The Markov semigroupPtwith stationary measureµis called reversibleifhf, Ptgiµ=hPtf, giµ for every f, g∈L²(µ).

Thus the Markov process is reversible if the operators Ptare self-adjoint onL²(µ). Equivalently, as Pt =e^t^L, the Markov process is reversible if its generatorL is self-adjoint. The reversibility property has a probabilistic in-terpretation: if the Markov property is reversible, then (assumingX0∼µ)

hPtf, giµ =hf, Ptgiµ=E[f(X0)E[g(Xt)|X0]]

=E[f(X₀)g(X_t)] =E[E[f(X₀)|Xt]g(X_t)]

for everyf, g∈L²(µ), so that in particular

Ptf(x) :=E[f(Xt)|X0=x] =E[f(X0)|Xt=x].

This implies that when the Markov process (Xt)_t∈[0,a]is viewed backwards in time (Xa−t)_t∈[0,a], it has the same law: that is, the law of the Markov process is invariant under time reversal; hence the namereversibility.

We will see in the following section that reversible Markov processes are the most natural objects connected to Poincar´e inequalities (and to other functional inequalities that we will encounter in later chapters). However, the notion of time reversal will not play any role in our proofs. Rather, for reasons that will become evident in the sequel, the self-adjointness of the generator L will allow us to obtain a very complete characterization of exponential convergence of the Markov semigroup to the stationary measure.

Example 2.15 (Finite state space continued).In the setting of Example 2.12, it is evident that the Markov process is reversible if and only if

i,j=1

µifiΛijgj=

i,j=1

µjgjΛjifi

for allf, g∈R^d, or equivalently

µ_iΛ_ij =µ_jΛ_ji for alli, j∈ {1, . . . , d},

where µ denotes the stationary measure of the Markov process. The latter condition is often called “detailed balance” in the physics literature.

Problems

2.6 (Some elementary identities). Let Pt be a Markov semigroup with generatorL and stationary measureµ. Prove the following elementary facts:

a. Show thatµ(Lf) = 0 for everyf ∈Dom(L).

b. Ifφ:R→Ris convex, thenPtφ(f)≥φ(Ptf) whenf, φ(f)∈L²(µ).

c. Ifφ:R→Ris convex, thenLφ(f)≥φ⁰(f)Lf whenf, φ(f)∈Dom(L).

d. Letf ∈Dom(L). Show that the following process is a martingale:

M_t^f :=f(Xt)− Z t

Lf(Xs)ds

26 2 Variance bounds and Poincar´e inequalities

Im Dokument Probability in High Dimension (Seite 27-32)