Non-symmetric Hypercontractivity and the p-Entropy

In this section we will show how to deduce the rate of convergence to equilibrium for the family ofp-entropies, with 1<p<2, frome₂. The main thing that will make the above possible isa non-symmetric hypercontractivityproperty of our Fokker–Planck equation

- namely, that any solution to the equation with (initially only) a finitep-entropy will eventually be “pushed” intoL²¡

R^d,f_∞⁻¹¢

, at which point we can use the information we gained one₂.

Before we show this result, and see how it implies our main theorem, we explain why and how this non-symmetric hypercontractivity helps.

Lemma 1.4.1. Let f ∈L¹₊¡ R^d¢

with unit mass. Then (i)

e_p(f|f_∞)= 1 p(p−1)

Ã kfk^p

L^p³

R^d,f_∞¹⁻^p´−1

! . (ii) for any1<p₁<p₂≤2there exists a constant C_p₁_,p₂>0such that

ep₁(f|f_∞)≤Cp₁,p₂ep₂(f|f_∞).

In particular, for any1<p<2

ep(f|f_∞)≤Cpe2(f|f_∞), for a fixed geometric constant.

Proof. (i) is trivial. To prove (i i) we consider the function g(y) :=

(_p₂_(p₂₋₁₎

p₁(p₁−1)

y^p¹−p1(y−1)−1

y^p²−p₂(y−1)−1, y≥0,y6=1

1 , y=1.

Clearly g ≥ 0 on R⁺, and it is easy to check that it is continuous. Since we have lim_y_→∞g(y)=0, we can conclude the result using (1.1.4).

It is worth to note that the second point of part (i i) of Lemma 1.4.1 can be extended to general generating function for an admissible relative entropy. The following is taken from [3]:

Lemma 1.4.2. Letψbe a generating function for an admissible relative entropy. Then one has that

ψ(y)≤2ψ⁰⁰(1)ψ2(y), y≥0.

In particular e_p≤2e₂for any1<p<2whenever e₂is finite.

Lemma 1.4.1 assures us that, if we start with initial data inL²¡

R^d,f_∞⁻¹¢

, thenepwill be finite. Moreover, due to Theorem 1.1.4 forp=2, and the fact that the solution to (1.1.2) remains inL²¡

R^d,f_∞⁻¹¢

, we have that

e_p(f(t)|f_∞)≤2e₂(f(t)|f_∞)≤C e₂(f₀|f_∞)¡

1+t²ⁿ¢ e^−2µt.

However, one can easily find initial data f₀6∈L²¡

R^d,f_∞⁻¹¢

with finitep-entropies. If one can show that the flow of the Fokker–Planck equation eventually forces the solution to enterL²¡

R^d,f_∞⁻¹¢

, we would be able to utilise the idea we just presented, at least from that time on.

Thisexplicit non-symmetrichypercontractivity result we desire, is the main new theo-rem we present in this section.

Theorem 1.4.3. Consider the Fokker–Planck equation (1.1.2)with diffusion and drift matricesDandCsatisfying Conditions (A)–(C). Let f0∈L¹₊¡

R^d¢

be a function with unit mass and assume there existsε>0such that

R^de^ε|x|²f₀(x)d x< ∞. (1.4.1) (i) Then, for any q>1, there exists an explicit t₀>0that depends only on geometric

constants of the problem such that the solution to(1.1.2)satisfies Z

R^d f(t,x)^qf_∞⁻¹(x)d x≤

µ q

π(q+1)

¶^qd₂ µ 8π² q−1

d 2µZ

R^de^ε|x|²f₀(x)d x

¶q

(1.4.2) for all t ≥t₀.

(ii) In particular, if f₀satisfies ep(f₀|f_∞)< ∞for some1<p<2we have that

e₂(f(t)|f_∞)≤1 2



 Ã8p

2 3·2¹^p

¡p(p−1)e_p(f₀|f_∞)+1¢_p²

−1



, (1.4.3) for t ≥t˜0(p)>0, which can be given explicitly.

Remark 1.4.4. As we consider ep in our hypercontractivity, which is, up to a constant, the L^p norm of g:= _f^f

∞ with the measure f_∞(x)d x, one can view our result as a hypercon-tractivity property of the Ornstein-Uhlenbeck operator, P (for an appropriate choice of the diffusion matrixQand drift matrixB), discussed in §1.3. With this notation,(1.4.3) is equivalent to

kg(t)kL²(f_∞)≤C_p,dkg₀kL^p(f_∞), t≥t˜₀(p) (1.4.4) for1<p<2, where C_p,d :=

8p 2 3·2

¶^d₂

. Since e₂decreases along the flow of our equation, (1.4.4)is valid for p=2with C_2,d =1. Thus, by using the Riesz-Thorin theorem one can improve inequality(1.4.4)to the same inequality with the constant C

2 p−1

p,d . We would like to point out at this point that a simple limit process shows that (1.4.4)is also valid for p=1, but there is no connection between the L¹norm of g and the Boltzmann entropy, e₁, of f₀.

Remark 1.4.5. Since its original definition for the Ornstein-Uhlenbeck semigroup in the work of Nelson, [16], the notion of hypercontractivity has been studied extensively for Markov diffusive operators (implying selfadjointness). A contemporary review of this topic can be found in [4]. For such selfadjoint generators, hypercontractivity is equiv-alent to the validity of a logarithmic Sobolev inequality, as proved by Gross [10]. For non-symmetric generators, however, this equivalence does not hold: While a log Sobolev inequality still implies hypercontractvity of related semigroups (cf. the proof of Theorem 5.2.3 in [4]), the reverse implication is not true in general (cf. Remark 5.1.1 in [22]). In particular, hypocoercive degenerate parabolic equations cannot give rise to a log Sobolev inequality, but they may exhibit hypercontractivity (as just stated above).

The last 20 years have seen the emergence of the, more delicate, study of hypercontractiv-ity for non-symmetric and even degenerate semigroups. Notable works in the field are the paper of Fuhrman, [9], and more recently the work of Wang et al., [6, 7, 21]. Most of these works consider an abstract Hilbert space as an underlying domain for the semigroup, and to our knowledge none of them give an explicit time after which one can observe the hy-percontractivity phenomena (Fuhrman gives a condition on the time in [9]).

Our hypercontractivity theorem, which we will prove shortly, gives not only an explicit and quantitative inequality, but also provides an estimation on the time one needs to wait before the hypercontractivity occurs. To keep the formulation of Theorem 1.4.3 sim-ple we did not include this “waiting time” there, but we emphasised it in its proof. More-over, the hypercontractivity estimate from Theorem 1.4.3(i) only requires(1.4.1), a weighted L¹norm of f₀. This is weaker than in usual hypercontractivity estimates, which use L^p norms as on the r.h.s. of (1.4.4).

It is worth to note that we prove our theorem under the setting of thee_p entropies, which can be thought of asL^p spaces with a weight function that depends onp.

In order to be able to prove Theorem 1.4.3 we will need a few technical lemmas.

Lemma 1.4.6. Given f₀∈L¹₊¡ R^d¢

with unit mass, the solution to the Fokker–Planck equa-tion(1.1.2)with diffusion and drift matricesDandCthat satisfy Conditions (A)–(C) is given by

f(t,x)= 1 (2π)^d² p

detW(t) Z

R^de⁻¹²

¡x−e⁻^C^ty¢T

W(t)⁻¹¡

x−e^C^ty¢

f₀(y)d y, (1.4.5) where

W(t) :=2 Z t

e⁻^C^sDe⁻^C^T^sd s.

This is a well known result, see for instance §1 in [12] or §6.5 in [19].

Lemma 1.4.7. Assume that the diffusion and drift matrices,DandC, satisfy Conditions (A)–(C), and letKbe the unique positive definite matrix that satisfies

2D₌CK₊KC^T.

Then (in any matrix norm)

kW(t)−K_{k ≤}c(1+t²ⁿ)e⁻^2µt, t≥0,

where c>0is a geometric constant depending on n andµ, with n being the maximal defect of the eigenvalues ofCwith real partµ, defined in(1.1.5).

Proof. We start the proof by noticing thatK is given by K=2

Z _∞

e⁻^C^sDe⁻^C^T^sd s (see for instance [18]). As such

kW(t)−K_{k ≤}2 Z _∞

t ke⁻^C^sDe⁻^C^T^skd s≤2kD_k Z _∞

t ke⁻^C^skke⁻^C^T^skd s.

Using the fact that

Ae⁻^C^tA⁻¹₌e⁻^ACA⁻¹^t

for any regular matrixA, we conclude that, ifJis the Jordan form ofC, then

ke⁻^C^tk ≤ kA_J_kkA⁻¹_J _kke⁻^J^tk, (1.4.6) whereA_Jis the similarity matrix betweenCand its Jordan form.

For a single Jordan block of sizen+1 (corresponding to a defect ofnin the eigenvalue λ),Je, we find that

e^J^e^t=







e^λt t e^λt . . . ^t_n!ⁿe^λt e^λ^t . .. ^tⁿ⁻¹

(n−1)!e^λ^t . .. ...

0 e^λ^t







where Je₌







λ 1 0

. .. ...

0 λ





 .

Thus, we conclude that ke^J^e^txk1≤

n+1X

i=1 n+1X

j=i

t^j⁻ⁱ

(j−i)!e^Re(λ)t¯

¯x_j¯

¯≤ Ãn+1

i=1

¡1+tⁿ¢ e^Re(λ)t

! kxk1

=(n+1)¡ 1+tⁿ¢

e^Re(^λ^)tkxk1, t≥0.

Due to the equivalence of norms on finite dimensional spaces, there exists a geometric constantc₁>0, that depends onn, such that

ke^J^e^tk ≤c₁¡ 1+tⁿ¢

e^Re(λ)t. (1.4.7)

Coming back to C, we see that the above inequality together with (1.4.6) imply that ke⁻^C^tkis controlled by the norm ofC’s largest (measured by the defect number) Jordan block of the eigenvalue with smallest real part. From this, and (1.4.7), we conclude that ke⁻^C^tk ≤c₂(1+tⁿ)e^−µt, t≥0. (1.4.8) The same estimation forke⁻^C^T^tkimplies that

kW(t)−K_{k ≤}c₃ Z _∞

¡1+s²ⁿ¢

e^−2µsd s, for some geometric constantc₃>0 that depends onn. Since

Z _∞

s²ⁿe^−2µsd s=

· 1

2µt²ⁿ+ 2n

(2µ)²t²ⁿ⁻¹+2n(2n−1)

(2µ)³ t²ⁿ⁻²+...+ (2n)!

(2µ)²ⁿ⁺¹

¸ e^−2µt we conclude the desired result.

While we can continue with a general matrix K, it will simplify our computations greatly ifK would have beenI. Since we are working under the assumption thatD= C_S, the normalization from Theorem 1.2.5 implies exactly that. Thus, from this point onwards we will assume thatK isI.

Lemma 1.4.8. For anyε>0there exists an explicit t₁>0such that for all t≥t₁ kW⁻¹(t)−I_{k ≤}_ε,

whereW(t)is as in Lemma 1.4.7. An explicit, but not optimal choice for t₁is given by

t₁(ε) := 1

2(µ−α)log





c(1+ε)³ 1+¡ _n

αe

¢2n´ ε



, (1.4.9)

where0<α<µis arbitrary and c>0is given by Lemma 1.4.7.

Proof. We have that for any invertible matrixA

kA⁻¹₋I_{k = k}(A₋I)A⁻¹_{k ≤ k}A₋I_kkA⁻¹_k. In addition, ifkA₋I_{k <}1, then

kA⁻¹_{k = k(}I₋(I₋A))⁻¹k ≤ 1 1− kA₋I_k^. Thus, for anyt>0 such thatkW(t)−I_{k <}1 we have that

kW⁻¹(t)−I_{k ≤} ^kW(t)−I_k

1− kW(t)−I_k^. ^(1.4.10)

Defining ˜t₁(ε) as

t˜₁(ε) :=min

½ s≥0

¡1+t²ⁿ¢

e^−2µt≤ ε

c(1+ε), ∀t≥s

, (1.4.11)

with the constantcgiven by Lemma 1.4.7, we see from Lemma 1.4.7 that for anyt≥t˜₁(ε) kW(t)−I_{k ≤} ^ε

1+ε.

Combining the above with (1.4.10), shows the first result fort₁=t˜₁(ε).

To prove the second claim we will show that t₁(ε)≥t˜₁(ε).

For this elementary proof we use the fact that maxt≥0 e⁻^att^b=

µ b ae

¶b

for anya,b>0. Thus, choosinga=2α, where 0<α<µis arbitrary, andb=2nwe have that

¡1+t²ⁿ¢

e^−2µt≤ µ

³ n αe

´2n¶

e^{−2(µ−α)t}, t≥0.

As a consequence, if µ

³ n αe

´2n¶

e⁻²⁽^µ−α^)t≤ ε

c(1+ε), ∀t≥s, (1.4.12) thens≥t˜₁(ε) due to (1.4.11). The smallest possibles in (1.4.12) is obtained by solving the corresponding equality fort, and yields (1.4.9), concluding the proof.

We now have all the tools to prove Theorem 1.4.3

Proof of Theorem 1.4.3. To show (i) we recall Minkowski’s integral inequality, which will play an important role in estimating theL^p norms off(t).

Minkowski’s Integral Inequality:For any non-negative measurable function F on(X₁× X₂,µ1×µ2), and any q≥1one has that

µZ

¯ Z

F(x₁,x₂)dµ1(x₁)

dµ2(x₂)

¶_q¹

≤ Z

µZ

|F(x₁,x₂)|^qdµ2(x₂)

¶_q¹

dµ1(x₁).

(1.4.13)

Next, we fix anε1=ε1(ε,q)∈(0, 1), to be chosen later. From Lemma 1.4.7 and 1.4.8 we see that, fort≥t₁(ε1) with

t₁(ε1) := 1

2(µ−α)log





c(1+ε1)³ 1+¡_n

αe

¢2n´ ε1



 for some fixed 0<α<µ, we have that

kW(t)−I_{k ≤} ^ε¹

1+ε1 <ε1, kW⁻¹(t)−I_{k ≤}_ε₁, and hence

W(t)>(1−ε1)I, W(t)⁻¹≥(1−ε1)I. As such, fort≥t₁(ε1)

¯ e⁻¹²

¡x−e⁻^C^ty¢T

W(t)⁻¹¡

x−e^C^ty¢

f₀(y)

≤e⁻^q²⁽¹^−ε¹⁾

¯x−e⁻^C^ty¯

2¯

¯f₀(y)¯

q (1.4.14)

and

detW(t)≥(1−ε1)^d. (1.4.15)

We conclude, using (1.4.13), the exact solution formula (1.4.5), (1.4.14) and (1.4.15) that fort≥t1(ε1) it holds:

R^d

¯f(t,x)¯

qf_∞⁻¹(x)d x

≤ (2π)^d² (2π(1−ε1))^qd²

ÃZ

R^d

µZ

R^de⁻^q²⁽¹^−ε¹⁾

¯x−e⁻^C^ty¯

2¯

¯f₀(y)¯

qe^|x|

2 2 d x

¶¹_q d y

= (2π)^d² (2π(1−ε1))^qd²

ÃZ

R^d

µZ

R^de⁻^q²^(1−ε¹⁾

¯¯x−e⁻^C^ty¯

e^|x|

2 2 d x

¶_q¹

¯f₀(y)¯

¯d y

(1.4.16)

We proceed by choosing ε1>0 such that q(1−ε1)>1 (or equivalentlyε1< ^q−1_q ) and denoting

η:=q(1−ε1)−1>0.

Shifting thexvariable by¹₂e⁻^C^tyand completing the square, we find that Z

R^de⁻^q²⁽¹^−ε¹⁾

¯x−e⁻^C^ty¯

e^|^x^|

2 2 d x=

R^de⁻^η+²¹

¯x−¹₂e⁻^C^ty¯

¯x+1 2e−C_{t y}^¯¯

¯ 2

2 d x

= Z

R^de^xe⁻^C^t^ye⁻^η²

¯x−¹₂e⁻^C^ty¯

d x= Z

R^de⁻^η²

¯x−¹₂

³ 1+²_η

´ e⁻^C^ty

³1 2+₂¹_η

´¯

¯e⁻^C^ty¯

d x

= µ2π

¶^d₂ e

³1 2+₂¹_η´¯

¯e⁻^C^ty¯

(1.4.17)

Using (1.4.8) we can find a uniform geometric constantc₂such that ke⁻^C^tk²≤c₂²¡

1+tⁿ¢2

e^−2µt ≤2c₂²¡

1+t²ⁿ¢ e^−2µt. Following the proof of Lemma 1.4.8 we recall that if

t≥ 1

2(µ−α)log

Ãc˜(1+ε2)¡

1+_αⁿ_e¢2n

ε2

! , where 0<α<µis arbitrary and for any ˜c,ε2>0, then

¡1+t²ⁿ¢

e^−2µt≤ ε2

c(1+ε2). Thus, choosing

c=c₂²(1+η)

qη = c₂²(1−ε1)

q(1−ε1)−1 and ε2= ε1

1−ε1

we get that if

t≥t₂(ε1) := 1

2(µ−α)log

Ãc₂²(1−ε1)¡

1+_αeⁿ ¢2n

¡q(1−ε1)−1¢ ε1

! , where 0<α<µis arbitrary and for any ˜c,ε2>0, then

µ1 2+ 1

2η

ke⁻^C^tk²≤c₂²(1+η) qη q¡

1+t²ⁿ¢

e⁻²^µ^t ≤qε1.

Combining this with our previous computations ((1.4.16) and (1.4.17)), we find that for anyt≥t₀(ε1) :=max (t₁(ε1),t₂(ε1))

R^d

¯f(t,x)¯

qf_∞⁻¹(x)d x≤ (2π)^d(1−^q²⁾ (1−ε1)^qd² η^d²

µZ

R^de^ε¹|^y|²f0(y)d y

¶q

Ifε1is chosen more restrictively than before, namelyε1≤^q−1_2q , then we have q−1

2 ≤η<q−1 and 1−ε1≥q+1 2q ,

which implies the first statement of the theorem by choosingε1:=min³ ε,^q−1_2q ´

. For the proof of (ii) we note that (1.4.3) is equivalent to

kf(t)k²_L2(R^d,f_∞⁻¹)≤ Ã8p

2 3·2¹^p

kf₀k²

L^p

³R^d,f_∞^1−p

´. (1.4.18)

With the Hölder inequality we obtain Z

R^de

p−1

4p |x|²f₀(x)d x≤ µZ

R^de⁻^|x|

2 4 d x

¶^p_p⁻¹µZ

R^de^p−1² ^|x|²f₀^p(x)d x

¶_p¹

=2^d²

p−1

p kf₀k_Lp³

R^d,f_∞^1−p´.

Hence,e_p(f₀|f_∞)< ∞implies (1.4.1) withε=^p_4p⁻¹, and (1.4.18) follows from (1.4.2) with q=2 and ˜t₀(p)=t₀³_p

−1 4p

´ .

Remark 1.4.9. If the condition(1.4.1)holds forε=¹₂we can give an explicit upper bound for the “waiting time” in the hypercontractivity estimate (1.4.2). For such εwe have ε1:=min³

ε,^q_2q⁻¹´

= ^q_2q⁻¹, and by choosing α= ^µ₂ we can see that t₀(ε1)from the proof of Theorem 1.4.3 is

t₀(q) := 1 µlog





 max

c(3q−1), 2c₂²^q+1_q−1

´µ 1+

³2n µe

´2n¶ q−1





 ,

where c,c₂are geometric constants found in the proof of Lemma 1.4.7.

With the non-symmetric hypercontractivity result at hand, we can finally complete the proof of our main theorem for 1<p<2.

Proof of Theorem 1.1.4 for1<p<2. Using Theorem 1.4.3 (i i) we find an explicit T0(p) such that for any t ≥T0(p) the solution to the Fokker–Planck equation, f(t), is inL²¡

R^d,f_∞⁻¹¢

. Proceeding similarly to the previous remark (but now with q =2 and ε=^p_4p⁻¹) we haveε1:=min

³_p

−1 4p ,¹₄

=^p_4p⁻¹. This yields the following upper bound for the

“waiting time” in the hypercontractivity estimate (1.4.3):

T₀(p) :=1 µlog





 max³

c(5p−1), 2c₂²^3p_p²₊⁺₁^p´µ 1+

³2n µe

´2n¶ p−1





 .

Using Lemma 1.4.2, Theorem 1.1.4 for p=2 (which was already proven in §1.3), and inequality (1.4.3) we conclude that for anyt≥T₀(p)

e_p(f(t)|f_∞)≤2e₂(f(t)|f_∞)≤2 ˜c₂e₂¡

f(T₀(p))|f_∞¢³ 1+¡

t−T₀(p)¢2n´

e^−2µ(t^−T⁰^(p))

≤2 ˜c_pe^2µT⁰^(p)¡

p(p−1)e_p(f₀|f_∞)+1¢_p²¡

1+t²ⁿ¢ e^−2µt.

(1.4.19)

To complete the proof we recall that any admissible relative entropy decreases along the flow of the Fokker–Planck equation (see [2] for instance). Thus, for anyt≤T₀(p) we have that

e_p(f(t)|f_∞)≤e_p(f₀|f_∞)≤e_p(f₀|f_∞)e^2µT⁰^(p)¡

1+t²ⁿ¢

e^−2µt. (1.4.20) The theorem now follows from (1.4.19) and (1.4.20), together with the fact that for a 1<p<2

e_p(f₀|f_∞)≤Cp

¡p(p−1)e_p(f₀|f_∞)+1¢_p² , whereCp:=sup_x_≥₀ ^x

(p(p−1)x+1)

p2 < ∞.

We end this section with a slight generalization of our main theorem:

Theorem 1.4.10. Letψbe a generating function for an admissible relative entropy. As-sume in addition that there exists C_ψ>0such that

ψp(y)≤C_ψψ(y) (1.4.21)

for some1<p <2and all y∈R⁺. Then, under the same setting of Theorem 1.1.4 (but now with the assumption e_ψ(f₀|f_∞)< ∞) we have that

e_ψ(f(t)|f_∞)≤c_p,_ψ¡

e_ψ(f₀|f_∞)+1¢²_p¡

1+t²ⁿ¢

e⁻²^µ^t, t≥0, where c_p,ψ>0is a fixed geometric constant.

Proof. The proof is almost identical to the proof of Theorem 1.1.4. Due to (1.4.21) we know thate_p(f₀|f_∞)< ∞. As such, according to Theorem 1.4.3 (i i) there exists an ex-plicitT₀(p) such that for allt≥T₀(p) we have that f(t)∈L²¡

R^d,f_∞⁻¹¢ and

e₂(f(t)|f_∞)≤1 2



 Ã8p

2 3·2¹^p

¡C_ψp(p−1)e_ψ(f₀|f_∞)+1)¢²_p

−1



.

The above, together with Lemma 1.4.2 gives the appropriate decay estimate one_ψ for t≥T₀(p). Sincee_ψdecreases along the flow of our equation, we can deal with the inter-valt≤T₀(p) like in the previous proof, yielding the desired result.

In the next, and last, section of this chapter we will mention another natural quantity in the theory of the Fokker–Planck equations - the Fisher information. We will briefly explain how the method we presented here is different to the usual technique one con-siders when dealing with the entropy. Moreover we describe how to infer from our main theorem an improved rate of convergence to equilibrium - in relative Fisher informa-tion.

Im Dokument On Decay Rates in Linear Kinetic Equations with Defects (Seite 39-50)