• Keine Ergebnisse gefunden

If µ issuchthat h ( T )= h ( T ) ,then µ iscalledameasureofmaximalentropy.Wegivesomehistoryrst. h ( T )=sup { h ( T ): µ is T -invariantprobabilitymeasure } µ .Topologicalandmeasure-theoreticalentropyarerelatedbytheVariationPrinciplewhichsaythat Inthisle

N/A
N/A
Protected

Academic year: 2021

Aktie "If µ issuchthat h ( T )= h ( T ) ,then µ iscalledameasureofmaximalentropy.Wegivesomehistoryrst. h ( T )=sup { h ( T ): µ is T -invariantprobabilitymeasure } µ .Topologicalandmeasure-theoreticalentropyarerelatedbytheVariationPrinciplewhichsaythat Inthisle"

Copied!
18
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Entropy

In this lecture we start discussing the notion of entropy. In dynamical systems, there istopological entropy(which is

independent of invariant measures), and in ergodic theory there is measure-theoretic entropy, also misnamed as metric entropy. It depends on the choice of invariant measureµ.

Topological and measure-theoretical entropy are related by the Variation Principlewhich say that

htop(T) = sup{hµ(T) :µ isT-invariant probability measure} Ifµ is such thathµ(T) =htop(T), thenµis called a measure of maximal entropy.

We give some history rst.

(2)

Entropy

I Entropy was rst used by 19th century physicist Rudolf Clausius and Ludwig Boltzmann as part of thermodynamics.

I Andrej Kolmogorov introduced a mathematical version in probability in the 1950s, and used it as isomorphism invariant.

This is the way we still dene it in ergodic theory.

I Around the same time, Claude Shannon used it in information theory.

I Major work in 1960s by Yakov Sina on measure-theoretic entropy and generators.

I Topological entropy was introduced in 1969 by Roy Adler, Alan Konheim and Harry McAndrew.

I In the early 1970s, Rufus Bowen and Em Dinaburg independently introduced a user-friendlier version of

topological entropy. Around this time, the Variational Principle was proved.

I In 1974 Don Ornstein published his theorem that entropy is completeinvariant for two-sided Bernoulli shifts.

(3)

Entropy

The current mathematical denition has very little to do anymore with the original denition from thermodynamics. Only it still expresses the amount of disorder in the system.

I For the circle rotation (S1,B, µ,Rα) with Lebesgue measure, htop(Rα) =hµ(Rα) =0.

I For the doubling map (S1,B, µ,T)with Lebesgue measure, htop(T) =hµ(T) = log2.

I For the Bernoulli shift ({1, . . . ,N}NorZ,B, µp, σ) with probability vector p= (p1, . . . ,pN),

htop(σ) = logN ≥hµp(σ) =−

N

X

i=1

pilogpi.

(4)

Jensen's Inequality

A functionf :R→Risconcave if

f(αx+ (1−α)y)≤αf(x) + (1−α)f(y) (1) for allx,y ∈Randα∈[0,1].

Iff :R→Ris C2 andf00≤0, then f is concave.

Theorem: For every strictly concave function f : [0,∞)→R, and allαi >0,Pn

i=1αi =1 and xi ∈[0,∞)we have

n

X

i=1

αif(xi)≤f(

n

X

i=1

αixi), (2)

withequality if and only if all the xi are the same.

(5)

Jensen's Inequality

Proof of Jensen's Inequality: We prove this by induction onn. Forn=2 it is simply (1). So assume that (2) holds for somen.

Forn+1, take αi >0 andPn+1

i=1 αi =1 and write B=Pn i=1αi. f(

n+1

X

i=1

αixi) = f(B

n

X

i=1

αi

Bxin+1xn+1)

≥ Bf(

n

X

i=1

αi

Bxi) +αn+1f(xn+1) by (1)

≥ B

n

X

i=1

αi

Bf(xi) +αn+1f(xn+1) by (2) for n

=

n+1

X

i=1

αif(xi)

Equality also carries over by induction. Ifxi are all equal fori ≤n, then(1) for n+1 is an equality only ifxn+1 =Pn

i=1 αi

Bxi =x1.

(6)

Jensen's Inequality

For measure-theoretic entropy, the functionϕ: [0,1]→Rdened as

ϕ(x) =−xlogx ϕ(0) =ϕ(1) =0 is important. Compute

ϕ0(x) =−1−logx, ϕ00(x) =−1 x <0, soϕis concave.

By Jensen's Inequality (with allαi = N1)

N

X

i=1

pilogpi =N

N

X

i=1

1

Nϕ(pi)≤Nϕ(1 N

N

X

i=1

pi) = logN,

withequality if and only ifall pi are the same.

(7)

Entropy

Let(X,B, µ) be a probability measure space. We call a collection P ={Pi} Pi measurable

a (measurable)partition if allPi's are disjoint, and X =∪iPi. We say that a partitionP isnerthan Q (written asP Q) if everyP ∈ Pis contained in some Q ∈ Q. For example, the nest possible partition is when ifPi ={i},i ∈X; this is thepoint partition. It is (uncountably) innite ifX is. The coarsest partition P ={X} is called the trivial partition.

Given two partitionsP andQ, the joint is

P ∨ Q:={P ∩Q :P ∈ P,Q ∈ Q};

This joint is ner than bothP andQ.

(8)

Entropy

IfT :X →X is measurable, then the n-th joint ofP is dened as

Pn=

(Wn−1

k=−nT−kP. ifT is invertible, Wn−1

k=0T−kP. ifT is non-invertible,

For example, ifT :S1 →S1 is the doubling map (non-invertible), and

P ={[0,1 2),[1

2,1)}

then

Pn={[i/2n,(i+1)/2n),i =0, . . . ,2n−1} so #Pn=2n. A partition isgenerating forT if for almost all x6=y ∈X, there is n such that x andy lie in dierent elements ofPn.

(9)

Entropy

Given a nite partitionP of a probability space (X, µ), let Hµ(P) = X

P∈P

ϕ(µ(P)) =−X

P∈P

µ(P) log(µ(P)), (3) where we can ignore the partition elements withµ(P) =0 because ϕ(0) =0. For aT-invariant probability measureµ on (X,B,T), and a partitionP, dene theentropy ofµ w.r.t.P as

hµ(T,P) = lim

n→∞

1 nHµ(

n−1

_

k=0

T−kP). (4) Finally, themeasure theoretic entropy of µis

hµ(T) = sup{hµ(T,P) : P is a nite partition ofX}. (5)

(10)

Fekete's Lemma

For this denition to make sense, we need to verify that thelimit in (4) exists. For this we need:

Denition: We call a real sequence (an)n≥1 subadditiveif am+n≤am+an for allm,n∈N. A positive sequence(an)n≥1 submultiplicative if

am+n≤am·an for allm,n∈N.

NB: If(an) is submultiplicative, then(logan) is subadditve.

Fekete's Lemma: If (an)n≥1 is subadditive, then limn

an

n = inf

r≥1

ar

r .

(11)

Fekete's Lemma

Proof of Fekete's Lemma: Every integern can be written uniquely asn=p·q+r for 0≤r <q. Therefore

lim sup

n→∞

an

n = lim sup

p→∞

ap·q+r

p·q+r ≤lim sup

p→∞

paq+ar p·q+r = aq

q .

This holds for all q∈N, so we obtain infq

aq

q ≤lim inf

n

an

n ≤lim sup

n

an n ≤inf

q

aq q , as required.

(12)

Fekete's Lemma

Callan=Hµ(Wn−1

k=0T−kP). Then (Prop. 13 of the Class Notes) am+n = Hµ(

m+n−1

_

k=0

T−kP)

≤ Hµ(

m−1

_

k=0

T−kP) +Hµ(

m+n−1

_

k=m

T−kP)

byT-invariance ofµ = Hµ(

m−1

_

k=0

T−kP) +Hµ(

n−1

_

k=0

T−kP)

= am+an. ThereforeHµ(Wn−1

k=0T−kP) is subadditive, and the existence of the limithµ(T,P) = limn1nHµ(Wn−1

k=0T−kP) follows.

(13)

Entropy

The second natural question about computing entropy:

How can one possibly consider all partitions ofX?

By the next theorem, which we state without proof (see Theorem 22 in the Class Notes), we can reduce all partitions to a single generating partition:

Theorem (Kolmogorov-Sina): Let (X,B,T, µ) be a

measure-preserving dynamical system. If partitionP is such that W

j=0T−kP generatesB ifT is non-invertible, W

j=−∞T−kP generatesB ifT is invertible, thenhµ(T) =hµ(T,P).

(14)

Entropy

Now a good property of entropy:

Theorem: Two isomorphic measure preserving systems have the same entropy.

Indeed, let(X,B,T, µ) and(Y,C,S, ν) have full-measured sets X0⊂X,Y0 ⊂Y and a bi-measurable invertible measure-preserving mapφ:X0 →Y0 such that

(X0,B, µ) −→T (X0,B, µ)

φ↓ ↓φ

(Y0,C, ν) −→S (Y0,C, ν) commutes, thenhµ(T) =hν(S).

This holds, because the bi-measurable measure-preserving mapφ preserves all the quantities involved in (3)-(5), including the class of partitions for both systems.

(15)

Entropy

For two-sided (i.e., invertible) Bernoulli shifts

(X ={0, . . . ,N}Z,B, µp) based on the probability vector p= (p1, . . . ,pN}, the cylinder partition

P ={[i] :i =0, . . . ,N} is generating.

Lemma: For the cylinder partitionP hµp(σ,P) =−X

i

pilogpi.

By the Kolmogorov-Sina Theorem, this generating partition suces to compute the entropy.

Theorem (Ornstein 1974): Two two-sided Bernoulli shifts (X, µp, σ) and(X0, µp0, σ) are isomorphic if and only if hµp(σ) =hµp0(σ).

(16)

Entropy

Let us now compute thatWn−1

k=0T−kP.=−P

ipilogpi, just for two symbols, and using the cylinder partitionP ={[0],[1]}, which can denote head and tail in coin-ips.

P(k heads inn ips) = n

k

pk(1−p)n−k, so by full probability:

n

X

k=0

n k

pk(1−p)n−k =1.

Here nk

= k!(n−k)!n! are the binomial coecients, and

 k nk

= (k−1)!(n−k)!n! =n(k−(n−1)!(n−k)!1)! =n n−k−11 (n−k) nk

= (k)!(n−k−n! 1)! =nk!(n−k(n−1)!1)! =n n−k1 (6)

(17)

Entropy

We compute.

Hµ(

n−1

_

k=0

σ−kP) = −

X1

x0,...,xn−1=0

µ([x0, . . . ,xn−1]) logµ([x0, . . . ,xn−1])

= −

X1

x0,...,xn−1=0 n−1

Y

j=0

ρ(xj) log

n−1

Y

j=0

ρ(xj)

= −

n

X

k=0

n k

pk(1−p)n−klog

pk(1−p)n−k

= −

n

X

k=0

n k

pk(1−p)n−kklogp

n

X

k=0

n k

pk(1−p)n−k(n−k) log(1−p)

In the rst sum, the termk =0 gives zero, as does the termk =n for the second sum.

(18)

Entropy

Thus we leave out these terms and rearrange by (6):

Hµ(

n−1

_

k=0

σ−kP) = −plogp

n

X

k=1

k

n−1 k

pk−1(1−p)n−k

−(1−p) log(1−p)

n−1

X

k=0

(n−k) n

k

pk(1−p)n−k−1

= −plogp

n

X

k=1

n n−1

k−1

pk−1(1−p)n−k

−(1−p) log(1−p)

n−1

X

k=0

n n−1

k

pk(1−p)n−k−1

= n(−plogp−(1−p) log(1−p)). P is generating, so by the Kolomogorov-Sina Theorem, hµ(σ) =hµ(σ,P) = lim

n

1 nHµ(

n−1

_

k=0

σ−kP) =−plogp−(1−p) log(1−p).

Referenzen

ÄHNLICHE DOKUMENTE

[r]

Shreiben Sie eine Matlab-Funktion Na vierStokes.m, um diese Gleihung mit einer. F ourier-Kollokation-Spektalmethode

Die Diagonalelemente sind nicht invariant unter einem Phasenfaktor exp ( − iθ) und verursachen daher eine Symmetriebrechung..

Wengenroth Wintersemester 2014/15 21.01.2015. Maß- und Integrationstheorie

To account for missing contributions of photons from radiative top-quark decays, the sample is reweighted in the photon p T distribution at parton level to match a

[r]

[r]

Obige Skizze zeigt, daß F nur eine z-Komponente F z