Variational Principle

(1)

Variational Principle

Measure-theoretic and topological entropy are related via the Variational Principle:

Theorem: Let (X,d) be a compact metric space and T :X →X a continuous map. Then

htop(T) =sup{hµ(T) : µis a T -invariant probability measure}.

Any measureµsuch that htop(T) =hµ(T)is called a measure of maximal entropy.

If there is aunique measure of maximal entropyµ_max, then(X,T) is calledintrinsically ergodic.

(2)

Variational Principle

Remark: A measure of maximal entropy is automaticallyergodic.

Indeed, Ifµ_max is not ergodic, say µ_max=αµ₁+ (1−α)µ₂, then hµ_max(T) =αhµ₁(T) + (1−α)hµ₂(T),

because measure-theoretic entropy is linear in the measure (check the denitions). But this means that at least one ofµ_i, i=1,2 has hµ_i ≥hµ_max(T).

Remark: A measure of maximal entropy need not exist if T is discontinuous. For example, the Gauss mapG(x) = ¹_x − b¹_xc has no measure of maximal entropy.

Exercise: Show that the Gauss map has innite topological entropy.

(3)

Measures of Maximal Entropy

Most dynamical systems we see are intrinsically ergodic, but nding this measure of maximal entropy is not always simple.

I Any uniquely ergodic system is intrinsicially ergodic.

I For the full shift on N symbols, the (_N¹, . . . ,_N¹)-Bernoulli measure is the unique measure of maximal entropy.

I For transitive maps T : [0,1]→[0,1] of constant slope±s,

|s|>1, the measure that is absolutely continuous w.r.t Lebesgue is the unique measure of maximal entropy.

I Lebesgue measure is the unique measure of maximal entropy of hyperbolic toral automorphism.

The next main result (a theorem due to Parry) is about nding the maximal measure for subshifts of nite type.

(4)

Subshifts of Finite Type

Let A= (A_ij)^N_i_,_j₌₁ be a non-negative N×N integer matrix.

I We A ittransition matrix because A_ij usually indicates whether (or in how many ways) you can go from state i to state j in a Markov partition.

I A is irreducible if for every i,j there is k such that the i,j-entry of A^k is positive.

I Let p(i) =min{k ≥1: the i,i-entry of A^k is positive}. A is aperiodic if gcd{p(i) :p(i) exists}=1.

I A is primitive if A is both irreducible and aperiodic.

Alternatively, there is k such that A^k is a strictly positive matrix.

(5)

Subshifts of Finite Type

The set of (bi)innite strings

Σ_A={(xi)_i∈Z:xi ∈ {1, . . . ,N},Axi,xi+1 >0 for all i ∈Z} is shift-invariant and closed in the standard product topology of {1, . . . ,N}^Z. Hence it is a subshift.

It is calledsubshift of nite type (SFT) because of the nite collection of forbidden words (namely the pairs i,j such that A_i,j =0) that fully determinesΣ_A.

Theword-complexity

pn(Σ_A) := #{x0. . .xn−1 subword appearing in Σ_A} Because the n-cylinders form an open 2⁻ⁿ-cover ofΣ_A:

htop(σ|_Σ_A) = lim

n→∞

1

nlog pn(Σ_A) =logλ, whereλis the leading eigenvalue of the transition matrix A.

(6)

Subshifts of Finite Type

Perron-Frobenius Theorem: Let A be aprimitive nonnegative N×N-matrix. Then A has a unique (up to scaling) eigenvector with all entries>0. The corresponding eigenvalue λis positive, has multiplicity one, and is larger than the absolute value of every other eigenvalue of A.

I λis called theleadingor Perron-Frobenius eigenvalue.

I If A is not irreducible, then λcan have higher multiplicity. For example A=

1 0

0 1

.

I If A is not aperiodic, then there can be other eigenvalues of the same absolute value asλ. For example A=

0 1

1 0

.

I The Perron-Frobenius Theorem holds both for left and right eigenvalues.

(7)

Parry Measure

Bill Parry constructed the measure of maximal entropy, which is now called after him. Let(Σ_A, σ) be a subshift of nite type on alphabet{1, . . . ,N} with transition matrix A= (A_i,j)^N_i_,_j₌₁,

A_ij ∈ {0,1}, so x = (x_n)∈Σ_A if and only if A_x_n,xn+1 =1for all n.

We assume that A is aperiodic and irreducible. Then by the Perron-Frobenius Theorem, the leading eigenvalueλhas multiplicity one, is larger in absolute value than every other eigenvalue, and htop(σ) =logλ.

The left and right eigenvectors

u = (u₁, . . . ,u_N) and v = (v₁, . . . ,v_N)^T

associated toλare unique up to a multiplicative factor. We will scale them such that they are positive and

XN i=1

u_iv_i =1.

(8)

Parry Measure

Dene theParry measure by

p_i := u_iv_i =µ([i]), p_i,j := A_i,jv_j

λv_i =µ([ij]|[i]),

so p_i,j indicates the conditional probability that x_n+1 =j knowing that xn=i. Thereforeµ([ij]) =µ([i])µ([ij]|[i]) =pipi,j. It is stationary (i.e., shift-invariant) but not quite a product measure:

µ([im. . .in]) =pim ·pim,im+1· · ·pin−1,in.

Theorem: The Parry measureµis the unique measure of maximal entropy for a subshift of nite type with aperiodic irreducible transition matrix.

(9)

Parry Measure

Proof: In this proof, we will only show that

hµ(σ) =htop(σ) =logλ, and skip the (more complicated) uniqueness part.

The denitions of the masses of 1-cylinders and 2-cylinders are compatible, because (since v is a right eigenvector)

N

X

j=1

µ([ij]) =

N

X

j=1

pipi,j =pi N

X

j=1

Ai,jvj

λv_i =piλvi

λv_i =pi =µ([i]).

Summing over i, we getP_N

i=1µ([i]) =P_N

i=1p_i =P_N

i=1u_iv_i =1, due to our scaling.

(10)

Parry Measure

To show thatµis shift-invariant, we take any cylinder set Z = [i_m. . .i_n]and compute

µ(σ⁻¹Z) =

N

X

i=1

µ([iim. . .in]) =

N

X

i=1

p_ip_i,im

p_i_m µ([im. . .in])

= µ([im. . .in])

N

X

i=1

u_iv_i A_i,imv_i_m λv_i u_i_mv_i_m

= µ(Z)

N

X

i=1

u_iA_i,im

λuim

=µ(Z)λu_i_m λuim

=µ(Z).

This invariance carries over to all sets in theσ-algebraBgenerated by the cylinder sets.

(11)

Parry Measure

Based on the interpretation of conditional probabilities, the identities

N

X

im+1,...,in=1

A_ik,ik+1=1

pimpim,im+1 · · · pin−1,in =pim

and (1)

XN

im,...,in−1=1

A_ik,ik+1=1

p_i_mp_i_m,im+1 · · · p_i_n−1,in =p_i_n

follows because the left hand side indicates the total probability of starting in state im and reaching some state after n−m steps, respectively starting at some state and reaching state n after n−m steps.

(12)

Parry Measure

To compute hµ(σ), we will take the partitionP of 1-cylinder sets;

this partition is generating, so this restriction is justied by the Kolmogorov-Sina Theorem (on generating partitions).

Hµ(

n−1

_

k=0

σ⁻^kP) = −

N

X

i0,...,in−1=1

A_ik,ik+1=1

µ([i0. . .in−1])logµ([i0. . .in−1])

= −

N

X

i0,...,in−1=1

A_ik,ik+1=1

p_i₀p_i₀,i1· · ·p_i_n−1,in(log p_i₀

+log p_i₀,i1+· · ·+log p_i_n−2,in−1

= −

N

X

i0=1

p_i₀log p_i₀ −(n−1) X^N

i,j=1

p_ip_i,jlog p_i,j,

by (1) used repeatedly.

(13)

Parry Measure

Hence

hµ(σ) = lim

n→∞

1 nHµ(

n−1

_

k=0

σ⁻^kP)

= −

N

X

i,j=1

p_ip_i,jlog p_i,j

= −

N

X

i,j=1

u_iA_i,jv_j

λ (log Ai,j +log vj −log vi−logλ). The rst term in the brackets is zero because Ai,j ∈ {0,1}.

(14)

Parry Measure

The second term−P_N

i,j=1 uiAi,jvj

λ log v_j (summing rst over i) simplies to

− XN

j=1

λu_jv_j

λ log v_j =− XN j=1

u_jv_jlog v_j,

The third termP_N

i,j=1 uiAi,jvj

λ log v_i (summing rst over j) simplies

to N

X

i=1

u_iλv_i

λ log v_i =

N

X

i=1

u_iv_ilog v_i.

Hence these two terms cancel each other.

(15)

Parry Measure

The remaining term is

N

X

i,j=1

u_iA_i,jv_j

λ logλ=

N

X

i=1

uiλvi

λ logλ=

N

X

i=1

uivilogλ=logλ.

This nishes the proof.

Remark: To deal with entries A_ij ∈ {2,3,4, . . .}, we cansplit states and regain a 0,1-matrix.