Topological Entropy of Formal Languages

(1)

arXiv:1507.03393v1 [math.DS] 13 Jul 2015

Topological Entropy of Formal Languages

Friedrich Martin Schneider, Daniel Borchmann July 14, 2015

In this work we shall consider a notion of complexity of formal languagesLthat is inspired by the concept ofentropyfrom dynamical systems. More precisely, we shall define the topological entropyof L to be the exponential growth-rate of the restrictions of the Nerode congruence relation of L to words of length at mostn. We shall show that the topological entropy of regular languages is always zero, but that there are also non-regular languages with vanishing entropy, for example Dyck-languages. Furthermore, we shall establish a way of how to compute the entropy of a formal language that is given by a topological automaton accepting it. Finally, we shall point out that the topological entropy of a formal language can be seen as the entropic dimension of a suitable precompact pseudo-metric space.

1 Introduction

A variety of notions has been developed to assess different aspects of complexity of formal languages. Most of these notions have been devised with an understanding of complexity in mind that comes with classical complexity theory, and thus these notions are formu- lated as decision problems. Examples for this are the word problem and the equivalence problem for formal languages, and the complexity of the formal languages is measured by the complexity class for which these problems are complete. Other notions quantify complexity by other means. Examples are thestate complexity [18] of a regular language, which gives the complexity of the language as the number of states in its minimal automaton, or thesyntactic complexityof a regular language, which instead considers the size of corresponding syntactic semigroup [10].

The core idea of the present article is to expand the methods of measuring a formal language’s complexity by a topological approach in terms oftopological entropy, which proved tremendously useful to dynamical systems. Topological entropy was introduced by Adler et al. [1] for single homeomorphisms (or continuous transformations) on a compact Haus- dorff space. The literature provides several essentially different extensions of this concept for continuous group and semigroup actions. Among others, there is an approach towards topological entropy for continuous actions of finitely generated (pseudo-)groups due to Ghys et al. [13] (see also [3, 5, 7, 16]), which has also been investigated for continuous semigroup actions in [4, 6, 14].

(2)

By a dynamical system we mean a continuous semigroup action on a compact Hausdorff topological space. Topological entropy measures the ability of an observer to distinguish between points of the dynamical system just by recognizing transitionsat equal time inter- vals, i.e., with respect to a fixed generating system of transformations, starting from the initial state. Since the above notion of dynamical system may very well be regarded as the topological counterpart of a finite automaton, it seems natural to utilize the dynamical approach for applications to automata theory.

To view a formal language Lover an alphabet Σ as some kind of dynamical system we take inspiration from the characterization of regular languages as languages whose Myhill- Nerode congruence relationΘ(L)has finite index. Recall that foru,v∈Σwe have

(u,v)∈^Θ(L) ⇐⇒ ∀w∈^Σ^∗: (uw∈ L ⇐⇒ vw ∈L).

The relationΘ(L)can be seen as some way of measuring the complexity ofL: ifLis regular, the number of equivalence classes is finite and equals the number of states in the minimal automaton ofL. This is the idea behind the notion of state complexity.

However, if L is not regular, Θ(L) does not have finite index, and asking for the state complexity ofLis not a reasonable undertaking. The notion oftopological entropy that we shall introduce in this paper tries to overcome this issue in the following way. Instead of considering the relation Θ(L)alone, we shall also take into account how Θ(L) arises as a limit of a sequences of certain equivalence relationsΘ(^Σ⁽ⁿ⁾,L). The behavior of this sequence(^Θ(^Σ⁽ⁿ⁾,L)|n∈^N)then gives rise to our notion of topological complexity ofL.

It is the purpose of the this paper to present a first investigation of the notion of topological entropy of formal languages. We shall show that all regular languages have vanishing topological entropy, but that there are also non-regular languages whose topological entropy is zero, most notably Dyck languages [2]. We shall also give examples of context-free languages with non-vanishing entropy. Furthermore, we shall discuss how the topological entropy of a formal language can be computed if the language itself is represented by means of a topological automaton [15].

This paper is structured as follows. We shall first introduce and investigate the topological entropy of formal languages in Section 2. In Section 3 we shall have a closer look into the connection of topological entropy of formal languages andtopological complexity of semigroup actions on compact Hausdorff spaces. In particular we shall show how the topological entropy of a language can be obtained if the language is given by a topological automaton accepting it. Finally, Section 4 shall show that the topological entropy coincides with the entropic dimension of a suitable precompact pseudo-ultrametric space.

2 Topological entropy of formal languages

In this section we shall introduce our new notion of topological entropy of formal languages. As already sketched in the introduction, this notion is inspired by the characterization of regular languages as those languages whose Nerode congruence has finite index.

To make our argumentation easier to follow, we shall thus first recall this famous result.

(3)

Thereafter, we shall introduce and investigate our notion of topological entropy. In particular, we shall show that all regular languages have entropy zero, but that there are also non-regular languages with vanishing entropy. This latter discussion shall be embedded in a more general observation about languages defined via groups with sub-exponential growth. Finally, we shall discuss some more examples of languages and determine their entropy.

Let us recall some basic notation. LetΘbe an equivalence relation on a setY. Fory ∈ Y we put [y]Θ := {x ∈ Y | (x,y) ∈ ^Θ}. ThenY/Θ := {[y]Θ | y ∈ Y}. Furthermore, theindex of Θ onY is defined as ind(^Θ) := |Y/Θ|. For a mapping f: X → Y we set f⁻¹(Θ) := {(s,t) ∈ X×X | (f(x),f(y)) ∈ Θ}. Clearly, f⁻¹(Θ) then constitutes an equivalence relation onX.

Now let Σ be an alphabet, i.e., a finite and non-empty set. The Nerode congruence of a languageL⊆^Σ^∗is the equivalence relation

Θ(L):={(u,v)∈^Σ^∗×^Σ^∗ | ∀w∈^Σ^∗: uw∈ L⇔vw ∈L}.

Recall thatLis regular if and only if it is accepted by an automaton. The following characterization of regular languages in terms of the Nerode congruence relation is well-known.

Theorem 2.1 (Myhill-Nerode) LetΣbe a finite alphabet. A language L ⊆ Σ^∗ is regular if and only ifΘ(L)has finite index.

For regular languages Lthe number of equivalence classes of the Nerode congruence relation can thus be considered as a measure of complexity of the L. However, if L is not regular this measure is not available anymore. We shall remedy this by not considering thenumber of equivalence classes ofΘ(L), but by considering the growth of the number of equivalence classes of a particular approximation of Θ(L). Based on this growth we introduce our notion of topological entropy ofL.

Definition 2.2 LetΣbe an alphabet. For F⊆ ^Σ^∗finite and L⊆^Σ^∗define

Θ(F,L):={(u,v)∈^Σ^∗×^Σ^∗ | ∀w∈ F: uw ∈ L⇔vw∈ L}. △ Now, the equivalence relationsΘ(F,L)constitute an approximation ofΘ(L)in the sense that

Θ(L) =^\{^Θ(F,L)|F⊆ ^Σ^∗finite}. (1) Furthermore, it can be seen quite easily thatΘ(F,L)always has finite index. The mapping Φ_F,L:Σ^∗/Θ(F,L)→ {0, 1}^Fgiven by

Φ_F,L([u]Θ(F,L))(w):=

(1 ifuw∈ L, 0 otherwise

is a well-defined injection. Because of this we have indΘ(F,L) ≤ 2^|^F^|, and thus Θ(F,L) has finite index. Thus the following definition is reasonable.

(4)

Definition 2.3 LetΣbe an alphabet, and denote withF(^Σ^∗)the set of finite subsets ofΣ^∗. Define γ:F(^Σ^∗)× P(^Σ^∗)→^N, (F,L)7→ indΘ(F,L).

Given L⊆^Σ^∗, we call

γ_L: F(^Σ^∗)→^N, F 7→γ(F,L)

thetopological complexity function of L. The topological entropyof a language L ⊆ ^Σ^∗ is defined to be

h(L):=lim sup

n→∞

log₂γL(^Σ⁽ⁿ⁾)

n ,

whereΣ⁽ⁿ⁾is the set of all words overΣof length at most n. △ Next we want to collect some obvious but useful properties of topological complexity func- tions.

Proposition 2.4 LetΣbe a finite alphabet, let E,F⊆ ^Σ^∗be finite, and let L,L0,L1⊆ ^Σ^∗. Then (a) γ(F,∅) =γ(F,Σ^∗) =1, and thus h(^∅) =h(^Σ^∗) =0.

(b) γ(E∪F,L)≤γ(E,L)·γ(F,L). If E⊆ F, thenγ(E,L)≤γ(F,L). (c) γ(F,L) =γ(F,Σ^∗\L), and hence h(L) =h(^Σ^∗\L).

(d) γ(F,L0∪L1)≤γ(F,L0)·γ(F,L1), and thus h(L0∪L1)≤h(L0) +h(L1). (e) γ(F,L0∩L1)≤γ(F,L0)·γ(F,L1), and thus h(L0∩L1)≤h(L0) +h(L1).

Proof ClearlyΘ(F,∅) =^Θ(F,Σ^∗) =^Σ^∗×^Σ^∗and henceγ(F,∅) =γ(F,Σ^∗) =1. Secondly, because ofΘ(E∪F,L) = ^Θ(E,L)∩^Θ(F,L)we obtain γ(E∪F,L) ≤ γ(E,L)·γ(F,L). In particular, if E ⊆ F, then Θ(F,L) ⊆ ^Θ(E,L) and hence γ(E,L) ≤ γ(F,L). Moreover, Θ(F,L) =^Θ(F,Σ^∗\L)and thereforeγ(F,L) =γ(F,Σ^∗\L). Finally, it is easy to check that Θ(F,L0)∩^Θ(F,L1)⊆ ^Θ(F,L0∪L1). Consequently,γ(F,L0∪L1)≤γ(F,L0)·γ(F,L1). Utilizing the previous observations, we can furthermore conclude that

Θ(F,L0)∩^Θ(F,L1) =^Θ(F,Σ^∗\L0)∩^Θ(F,Σ^∗\L1)

⊆^Θ(F,(^Σ^∗\L0)∪(^Σ^∗\L1))

=^Θ(F,Σ^∗\(L0∩L1))

=^Θ(F,L0∩L1)

and thereforeγ(F,L0∩L1)≤ γ(F,L0)·γ(F,L1). As a consequence of this proposition we immediately obtain the result that the class of languages with zero entropy and the class of languages with finite entropy are closed under Boolean operations. Since it can be seen easily that finite languages have zero entropy we immediately obtain that all regular languages must have zero entropy as well.

The following result gives a precise formulation of this fact, and provides an alternative proof for the claim.

Theorem 2.5 LetΣbe an alphabet and L⊆^Σ^∗. The following are equivalent:

(5)

(a) L is regular, (b) γ_Lis bounded, and

(c) there exists some finite subset F⊆^Σ^∗such thatΘ(F,L) =^Θ(L).

Proof (a) =⇒(b). Due to 2.1,Θ(L)has finite index. Note thatΘ(L)⊆ ^Θ(F,L)and hence γL(F)≤indΘ(L)for allF⊆^Σ^∗finite. Thus,γLis bounded.

(b) =⇒ (c). Suppose thatγLis bounded. Then there exists some finiteF0 ⊆ ^Σ^∗ such that γ_L(F0) = sup{γ_L(F) | F ⊆ ^Σ^∗finite}. We shall show that Θ(F0,L) = ^Θ(L). Of course, Θ(L)⊆^Θ(F0,L). Let(u,v)∈(^Σ^∗×^Σ^∗)\^Θ(L). By (1) there exists some finiteF1⊆ ^Σ^∗such that(u,v) ∈/ ^Θ(F1,L). Obviously, F0∪F1 ⊆ ^Σ^∗ is finite andΘ(F0∪F1,L) ⊆ ^Θ(F0,L). By assumption,γ_L(F0∪F1) ≤ γ_L(F0). Consequently,Θ(F0∪F1,L) = ^Θ(F0,L)and therefore (u,v) ∈ (^Σ^∗ ×^Σ^∗)\^Θ(F1,L) ⊆ (^Σ^∗×^Σ^∗)\^Θ(F0∪F1,L) = (^Σ^∗×^Σ^∗)\^Θ(F0,L). This substantiates thatΘ(F0,L) =^Θ(L).

(c) =⇒ (a). By assumptionΘ(L) =^Θ(F,L), and since Θ(F,L)has finite index,Θ(L)has

finite index as well. Hence,Lis regular due to 2.1.

Corollary 2.6 LetΣbe an alphabet. If L⊆^Σ^∗is regular, then h(L) =0.

The converse of this corollary does not hold, i.e., there are non-regular languages with vanishing topological entropy. To see this we shall show thatDyck languagesalways have zero entropy (c.f. 2.12). We shall put the corresponding argumentation in a more general framework, by estimating the entropy of languages defined by groups. For this purpose, we recall the concept ofgrowthin groups. Consider a finitely generated groupG. LetSbe a finite symmetric generating subset ofGcontaining the neutral element. Theexponential growth rateofGwith respect toSis defined to be

egr(G,S):=lim sup

n→∞

log₂|Sⁿ|

n .

Note that this quantity is finite as|Sⁿ| ≤ |S|ⁿfor everyn∈^N. Furthermore, egr(G,S):= lim

n→∞

log₂|Sⁿ| n

due to a well-known result by Fekete [12]. Of course, the precise value of the exponential growth rate depends upon the particular choice of a generating set.

However, if T is another finite symmetric generating subset of G containing the neutral element, then

1

k ·egr(G,T)≤ egr(G,S)≤ l·egr(G,T)

wherek :=inf{m∈ ^N\ {0} |T ⊆S^m}andl :=inf{m∈^N\ {0} |S⊆ T^m}. This justifies the following definition: Gis said to havesub-exponential growthif egr(G,S) = 0 for some (and thus any) symmetric generating setSofGcontaining the neutral element. The class of finitely generated groups with sub-exponential growth encompasses all finitely generated abelian groups. In fact, ifGis abelian, then

Sⁿ⊆ⁿ

∏

s∈S

s^α⁽^s⁾

α: S→ {0, . . . ,n}^o

and thus|Sⁿ| ≤(n+1)^|^S^|for alln ∈^N. Now let us return to formal languages.

(6)

Theorem 2.7 LetΣbe an alphabet. Let G be a group, ϕ: Σ^∗ →G a homomorphism, H ⊆G, and E⊆ G finite. Define

Pϕ(H):={w∈ ^Σ^∗ | ∀u prefix of w: ϕ(u)∈ H}, Lϕ(H,E):=Pϕ(H)∩ϕ⁻¹(E).

Thenγ(F,Lϕ(H,E))≤ |E| · |ϕ(F)|+1for all finite F⊆^Σ^∗. In particular, h(Lϕ(H,E))≤lim sup

n→∞

log₂|ϕ(^Σ⁽ⁿ⁾)|

n ≤log₂|^Σ|.

Furthermore, if S is a finite symmetric generating subset of G containing the neutral element and k:=inf{m∈^N\ {0} | ϕ(^Σ)⊆S^m}, then

h(Lϕ(H,E))≤ k·egr(G,S).

Proof We abbreviateP:= Pϕ(H)andL:= Lϕ(H,E). Consider a finite subsetF ⊆^Σ^∗. Then Q:= Eϕ(F)⁻¹is a finite subset ofG. Fix any object∞∈/Qand defineQ∞ := Q∪ {^∞}. Let us consider the mapψ: Σ^∗ →Q∞given by

ψ(u):=

(ϕ(u) ifu∈ P∩ϕ⁻¹(Q),

∞ otherwise (u∈^Σ^∗).

We show kerψ⊆^Θ(F,L). To this end, let(u,v)∈ kerψ. We proceed by case analysis.

First case: ψ(u) = ψ(v) 6= ^∞. Now,u,v ∈ P∩ϕ⁻¹(Q)andϕ(u) = ψ(u) = ψ(v) = ϕ(v). Letw∈ Fand suppose thatuw∈ L. We showvw∈ L. We observe that

ϕ(vw) = ϕ(v)ϕ(w) =ϕ(u)ϕ(w) = ϕ(uw)∈E,

i.e.,vw ∈ ϕ⁻¹(E). In order to prove that vw ∈ P, let x be a prefix of vw. If x is a prefix of v, then ϕ(x) ∈ H asv ∈ P. Otherwise, there exists a prefixy of w such that x = vy, and so we conclude that ϕ(x) = ϕ(vy) = ϕ(v)ϕ(y) = ϕ(u)ϕ(y) = ϕ(uy) ∈ H, because uw ∈ Panduyis a prefix ofuw. Hence,vw ∈ L. On account of symmetry, it follows that (u,v)∈ ^Θ(F,L).

Second case: ψ(u) = ψ(v) = ^∞. Let x ∈ {u, v}. If x ∈/ ϕ⁻¹(Q), then we conclude that ϕ(xw) = ϕ(x)ϕ(w)∈/ Eand thusxw ∈/ Lfor anyw ∈ F. Ifx ∈/ P, thenxw ∈/ Pand hence xw ∈/ Lfor anyw ∈ F. This proves that{uw, vw} ∩L = ^∅for all w ∈ F. Consequently, (u,v)∈ ^Θ(F,L).

This substantiates that kerψ⊆^Θ(F,L). Therefore

γ(F,L) =indΘ(F,L)≤ind(kerψ)≤ |Q∞| ≤ |Q|+1≤ |E| · |ϕ(F)|+1.

In particular, it follows that h(L) =lim sup

n→∞

log₂γ_L(^Σ⁽ⁿ⁾)

n ≤lim sup

n→∞

log₂(|E| · |ϕ(^Σ⁽ⁿ⁾)|+1) n

=lim sup

n→∞

n ≤lim sup

n→∞

log₂(n· |^Σ|ⁿ)

n =log₂|^Σ|.

(7)

Finally, supposeS to be a finite symmetric generating subset ofGcontaining the neutral element. SinceΣis finite,M :={m ∈ ^N\ {0} | ϕ(^Σ) ⊆S^m}is not empty. Letk := infM.

Our considerations above now readily imply that h(Lϕ(S,E))≤lim sup

n→∞

n ≤k·lim sup

n→∞

log₂|Sⁿ|

n =k·egr(G,S). For groups whose growth is sub-exponential the previous theorem yields that the corresponding languagesLϕ(S,E)have zero entropy.

Corollary 2.8 LetΣbe an alphabet, let G be a group with sub-exponential growth, andϕ: Σ^∗ →G a homomorphism. Then for each S⊆G and finite E ⊆G, it is true that h(Lϕ(S,E)) =0.

We immediately obtain the following statement.

Corollary 2.9 LetΣbe an alphabet, let G be a finitely generated abelian group, andϕ: Σ^∗→ G a homomorphism. Then for each S⊆ G and finite E⊆ G, it is true that h(Lϕ(S,E)) =0.

The following corollaries are immediate consequences of Theorem 2.7 forS= G.

Corollary 2.10 LetΣbe a finite alphabet and L ⊆ ^Σ^∗. Let G be a group, ϕ: Σ^∗ → G a homo- morphism and E⊆ G finite such that L= ϕ⁻¹(E). Thenγ(F,L)≤ |E| · |ϕ(F)|+1for all finite F ⊆^Σ^∗. In particular,

h(L)≤lim sup

n→∞

n ≤log₂|^Σ|.

Corollary 2.11 LetΣ be a finite alphabet, L ⊆ ^Σ^∗. Let G be an abelian group, ϕ: Σ^∗ → G a homomorphism and E⊆G finite such that L= ϕ⁻¹(E). Then h(L) =0.

With the previous results in place, we are now able to argue thatDyck languages have finite entropy. Recall that theDyck language with k sorts of parenthesesconsists of all balanced strings over{(₁,)₁, . . . ,(_k,)_k}. Alternatively, we can view the Dyck language withksorts of parentheses as the set of all strings that can be reduced to the empty word by succes- sively eliminating matching pairs of parentheses.

We can formalize this as follows. LetΣ,Σbe two alphabets,∆ := ^Σ∪^Σ, and letκ: Σ→ ^Σ be a bijection. Consider the the free group F(^Σ) with generator set Σ, and denote with ϕ: ∆^∗ → ^Σ the unique homomorphism satisfying ϕ(a) = a and ϕ(κ(a)) = a⁻¹ for all a∈ ^Σ. Define

D(κ):= {w∈ ^∆^∗ | ϕ(w) =e∧(∀uprefix ofw: |w|a ≥ |w|_κ₍_a₎)}.

IfΣ = {(₁, . . . ,(_k},Σ = {)₁, . . . ,)_k}, andκ (_i = )_i, then the setD(κ)coincides with the Dyck language withksorts of parentheses.

(8)

Theorem 2.12 Letκ: Σ→^Σbe a bijection between finite sets. Then log₂|^Σ| ≤h(D(κ))≤egr(F(^Σ),S) for S:=^Σ∪^Σ⁻¹∪ {e}, where e denotes the neutral element of F(^Σ).

Proof We first show indΘ(^Σ⁽ⁿ⁾,L)≥ |^Σⁿ|, since this implies log₂|^Σ| ≤h(D(κ)). For this let u,v ∈ ^Σⁿ,u6= v. Defineκ(u):=κ(u_|_u_|). . .κ(u1), whereu = u1. . .u_|_u_|. Thenu·κ(u)∈ L, butv·κ(u)∈/ L. Thus(u,v)∈/^Θ(^Σ⁽ⁿ⁾,L)and therefore indΘ(^Σ⁽ⁿ⁾,L)≥ |^Σⁿ|as required.

For the second inequality let us consider the unique homomorphismψ: F(^Σ)→^Z^Σsatis- fying

ψ(b)(a):=

(1 ifa=b,

0 otherwise (a,b∈ ^Σ).

We observeD(κ) =Lϕ(ψ⁻¹(^N^Σ),{e}), where the mapping ϕis as above. Hence, we have

h(D(κ))≤egr(F(^Σ),S)by 2.9.

The reason that Dyck languages with more than one type of parentheses have non-zero positive entropy is cause by the requirement that in a wordwthe different types of parentheses need to mutually balanced, i.e.,ϕ(w) =e. In other words, if we replace this require- ment by the weaker condition that each opening parenthesis has to be closed eventually, then we obtain a class of languages with zero entropy.

Theorem 2.13 Letκ: Σ → ^Σbe a bijection between finite sets, let∆ := ^Σ∪^Σ, and consider the language

D^′(κ):={w∈^∆^∗ | ∀a ∈^Σ: (|w|a =|w|_κ₍_a₎)∧(∀u prefix of w:|w|a ≥ |w|_κ₍_a₎)}. Then h(D^′(κ)) =0.

Proof Let us consider the homomorphismϕ: ∆^∗ →^Z^Σgiven by ϕ(w)(a):=|w|_a− |w|_κ₍_a₎ (w∈^∆^∗, a∈^Σ).

We observe thatD(κ) =Lϕ(^N^Σ,{0}), whereforeh(D(κ)) =0 by 2.9.

Note that for|^Σ|=1 we haveD(κ) =D^′(κ) =0, and thus we obtain an example of a language with zero entropy that is not regular. Other non-regular languages with vanishing entropy are discussed in the following examples.

Example 2.14 LetΣbe an alphabet.

(a) Let m ∈ ^N and a,b ∈ ^Σ, a 6= b. Then L := {w ∈ ^Σ^∗ | |w|_a = |w|_b+m} is not regular. However, h(L) = 0 by Corollary 2.11. To see this, note that the mapping ϕ:Σ^∗ →^Z, w7→ |w|a− |w|_bconstitutes a homomorphism whereL= ϕ⁻¹({m}).

(9)

(b) Suppose that Σ = {a,b,c}. Then L := {aⁿbⁿcⁿ | n ∈ ^N}is not context-free, but h(L) =0. To see this we show thatΘ=^Θ(^Σ⁽ⁿ⁾,L)has the equivalence classes

[c^k]Θ, k≤n/2

[b^ℓc^k]Θ, 1≤ ℓ≤ k, 2k−ℓ≤n [a^ℓb^kc^k]Θ, 1≤ ℓ≤ k, k−ℓ≤n

[b]Θ.

(2)

From this it follows that indΘ(^Σ⁽ⁿ⁾,L)∈ O(n²), and thush(L) =0.

To see that the sets in (2) are indeed all equivalence classes ofΘ(^Σ⁽ⁿ⁾,L), letw∈ ^Σ^∗ such thatwis not an element of the first three types of classes in (2). We need to show that thenw ∈ [b]_Θ₍_Σ(n),L). We do this by showing that there is no u ∈ ^Σ⁽ⁿ⁾such that uw∈ L.

Assume by contradiction that such a word u exists. Then u must be of one of the following forms

u= a^kb^kc^ℓ, 2k+ℓ≤n, 0≤ℓ≤k, u= a^kb^ℓ, k+ℓ≤n, 0≤ℓ <_k, u= a^ℓ, 0 ≤ℓ <_n

Ifu = a^kb^kc^ℓ, 2k+ℓ ≤ n, 0 ≤ ℓ ≤ k, thenw = c^k⁻^ℓ, k−ℓ ≤ n/2, and therefore w∈[c^k⁻^ℓ]_Θ₍_Σ(n),L), a contradiction. Ifu= a^kb^ℓ,k+ℓ≤ n, 0≤ ℓ <_{k, then}_w=b^k⁻^ℓc^k, andk−ℓ > _{0, 2k}−(k−ℓ) ≤ n, thus w ∈ [b^k⁻^ℓc^k]_Θ₍_Σ(n),L), again a contradiction.

Ifu = a^ℓ, thenw = a^k⁻^ℓb^kc^k, and k−(k−ℓ) ≤ n, so w ∈ [a^ℓb^kc^k]_Θ₍_Σ(n),L), another contradiction.

Thus, our assumption thatuexists is false. The same is true for the wordb, and thus

w∈[b]_Θ₍_Σ(n),L), as required.

⋄

The following example shows that there are natural examples of “simple” languages whose entropy is not zero, but still finite.

Example 2.15 Suppose|^Σ| ≥2. Then thepalindrome language L:={ww^R |w∈^Σ^∗} is not regular, but context-free, andh(L)∈(0,∞).

To seeh(L) >0, observe that for each n∈ ^Nand allu,v ∈ ^Σⁿ, if(u,v)∈ ^Θ(^Σ⁽ⁿ⁾,L), then u=v. This is because ofvv^R ∈L, we also haveuv^R∈ L, and henceu=v. Thus

[u]_Θ₍_Σ(n),L)6= [v]_Θ₍_Σ(n),L) (u6= v) Thus indΘ(^Σ⁽ⁿ⁾,L)≥ |^Σⁿ|= |^Σ|ⁿ, and we obtain

h(L) =lim sup

n→∞

log₂|^Σ|ⁿ

n =log₂|^Σ|>_0.

(10)

To seeh(L)<_∞we shall consider the relationΘ^∗ defined by

(u,v)∈^Θ^∗ ⇐⇒ (u,v)∈^Θ(^Σ⁽ⁿ⁾,L)and(|u| ≤n ⇐⇒ |v| ≤n). Then indΘ(^Σ⁽ⁿ⁾,L)≤indΘ^∗. We shall show

lim sup

n→∞

log₂(indΘ^∗) n <_∞_.

There are at most|^Σ⁽ⁿ⁾|many equivalence classes[u]Θ^∗ foru ∈ ^Σ^∗,|u| < n. To count the number of equivalence classes for|u| ≥ nwe define

ℓ_n(u):={a1. . .ai |1≤i≤n,a1, . . . ,ai ∈ ^Σ,u=a1. . .aiu^′,u^′ ∈ L}. Then foru,v∈ ^Σ^∗\^Σ⁽ⁿ⁾we have

(u,v)∈^Θ^∗ ⇐⇒ (u,v)∈^Θ(^Σ⁽ⁿ⁾,L) ⇐⇒ ℓ_n(u) =ℓ_n(v). (3) The first equivalence is clear. To see the second equivalence let(u,v)∈ Θ(Σ⁽ⁿ⁾,L), and let a1. . .ai ∈ℓ_n(u). By definition ofℓ_n(u)it is then true that

u(a1. . .ai)^R ∈ L.

Because(u,v)∈^Θ(^Σ⁽ⁿ⁾,L)we therefore obtain v(a1. . .ai)^R∈ L,

i.e., v is of the form v = a1. . .a_iv^′ for some v^′ ∈ L. This yields a1. . .a_i ∈ ℓ_n(v). By symmetry we obtainℓ_n(u) =ℓ_n(v)as required.

Conversely, assumeℓ_n(u) =ℓ_n(v), and letw∈ ^Σ⁽ⁿ⁾be such thatuw∈ L. Because|u| ≥ n, there existsu^′ ∈ Lwithuw = w^Ru^′w. Thenw^R ∈ ℓ_n(u) = ℓ_n(v), and thereforev = w^Rv^′ for somev^′ ∈ L. But thenvw ∈ L. By symmetryvw ∈ L =⇒ uw ∈ Lfor eachw ∈ ^Σ⁽ⁿ⁾, and therefore(u,v)∈ ^Θ(^Σ⁽ⁿ⁾,L), as required.

Using the characterization from Equation (3) we have

Σ^∗\^Σ⁽ⁿ⁾_Θ∗

={ℓ_n(u)|u∈Σ^∗\Σ⁽ⁿ⁾}

Now every setℓ_n(u)withu =u1. . .uk,k ≥n, can be represented by the prefixu1. . .unof uof lengthntogether with a tuplet∈ {0, 1}ⁿdefined by

ti =1 ⇐⇒ u1. . .ui ∈ ℓ_n(u). Therefore,

Σ^∗\^Σ⁽ⁿ⁾_Θ∗

={ℓ_n(u)|u∈ ^Σ^∗\^Σ⁽ⁿ⁾} ≤ |^Σ|ⁿ·2ⁿ.

(11)

This yields

indΘ^∗ =^Σ⁽ⁿ⁾_Θ∗

+^Σ^∗\^Σ⁽ⁿ⁾_Θ∗

≤ |^Σ⁽ⁿ⁾|+|^Σ|ⁿ·2ⁿ, and thus

lim sup

n→∞

log₂(indΘ^∗)

n =log₂(2|^Σ|)<∞.

⋄

It is not hard to see that the entropy of a formal language can very well be infinite. This is illustrated by the following example

Example 2.16 Let|^Σ| ≥ 2, and choose mappingsϕ_n: Σ²ⁿ → P(^Σⁿ)for each n ∈ ^Nsuch that|im(ϕ_n)|=|^Σ|²ⁿ =2²ⁿ. Then define a languageL⊆^Σ^∗ by

L∩^Σ^m :=

({uv|u ∈^Σ²ⁿ,v∈ ϕn(u)} ifm=2ⁿ+nfor somen∈^N,

∅ otherwise.

Then 2²ⁿ ≤ γ_L(n), i.e.,

2²ⁿ ≤indΘ(^Σⁿ,L). (4)

To see this we shall show that each wordϕ_n(u)defines its own equivalence class, i.e., for wordsu0,u1 ∈ ^Σ²ⁿ with ϕ_n(u0) 6= ϕ_n(u1) we have (u0,u1) ∈/ ^Θ(^Σⁿ,L). This is because if ϕn(u0)6= ϕn(u1)we can assume without loss of generality that there exists some word v∈ ϕ_n(u0)\ϕ_n(u1). By definition ofLwe then haveu0v∈ L, but since|u1v|=2ⁿ+nand v∈/ ϕ_n(u1)we also getu1v ∈/L. Thus(u0,u1)∈/ ^Θ(^Σⁿ,L).

But then (4) implies

lim sup

n→∞

log₂γ_L(n)

n ≥lim sup

n→∞

log₂2²ⁿ n =^∞,

and thush(L) =^∞.

⋄

3 Entropy of semigroup actions and topological automata

Regular languages are exactly those accepted by finite automata. For some non-regular languages, there exist similar characterizations in terms of finite state machines, e.g., push- down automata, linearly bounded Turing machines, and Turing machines. However, there is a different approach in terms oftopological automata[15], which in contrast to finite automata have an infinite state set which itself is equipped with a compact Hausdorff topology. In this case,everylanguageLis accepted by a topological automaton, and one can ask whether the topological entropy ofLcan be expressed in terms of a topological automaton accepting it. In this section we shall show that this question has a positive answer.

The concept of topological automata arises from the observation that the transition function of a finite automata is amonoid actionon the setΣ^∗of all words overΣ. Recall that an actionof a monoidSon a setXis a mappingα: X×S→Xsuch that

α(x,eS) =x,

α(x,st) =α(α(x,s),t)

(12)

for allx∈ X,s,t ∈S.

Recall that a (deterministic) automaton over an alphabet Σ is a tuple A = (Q,Σ,δ,q0,F) consisting of a finite setQofstates, atransition functionδ: Q×^Σ →Q, a setF ⊆ Qoffinal states, and aninitial state q0 ∈ Q. The transition function is usually extended to the set of all words overΣby virtue of

δ^∗(q,ε):=q,

δ^∗(q,wa):=δ(δ(q,w),a)

forq∈ Q,a∈ ^Σ, andw∈ ^Σ^∗. It is not difficult to see that in this caseδis a monoid action ofΣ^∗onQ. Thelanguage acceptedbyAis then

L(A):={w∈^Σ^∗ |δ^∗(q0,w)∈ F}.

We can extend the notion of deterministic finite automata to an infinite state set as follows [15].

Definition 3.1 Atopological automaton over an alphabetΣis a tuple A = (X,Σ,α,x0,F) consisting of

• a compact Hausdorff space X, called the set of statesofA

• a continuous actionαofΣ^∗on X, called thetransition functionofA,

• a point x0 ∈X, called theinitial stateofA, and

• a clopen subset F⊆ X, called the set of final statesofA.

We say thatAistrimifα(x0,Σ^∗)is dense in X. Thelanguage recognized byAis defined as L(A):= {w∈ ^Σ^∗ |α(x0,w)∈F}.

LetB= (Y,Σ,β,y0,G)be another topological automaton. We shall say thatAandBareisomor- phic, and writeA ∼= B, if there exists a homeomorphism ϕ: X →Y such that

ϕ(α(x,σ)) =β(ϕ(x),σ)

for all x∈ X,σ∈ ^Σ,ϕ(x0) =y0, andϕ(F) =G. △ Evidently, isomorphic automata accept the same language.

Observe that every automaton acceptingLcan be turned into an automaton that is trim: if A= (X,Σ,α,x0,F)is a topological automaton acceptingL, then replacingXwithα(x0,Σ^∗) andFwithF∩α(x0,Σ^∗)always yields a trim automaton accepting the same languageL.

As already stated, and in contrast to regular languages, every formal languageL ⊆ ^Σ^∗is accepted by a topological automaton, cf. [15, Proposition 2.1].

Proposition 3.2 Let L ⊆ ^Σ^∗ andχL the characteristic function of L. Equip X := {0, 1}^Σ^∗ with the product topology, and define the mappingδ: X×^Σ^∗→ X by

δ(f,u)(v):= f(uv).

Then L is accepted by the topological automaton(X,Σ,δ,χ_L,T)for T :={f ∈X | f(ε) =1}.

(13)

With the notation of 3.2, we define theminimal automaton of Lto be A_L= χ_L(^Σ^∗),Σ,δ,χ_L,TL

,

whereχ_L(^Σ^∗) is the closure ofχ_L(^Σ^∗) in{0, 1}^Σ^∗, and TL = T∩χ_L(^Σ^∗). Clearly, A_L is trim. Indeed we have the following fact, cf. [15, Theorem 2.2].

Proposition 3.3 Let L ⊆ ^Σ^∗, and letA = (X,Σ,x0,δ,F)be a topological automaton accepting L. ThenA ∼=A_Lif and only if for every automatonB = (Y,Σ,y0,λ,G)accepting L there exists a uniquely determined surjective continuous function

ϕ:Y→ X

satisfying ϕ(λ(y,σ)) =δ(ϕ(y),σ), ϕ(y0) =x0, and F= ϕ⁻¹(G).

SinceA_L∼= A_L, this proposition immediately yields that the minimal automaton is indeed minimal in the above sense. Moreover, in the case thatLis regular,AL is finite and is the usual minimal automaton of regular languages.

Example 3.4 Let Σbe a finite alphabet and let a,b ∈ Σ, a 6= b. We consider theAlexan- droff compactificationZ_∞ of the discrete space of integersZ, that is the setZ_∞ = ^Z∪ {^∞} equipped with the topology

{M⊆^Z∪ {^∞} | ^∞∈ M=⇒^Z\Mis finite}.

We define an action α of Σ^∗ onZ_∞ by setting α(m,a) = m+1, α(m,b) = m−1 and α(m,c) =mfor allm∈^Z∞andc∈ ^Σ\ {a,b}. Thenαconstitutes a continuous action ofΣ^∗ onZ_∞, and for eachn ∈ ^Nthe topological automatonA = (^Z∞,Σ,α, 0,{n})accepts the

languageL=w∈^Σ^∗ |w|a =|w|_b+n .

⋄

We now shall express the topological complexity of the languageLaccepted by a topological automatonA= (Q,Σ,α,x0,F)by thetopological entropyof the continuous actionαofΣ^∗ onQ[1, 8, 17]. To this end, we shall first fix some useful notation and recall some important definitions about continuous actions on compact Hausdorff spaces.

LetXagain be a compact Hausdorff space. We shall denote byC(X)the set of all finite open covers ofX. If f: X→ Xis continuous andU ∈ C(X), then f⁻¹(U):= {f⁻¹(U)|U∈ U } is a finite open cover ofXas well. Given U,V ∈ C(X), we say thatV refinesU and write U V if

∀V∈ V ∃U ∈ U: V⊆U,

and we say thatU andV arerefinement-equivalentand writeU ≡ V ifU V andV U. Furthermore, if(U_i |i∈ I)is a finite family of finite open covers ofX, then

_

i∈I

U_i :=^\

i∈I

Ui

(Ui)_i∈I ∈

∏

i∈I

U_i .

(14)

is a finite open cover ofXas well. ForU ∈ C(X)let N(U):=inf

|V | V ⊆ U,X =^[V .

In preparation for some later considerations, let us recall the following basic observations.

Remark 3.5 ([1]) Let X be a compact Hausdorff space, U,V ∈ C(X), I be a finite set, (U_i)_i_∈_I,(V_i)_i_∈_I ∈ C(X)^I, and f: X → X be a continuous map. Then the following statements hold:

(1) U V =⇒ N(U)≤N(V), (2) U V =⇒ f⁻¹(U) f⁻¹(V),

(3) (∀i∈ I: U_i V_i) =⇒ ^W_i_∈_IU_i ^W_i_∈_IV_i.

⋄

Now we come to dynamical systems, i.e., continuous semigroup actions. LetSbe a semigroup and consider a continuous actionα: X×S→ XofSonX, that is,αis supposed to be an action ofSonXwhereαs: X→X, αs(x) =α(x,s)is continuous for everys∈S. For U ∈ C(X)we write

s⁻¹(U):=α⁻_s¹(U). For every finiteF⊆SandU ∈ C(X)let

(F:U)_α := N ^_

s∈F

s⁻¹(U).

AssumeF to be a finite generating subset ofS. IfU is a finite open cover of X, then we define

η(α,F,U):=lim sup

n→∞

log₂(Fⁿ:U)_α

n .

Furthermore, thetopological entropy ofαwith respect to Fis defined to be the quantity η(α,F):=sup{η(α,F,U)| U ∈ C(X)}.

Of course, the precise value of this quantity depends on the choice of a finite generating system. However, we observe the following fact.

Proposition 3.6 Let S be a semigroup and let α be a continuous action of S on some compact Hausdorff space X. Suppose E,F⊆S to be finite subsets generating S. Then

1

m·η(α,F)≤η(α,E)≤n·η(α,F), where m:=inf{k∈^N|F⊆ E^k}and n:=inf{k∈^N|E⊆F^k}.

Proof LetU ∈ C(X). Evidently,(E^k :U)≤ (F^kn :U)for allk ∈^N, whence η(α,E,U) =lim sup

k→∞

log₂(E^k :U)α

k ≤lim sup

k→∞

log₂(F^kn :U)α

k

=nlim sup

k→∞

log₂(F^kn :U)α

kn ≤nlim sup

k→∞

log₂(F^k :U)α

k =n·η(α,F,U)

(15)

Thus,η(α,E,U)≤ n·η(α,F,U). This shows thatη(α,E) ≤ nη(α,F). Due to symmetry, it

follows thatη(α,F)≤m·η(α,E)as well.

We shall now show that the entropy of a formal language is bounded from above by the entropy of any topological automaton accepting it. In the case that the automaton is trim, these two notions even coincide.

Theorem 3.7 SupposeA= (X,Σ,α,x0,F)to be a topological automaton. Consider S:= ^Σ∪ {ε} andU := {F, X\F}. Then h(L(A))≤η(α,S,U). IfAis trim, then h(L(A)) =η(α,S,U). We prove this theorem with the following three auxiliary statements.

Lemma 3.8 LetA= (X,Σ,α,x0,F)be a topological automaton. LetΦ: Σ^∗ →X, w7→α(x,w) andU := {F, X\F}. Consider a finite subset E⊆^Σ^∗as well as the equivalence relation

Λ_E :={(x,y)| ∀w∈E: α(x,w)∈ F⇐⇒α(y,w)∈ F}. Then the following statements hold:

(1) X/Λ_E = ^W_w_∈_Ew⁻¹(U)\ {^∅}. (2) Θ(E,L(A)) = (^Φ×^Φ)⁻¹(^Λ_E).

(3) IfAis trim, thenΦ(^Σ^∗)∩V 6= ^∅for every V∈ X/Λ_E.

Proof (1): We observe that V := ^W_w_∈_Ew⁻¹(U)\ {^∅}constitutes a finite partition of X into clopen subsets. For anyV ∈ Vandx ∈V, we observe that

[x]Λ_E = {y ∈Y| ∀w∈ E: α(x,w)∈F⇔α(y,w)∈F}

= {y ∈Y| ∀w∈ E: x∈ w⁻¹(F)⇔y∈ w⁻¹(F)}

= {y ∈Y| ∀w∈ E∀U∈ U : x∈w⁻¹(U)⇔y∈w⁻¹(U)}

= {y ∈Y| ∀W ∈ V: x∈W ⇔y∈W}

= {y ∈Y|y∈V}

=V We conclude thatX/Λ_E =V.

(2): LetL:=L(A). For any two wordsu,v∈^Σ^∗, it follows that (u,v)∈^Θ(E,L) ⇐⇒ ∀w∈E: uw ∈L⇔vw ∈ L

⇐⇒ ∀w∈E: α(x0,uw)∈ F⇔α(x0,vw)∈ F

⇐⇒ ∀w∈E: α(α(x0,u),w)∈ F⇔α(α(x0,v),w)∈F

⇐⇒ ∀w∈E: α(^Φ(u),w)∈F ⇔α(^Φ(v),w)∈ F

⇐⇒ (^Φ(u),Φ(v))∈^Λ_E. That is,Θ(E,L) = (^Φ×^Φ)⁻¹(^Λ_E).

(3): By (1), the setX/Λ_E is a collection of open, non-empty subsets ofX. IfAis trim, then Φ(Σ^∗)is dense inX, and thusΦ(Σ^∗)∩V6=∅for everyV∈ X/Λ_E.