• Keine Ergebnisse gefunden

4.1 Closure Properties of Language Families

N/A
N/A
Protected

Academic year: 2022

Aktie "4.1 Closure Properties of Language Families"

Copied!
38
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)
(2)

Contents

1 Fundamentals 7

1.1 Sets and Multisets of Words . . . 7

1.2 Polynomials and Linear Algebra . . . 13

1.3 Graph Theory . . . 14

1.4 Intuitive Algorithms . . . 16

A SEQUENTIAL GRAMMARS 19 2 Basic Families of Grammars and Languages 21 2.1 Definitions and Examples . . . 21

2.2 Normal forms . . . 32

2.3 Iteration Theorems . . . 48

3 Languages as Accepted Sets of Words 55 3.1 Turing Machines versus Phrase Structure Grammars . . . 55

3.1.1 Turing Machines and Their Accepted Languages . . . 55

3.1.2 Nondeterministic Turing Machines and Their Accepted Languages . 64 3.1.3 A Short Introduction to Computability and Complexity . . . 71

3.2 Finite Automata versus Regular Grammars . . . 78

3.3 Push-Down Automata versus Context-Free Languages . . . 85

4 Algebraic Properties of Language Families 93 4.1 Closure Properties of Language Families . . . 93

4.2 Algebraic Characterizations of Language Families . . . 104

4.2.1 Characterizations of Language Families by Operations . . . 104

4.2.2 Characterizations of Regular Language Families by Congruence Re- lations . . . 113

5 Decision Problems for Formal Languages 117 B Formal Languages and Linguistics 133 8 Some Extensions of Context-Free Grammars 135 8.1 Families of Weakly Context-Sensitive Grammars . . . 135

8.2 Index Grammars . . . 135

8.3 Tree-Adjoining Grammars . . . 135

8.4 Head Grammars . . . 135 5

(3)

8.5 Comparison of Generative Power . . . 135

9 Contextual Grammars and Languages 137 9.1 Basic Families of Contextual Languages . . . 137

9.2 Maximally Locally Contextual Grammars . . . 137

10 Restart Automata 139 D Formal Languages and Pictures 225 14 Chain Code Picture Languages 227 14.1 Chain Code Pictures . . . 227

14.2 Hierarchy of Chain Code Picture Languages . . . 235

14.3 Decision Problem for Chain Code Picture Languages . . . 239

14.3.1 Classical Decision Problems . . . 239

14.3.2 Decidability of Properties Related to Subpictures . . . 249

14.3.3 Decidability of ”Geometric” Properties . . . 252

14.3.4 Stripe Languages . . . 255

14.4 Some Generalizations . . . 261

14.5 Lindenmayer Chain Code Picture Languages and Turtle Grammars . . . . 263

14.5.1 Definitions and some Theoretical Considerations . . . 263

14.5.2 Applications for Simulations of Plant Developments . . . 267

14.5.3 Space-Filling Curves . . . 269

14.5.4 Kolam Pictures . . . 272

15 Siromoney Matrix Grammars and Languages 275 15.1 Definitions and Examples . . . 277

15.2 Hierarchies of Siromoney Matrix Languages . . . 282

15.3 Hierarchies of Siromoney Matrix Languages . . . 282

15.4 Decision Problems for Siromoney Matrix Languages . . . 285

15.4.1 Classical Problems . . . 285

15.4.2 Decision Problems related to Submatrices and Subpictures . . . 290

15.4.3 Decidability of geometric properties . . . 294

16 Collage Grammars 301 16.1 Collage Grammars . . . 303

16.2 Collage Grammars with Chain Code Pictures as Parts . . . 312

Bibliography 317

(4)
(5)

Chapter 4

Algebraic Properties of Language Families

In this section we study the behaviour of languages under certain operations. Especially, we are interested in the question whether or not the application of some operation to languages of some language family yields a language of that family, again. The result will be used to present some characterizations of language families by operations. In addition, we also give a characterization of the set of regular languages by properties of associated congruence classes.

4.1 Closure Properties of Language Families

The basic definition for the behaviour of language families with respect to operation is the following one.

Definition 4.1 We say that a family L of languages is closed under the n-ary operation τ if, for any languages L1, L2, . . . , Ln of L, τ(L1, L2, . . . , Ln)∈ L.

We first study the closure properties of the families of the Chomsky hierarchy under set-theoretic operations.

Lemma 4.2 The families L(REG), L(LIN), L(CF), L(CS)and L(RE) are closed under union.

Proof. Let L1 and L2 are two languages in L(X) with X ∈ {REG,LIN,CF,CS,RE}.

Then there are grammars G1 = (N1, T1, P1, S1) and G2 = (N2, T2, P2, S2) of type X such that L(G1) = L1 and L(G2) = L2). Without loss of generality we assume that N1∩N2 = (if this should not be the case we rename some nonterminals such that the required emptiness is obtained). We construct the grammar

G= ({S} ∪N1∪N2, T1∪T2,{S→S1, S →S2} ∪P1∪P2, S),

whereS is a new symbol not contained in N1∪N2∪T1∪T2. Obviously,G is of type X, too. Moreover, any derivation has the form

S =⇒Si =

Gi w∈L(Gi) 93

(6)

for somei∈ {1,2} (because by N1∪N2 = the rules of Pi do not produce nonterminals ofNj,j 6=i, i. e., we cannot merge the productions ofP1 andP2. Thus we can derive only words and all words of L(G1)∪L(G2) = L1∪L2. Therefore L1∪L2 =L(G)∈ L(X). 2 Lemma 4.3 The familiesL(REG), L(CS)andL(RE)are closed under intersection. The families L(LIN) and L(CF) are not closed under intersection.

Proof. L(REG). We have to show that, for two regular languages L1 and L2, their intersection L1 ∩L2 is a regular language, too. We only give the proof for the case that λ /∈L1∩L2 and leave the modifications for the general case to the reader.

Let

G1 = (N1, T1, P1, S1) and G2 = (N2, T2, P2, S2) be two regular grammars with

L(G1) = L1 and L(G2) =L2.

By Theorem 2.28, we can assume that both grammar are in the normal form, i. e., the rules have the form A aB or A a with nonterminals A, B and terminal a. We consider the regular grammar

G= (N1×N2, T, P,(S1, S2)) with

P = {(A1, B1)→a(A2, B2) :A1 →aA2 ∈P1, B1 →aB2 ∈P2}

∪{(A, B)→a:A→a∈P1, B →a∈P2}.

It is easy to see that a derivation

(S1, S2) =⇒a1(A1, B1) =⇒a1a2(A2, B2) =⇒. . .=⇒a1a2. . . an−1(An−1, Bn−1) =⇒a1a2. . . an−1an exists in Gif and only derivations

S1 =⇒a1A1 =⇒a1a2A2 =⇒. . .=⇒a1a2. . . an−1An−1 =⇒a1a2. . . an−1an and

S2 =⇒a1B1 =⇒a1a2A2 =⇒. . .=⇒a1a2. . . an−1Bn−1 =⇒a1a2. . . an−1an

exist in G1 and G2, respectively. Therefore w L(G) holds if and only w L(G1) and w∈L(G2). Hence

L(G) = L(G1)∩L(G2) =L1∩L2. Since Gis a regular grammar, L1∩L2 is a regular languages.

L(RE). Let L1 ∈ L(RE) and L2 ∈ L(RE) be given. By Theorem 3.19, there are deterministic Turing machines

M1 = (X, Z1, z01, Q1, δ1, Q1) and M2 = (X, Z2, z02, Q2, δ2, Q2)

(7)

with

T(M1) = L1 and T(M2) =L2.

Without loss of generality we can assume that Z1 ∩Z2 = ∅. We construct a Turing machine M which works as follows (the formal description is left to the reader). First the machine replaces any letter xof the input word by (x, x). Then it works as M1 using only the letters of the first components; thus the input input word is stored in the second component (if a is read, then it is handled as (∗,∗)). If M reaches a state from Q1, then it replaces all letters (a, b) by their second component b, i. e., the input word is at the tape, again. Now M starts to work as M2 and stops if a state of Q2 is reached.

According to this work we first check whether the input is accepted by M1 and then whether the input is inT(M2). Thus M accepts a word W if and only w is accepted by M1 as well as by M2. Consequently,

T(M) = T(M1)∩T(M2) =L1∩L2, which proves thatL1∩L2 ∈ L(RE) by Theorem 3.19.

L(CS). The proof can be given analogously to that for recursively enumerable lan- guages, but we use linearly bounded automata and Theorem 3.23.

L(LIN) and L(CF). In order to prove the assertion it is sufficient to give two linear languages which have a non-context-free intersection. We consider the linear grammars

G1 = ({S, A}, {a, b, c}, {S→Sc, S →Ac, A→aAb, A→ab}, S), G1 = ({S, A}, {a, b, c}, {S→aS, S →aA, A→bAc, A→bc}, S).

It is easy to see that

L(G1) = {anbncm |n≥1, m1}and L(G2) = {ambncn |n≥1, m1}.

Obviously, L(G1)∩L(G2) = {anbncn | n 1}. By the proof of Theorem 16.13 we know

that L(G1)∩L(G2) is not context-free. 2

Lemma 4.4 The families L(REG) and L(CS) are closed under complement. The fami- lies L(LIN), L(CF) and L(RE) are not closed under complement.

Proof. L(REG). Let L be a regular language. Then there is a deterministic finite automaton A = (alph(L), Z, z0, F, δ) such that L = T(A). Thus w L if and only if δ(z0, w) F. Consequently, w C(L) if and only if δ(z0, w) ∈/ F if and only if δ(z0, w) Z \F. Thus the automaton A0 = ((alph(L), Z, z0, Z \F, δ) accepts C(L).

ThereforeC(L) is regular.

L(CS). We omit the proof since it requires some knowledge not presented in this book and is relatively long. We refer to [32] and [17] and the original papers [13], [29].

L(RE). If L(RE) is closed under complement, then any recursively-enumerable lan- guage is recursive by Theorem 3.10, in contradiction to Theorem 3.11.

L(CF). Let us assume thatL(CF) is closed under complement. Let L1 andL2 be two arbitrary context-free languages. We set

X =alph(L1)∪alph(L2), X1 =X\alph(L1), X2 =X\alph(L2).

(8)

LetR1 andR2 be the sets of all words overX which contain at least one letter ofX1 and X2, respectively. If Xi =for somei∈ {1,2}, thenRi is the empty set, and thereforeRi

is a regular set. IfXi 6=∅, then the regular grammar Gi = ({S, A}, X, [

a∈alph(Li)

{S →aS} ∪ [

b∈Xi

{S →bA, S →b} ∪ [

x∈X

{A→xA, A→x}, S)

generates Ri (since we can only terminate from S or switch from S to A, if a letter fromXi is generated). Hence in all cases R1 and R2 are regular languages and therefore context-free, too. By our assumption and Lemma 4.2, for i∈ {1,2},

X\Li = ((alph(Li))\Li)∪Ri =C(Li)∪Ri is a context-free language. Again, by Lemma 4.2,

R = (X\L1))(X\L2)

is context-free. Now our assumption gives the context-freeness of

L1∩L2 =X\((X\L1)(X\L2)) = (alph(R))\R

is a context-free languages, which means that the intersection of arbitrary context-free languages is context-free. Thus we have a contradiction to Lemma 4.3. Therefore our assumption is not valid, i. e. L(CF) is not closed under complement.

L(LIN) We repeat the proof forL(CF) (word by word), but replace context-free in all

cases by linear andL(CF) by L(LIN). 2

Lemma 4.5 The families L(REG) and L(CS) are closed under set-theoretic difference.

The families L(LIN), L(CF) and L(RE) are not closed under set-theoretic difference.

Proof. Let X and Y be two languages andV =alph(X)∪alph(Y). Let us assume that alph(X)\alph(Y) is not empty (the easy modifications for alph(X)⊆alph(Y) are left to the reader). From the proof of Lemma 4.4, we know that V\(alph(Y)) is in L(REG) and therefore in L(CS), too. Because

X\Y = (V\Y)∩X = ((V\(alph(Y)))((alph(Y))\Y))∩X

= ((V\(alph(Y)))∪C(Y))∩X, the first assertion follows by Lemmas 4.2 – 4.4.

SinceXis a regular language and belongs to all language families under consideration, the complement is a special case of difference. Thus the second statement of Lemma 4.4

implies the second assertion. 2

We now mention a special case of intersection; we require that the language of the family under consideration has to intersected with a regular set.

Lemma 4.6 The families L(REG), L(LIN), L(CF), L(CS), and L(RE) are closed un- der intersection with regular languages.

(9)

Proof. The statement holds trivially for L(REG), L(CS), and L(RE), because any of these language families is closed by intersection (see Lemma 4.3) and contains all regular languages (see Theorem 2.37).

In order to prove the statement for L(CF) we construct a pushdown automaton M which acceptsL∩R for a given context-free language Land a given regular language R.

Let

M1 = (X, Z1,Γ, z0,1, F1, δ1) andA2 = (X, Z2.z0,2, F2, δ2)

be a pushdown automaton and a finite automaton, respectively, such that T(M1) = L and T(A2) =R. We construct the pushdown automaton

M= (X, Z1×Z2,Γ,(z0,1, z0,2), F1×F2, δ) where

((z10, z20), R, β)∈δ((z1, z2), a, γ) if (z10, β)∈δ1(z1, a, γ) and δ2(z2, a) =z20, ((z10, z2), N, β)∈δ((z1, z2), a, γ) if (z10, β)∈δ1(z1, a, γ).

By definition M behaves on the first component of the state and the pushdown tape as M1 and on the second component of the state as A2 (where a letter is only read by A2, if M1 moves to the right). Hence M accepts a word w if and only if w is accepted by M1 as well as by A2. Thus T(M) =L∩R.

For the family of linear languages, we only notice that the construction of M from M1 gives a 1-turn pushdown automaton if M1 is a 1-turn pushdown automaton. 2 We now study the algebraically motivated operations concatenation and Kleene closure and those operations related to homomorphisms.

Lemma 4.7 The families L(REG), L(CF), L(CS), and L(RE) are closed under con- catenation. L(LIN) is not closed under concatenation.

Proof. L(CF). Again, we start with two context-free grammars G1 = (N1, T, P1, S1) and G2 = (N2, T, P2, S2) with N1∩N2 =and show that the grammar

G= (N1∪N2∪ {S}, T, P1∪P2∪ {S →S1S2}, S) generates L(G1)

cdotL(G2). It is sufficient to mention that – up to the order of the applications of rules – any derivation in G has the form

S =⇒S1S2 =⇒w1S2 =⇒w1w2

where, for i ∈ {1,2}, Si = wi is a derivation in Gi (i. e., the derivation only uses rules of Pi). Since G is a context-free grammar, L(G1)

cdotL(G2) is a context-free language.

L(CS) andL(RE). We repeat the proof for L(CF) where we suppose without loss of generality that the grammars are in the Kuroda normal form (see Theorem 2.19. This

(10)

ensures that the derivations in G1 and G2 cannot be influenced by the contexts of the other part. Furthermore, we have to take care of the empty word in case ofL(CS), which requires to represent the concatenation as a union by languages without the empty word and the language only consisting of the empty word; e. g., if λ L(G1) and λ L(G2), then

L(G1)·L(G2) = ((L(G1)\ {λ})·(L(G2)\ {λ}))∪(L(G1)\ {λ})∪(L(G2)\ {λ})∪ {λ}.

The details are left to the reader.

L(REG). The above proof (for L(CF)) does not work for regular languages since the newly introduced rule S →S1S2 has not the required form.

Let G1 = (N1, T1, P1, S1) and G2 = (N2, T2, P2, S2) be regular grammars such that L(G1) = L1, L(G2) = L2 and N1∩N2 =∅. Then we construct the grammar

G= (N1∪N2, T, P10 ∪P2, S1) where

P10 ={A→wB :A→wB ∈P1, B ∈N1} ∪ {A→wS2 :A→w∈P1, w ∈T}.

According to this construction, all derivations in Ghave the form S1 = w0A=⇒w0wS2 = w0ww2 where

S1 = w0A =⇒w0w=w1 and S2 =⇒w2 are derivations in G1 and G2, respectively. Hence

L(G) ={w1w2 :w1 ∈L(G1), w2 ∈L(G2)}=L(G1)·L(G2).

L(LIN) The method used forL(REG) does not work since the derivation of the first grammar can end somewhere in the middle of the word and not at the end as in the case of regular grammars.

By Example 2.5, L = {anbn | n 1} is a linear language. However, the language L·L = {anbnambm | n 1, m 1} is not linear as we have shown in the proof of

Theorem 2.34. 2

Lemma 4.8 The families L(REG), L(CF), L(CS), and L(RE) are closed under (posi- tive) Kleene closure. L(LIN) is not closed under (positive) Kleene closure.

Proof. We first prove the statement for positive Kleene closure.

L(CF). Let L be a context-free language. Let G = (N, T, P, S) be a context-free grammar which generatesL. We set

G0 = (N ∪ {S0}, T, P ∪ {S0 →SS0, S0 →S}, S0)

(whereS0 is an additional symbol, again). Up to the order of the application of the rules, any derivation in G0 has the form

S0 = SS0 = w1S0 =⇒w1SS0 =⇒w1w2S0 =⇒w1w2SS0 =⇒...

= w1w2. . . wn−1S0 =⇒w1w2. . . wn−1S =⇒w1w2. . . wn−1wn,

(11)

where, for 1 i n, each derivation S = wi uses only rules of P. Thus we have wi L(G) = L for 1 i n. Hence w1w2. . . wn Ln. It is obvious that any word w∈Ln and only words of Lm with m≥1 can be generated. Therefore

L(G0) = [

n≥1

Ln =L+,

which proves the context-freeness of L+.

L(CS) and L(RE). Let L be a language of L(X), X ∈ {CS,RE}. Then L can be generated by a grammar G = (N, T, P, S) in Kuroda normal form (see Theorem 2.19).

We set

G0 = (N ∪ {S0, S00}, T, P ∪P0, S0) whereP0 consists of the rules

S0 →S, S0 →SS00,

xS00 →xSS00, xS00→xS for x∈T.

By these it is ensured that the subderivations starting fromS can not influence each other by context (since a new derivation can only be started if the preceding one has already produced the last terminal letter). Now we get L(G0) =L+ as above. The details of the proof are left to the reader.

L(REG). LetG= (N, T, V, P, S) be a regular grammar withL(G) =L. We construct the regular grammar G0 = (N, T, P0, S) where P0 is obtained by adding all rules of the forms

A →wS for A→w∈P, w ∈T toP. Then the derivations of G0 have the form

S = w10A1 =⇒w01w001S= w01w200w02A2 =⇒w10w100w02w200S

= w10w100. . . wn−10 w00n−1S =⇒w10w001. . . wn−10 wn−100 wn,

where w0iwi00 L(G) for 1 i n−1 and wn L(G). Now L(G0) = L+ can easily be proved.

Kleene closure. If λ L, then L = L+ and we can use the above constructions. If λ /∈L, then L =L+∪ {λ}; because a grammar with the only rule S →λ, generates the language which only consists of the empty word, the assertion follows by the above result for L and Lemma 4.2.

L(LIN). We consider the linear language L(G2) ={anbn | n 1} from Example 2.5.

It is easy to see that

L(G2)+ ={an1bn1an2bn2. . . antbnt |t≥1, ni 1, 1≤i≤t}.

Let us assume that L(G2)+ is linear. Because R = {apbqarbs | p, q, r, s 1} is regular (the verification is left to the reader), then

L(G2)+∩R ={anbnambm |n, m≥1}

(12)

is also linear by Lemma 4.6. However, as an application of the pumping lemma for linear languages we have shown that L(G2)+∩R is not linear. This contradiction shows that our above assumption is wrong, i. e.,L(G2)+is not a linear languages. Thus we have shown the non-closure of the family of linear languages under positive Kleene closure. The analogous statement for the Kleene closure follows as above taking into consideration that

L(G2)∩R =L(G2)+∩R. 2

Lemma 4.9 The families L(REG), L(LIN), L(CF), and L(RE)are closed under homo- morphisms.

Proof. Leth be homomorphism which maps T toY.

L(CF). Let L be a context-free language. Then there is a context-free grammar G = (N, T, P, S) in Chomsky normal form such that L(G) = L (see Theorems 2.26).

Therefore all rules are of the form A BC or A a with A, B, C N and a T. Moreover, we can arrange the order of the applications of rules such that any derivation has the form

S = A1A2. . . Ak =⇒a1A2A3. . . Ak =⇒a1a2A3A4. . . Ak =⇒. . .=⇒a1a2. . . ak (where we apply only rules of the formA →BC in the subderivation S = A1A2. . . Ak. We now construct the grammar G0 = (N, Y, P0, S) where P0 is obtained from P by a replacement of any rule of the form A a P by A h(a). Then it follows that – without loss of generality – the derivations in G0 have the form

S = A1A2. . . Ak=⇒h(a1)A2A3. . . Ak =⇒h(a1)h(a2)A3A4. . . Ak =⇒. . .

= h(a1)h(a2). . . h(ak) =h(a1a2. . . ak).

Thus we have w L(G) if and only if h(w) L(G0) and therefore L(G0) = h(L(G)) = h(L). Furthermore, G0 is a context-free grammar. Hence L(CF) is closed under homo- morphisms.

L(RE). We repeat the proof for L(CF) but use the Kuroda normal form instead of the Chomsky normal form.

L(LIN). LetLbe a linear grammar. Then there is a linear grammarG= (N, T, P, S) generating L. Moreover, any derivation inG has the form

S w1A1v1 =⇒w1w2A2v2v1 =⇒. . .=⇒w1w2. . . wkAkvkvk−1. . . v1

= w1w2. . . wkuvkvk−1. . . v1

where the rules S w1A1v1, Ai wi+1Ai+1vi+1 for 1 i k−1, and Ak u are applied.

We now define the grammar G= (N, Y, P0, S) by

P0 ={A →h(w)Bh(v)|A →wBv ∈P} ∪ {A →h(w)|A→w∈P}.

Any derivation inG0 has the form

S h(w1)A1h(v1) =⇒h(w1)h(w2)A2h(v2)h(v1)

= . . .=⇒h(w1)h(w2). . . h(wk)Akh(vk)h(vk−1). . . h(v1)

= h(w1)h(w2). . . h(wk)h(u)h(vk)h(vk−1). . . h(v1)

= h(w1w2. . . wkuvkvk−1. . . v1).

(13)

Again, we have z L(G) if and only if h(z) L(G0) and therefore L(G0) = h(L(G)) = h(L). The assertion follows becauseG0 is linear.

L(REG). The construction given in the proof forL(LIN) gives a regular grammarG0,

if G is regular. 2

We have not given the closure property ofL(CS) under homomorphisms. This will be added in Chapter 5.

Lemma 4.10 The families L(REG), L(LIN), L(CF), L(CS), and L(RE) are closed under inverse homomorphisms.

Proof. L(REG). Let L be a regular language. Then there is a deterministic finite automata A= (X, Z, z0, F, δ) such thatT(A) = L. Now leth:Y →X be a homomor- phism. Then a1a2. . . an h−1(L), ai Y for 1 i n if and only if h(a1a2. . . an) = h(a1)h(a2). . . h(an)∈L. We construct the automaton A0 = (Y, Z, z0, F, δ0) by setting

δ0(z, a) =δ(z, h(a)) for a∈Y.

By definition ofδ0, we immediately have

δ0(z0, a1a2. . . an) =δ(z0, h(a1)h(a2). . . h(an)∈F.

Therefore a1a2. . . an T(A0) if and only if h(a1)h(a2). . . h(an) T(A0). This implies that A0 accepts h−1(T(A)) =h−1(L). Henceh−1(L) is regular.

L(CF). Let L be a context-free language andM = (X, Z,Γ, z0, F, δ) be a pushdown automaton. Moreover, let h :Y X be a homomorphism. For any letter a Y with h(a) = b1b2. . . bra, we introduce new symbols (a, i), 1 i ≤ra+ 1. Let Z0 be the set of all new symbols. Then we consider the pushdown automaton

M0 = (Y,{(z, z)|z ∈Z} ∪(Z ×Z0), z0,{(z, z)|z ∈F}, δ0), whereδ0 is defined as follows:

δ0((z, z), a,#) = {(z,(a,1)), λ)} for z ∈Z, a∈Y,

δ0((z, z), a, γ) = {(z,(a,1)), γ)} for z∈Z, a ∈Y, γ Γ, δ0((z,(a, i)), λ, γ) = {(z0,(a, i+ 1)), β)|(z0, β)∈δ(z, bi, γ)}

for z ∈Z, a∈Y,1≤i≤ra, γ Γ∪ {#}, δ0((z,(a, i)), λ, γ) = {(z0,(a, i)), β)|(z0, β)∈δ(z, λ, γ)}

for z ∈Z, a∈Y,1≤i≤ra, γ Γ∪ {#}, δ0((z,(a, ra+1)), λ, γ) = {((z, z), γ)} for z ∈Z, a∈Y, γ Γ,

δ0((z,(a, ra+1)), λ,#) = {(z, z), λ)} for z ∈Z, a∈Y,

After reading a letter a in state (z, z), we change to (z,(a,1)) and simulate the work of M on h(a) = b1b2. . . bra by changing the first component according to M and mov- ing to (a, i+ 1) if bi is ”read”. The (z0, ara+1) says that the work on h(a) is simulated and we enter (z0, z0). Therefore the pushdown automaton M0 accepts a1a2. . . an if and

(14)

only if we obtain (q, q) for some q F on the input a1a2. . . an if and only the simu- lation on h(a1)h(a2). . . h(an) leads to q F. Thus a1a2. . . an T(M0) if and only if h(a1)h(a2). . . h(an)∈T)M=Lif and only if a1a2. . . an∈h−1(L).

We omit the proofs for L(LIN), L(CS), and L(RE) which can be given analogously, i. e., the automaton for h−1(L) simulates the work of the automaton forL. 2

The proof of the following theorem is left to the reader (see Exercise ???).

Lemma 4.11 The familiesL(REG), L(LIN), L(CF), L(CS), and L(RE)are closed un-

der reversal. 2

We summarize the closure properties of the families of the Chomsky hierarchy in the table given in Figure 4.1 where a + or – in the meet of the column associated with a familyLand the row associated with an operation τ means that Lis closed or not closed underτ, respectively.

L(RE) L(CS) L(CF) L(LIN) L(REG)

union + + + + +

intersection + + – – +

intersection with regular sets + + + + +

complement – + – – +

product + + + – +

(positive) Kleene closure + + + – +

homomorphisms + – + + +

non-erasing homomorphisms + + + + +

inverse homomorphisms + + + + +

reversal + + + + +

Figure 4.1: Table of closure properties

We now show that a family of languages which is closed under certain operations is also closed under some further operations. In order to shorten the statements we give the following notation.

Definition 4.12 A family L of languages is called an abstract family of languages (ab- breviated by AFL) if

it contains at least one non-empty language,

it is closed under union, product, positive Kleene closure, non-erasing homomor- phisms, inverse homomorphisms and intersections with regular languages.

The familyL is called a full AFL if, in addition, it is closed under (arbitrary) homomor- phisms.

By Figure 4.1, L(REG), L(CF), L(CS), and L(RE) are AFLs; L(REG), L(CF), and L(RE) are full AFLs; L(LIN) is not an abstract family of languages.

Lemma 4.13 Any full AFL is closed under Kleene closure.

(15)

Proof. Since L = L+ ∪ {λ} and any full AFL is closed under positive Kleene closure and union, it is sufficient to show that any full AFL contains {λ}.

Let L be an AFL. We first show that {λ} ∈ L. By defition, L contains a non-empty language K. IfK ={λ}, then the assertion holds. IfK 6={λ}, then K contains a non- empty word z. We define the homomorphism h : (alph(K)) (alph(K)) by h(a) = λ for all a∈alph(K). Then

{λ}=h(K∩ {w}).

Because Lis closed under intersections with regular sets and homomorphisms, we obtain

{λ} ∈ L. 2

Theorem 4.14 Any AFL is closed under set-theoretic subtraction of regular languages.

Proof. Let L be an AFL. For a language L X from L and a regular set R X, L\R=L∩(X\R). Since the complement of a regular set is regular, too (see Theorem 4.4), L\R is an intersection of a languages in L with a regular set. Thus L\R ∈ L by

the closure properties required for an AFL. 2

Theorem 4.15 Any full AFL is closed under left and right quotients by regular sets, i. e., for any language L of the AFLL and any regular set R, the quotients Dl(L, R) and Dr(L, R) belong to L.

Proof. We only give the proof for the left quotient; the proof for the right quotient is analogous.

Let L be an AFL,L a language in L, and R a regular set. Furthermore, let X =alph(L)∪alph(R) and X0 ={a0 |a∈X}.

We define the homomorphisms

h:X →X, h1 : (X∪X0) →X and h2 : (X∪X0) →X by

h(a) =a0, h1(a0) = a, h1(a) =a, h2(a0) =λ, h2(a) =a for a∈X.

Additionally, we consider the set

Q=h(R)(alph(L)).

By the closure of L(REG) under homomorphisms and concatenation (see Theorems 4.7 and 4.9), Q is regular. Because

h2(h−11 (L)∩Q) = h2({w0v |w0 ∈h(R), v (alph(L)), wv ∈L})

= h2({w0v |w∈R, v (alph(L)), wv ∈L})

= {h2(w0)h2(v)|w∈R, v (alph(L)), wv ∈L}

= {v |wv∈L for some w∈R}, we have

Dl(L, R) = h2(h−11 (L)∩Q).

By the closure properties of an AFL, we obtain Dl(L, R)∈ L. 2

(16)

Theorem 4.16 Any full AFL is closed under substitutions by regular sets.

Proof. Let L be an AFL, L X a language of L and τ : X Y a substitution such that τ(a) is a regular set for any a∈X. Let X ={a1, a2, . . . , an} and τ(ai) =Ri L(REG) for 1≤i≤n. We define

X0 ={a0 |a∈X},

h1 : (X0∪Y) →X byh1(x0) = xfor x∈X and h1(y) = λ for y∈Y, h2 : (X0∪Y) →Y byh2(x0) = λ for x∈X and h2(y) =y for y∈Y, R=

[n

i=1

a0iRi.

Then we get

h−11 (L) = {u0x01u1x02u2. . . x0rur |x1x2. . . xr ∈L, ui ∈Y for 1≤i≤r}, h−11 (L)∩R = {x01u1x02u2. . . x0rur |x1x2. . . xr∈L, ui ∈τ(xi) for 1≤i≤r}, h2(h−11 (L)∩R) = {u1u2. . . ur |x1x2. . . xr ∈L, ui ∈τ(xi) for 1≤i≤r},

and finally,

τ(L) =h2(h−11 (L)∩R).

By the closure properties required for a full AFL, we obtainτ(L)∈ L. 2

4.2 Algebraic Characterizations of Language Fami- lies

4.2.1 Characterizations of Language Families by Operations

The aim of this section is to present some characterizations of language families by alge- braic means. We start with characterizations by closure properties under certain opera- tions and containments of very special languages.

Definition 4.17 Regular expressions over an alphabet X are inductively defined as fol- lows:

1. ∅, λ and x with x∈X are regular expressions.

2. If R1, R2 and R are regular expressions, then (R1+R2), (R1·R2) and R are also regular expressions.

With any regular expression we associate a regular language.

Definition 4.18 For a regular expressionU over the alphabetX, the associated setM(U) is inductively defined by the following settings:

1. M(∅) = ∅, M(λ) = {λ} uand M(x) = {x} for x∈X,

(17)

2. If R1, R2 and R are regular expressions, then

M((R1+R2)) = M(R1)∪M(R2), M((R1·R2)) = M(R1)·M(R2),

M(R) = (M(R)).

Example 4.19 LetX ={a, b, c}. By condition 1. of Definition 4.17, R0 =λ, R1 =a, R2 =b, R3 =c

are regular expressions overX. By condition 2. of Definition 4.17, the following constructs are also regular expressions:

R01 = (R1·R1) = (a·a), R001 = (R01·R1) = ((a·a)·a), R02 = R2 =b,

R002 = (R02+R100) = (b+ ((a·a)·a))), R03 = R3 =c,

R003 = (R3·R03) = (c·c),

R4 = (R002 ·R300) = ((b+ ((a·a)·a)))·(c·c)),

R5 = (R0+R4) = (λ+ ((b+ ((a·a)·a)))·(c·c))).

According to Definition 4.18 we obtain the following associated sets (where obvious sim- plifications are done):

M(R0) = {λ}, M(R1) ={a}, M(R2) = {b}, M(R3) ={c}, M(R01) = =M((R1·R1)) ={a} · {a}={a2},

M(R001) = M((R01·R1)) ={a2} · {a}={a3}, M(R02) = M(R2) ={b} ={bm :m≥0},

M(R002) = M((R02+R001)) = {bm :m≥0} ∪ {a3}, M(R03) = M(R3) ={c} ={cn :n 0},

M(R003) = M((R3·R03)) ={c}{cn :n≥0}={cn:c≥1}, M(R4) = M((R002·R003)) = ({bm :m≥0} ∪ {a3})· {cn :n≥1}

= {bmcn:m≥0, n1} ∪ {a3cn:n 3},

M(R5) = M((R0+R4)) ={λ} ∪({bmcn:m 0, n 1} ∪ {a3cn:n 3})

= {λ} ∪ {bmcn:m≥0, n1} ∪ {a3cn:n 3}.

If U = ((. . .((R1+R2) +R3) +. . .) +Rn), then to shorten the notation we write U =

Xn

i=1

Ri. Obviously,

M(U) = [n

i=1

M(Ri).

In an analogous way we use sums and unions over certain sets of indexes.

(18)

Theorem 4.20 A language L is regular if and only if there is a regular expression R such that M(R) =L.

Proof. = ) We show inductively that, for any regular expression U, the associated set M(U) is regular.

IfU is a regular expression by condition 1. of Definition 4.17, then all associated sets M(∅) = ∅,M(λ) = {λ}and M(x) ={x} withx∈X are finite and therefore regular (see Exercise ???).

Now letU be a regular expression, which is obtained from regular expressions R1, R2, andR according to condition 2. of Definition 4.17, and letM(R1),M(R2), and M(R) be the sets associated with R1, R2, and R, respectively. By induction hypotheses, M(R1), M(R2), and M(R) are regular. If U = (R1 +R2), then M(U) = M(R1)∪M(R2). By Theorem 4.2, M(U) is regular. If U = (R1 ·R2) or U = R, then the associated sets M(U) = M(R1)·M(R2) or M(U) = (M(R)), respectively, so sind nach den are also regular by Theorems 4.7 and 4.8,respectively.

= ) Let L be a regular language. Then there is a finite deterministic automaton A= (X, Z, z0, F, δ) with T(A) = L. Without loss of generality we can assume that

Z ={0,1,2, . . . r} and z0 = 0

for some r 0. For i, j ∈Z and 0≤k ≤r+ 1, by Lki,j we denote the set of all words w satisfying the following two conditions: Eigenschaften:

(a) δ(i, w) =j,

(b) for any u6=λ with w=uu0 and |u|<|w|, we have δ(i, u)< k.

Obviously,

L=T(A) = [

j∈F

Lr+10,j . (4.1)

We now prove that, for any set Lki,j, i, j ∈Z, 0 ≤k ≤r+ 1, there is a regular expression Rki,j with M(Ri,jk ) =Lki,j. The proof will be given by induction on k.

Let k = 0. For i 6= j, by definition, L0i,j consists of all words w, which directly transform the statei into statej, because by condition (b) no intermediate states occur.

Thusw is a word of length 1.Therefore

L0i,j ={x:x∈X, δ(i, x) = j}.

This can be written as

L0i,j = [

x∈X δ(i,x)=j

{x}.

Thus we also have

L0i,j =M( X

x∈X δ(i,x)=j

x) = [

x∈X δ(i,x)=j

{x},

which proves our assertion. Ifi=j, in addition to the words of length 1 which transform i intoi, the empty word is in L0i,i. Hence

L0i,j =M(λ+ X

x∈X δ(i,x)=i

x) = {λ} ∪ [

x∈X δ(i,x)=j

{x}

Referenzen

ÄHNLICHE DOKUMENTE

ColumnDimension, ColumnOperation, ColumnSpace, CompanionMatrix, ConditionNumber, ConstantMatrix, ConstantVector, Copy, CreatePermutation, CrossProduct, DeleteColumn,

in which node, if she decides her next jump concerning transition-probabilities as they are described in the matrix S. Let us take a look at the solution with the help

Give a classification (with proof) of the primes of this form, and then examine (in terms of the prime factors of n) which integers n are of this form.. Hand in solutions to

The crystals (hexagonal pyra- mids and bipyramids, and small plates) obtained by chemical transport with iodine as transporting agent were multiple twins.. The collected data of

CONTINUITY OF TRANSLATION AND SEPARABLE INVARIANT SUBSPACES OF BANACH SPACES ASSOCIATED TO LOCALLY COMPACT GROUPS1. Colin Graham1, Anthony To—Ming Lau2,

The poetic function in language is manifested through the features used to *make strange* (Entfremdung) — deautomatization in a word. Since poetic language must

Lothar Sebastian Krapp Simon Müller.

be highly rewarding for the study ofthe history ofthe Indo-Aryan languages.).. The Original Language of the Karpura-manjan 127.. edition