Theory of Computer Science B3. Regular Languages Gabriele R¨oger

(1)

B3. Regular Languages

Gabriele R¨oger

University of Basel

March 15, 2021

(2)

Introduction

(3)

Introduction Epsilon Rules Finite Automata Summary

Repetition: Regular Grammars

Definition (Regular Grammars)

Aregular grammaris a 4-tuplehV,Σ,R,Si with V finite set of variables (nonterminal symbols) Σ finite alphabet of terminal symbols with V ∩Σ =∅ R ⊆(V ×(Σ∪ΣV))∪ {hS, εi}finite set of rules

if S →ε∈R, there is no X ∈V,y ∈Σ withX →yS ∈R S ∈V start variable.

S never occurs in the right-hand side of a rule.

(4)

Repetition: Regular Grammars

Aregular grammaris a 4-tuplehV,Σ,R,Si with V finite set of variables (nonterminal symbols) Σ finite alphabet of terminal symbols with V ∩Σ =∅ R ⊆(V ×(Σ∪ΣV))∪ {hS, εi} finite set of rules

RuleX →εis only allowed if X =S and S never occurs in the right-hand side of a rule.

(5)

Question (Slido)

With a regular grammar, how many steps does it take to derive a non-empty word (over Σ) from the start variable?

(6)

Repetition: Regular Languages

A language is regular if it is generated by some regular grammar.

Definition (Regular Language) A languageL⊆Σ^∗ isregular

if there exists a regular grammarG with L(G) =L.

(7)

Questions

How restrictive is the requirement on rules?

If we don’t restrict the usage of εas right-hand side of a rule, what does this change?

How do regular languages relate to finite automata?

Can all regular languages be recognized by a finite automaton? And vice versa?

With what operations can we “combine” regular languages and the result is again a regular language?

E.g. is the intersection of two regular languages regular?

(8)

Questions

Questions?

(9)

Epsilon Rules

(10)

Repetition: Regular Grammars

(11)

Repetition: Regular Grammars

How restrictive is this?

(12)

Our Plan

We are going to show that every grammar with rules R⊆V ×(Σ∪ΣV ∪ε)

generates a regular language.

(13)

Question

This is much simpler!

Why don’t we define regular languages via such grammars?

Picture courtesy of imagerymajestic / FreeDigitalPhotos.net

(14)

Question

Both variants (restricting the occurrence ofεon the right-hand side of rules or not) characterize exactly the regular languages.

In the following situations, which variant would you prefer?

You want to prove something for all regular languages.

You want to specify a grammar to establish that a certain language is regular.

You want to write an algorithm that takes a grammar for a regular language as input.

(15)

Our Plan

We are going to show that every grammar with rules R⊆V ×(Σ∪ΣV ∪ε)

generates a regular language.

The proof will beconstructive, i. e. it will tell us how to construct a regular grammar for a language

that is given by such a more general grammar.

Two steps:

1 Eliminate the start variable from the right-hand side of rules.

2 Eliminate forbidden occurrences ofε.

(16)

Start Variable in Right-Hand Side of Rules

For every type-0 languageLthere is a grammar where the start variable does not occur on the right-hand side of any rule.

Theorem

For every grammar G =hV,Σ,R,Si there is a grammar G⁰ =hV⁰,Σ,R⁰,Si with rules

R⁰ ⊆(V⁰∪Σ)^∗V⁰(V⁰∪Σ)^∗×(V⁰\ {S}∪Σ)^∗ such that L(G) =L(G⁰).

Note: this theorem is true for allgrammars.

(17)

Start Variable in Right-Hand Side of Rules: Example

Before we prove the theorem, let’s illustrate its idea.

ConsiderG =h{S,X},{a,b},R,Si with the following rules inR:

bS→ε S→XabS bX→aSa X→abc

The new grammar has all original rules except that S is replaced with a new variable S’ (allowing to derive everything from S’ that could originally be derived from the start variable S):

bS’→ε S’→XabS’ bX→aS’a X→abc

In addition, it has rules that allow to start from the original start variable but switch to S’ after the first rule application:

S→XabS’

(18)

Start Variable in Right-Hand Side of Rules: Example

S→XabS’

(19)

Start Variable in Right-Hand Side of Rules: Example

S→XabS’

(20)

Start Variable in Right-Hand Side of Rules: Proof

Proof.

LetG =hV,Σ,R,Si be a grammar andS⁰ 6∈V be a new variable.

Construct rule setR⁰ fromR as follows:

for every ruler ∈R, add a ruler⁰ toR⁰, wherer⁰ is the result of replacing all occurences ofS in r with S⁰.

for every ruleS →w ∈R, add a ruleS →w⁰ toR⁰, where w⁰ is the result of replacing all occurences of S in w with S⁰. ThenL(G) =L(hV ∪ {S⁰},Σ,R⁰,Si).

rules inR. In particular:

IfR ⊆V ×(Σ∪ΣV ∪ {ε}) thenR⁰ ⊆V⁰×(Σ∪ΣV⁰∪ {ε}). IfR ⊆V ×(V ∪Σ)^∗ then R⁰ ⊆V⁰×(V⁰∪Σ)^∗.

(21)

Start Variable in Right-Hand Side of Rules: Proof

Proof.

LetG =hV,Σ,R,Si be a grammar andS⁰ 6∈V be a new variable.

Construct rule setR⁰ fromR as follows:

for every ruler ∈R, add a ruler⁰ toR⁰, wherer⁰ is the result of replacing all occurences ofS in r with S⁰.

for every ruleS →w ∈R, add a ruleS →w⁰ toR⁰, where w⁰ is the result of replacing all occurences of S in w with S⁰. ThenL(G) =L(hV ∪ {S⁰},Σ,R⁰,Si).

Note that the rules inR⁰ are not fundamentally different from the rules inR. In particular:

IfR ⊆V ×(Σ∪ΣV ∪ {ε}) thenR⁰⊆V⁰×(Σ∪ΣV⁰∪ {ε}).

IfR ⊆V ×(V ∪Σ)^∗ then R⁰ ⊆V⁰×(V⁰∪Σ)^∗.

(22)

Epsilon Rules

Theorem

For every grammar G with rules R ⊆V ×(Σ∪ΣV ∪ {ε}) there is a regular grammar G⁰ with L(G) =L(G⁰).

(23)

Epsilon Rules: Example

Let’s again first illustrate the idea.

ConsiderG =h{S,X,Y},{a,b},R,Siwith the following rules inR:

S→ε S→aX X→aX X→aY Y→bY Y→ε

1 The start variable does not occur on a right-hand side. X

2 Determine the set of variables that can be replaced with the empty word: V_ε={S,Y}.

3 Eliminate forbidden rules: ///////Y→ε

4 If a variable fromVε occurs in the right-hand side, add another rule that directly emulates a subsequent replacement with the empty word: X→aand Y→b

(24)

Epsilon Rules: Example

(25)

Epsilon Rules: Example

(26)

Epsilon Rules: Example

(27)

Epsilon Rules: Example

4 If a variable fromVε occurs in the right-hand side, add another rule that directly emulates a subsequent replacement with the empty word: X→aandY→b

(28)

Epsilon Rules

Theorem

For every grammar G with rules R ⊆V ×(Σ∪ΣV ∪ {ε}) there is a regular grammar G⁰ with L(G) =L(G⁰).

Proof.

LetG =hV,Σ,R,Si be a grammar s.t.R ⊆V ×(Σ∪ΣV ∪ {ε}).

Use the previous proof to construct grammarG⁰=hV⁰,Σ,R⁰,Si s.t.R⁰ ⊆V⁰×(Σ∪Σ(V⁰\ {S})∪ {ε}) andL(G⁰) =L(G).

LetV_ε={A|A→ε∈R⁰}.

LetR⁰⁰ be the rule set that is created fromR⁰ by removing all rules of the formA→ε(A6=S). Additionally, for every rule of the form B→xA with A∈Vε,B ∈V⁰,x ∈Σ we add a rule B →x to R⁰⁰. ThenG⁰⁰=hV⁰,Σ,R⁰⁰,Siis regular and L(G) =L(G⁰⁰).

(29)

Questions

Questions?

(30)

Exercise (Slido)

ConsiderG =h{S,X,Y},{a,b},R,Si with the following rules inR:

S→ε S→aX

X→aX X→aY

Y→bY Y→ε

Is G a regular grammar?

Is L(G) regular?

What is L(G)?

(31)

Finite Automata

(32)

Languages Recognized by DFAs are Regular

Theorem

Every language recognized by a DFA is regular (type 3).

(33)

Languages Recognized by DFAs are Regular

Theorem

Proof.

LetM =hQ,Σ, δ,q₀,Fi be a DFA.

We define a regular grammarG withL(G) =L(M).

DefineG =hQ,Σ,R,q0i whereR contains a ruleq →aq⁰ for every δ(q,a) =q⁰, and a ruleq →εfor everyq ∈F.

(We can eliminate forbidden epsilon rules

as described at the start of the chapter.) . . .

(34)

Languages Recognized by DFAs are Regular

Theorem

Proof (continued).

For everyw =a₁a₂. . .a_n∈Σ^∗: w ∈ L(M)

iff there is a sequence of statesq⁰₀,q₁⁰, . . . ,q_n⁰ with

iffq₀⁰ =q0,q⁰_n∈F and δ(q_i−1⁰ ,ai) =q_i⁰ for alli ∈ {1, . . . ,n}

iff there is a sequence of variablesq₀⁰,q₁⁰, . . . ,q_n⁰ with iffq₀⁰ is start variable and we have q⁰₀⇒a1q⁰₁⇒a1a2q⁰₂⇒ iff· · · ⇒a₁a₂. . .a_nq_n⁰ ⇒a₁a₂. . .a_n.

iffw ∈ L(G)

(35)

Exercise (Slido)

q0

q1 q2

0 1

Specify a regular grammar that generates the language recognized by this DFA.

(36)

Questions

Questions?

(37)

Question

Is the inverse true as well:

for every regular language, is there a DFA that recognizes it? That is, are the

languages recognized by DFAsexactly the regular languages?

We will prove this via a detour.

(38)

Question

Is the inverse true as well:

for every regular language, is there a DFA that recognizes it? That is, are the

languages recognized by DFAsexactly the regular languages?

Yes!

We will prove this via a detour.

(39)

Regular Grammars are No More Powerful than NFAs

Theorem

For every regular grammar G there is an NFA M withL(G) =L(M).

Proof illustration:

ConsiderG =h{S,A,B},{a,b},R,Siwith the following rules inR:

S→ε S→aA A→aA A→aB

A→a B→bB B→b

(40)

Regular Grammars are No More Powerful than NFAs

Theorem

Proof.

LetG =hV,Σ,R,Si be a regular grammar.

Define NFAM =hQ,Σ, δ,q₀,Fiwith

Q=V ∪ {X}, X 6∈V q0=S

F =

({S,X} ifS →ε∈R {X} ifS →ε6∈R B ∈δ(A,a) if A→aB ∈R

X ∈δ(A,a) if A→a∈R

. . .

(41)

Regular Grammars are No More Powerful than NFAs

Theorem

Proof (continued).

For everyw =a₁a₂. . .a_n∈Σ^∗ with n≥1:

w ∈ L(G)

iff there is a sequence on variablesA₁,A₂, . . . ,An−1 with

iffS ⇒a₁A₁ ⇒a₁a₂A₂⇒ · · · ⇒a₁a₂. . .an−1An−1 ⇒a₁a₂. . .a_n. iff there is a sequence of variablesA₁,A₂, . . . ,An−1 with

iffA1 ∈δ(S,a1),A2 ∈δ(A1,a2), . . . ,X ∈δ(An−1,an).

iffw ∈ L(M).

Casew =εis also covered becauseS ∈F iff S →ε∈R.

(42)

Finite Automata and Regular Languages

DFA

regular grammar

NFA In particular, this implies:

Corollary

Lregular ⇐⇒ Lis recognized by a DFA.

Lregular ⇐⇒ Lis recognized by an NFA.

(43)

Questions

Questions?

(44)

Summary

(45)

Summary

Regular grammars restrict the usage ofεin rules.

This restriction is not necessary for the characterization of regular languages but convenient if we want to prove something for all regular languages.

Finite automata (DFAs and NFAs) recognizeexactly the regular languages.