Theory of Computer Science
C5. Context-free Languages: Normal Form and PDA
Gabriele R¨oger
University of Basel
April 1, 2019
Overview
Automata &
Formal Languages
Languages
& Grammars
Regular Languages
Context-free Languages
ε-rules
Chomsky Normal Form
PDAs
Pumping Lemma Closure Properties
Decidability Context-sensitive &
Type-0 Languages
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Context-free Grammars and ε-Rules
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Repetition: Context-free Grammars
Definition (Context-free Grammar)
Acontext-free grammar is a 4-tuplehΣ,V,P,Siwith
1 Σ finite alphabet of terminal symbols,
2 V finite set of variables (with V ∩Σ =∅),
3 P ⊆(V ×(V ∪Σ)+)∪ {hS, εi}finite set of rules,
4 IfS →ε∈P, then all other rules inV ×((V \ {S})∪Σ)+.
5 S ∈V start variable.
andS never occurs on a right-hand side.
With regular grammars, this restriction could be lifted. How about context-free grammars?
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Repetition: Context-free Grammars
Definition (Context-free Grammar)
Acontext-free grammar is a 4-tuplehΣ,V,P,Siwith
1 Σ finite alphabet of terminal symbols,
2 V finite set of variables (with V ∩Σ =∅),
3 P ⊆(V ×(V ∪Σ)+)∪ {hS, εi}finite set of rules,
4 IfS →ε∈P, then all other rules inV ×((V \ {S})∪Σ)+.
5 S ∈V start variable.
RuleX →εis only allowed if X =S andS never occurs on a right-hand side.
With regular grammars, this restriction could be lifted. How about context-free grammars?
Repetition: Context-free Grammars
Definition (Context-free Grammar)
Acontext-free grammar is a 4-tuplehΣ,V,P,Siwith
1 Σ finite alphabet of terminal symbols,
2 V finite set of variables (with V ∩Σ =∅),
3 P ⊆(V ×(V ∪Σ)+)∪ {hS, εi}finite set of rules,
4 IfS →ε∈P, then all other rules inV ×((V \ {S})∪Σ)+.
5 S ∈V start variable.
RuleX →εis only allowed if X =S andS never occurs on a right-hand side.
With regular grammars, this restriction could be lifted.
How about context-free grammars?
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Overview
Automata &
Formal Languages
Languages
& Grammars
Regular Languages
Context-free Languages
ε-rules
Chomsky Normal Form
PDAs
Pumping Lemma Closure Properties
Decidability Context-sensitive &
Type-0 Languages
ε-Rules
Theorem
For every grammar G with rules P⊆V ×(V ∪Σ)∗ there is a context-free grammar G0 with L(G) =L(G0).
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
ε-Rules
Theorem
For every grammar G with rules P⊆V ×(V ∪Σ)∗ there is a context-free grammar G0 with L(G) =L(G0).
Proof.
LetG =hΣ,V,P,Sibe a grammar withP ⊆V ×(V ∪Σ)∗. LetVε={A∈V |A⇒∗ ε}. We can find this set Vε by first collecting all variablesAwith rule A→ε∈P and then successively adding additional variablesB if there is a rule B→A1A2. . .Ak ∈P and the variablesAi are already in the set
for all 1≤i ≤k. . . .
ε-Rules
Theorem
For every grammar G with rules P⊆V ×(V ∪Σ)∗ there is a context-free grammar G0 with L(G) =L(G0).
Proof (continued).
LetP0 be the rule set that is constructed from P by adding rules that obviate the need for A→εrules:
for every existing ruleB →w with B ∈V,w ∈(V ∪Σ)+, let Iε be the set of positions where w contains a variable A∈Vε. For every non-empty setI0 ⊆Iε, add a new rule B →w0, wherew0 is constructed fromw by removing the variables at all positions in I0.
removing all rules of the form A→ε(after the previous step).
. . .
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
ε-Rules
Theorem
For every grammar G with rules P⊆V ×(V ∪Σ)∗ there is a context-free grammar G0 with L(G) =L(G0).
Proof (continued).
ThenL(G)\ {ε}=L(hΣ,V,P0,Si) andP0 contains no rule A→ε. If the start variableS of G is not in Vε, we are done.
Otherwise, letS0 be a new variable and construct P00 fromP0 by
1 replacing all occurrences ofS on the right-hand side of rules with S0,
2 adding the rule S0 →w for every rule S →w, and
3 adding the rule S →ε.
ThenL(G) =L(hΣ,V ∪ {S0},P00,Si).
ε-Rules
Theorem
For every grammar G with rules P⊆V ×(V ∪Σ)∗ there is a context-free grammar G0 with L(G) =L(G0).
Proof (continued).
ThenL(G)\ {ε}=L(hΣ,V,P0,Si) andP0 contains no rule A→ε. If the start variableS of G is not in Vε, we are done.
Otherwise, letS0 be a new variable and construct P00 fromP0 by
1 replacing all occurrences ofS on the right-hand side of rules with S0,
2 adding the rule S0 →w for every rule S →w, and
3 adding the rule S →ε.
ThenL(G) =L(hΣ,V ∪ {S0},P00,Si).
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Questions
Questions?
Chomsky Normal Form
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Overview
Automata &
Formal Languages
Languages
& Grammars
Regular Languages
Context-free Languages
ε-rules
Chomsky Normal Form
PDAs
Pumping Lemma Closure Properties
Decidability Context-sensitive &
Type-0 Languages
Chomsky Normal Form: Motivation
As in logical formulas (and other kinds of structured objects), normal formsfor grammars are useful:
they show which aspects are critical for defining grammars and which ones are just syntactic sugar
they allow proofs and algorithms to be restricted
to a limited set of grammars (inputs): those in normal form Hence we now consider anormal form for context-free grammars.
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Chomsky Normal Form: Definition
Definition (Chomsky Normal Form)
A context-free grammarG is inChomsky normal form (CNF) if all rules have one of the following three forms:
A→BC with variables A,B,C, or
A→awith variableA, terminal symbol a, or S →εwith start variable S.
German: Chomsky-Normalform
in short: rule setP ⊆(V ×(VV ∪Σ))∪ {hS, εi}
Chomsky Normal Form: Theorem
Theorem
For every context-free grammar G there is a context-free grammar G0 in Chomsky normal form with L(G) =L(G0).
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Chomsky Normal Form: Theorem
Theorem
For every context-free grammar G there is a context-free grammar G0 in Chomsky normal form with L(G) =L(G0).
Proof.
The following algorithm converts the rule set ofG into CNF:
Step 1: Eliminate rules of the form A→B with variablesA,B. If there are sets of variables{B1, . . . ,Bk} with rules
B1 →B2,B2 →B3, . . . ,Bk−1 →Bk,Bk →B1, then replace these variables by a new variableB.
Define a strict total order<on the variables such thatA→B∈P implies thatA<B. Iterate from the largest to the smallest variableAand eliminate all rules of the form A→B while adding rulesA→w for every ruleB →w withw ∈(V ∪Σ)+. . . .
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Chomsky Normal Form: Theorem
Theorem
For every context-free grammar G there is a context-free grammar G0 in Chomsky normal form with L(G) =L(G0).
Proof (continued).
Step 2: Eliminate rules with terminal symbols on the right-hand side that do not have the formA→a.
For every terminal symbola∈Σ add a new variable Aa
and the ruleAa →a.
Replace all terminal symbols in all rules that do not have
the formA→awith the corresponding newly added variables. . . .
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Chomsky Normal Form: Theorem
Theorem
For every context-free grammar G there is a context-free grammar G0 in Chomsky normal form with L(G) =L(G0).
Proof (continued).
Step 3: Eliminate rules of the form A→B1B2. . .Bk with k >2 For every rule of the formA→B1B2. . .Bk with k >2, add new variablesC2, . . . ,Ck−1 and replace the rule with
A→B1C2 C2→B2C3
...
Ck−1→Bk−1Bk
Chomsky Normal Form: Length of Derivations
Observation
LetG be a grammar in Chomsky normal form,
and letw ∈ L(G) be a non-empty word generated byG.
Then all derivations ofw have exactly 2|w| −1 derivation steps.
Proof.
Exercises
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-Down Automata
Overview
Automata &
Formal Languages
Languages
& Grammars
Regular Languages
Context-free Languages
ε-rules
Chomsky Normal Form
PDAs
Pumping Lemma Closure Properties
Decidability Context-sensitive &
Type-0 Languages
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Limitations of Finite Automata
q0 0 q1 q2
0,1
0
Language Lis regular.
⇐⇒ There is a finite automaton that acceptsL.
What information can a finite automaton “store”
about the already read part of the word?
Infinite memory would be required for
L={x1x2. . .xnxn. . .x2x1|n>0,xi ∈ {a,b}}.
therefore: extension of the automata model with memory
Limitations of Finite Automata
q0 0 q1 q2
0,1
0
Language Lis regular.
⇐⇒ There is a finite automaton that acceptsL.
What information can a finite automaton “store”
about the already read part of the word?
Infinite memory would be required for
L={x1x2. . .xnxn. . .x2x1|n>0,xi ∈ {a,b}}.
therefore: extension of the automata model with memory
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Limitations of Finite Automata
q0 0 q1 q2
0,1
0
Language Lis regular.
⇐⇒ There is a finite automaton that acceptsL.
What information can a finite automaton “store”
about the already read part of the word?
Infinite memory would be required for
L={x1x2. . .xnxn. . .x2x1|n>0,xi ∈ {a,b}}.
therefore: extension of the automata model with memory
Stack
Astack is a data structure following thelast-in-first-out (LIFO) principle supporting the following operations:
push: puts an object on top of the stack
pop: removes the object at the top of the stack
peek: returns the top object without removing it
Pop Push
German: Keller, Stapel
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-down Automata: Visually
Input tape
I n p u t
Read head
Push-down automaton
Stack access
Stack
German: Kellerautomat, Eingabeband, Lesekopf, Kellerzugriff
Push-down Automata: Definition
Definition (Push-down Automaton)
Apush-down automaton (PDA) is a 6-tupleM =hQ,Σ,Γ, δ,q0,#i with
Q finite set of states Σ the input alphabet Γ the stack alphabet
δ :Q×(Σ∪ {ε})×Γ→ Pf(Q×Γ∗) the transition function (where Pf is the set of all finitesubsets)
q0∈Q the start state
#∈Γ the bottommost stack symbol
German: Kellerautomat, Eingabealphabet, Kelleralphabet, German: Uberf¨¨ uhrungsfunktion
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-down Automata: Transition Function
LetM =hQ,Σ,Γ, δ,q0,#i be a push-down automaton.
What is the Intuitive Meaning of the Transition Functionδ?
hq0,B1. . .Bki ∈δ(q,a,A): If M is in stateq, reads symbol a and has Aas the topmost stack symbol,
then M cantransition toq0 in the next step while replacing A with B1. . .Bk (afterwards B1 is the topmost stack symbol)
q a,A→B1. . .Bk q0
special casea=εis allowed (spontaneous transition)
Push-down Automata: Example
q q0
a,A→AA a,B→AB a,#→A#
b,A→BA b,B→BB b,#→B#
a,A→ε b,B→ε
a,A→ε b,B→ε ε,#→ε
M =h{q,q0},{a,b},{A,B,#}, δ,q,#i with
δ(q,a,A) ={hq,AAi,hq0, εi}δ(q,b,A) ={hq,BAi} δ(q, ε,A) =∅ δ(q,a,B) ={hq,ABi} δ(q,b,B) ={hq,BBi,hq0, εi}δ(q, ε,B) =∅ δ(q,a,#) ={hq,A#i} δ(q,b,#) ={hq,B#i} δ(q, ε,#) =∅ δ(q0,a,A) ={hq0, εi} δ(q0,b,A) =∅ δ(q0, ε,A) =∅ δ(q0,a,B) =∅ δ(q0,b,B) ={hq0, εi} δ(q0, ε,B) =∅
δ(q0,a,#) =∅ δ(q0,b,#) =∅ δ(q0, ε,#) ={hq0, εi}
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-down Automata: Configuration
Definition (Configuration of a Push-down Automaton)
Aconfiguration of a push-down automatonM =hQ,Σ,Γ, δ,q0,#i is given by a triplec ∈Q×Σ∗×Γ∗.
German: Konfiguration Example
I n p u t
q
Configuration hq,ut,BAC#i.
Push-down Automata: Steps
Definition (Transition/Step of a Push-down Automaton)
We writec `M c0 if a push-down automatonM =hQ,Σ,Γ, δ,q0,#i can transition from configurationc to configuration c0 in one step.
Exactly the following transitions are possible:
hq,a1. . .an,A1. . .Ami `M
hq0,a2. . .an,B1. . .BkA2. . .Ami ifhq0,B1. . .Bki ∈δ(q,a1,A1) hq0,a1a2. . .an,B1. . .BkA2. . .Ami
ifhq0,B1. . .Bki ∈δ(q, ε,A1)
German: Ubergang¨
IfM is clear from context, we only writec `c0.
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-down Automata: Reachability of Configurations
Definition (Reachable Configuration)
Configurationc0 isreachable from configuration c in PDAM (c `∗M c0) if there are configurations c0, . . . ,cn (n ≥0) where
c0=c,
ci `M ci+1 for all i ∈ {0, . . . ,n−1}, and cn=c0.
German: c0 ist inM vonc erreichbar
Push-down Automata: Recognized Words
Definition (Recognized Word of a Push-down Automaton) PDAM =hQ,Σ,Γ, δ,q0,#i recognizes the wordw =a1. . .an
iff the configurationhq, ε, εi(word processedandstack empty) for someq ∈Q is reachable from thestart configurationhq0,w,#i.
M recognizesw iff hq0,w,#i `∗M hq, ε, εi for someq ∈Q.
German: M erkenntw, Startkonfiguration
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Push-down Automata: Recognized Word Example
q q0
a,A→AA a,B→AB a,#→A#
b,A→BA b,B→BB b,#→B#
a,A→ε b,B→ε
a,A→ε b,B→ε ε,#→ε
example: this PDA recognizesbbabbabb blackboard
Push-down Automata: Accepted Language
Definition (Accepted Language of a Push-down Automaton) LetM be a push-down automaton with input alphabet Σ.
Thelanguage accepted by M is defined as
L(M) ={w ∈Σ∗ |M recognizesw}.
example: blackboard
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
PDAs Accept Exactly the Context-free Languages
Theorem
A language L is context-free if and only if L is accepted by a push-down automaton.
Questions
Questions?
Context-free Grammars andε-Rules Chomsky Normal Form Push-Down Automata Summary
Summary
Summary
Every context-free language has a grammar in Chomsky normal form. All rules have form
A→BC with variables A,B,C, or
A→awith variable A, terminal symbola, or S →εwith start variableS.
Push-down automata (PDAs) extend NFAs with memory.
PDAs acceptnot with end states but with an empty stack.
The languages accepted by PDAsare exactly thecontext-free languages.