Context-Free Analysis
Lecture Compilers SS 2009
Dr.-Ing. Ina Schaefer
Software Technology Group TU Kaiserslautern
Ina Schaefer Context-Free Analysis 1
Content of Lecture
1. Introduction: Overview and Motivation 2. Syntax- and Type Analysis
2.1 Lexical Analysis
2.2 Context-free Syntax Analysis 2.3 Context-sensitive Syntax Analysis 3. Translation to Target Language
3.1 Translation of Imperative Language Constructs 3.2 Translation of Object-Oriented Language Constructs 4. Selected Aspects of Compilers
4.1 Intermediate Languages 4.2 Optimization
4.3 Command Selection 4.4 Register Allocation 4.5 Code Generation 5. Garbage Collection
6. XML Processing (DOM, SAX, XSLT)
Outline of Context-Free Analysis
1. Specification of Parsers 2. Implementation of Parsers
Top-Down Syntax Analysis
Recursive Descent LL(k) Parsing Theory LL Parser Generation
Bottom-Up Syntax Analysis
Principles of LR Parsing LR Parsing Theory
SLR, LALR, LR(k) Parsing LALR-Parser Generation
3. Error Handling
4. Concrete and Abstract Syntax
Ina Schaefer Context-Free Analysis 3
Context-Free Syntax Analysis
Tasks
• Check, if Token Stream (from Scanner) matches context-free syntax of languages
! Error Case: Error handling
! Correctness: Construction of Syntax Tree
Parser Token Stream
Context-Free Syntax Analysis (2)
Remarks:
• Parsing can be interleaved with other actions processing the program (e.g. attributation).
• Syntax tree controls important parts of translation. Hence, we distinguish
! Concrete syntax tree corresponds to context-free grammar
! Abstract syntax tree aims at further processing steps, compact representation of essential information.
Ina Schaefer Context-Free Analysis 5
Specification of Parsers
Specification of Parsers
2 general specification techniques
• Syntax Diagrams
• Context-Free Grammars (often in extended form)
Specification of Parsers
Context-Free Grammars
Definition
Let N and T be two alphabets, with N ∩T = ∅ and Π a finite subset of N ×(N ∪T)∗ and S ∈ N. Then Γ = (N,T,Π,S) is a context-free
grammar(CFG) where
• N is the set of non-terminals
• T is the set of terminals
• Π is the set of productions rules
• S is the start symbol (axiom)
Ina Schaefer Context-Free Analysis 7
Specification of Parsers
Context-Free Grammars (2)
Notations:
• A,B,C, . . . denote non-terminals
• a,b,c, . . . denote terminals
• x,y,z, . . . denote strings of terminals, i.e. x ∈ T∗
• α,β,γ,ψ,φ,σ,τ are strings of terminals and non-terminals, i.e.
α ∈ (N ∪T)∗
Productions are denoted byA → α.
The notation A → α|β|γ|. . . is an abbreviation for
Specification of Parsers
Derivation
Let Γ =( N,T,Π,S) be a CFG:
• ψ is directly derivable from φ in Γ, φ produces ψ directly, φ ⇒ ψ, if there exists σAτ = φ and σατ = ψ and A → α ∈ Π
• ψ is derivable from φ in Γ,φ ⇒∗ ψ, if there exists φ0, . . . ,φn with φ = φ0 and ψ = φn and for all i ∈ {0, . . . ,n− 1} it holds that φi ⇒ φi+1. φ0, . . . ,φn is the derivation of ψ from φ.
• ⇒∗ is the reflexive, transitive closure of ⇒.
Ina Schaefer Context-Free Analysis 9
Specification of Parsers
Derivation (2)
• The derivation φ0, . . . ,φn is a left derivation (right derivation), if in φi the left-most (right-most) non-terminal is replaced. Left
derivation steps are denoted by φ ⇒lm ψ. Right derivation steps are denoted byφ ⇒rm ψ.
• The tree-like representation of a derivation is a syntax tree.
• L(Γ) = {z ∈ T∗|S ⇒∗ z} is the language generated byΓ.
• x ∈ L(Γ) is a sentence of Γ.
• φ ∈ (N ∪T)∗ with S ⇒∗ φ is a sentential form of Γ.
Specification of Parsers
Ambiguity in Grammars
• A sentence is unambiguous if it has exactly one syntax tree. A sentence is ambiguous if it has more than one syntax tree.
• For each syntax tree, there exists exactly on left derivation and exactly one right derivation.
• Thus it holds: A sentence is unambiguous if-and-only-if it has exactly one left (right) derivation.
• A grammar is ambiguous if it contains an ambiguous sentence, else it is unambiguous.
• For programming languages, unambiguous grammars are
essential, as the semantics and the translation are defined by the syntactic structure.
Ina Schaefer Context-Free Analysis 11
Specification of Parsers
Ambiguity in Grammars (2)
Example 1: Grammar for Expressions Γ0
• S → E
• E → E +E
• E → E ∗E
• E → (E)
• E → ID
Consider the input string (av + av)∗bv +cv + dv which is the following input to the context-free analysis
Specification of Parsers
Ambiguity in Grammars (3)
Syntax tree for (ID + ID)∗ID + ID + ID
© A. Poetzsch-Heffter, TU Kaiserslautern 54 25.04.2007
Beispiele: (Mehrdeutigkeit)
1. Beispiel einer Ausdrucksgrammatik:
!0: S E, E E + E, E E * E, E ( E ), E ID
Betrachte die Eingabe: (av+av) * bv + cv +dv) Eingabe zur kf-Analyse: ( ID + ID ) * ID + ID + ID
S
"
" "
" "
E E E E E ( ID + ID ) * ID + ID + ID
- Syntaxbaum entspricht nicht den üblichen Rechenregeln.
- Es gibt mehrere Syntaxbäume gemäß !0,
insbesondere ist die Grammatik mehrdeutig.
• Syntax tree does not match conventional rules of arithmetic.
• There are several syntax trees according to Γ0 for this input, hence Γ0 is ambiguous.
Ina Schaefer Context-Free Analysis 13
Specification of Parsers
Ambiguity in Grammars (4)
Example 2: Ambiguity in if-then-else construct
if B1 then if B2 then A:= 9 else A:= 7 First Derivation
2. Mehrdeutigkeit beim if-then-else-Konstrukt:
if B1 then if B2 then A:=8 else A:= 7
IFTHENELSE
ANW IFTHEN
ANW ANW ZW ZW IF ID THEN IF ID THEN ID EQ CO ELSE ID EQ CO
ZW ZW ANW ANW
IFTHENELSE ANW IFTHEN
Ina Schaefer Context-Free Analysis 14
Specification of Parsers
Ambiguity in Grammars (5)
Second Derivation
55
© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007
2. Mehrdeutigkeit beim if-then-else-Konstrukt:
if B1 then if B2 then A:=8 else A:= 7
IFTHENELSE
ANW IFTHEN
ANW ANW ZW ZW IF ID THEN IF ID THEN ID EQ CO ELSE ID EQ CO
ZW ZW ANW ANW
IFTHENELSE ANW
IFTHEN
Ina Schaefer Context-Free Analysis 15
Specification of Parsers
Ambiguity in Grammars (6)
Remarks:
• Each derivation corresponds to exactly one syntax tree. In reverse, for each syntax tree, there can be several derivations.
• Instead of the term "syntax tree" often also the terms "structure tree" or "derivation tree" are used.
• For each language, there can be several generating grammars, i.e.
the mapping L: Grammar → Language is in general not injective.
Specification of Parsers
Ambiguity as Grammar Property
Ambiguity is a grammar property. The grammar for expressions Γ0 is an example of an ambiguous grammar.
Γ0:
• S → E
• E → E + E
• E → E ∗ E
• E → (E)
• E → ID
57
© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007
Beispiel: (Mehrdeutigkeit als Grammatikeig.)
Die obige Ausdrucksgrammatik
!0: S E, E E + E | E * E | E ( E ) | E ID
ist ein Beispiel für eine mehrdeutige Grammatik:
S E E E E E
ID + ID * ID E E E E
E S
Mehrdeutigkeit ist zunächst einmal eine Grammatik- eigenschaft.
Ina Schaefer Context-Free Analysis 17
Specification of Parsers
Ambiguity as Grammar Property (2)
But there exists an unambiguous grammar for the same language:
Γ1:
• S → E
• E → T + E |T
• T → F ∗T |F
• F → (E)|ID
Aber es gibt eine eindeutige Grammatik für die Sprache:
!1: S E, E T + E | T, T F * T | F, F ( E ) | ID S
E
E E
E F
F
F
F T
T T
T
( ID + ID ) * ID + ID F T
Lesen Sie zu Abschnitt 2.2.1:
Wilhelm, Maurer:
• aus Kap. 8, Syntaktische Analyse, die S. 271 - 283 Appel:
• aus Chap. 3, S. 40 - 47
(Es gibt aber auch kontextfreie Sprachen, die nur durch mehrdeutige Grammatiken beschrieben werden.)
Ina Schaefer Context-Free Analysis 18
Specification of Parsers
Literature
Recommended Reading:
• Wilhelm, Maurer: Chapter 8, pp. 271 - 283 (Syntactic Analysis)
• Appel: Chapter 3, pp. 40-47
Ina Schaefer Context-Free Analysis 19
Implementation of Parsers
Implementation of Parsers
Overview
• Top-Down Parsing
! Recursive Descent
! LL-Parsing
! LL-Parser Generation
• Bottom-Up Parsing
! LR-Parsing
! LALR, SLR, LR(k)-Parsing
! LALR-Parser Generation
Implementation of Parsers
Methods for Context-Free Analysis
• Manually developed, grammar-specific implementation (error-prone, inflexible)
• Backtracking (simple, but inefficient)
• Cocke-Younger-Kasami-Algorithm (1967):
! for all CFGs in Chomsky Normalform
! based on idea of dynamic programming
! Time Complexity O(n3), however linear complexity desired
• Top-Down-Methods: from Axiom to Token stream
• Bottom-up-Methods: from Token stream to Axiom
Ina Schaefer Context-Free Analysis 21
Implementation of Parsers
Example: Top-Down Analysis
According to Γ1: Result is a left derivation.
Beispiel: (Top-down-Analyse) S
E =>
T + E =>
F * T + E =>
( E ) * T + E =>
( T + E ) * T + E =>
( F + E ) * T + E =>
( ID + E ) * T + E =>
( ID + T ) * T + E =>
( ID + F ) * T + E =>
( ID + ID ) * T + E =>
( ID + ID ) * F + E =>
( ID + ID ) * ID + E =>
( ID + ID ) * ID + T =>
( ID + ID ) * ID + F =>
( ID + ID ) * ID + ID
Ergebnis der td-Analyse ist eine Linksableitung.
Gemäß !1 :
Ina Schaefer Context-Free Analysis 22
Implementation of Parsers
Example: Bottom-Up Analysis
According to Γ1: Result is a right derivation.
© A. Poetzsch-Heffter, TU Kaiserslautern 61 25.04.2007
Beispiel: (Bottom-up-Analyse)
( ID + ID ) * ID + ID <=
( F + ID ) * ID + ID <=
( T + ID ) * ID + ID <=
( T + F ) * ID + ID <=
( T + T ) * ID + ID <=
( T + E ) * ID + ID <=
( E ) * ID + ID <=
F * ID + ID <=
F * F + ID <=
F * T + ID <=
T + ID <=
T + F <=
T + T <=
T + E <=
E <=
S <=
Ergebnis der bu-Analyse ist eine Rechtsableitung.
Gemäß !1 :
Ina Schaefer Context-Free Analysis 23
Implementation of Parsers
Context-free Analysis with linear complexity
• Restrictions on Grammar (not every CFG has a linear parser)
• Use of push-down automata or systems of recursive procedures
• Usage of look ahead to remaining input in order to select next production rule to be applied
Implementation of Parsers
Syntax Analysis Methods and Parser Generators
• Basic Knowledge of Syntax Analysis is essential for use of Parser Generators.
• Parser generators are not always applicable.
• Often, error handling has to be done manually.
• Methods underlying parser generation is a good example for a generic technique (and a highlight of computer science!).
Ina Schaefer Context-Free Analysis 25
Implementation of Parsers Top-Down Syntax Analysis
Top-Down Syntax Analysis
Educational Objectives
• General Principle of Top-Down Syntax Analysis
• Recursive Descent Parsing (at an Example)
• Expressiveness of Top-Down Parsing
• Basic Concepts of LL(k) Parsing
Implementation of Parsers Top-Down Syntax Analysis
Recusive Descent
Basic Idea
• Each non-terminal A is associated with a procedure. This procedure accepts a partial sentence derived from A.
• The procedure implements a finite automaton constructed from the productions starting from A
• Recursion of grammar is mapped to mutual recursive procedures such that stack of higher programing languages can be used for implementation.
Ina Schaefer Context-Free Analysis 27
Implementation of Parsers Top-Down Syntax Analysis
Construction of Recursive Descent Parser
Let Γ"1 be a CFG (likeΓ1) with a terminal # denoting the end of the
input.
Γ"1:
• S → E#
• E → T +E |T
• T → F ∗T |F
• F → (E)|ID
Constructitem automaton for each non-terminal A. The item
Implementation of Parsers Top-Down Syntax Analysis
Item Automata
S → E#
64
© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007
Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):
Sei !‘ wie !1, aber mit Randzeichen #, d.h.
S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:
1
[S .E#] [S E.# ] [S E#.]
[E .T+E]
[E .T ]
[T .F*T]
[ T .F ]
[F .(E)]
[F .ID ]
[E T+.E] [E T+E.]
[E T.+E]
[E T.]
[ T F.]
[T F.*T] [T F*.T] [T F*T.]
[F ID.]
[F (.E)] [F (E.)] [F (E).]
E #
T + E
F * T
(
ID
E )
E → T +E |T
64
© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007
Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):
Sei !‘ wie !1, aber mit Randzeichen #, d.h.
S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:
1
[S .E#] [S E.# ] [S E#.]
[E .T+E]
[E .T ]
[T .F*T]
[ T .F ]
[F .(E)]
[F .ID ]
[E T+.E] [E T+E.] [E T.+E]
[E T.]
[ T F.] [T F.*T]
[T F*.T] [T F*T.]
[F ID.]
[F (.E)] [F (E.)] [F (E).]
E #
T + E
F * T
(
ID
E )
T → F ∗T |F
64
© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007
Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):
Sei !‘ wie !1, aber mit Randzeichen #, d.h.
S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:
1
[S .E#] [S E.# ] [S E#.]
[E .T+E]
[E .T ]
[T .F*T]
[ T .F ]
[F .(E)]
[F .ID ]
[E T+.E] [E T+E.] [E T.+E]
[E T.]
[ T F.]
[T F.*T] [T F*.T] [T F*T.]
[F ID.]
[F (.E)] [F (E.)] [F (E).]
E #
T + E
F * T
(
ID
E )
Ina Schaefer Context-Free Analysis 29
Implementation of Parsers Top-Down Syntax Analysis
Item Automata (2)
F → (E)|ID
© A. Poetzsch-Heffter, TU Kaiserslautern 64 25.04.2007
Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):
Sei !‘ wie !1, aber mit Randzeichen #, d.h.
S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:
1
[S .E#] [S E.# ] [S E#.]
[E .T+E]
[E .T ]
[T .F*T]
[ T .F ]
[F .(E)]
[F .ID ]
[E T+.E] [E T+E.] [E T.+E]
[E T.]
[ T F.] [T F.*T]
[T F*.T] [T F*T.]
[F ID.]
[F (.E)] [F (E.)] [F (E).]
E #
T + E
F * T
(
ID
E )
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent Parsing Procedures
• Item Automata can be mapped to recursive procedures.
• The input is a token stream terminated by #.
• The variable currSymbol contains one token look ahead, i.e. the first symbol of the stream.
Ina Schaefer Context-Free Analysis 31
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent Parsing Procedures (2)
Production: S → E# void S() {
E ();
if (currSymbol == ’#’){
accept();
} else { error();
} }
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent Parsing Procedures (3)
Production: E → T + E|T void E() {
T();if (currSymbol == ’+’){
readSymbol();
E();} }
Production: T → F ∗ T |F void T() {
F();if (currSymbol == ’*’){
readSymbol();
} T();
}
Ina Schaefer Context-Free Analysis 33
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent Parsing Procedures (4)
Production: F → (E)|ID void F() {
if (currSymbol == ’(’){
readSymbol();
E();
if (currSymbol == ’)’){
readSymbol();
}else error();
}
else if (currSymbol == ID ){
readSymbol();
}
else error();
}
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent Parsing Procedures (5)
Remarks:
• Recursive Descent
! is relatively easy to implement
! can easily be used with other tasks (see following example)
! is a typical example for syntax-directed methods (see also following example)
• Example uses one token look ahead.
Ina Schaefer Context-Free Analysis 35
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent and Evaluation
Example: Interpreter for Expressions using recursive descent
int env(Ident); // ID -> int
// local variable int_result stores intermediate results int S() {
int int_result : = E();
if (currSymbol == ’#’) { return int_result;
} else {
error();
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent and Evaluation (2)
int E() {
int int_result := T();
if (currSymbol == ’+’){
readSymbol();
return int_result + E();
} }
int T() {
int int_result := F();
if (currSymbol == ’*’){
readSymbol();
return int_result * T();
} }
Ina Schaefer Context-Free Analysis 37
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent and Evaluation (3)
int F() {
int int_result;
if (currSymbol == ’(’){
readSymbol();
int_result := E();
if (currSymbol == ’)’){
readSymbol();
return int_result;
else { error();}
return error_result; } }else if (currSymbol == ID) {
readSymbol();
return env(code(ID));
}
else { error();
return error_result; }
Implementation of Parsers Top-Down Syntax Analysis
Recursive Descent and Evaluation (4)
• Extension of Parser with Actions/Computations can easily be implemented, but mixing of conceptually different tasks and causes programs hard to maintain.
• For which grammars does the recursive descent technique work?
→ LL(k) Parsing Theory
Ina Schaefer Context-Free Analysis 39
Implementation of Parsers Top-Down Syntax Analysis
LL Parsing
• Basis for Town-Down Syntax Analysis
• First L: Read Input from Left to Right
• Second L: Search for Left Derivations
Implementation of Parsers Top-Down Syntax Analysis
LL(k) Grammars
Definition (LL(k) Grammar)
Let Γ =( N,T,Π,S) be a CFG and k ∈ N.
Γ is an LL(k) grammar, if for any two left derivations S ⇒∗lm uAα ⇒lm uβα ⇒∗lm ux and
S ⇒∗lm uAα ⇒lm uγα ⇒∗lm uy it holds that ifprefix(k,x) = prefix(k,y) then β = γ.
Ina Schaefer Context-Free Analysis 41
Implementation of Parsers Top-Down Syntax Analysis
LL(k) Grammars (2)
Remarks:
• A grammar is an LL(k) grammar if for a left derivation with k token look ahead the correct production for the next derivation step can be found.
• A Language L ⊆ Σ∗ is LL(k) if there exists LL(k) grammar Γ with L(Γ) = L.
• The definition of LL(k) grammars provides no method to test if a grammar has the LL(k) property.
Implementation of Parsers Top-Down Syntax Analysis
Non LL(k) Grammars
Example 1: Grammar with Left Recursion Γ2:
• S → E#
• E → E +T |T
• T → T ∗ F |F
• F → (E)|ID
Elimination of Left Recursion:
Replace productions of form A → Aα|β where β does not start with A
by A → βA" and A" → αA"|(.
Ina Schaefer Context-Free Analysis 43
Implementation of Parsers Top-Down Syntax Analysis
Non LL(k) Grammars (2)
Elimination of Left Recursion: From Γ2 we obtain Γ3. Γ2:
• S → E#
• E → E + T |T
• T → T ∗F |F
• F → (E)|ID
Γ3
• S → E#
• E → TE"
• E" → +TE"|(
• T → FT"
• T" → ∗FT|(
• F → (E)|ID
Implementation of Parsers Top-Down Syntax Analysis
Non LL(k) Grammars (3)
Example 2: Grammar Γ4 with unlimited look ahead
• STM → VAR := VAR|ID(IDLIST)
• VAR → ID|ID(IDLIST)
• IDLIST → ID|ID,IDLIST
Γ4 is not an LL(k) grammar for any k.
(Proof: cf. Wilhelm, Maurer, Example 8.3.4, p. 319) Transformation to LL(2) grammar Γ"4:
• STM → ASS_CALL|ID := VAR
• ASS_CALL → ID(IDLIST)ASS_CALL_REST
• ASS_CALL_REST →:= VAR|(
Ina Schaefer Context-Free Analysis 45
Implementation of Parsers Top-Down Syntax Analysis
Non LL(k) Grammars (4)
Remark:
The transformed grammars accept the same language, but provide other syntax trees.
From a theoretical point of view, this is acceptable.
From a programming language implementation perspective, this is in general not acceptable.
There are languagesL for which no LL(k) grammar Γ exists that generates the language, i.e. L(Γ) =L. (Example: Grammar Γ5)
Implementation of Parsers Top-Down Syntax Analysis
Non LL(k) Grammars (5)
Example 3: For L(Γ5), there exists no LL(k) grammar.
• S → A|B
• A → aAb|0
• B → aBbb|1
We show that there is no k such that Γ5 is an LL(k) grammar.
Proof.
Let k be arbitrary but fixed. Choose two derivations according to the LL(k) definition and show that desipite of equal prefixes of length k the resuts from β and γ are not equal:
S ⇒∗lm S ⇒ Alm ⇒∗lm ak0bk S ⇒∗lm S ⇒lm B ⇒∗lm ak1b2k
Then: prefix(k,ak0bk) = prefix(k,ak1b2k) = ak, but β = A += B = γ.
Ina Schaefer Context-Free Analysis 47
Implementation of Parsers Top-Down Syntax Analysis
FIRST and FOLLOW Sets
Definition
Let Γ =( N,T,Π,S) be a CFG, k ∈ N. T≤k = {u ∈ T∗|length(u) ≤ k} denotes all prefixes of length at least k.
We define:
• FIRSTk : (N ∪T)∗ → P(T≤k)
with FIRSTk(α) = {prefix(k,u)|α ⇒∗ u}
where prefix(n,u) = u for all u with length(U) ≤ n.
• FOLLOWk : (N ∪T)∗ → P(T≤k)
Implementation of Parsers Top-Down Syntax Analysis
FIRST and FOLLOW Sets in Parse Trees
X S
FIRST
k(X) FOLLOW
k(X)
Ina Schaefer Context-Free Analysis 49
Implementation of Parsers Top-Down Syntax Analysis
Characterization of LL(1) Grammars
Definition (Reduced CFG)
A CFG Γ = (N,T,Π,S) is reduced if each non-terminal occurs in a derivation and each non-terminal derives at least one word.
Lemma
A reduced CFG is LL(1) if-and-only-if for any two productions A → β and A → γ it holds that
FIRST1(β)⊕1 FOLLOW1(A)∩FIRST1(γ)⊕1 FOLLOW1(A) = ∅ where L1 ⊕1 L2 = {prefix(1,vw)|v ∈ L1,w ∈ L2}
Remark: FIRST and FOLLOW sets are computable, such this criterion can be checked automatically.
Implementation of Parsers Top-Down Syntax Analysis
Examples: FIRST
kand FOLLOW
kCheck that modified expression grammarΓ3 is LL(1).
• S → E#
• E → TE"
• E" → +TE"|(
• T → FT"
• T" → ∗FT|(
• F → (E)|ID
Compute FIRST1 and FOLLOW1 for each non-terminal.
Ina Schaefer Context-Free Analysis 51
Implementation of Parsers Top-Down Syntax Analysis
Examples: FIRST
kand FOLLOW
k(2)
• F → (E)|ID:
FI1((E))⊕1 FOLLOW1(F)∩FIRST1(ID)⊕ FOLLOW1(F)
= {(}⊕1 FOLLOW1(F)∩{ID}⊕ FOLLOW1(F)
= ∅
• E" → +TE"|(:
FIRST1(+TE")⊕1 FOLLOW1(E")∩FIRST1(()⊕FOLLOW1(E")
= {+}⊕1 FOLLOW1(E")∩{(}⊕FOLLOW1(E")
= {+}∩{#,)}
= ∅
• T" → ∗FT|( :
FIRST1(∗FT")⊕1 FOLLOW1(T")∩FIRST1(()⊕ FOLLOW1(T")
Implementation of Parsers Top-Down Syntax Analysis
Proof of LL Characterization Lemma
• Left-To-Right Direction: Γ is LL(1) implies FIRST and FOLLOW characterization.
Proof by Contradiction:
Suppose two productions A → β and A → γ with β += γ and Φ = FIRST1(β)⊕1 FOLLOW1(A)∩FIRST1(γ)⊕1 FOLLOW1(A) += ∅. Then there exists z ∈ Φ with length(z) = 1.
As Γ is reduced, there are two derivations:
S ⇒∗ ψAα ⇒ ψβα ⇒∗ ψzx S ⇒∗ ψAα ⇒ ψγα ⇒∗ ψzy
Ina Schaefer Context-Free Analysis 53
Implementation of Parsers Top-Down Syntax Analysis
Proof of LL Characterization Lemma (2)
Thus, we can construct the following left derivations : S ⇒∗lm uAα ⇒lm uβα ⇒∗lm uzx S ⇒∗lm uAα ⇒lm uγα ⇒∗lm uzy
with prefix(1,zx) = z = prefix(1,zy) which contradicts the LL(1) property of Γ.
Implementation of Parsers Top-Down Syntax Analysis
Proof of LL Characterization Lemma (3)
• Right-To-Left Direction: FIRST and FOLLOW characterization implies Γ is LL(1) .
Proof by Contradiction:
Suppose Γ is not LL(1). Then there are two different derivations with length(z) = 1:
S ⇒∗ ψAα ⇒ ψβα ⇒∗ ψzx S ⇒∗ ψAα ⇒ ψγα ⇒∗ ψzy
But z ∈ FIRST1(β)⊕1 FIRST1(γ) which is a contradiction.
Ina Schaefer Context-Free Analysis 55
Implementation of Parsers Top-Down Syntax Analysis
Parser Generation for LL(k) Languages
LL(k) Parser Generator
Grammar
Table for Push-Down Automaton/
Error:
Grammar is
not LL(k)
Implementation of Parsers Top-Down Syntax Analysis
Parser Generation for LL(k) Languages (2)
Remarks:
• Use of push-down automata with look ahead
• Select Production from Tables
• Advantages over bottom-up techniques in error analysis and error handling
Example System: ANTLR (http://www.antlr.org/) Recommended Reading for Top-Down Analysis:
• Wilhelm, Maurer: Chapter 8, Sections 8.3.1. to Sections 8.3.4, pp.
312 - 329
Ina Schaefer Context-Free Analysis 57