• Keine Ergebnisse gefunden

Context-Free Analysis Lecture Compilers SS 2009 Dr.-Ing. Ina Schaefer

N/A
N/A
Protected

Academic year: 2022

Aktie "Context-Free Analysis Lecture Compilers SS 2009 Dr.-Ing. Ina Schaefer"

Copied!
29
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Context-Free Analysis

Lecture Compilers SS 2009

Dr.-Ing. Ina Schaefer

Software Technology Group TU Kaiserslautern

Ina Schaefer Context-Free Analysis 1

Content of Lecture

1. Introduction: Overview and Motivation 2. Syntax- and Type Analysis

2.1 Lexical Analysis

2.2 Context-free Syntax Analysis 2.3 Context-sensitive Syntax Analysis 3. Translation to Target Language

3.1 Translation of Imperative Language Constructs 3.2 Translation of Object-Oriented Language Constructs 4. Selected Aspects of Compilers

4.1 Intermediate Languages 4.2 Optimization

4.3 Command Selection 4.4 Register Allocation 4.5 Code Generation 5. Garbage Collection

6. XML Processing (DOM, SAX, XSLT)

(2)

Outline of Context-Free Analysis

1. Specification of Parsers 2. Implementation of Parsers

Top-Down Syntax Analysis

Recursive Descent LL(k) Parsing Theory LL Parser Generation

Bottom-Up Syntax Analysis

Principles of LR Parsing LR Parsing Theory

SLR, LALR, LR(k) Parsing LALR-Parser Generation

3. Error Handling

4. Concrete and Abstract Syntax

Ina Schaefer Context-Free Analysis 3

Context-Free Syntax Analysis

Tasks

Check, if Token Stream (from Scanner) matches context-free syntax of languages

! Error Case: Error handling

! Correctness: Construction of Syntax Tree

Parser Token Stream

(3)

Context-Free Syntax Analysis (2)

Remarks:

Parsing can be interleaved with other actions processing the program (e.g. attributation).

Syntax tree controls important parts of translation. Hence, we distinguish

! Concrete syntax tree corresponds to context-free grammar

! Abstract syntax tree aims at further processing steps, compact representation of essential information.

Ina Schaefer Context-Free Analysis 5

Specification of Parsers

Specification of Parsers

2 general specification techniques

Syntax Diagrams

Context-Free Grammars (often in extended form)

(4)

Specification of Parsers

Context-Free Grammars

Definition

Let N and T be two alphabets, with N ∩T = and Π a finite subset of N ×(N ∪T) and S N. Then Γ = (N,T,Π,S) is a context-free

grammar(CFG) where

N is the set of non-terminals

T is the set of terminals

Π is the set of productions rules

S is the start symbol (axiom)

Ina Schaefer Context-Free Analysis 7

Specification of Parsers

Context-Free Grammars (2)

Notations:

A,B,C, . . . denote non-terminals

a,b,c, . . . denote terminals

x,y,z, . . . denote strings of terminals, i.e. x T

α,β,γ,ψ,φ,σ,τ are strings of terminals and non-terminals, i.e.

α (N ∪T)

Productions are denoted byA α.

The notation A α|β|γ|. . . is an abbreviation for

(5)

Specification of Parsers

Derivation

Let Γ =( N,T,Π,S) be a CFG:

ψ is directly derivable from φ in Γ, φ produces ψ directly, φ ψ, if there exists σAτ = φ and σατ = ψ and A α Π

ψ is derivable from φ in Γ,φ ψ, if there exists φ0, . . . ,φn with φ = φ0 and ψ = φn and for all i {0, . . . ,n− 1} it holds that φi φi+1. φ0, . . . ,φn is the derivation of ψ from φ.

is the reflexive, transitive closure of .

Ina Schaefer Context-Free Analysis 9

Specification of Parsers

Derivation (2)

The derivation φ0, . . . ,φn is a left derivation (right derivation), if in φi the left-most (right-most) non-terminal is replaced. Left

derivation steps are denoted by φ lm ψ. Right derivation steps are denoted byφ rm ψ.

The tree-like representation of a derivation is a syntax tree.

L(Γ) = {z T|S z} is the language generated byΓ.

x L(Γ) is a sentence of Γ.

φ (N ∪T) with S φ is a sentential form of Γ.

(6)

Specification of Parsers

Ambiguity in Grammars

A sentence is unambiguous if it has exactly one syntax tree. A sentence is ambiguous if it has more than one syntax tree.

For each syntax tree, there exists exactly on left derivation and exactly one right derivation.

Thus it holds: A sentence is unambiguous if-and-only-if it has exactly one left (right) derivation.

A grammar is ambiguous if it contains an ambiguous sentence, else it is unambiguous.

For programming languages, unambiguous grammars are

essential, as the semantics and the translation are defined by the syntactic structure.

Ina Schaefer Context-Free Analysis 11

Specification of Parsers

Ambiguity in Grammars (2)

Example 1: Grammar for Expressions Γ0

S E

E E +E

E E ∗E

E (E)

E ID

Consider the input string (av + av)∗bv +cv + dv which is the following input to the context-free analysis

(7)

Specification of Parsers

Ambiguity in Grammars (3)

Syntax tree for (ID + ID)∗ID + ID + ID

© A. Poetzsch-Heffter, TU Kaiserslautern 54 25.04.2007

Beispiele: (Mehrdeutigkeit)

1. Beispiel einer Ausdrucksgrammatik:

!0: S E, E E + E, E E * E, E ( E ), E ID

Betrachte die Eingabe: (av+av) * bv + cv +dv) Eingabe zur kf-Analyse: ( ID + ID ) * ID + ID + ID

S

"

" "

" "

E E E E E ( ID + ID ) * ID + ID + ID

- Syntaxbaum entspricht nicht den üblichen Rechenregeln.

- Es gibt mehrere Syntaxbäume gemäß !0,

insbesondere ist die Grammatik mehrdeutig.

Syntax tree does not match conventional rules of arithmetic.

There are several syntax trees according to Γ0 for this input, hence Γ0 is ambiguous.

Ina Schaefer Context-Free Analysis 13

Specification of Parsers

Ambiguity in Grammars (4)

Example 2: Ambiguity in if-then-else construct

if B1 then if B2 then A:= 9 else A:= 7 First Derivation

2. Mehrdeutigkeit beim if-then-else-Konstrukt:

if B1 then if B2 then A:=8 else A:= 7

IFTHENELSE

ANW IFTHEN

ANW ANW ZW ZW IF ID THEN IF ID THEN ID EQ CO ELSE ID EQ CO

ZW ZW ANW ANW

IFTHENELSE ANW IFTHEN

Ina Schaefer Context-Free Analysis 14

(8)

Specification of Parsers

Ambiguity in Grammars (5)

Second Derivation

55

© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007

2. Mehrdeutigkeit beim if-then-else-Konstrukt:

if B1 then if B2 then A:=8 else A:= 7

IFTHENELSE

ANW IFTHEN

ANW ANW ZW ZW IF ID THEN IF ID THEN ID EQ CO ELSE ID EQ CO

ZW ZW ANW ANW

IFTHENELSE ANW

IFTHEN

Ina Schaefer Context-Free Analysis 15

Specification of Parsers

Ambiguity in Grammars (6)

Remarks:

Each derivation corresponds to exactly one syntax tree. In reverse, for each syntax tree, there can be several derivations.

Instead of the term "syntax tree" often also the terms "structure tree" or "derivation tree" are used.

For each language, there can be several generating grammars, i.e.

the mapping L: Grammar Language is in general not injective.

(9)

Specification of Parsers

Ambiguity as Grammar Property

Ambiguity is a grammar property. The grammar for expressions Γ0 is an example of an ambiguous grammar.

Γ0:

S E

E E + E

E E E

E (E)

E ID

57

© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007

Beispiel: (Mehrdeutigkeit als Grammatikeig.)

Die obige Ausdrucksgrammatik

!0: S E, E E + E | E * E | E ( E ) | E ID

ist ein Beispiel für eine mehrdeutige Grammatik:

S E E E E E

ID + ID * ID E E E E

E S

Mehrdeutigkeit ist zunächst einmal eine Grammatik- eigenschaft.

Ina Schaefer Context-Free Analysis 17

Specification of Parsers

Ambiguity as Grammar Property (2)

But there exists an unambiguous grammar for the same language:

Γ1:

S E

E T + E |T

T F ∗T |F

F (E)|ID

Aber es gibt eine eindeutige Grammatik für die Sprache:

!1: S E, E T + E | T, T F * T | F, F ( E ) | ID S

E

E E

E F

F

F

F T

T T

T

( ID + ID ) * ID + ID F T

Lesen Sie zu Abschnitt 2.2.1:

Wilhelm, Maurer:

• aus Kap. 8, Syntaktische Analyse, die S. 271 - 283 Appel:

• aus Chap. 3, S. 40 - 47

(Es gibt aber auch kontextfreie Sprachen, die nur durch mehrdeutige Grammatiken beschrieben werden.)

Ina Schaefer Context-Free Analysis 18

(10)

Specification of Parsers

Literature

Recommended Reading:

Wilhelm, Maurer: Chapter 8, pp. 271 - 283 (Syntactic Analysis)

Appel: Chapter 3, pp. 40-47

Ina Schaefer Context-Free Analysis 19

Implementation of Parsers

Implementation of Parsers

Overview

Top-Down Parsing

! Recursive Descent

! LL-Parsing

! LL-Parser Generation

Bottom-Up Parsing

! LR-Parsing

! LALR, SLR, LR(k)-Parsing

! LALR-Parser Generation

(11)

Implementation of Parsers

Methods for Context-Free Analysis

Manually developed, grammar-specific implementation (error-prone, inflexible)

Backtracking (simple, but inefficient)

Cocke-Younger-Kasami-Algorithm (1967):

! for all CFGs in Chomsky Normalform

! based on idea of dynamic programming

! Time Complexity O(n3), however linear complexity desired

Top-Down-Methods: from Axiom to Token stream

Bottom-up-Methods: from Token stream to Axiom

Ina Schaefer Context-Free Analysis 21

Implementation of Parsers

Example: Top-Down Analysis

According to Γ1: Result is a left derivation.

Beispiel: (Top-down-Analyse) S

E =>

T + E =>

F * T + E =>

( E ) * T + E =>

( T + E ) * T + E =>

( F + E ) * T + E =>

( ID + E ) * T + E =>

( ID + T ) * T + E =>

( ID + F ) * T + E =>

( ID + ID ) * T + E =>

( ID + ID ) * F + E =>

( ID + ID ) * ID + E =>

( ID + ID ) * ID + T =>

( ID + ID ) * ID + F =>

( ID + ID ) * ID + ID

Ergebnis der td-Analyse ist eine Linksableitung.

Gemäß !1 :

Ina Schaefer Context-Free Analysis 22

(12)

Implementation of Parsers

Example: Bottom-Up Analysis

According to Γ1: Result is a right derivation.

© A. Poetzsch-Heffter, TU Kaiserslautern 61 25.04.2007

Beispiel: (Bottom-up-Analyse)

( ID + ID ) * ID + ID <=

( F + ID ) * ID + ID <=

( T + ID ) * ID + ID <=

( T + F ) * ID + ID <=

( T + T ) * ID + ID <=

( T + E ) * ID + ID <=

( E ) * ID + ID <=

F * ID + ID <=

F * F + ID <=

F * T + ID <=

T + ID <=

T + F <=

T + T <=

T + E <=

E <=

S <=

Ergebnis der bu-Analyse ist eine Rechtsableitung.

Gemäß !1 :

Ina Schaefer Context-Free Analysis 23

Implementation of Parsers

Context-free Analysis with linear complexity

Restrictions on Grammar (not every CFG has a linear parser)

Use of push-down automata or systems of recursive procedures

Usage of look ahead to remaining input in order to select next production rule to be applied

(13)

Implementation of Parsers

Syntax Analysis Methods and Parser Generators

Basic Knowledge of Syntax Analysis is essential for use of Parser Generators.

Parser generators are not always applicable.

Often, error handling has to be done manually.

Methods underlying parser generation is a good example for a generic technique (and a highlight of computer science!).

Ina Schaefer Context-Free Analysis 25

Implementation of Parsers Top-Down Syntax Analysis

Top-Down Syntax Analysis

Educational Objectives

General Principle of Top-Down Syntax Analysis

Recursive Descent Parsing (at an Example)

Expressiveness of Top-Down Parsing

Basic Concepts of LL(k) Parsing

(14)

Implementation of Parsers Top-Down Syntax Analysis

Recusive Descent

Basic Idea

Each non-terminal A is associated with a procedure. This procedure accepts a partial sentence derived from A.

The procedure implements a finite automaton constructed from the productions starting from A

Recursion of grammar is mapped to mutual recursive procedures such that stack of higher programing languages can be used for implementation.

Ina Schaefer Context-Free Analysis 27

Implementation of Parsers Top-Down Syntax Analysis

Construction of Recursive Descent Parser

Let Γ"1 be a CFG (likeΓ1) with a terminal # denoting the end of the

input.

Γ"1:

S E#

E T +E |T

T F ∗T |F

F (E)|ID

Constructitem automaton for each non-terminal A. The item

(15)

Implementation of Parsers Top-Down Syntax Analysis

Item Automata

S E#

64

© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007

Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):

Sei !‘ wie !1, aber mit Randzeichen #, d.h.

S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:

1

[S .E#] [S E.# ] [S E#.]

[E .T+E]

[E .T ]

[T .F*T]

[ T .F ]

[F .(E)]

[F .ID ]

[E T+.E] [E T+E.]

[E T.+E]

[E T.]

[ T F.]

[T F.*T] [T F*.T] [T F*T.]

[F ID.]

[F (.E)] [F (E.)] [F (E).]

E #

T + E

F * T

(

ID

E )

E T +E |T

64

© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007

Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):

Sei !‘ wie !1, aber mit Randzeichen #, d.h.

S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:

1

[S .E#] [S E.# ] [S E#.]

[E .T+E]

[E .T ]

[T .F*T]

[ T .F ]

[F .(E)]

[F .ID ]

[E T+.E] [E T+E.] [E T.+E]

[E T.]

[ T F.] [T F.*T]

[T F*.T] [T F*T.]

[F ID.]

[F (.E)] [F (E.)] [F (E).]

E #

T + E

F * T

(

ID

E )

T F ∗T |F

64

© A. Poetzsch-Heffter, TU Kaiserslautern 25.04.2007

Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):

Sei !‘ wie !1, aber mit Randzeichen #, d.h.

S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:

1

[S .E#] [S E.# ] [S E#.]

[E .T+E]

[E .T ]

[T .F*T]

[ T .F ]

[F .(E)]

[F .ID ]

[E T+.E] [E T+E.] [E T.+E]

[E T.]

[ T F.]

[T F.*T] [T F*.T] [T F*T.]

[F ID.]

[F (.E)] [F (E.)] [F (E).]

E #

T + E

F * T

(

ID

E )

Ina Schaefer Context-Free Analysis 29

Implementation of Parsers Top-Down Syntax Analysis

Item Automata (2)

F (E)|ID

© A. Poetzsch-Heffter, TU Kaiserslautern 64 25.04.2007

Konstruktion eines Parsers mit der Methode des rekursiven Abstiegs (exemplarisch):

Sei !‘ wie !1, aber mit Randzeichen #, d.h.

S E #, E T + E | T, T F * T | F, F ( E ) | ID Konstruiere für jedes Nichtterminal A den sogenannten Item-Automaten. Er beschreibt die Analyse derjenigen Produktionen, deren linke Seite A ist:

1

[S .E#] [S E.# ] [S E#.]

[E .T+E]

[E .T ]

[T .F*T]

[ T .F ]

[F .(E)]

[F .ID ]

[E T+.E] [E T+E.] [E T.+E]

[E T.]

[ T F.] [T F.*T]

[T F*.T] [T F*T.]

[F ID.]

[F (.E)] [F (E.)] [F (E).]

E #

T + E

F * T

(

ID

E )

(16)

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent Parsing Procedures

Item Automata can be mapped to recursive procedures.

The input is a token stream terminated by #.

The variable currSymbol contains one token look ahead, i.e. the first symbol of the stream.

Ina Schaefer Context-Free Analysis 31

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent Parsing Procedures (2)

Production: S E# void S() {

E ();

if (currSymbol == ’#’){

accept();

} else { error();

} }

(17)

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent Parsing Procedures (3)

Production: E T + E|T void E() {

T();if (currSymbol == ’+’){

readSymbol();

E();} }

Production: T F T |F void T() {

F();if (currSymbol == ’*’){

readSymbol();

} T();

}

Ina Schaefer Context-Free Analysis 33

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent Parsing Procedures (4)

Production: F (E)|ID void F() {

if (currSymbol == ’(’){

readSymbol();

E();

if (currSymbol == ’)’){

readSymbol();

}else error();

}

else if (currSymbol == ID ){

readSymbol();

}

else error();

}

(18)

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent Parsing Procedures (5)

Remarks:

Recursive Descent

! is relatively easy to implement

! can easily be used with other tasks (see following example)

! is a typical example for syntax-directed methods (see also following example)

Example uses one token look ahead.

Ina Schaefer Context-Free Analysis 35

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent and Evaluation

Example: Interpreter for Expressions using recursive descent

int env(Ident); // ID -> int

// local variable int_result stores intermediate results int S() {

int int_result : = E();

if (currSymbol == ’#’) { return int_result;

} else {

error();

(19)

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent and Evaluation (2)

int E() {

int int_result := T();

if (currSymbol == ’+’){

readSymbol();

return int_result + E();

} }

int T() {

int int_result := F();

if (currSymbol == ’*’){

readSymbol();

return int_result * T();

} }

Ina Schaefer Context-Free Analysis 37

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent and Evaluation (3)

int F() {

int int_result;

if (currSymbol == ’(’){

readSymbol();

int_result := E();

if (currSymbol == ’)’){

readSymbol();

return int_result;

else { error();}

return error_result; } }else if (currSymbol == ID) {

readSymbol();

return env(code(ID));

}

else { error();

return error_result; }

(20)

Implementation of Parsers Top-Down Syntax Analysis

Recursive Descent and Evaluation (4)

Extension of Parser with Actions/Computations can easily be implemented, but mixing of conceptually different tasks and causes programs hard to maintain.

For which grammars does the recursive descent technique work?

LL(k) Parsing Theory

Ina Schaefer Context-Free Analysis 39

Implementation of Parsers Top-Down Syntax Analysis

LL Parsing

Basis for Town-Down Syntax Analysis

First L: Read Input from Left to Right

Second L: Search for Left Derivations

(21)

Implementation of Parsers Top-Down Syntax Analysis

LL(k) Grammars

Definition (LL(k) Grammar)

Let Γ =( N,T,Π,S) be a CFG and k N.

Γ is an LL(k) grammar, if for any two left derivations S lm uAα lm uβα lm ux and

S lm uAα lm uγα lm uy it holds that ifprefix(k,x) = prefix(k,y) then β = γ.

Ina Schaefer Context-Free Analysis 41

Implementation of Parsers Top-Down Syntax Analysis

LL(k) Grammars (2)

Remarks:

A grammar is an LL(k) grammar if for a left derivation with k token look ahead the correct production for the next derivation step can be found.

A Language L Σ is LL(k) if there exists LL(k) grammar Γ with L(Γ) = L.

The definition of LL(k) grammars provides no method to test if a grammar has the LL(k) property.

(22)

Implementation of Parsers Top-Down Syntax Analysis

Non LL(k) Grammars

Example 1: Grammar with Left Recursion Γ2:

S E#

E E +T |T

T T F |F

F (E)|ID

Elimination of Left Recursion:

Replace productions of form A Aα|β where β does not start with A

by A βA" and A" αA"|(.

Ina Schaefer Context-Free Analysis 43

Implementation of Parsers Top-Down Syntax Analysis

Non LL(k) Grammars (2)

Elimination of Left Recursion: From Γ2 we obtain Γ3. Γ2:

S E#

E E + T |T

T T ∗F |F

F (E)|ID

Γ3

S E#

E TE"

E" +TE"|(

T FT"

T" → ∗FT|(

F (E)|ID

(23)

Implementation of Parsers Top-Down Syntax Analysis

Non LL(k) Grammars (3)

Example 2: Grammar Γ4 with unlimited look ahead

STM VAR := VAR|ID(IDLIST)

VAR ID|ID(IDLIST)

IDLIST ID|ID,IDLIST

Γ4 is not an LL(k) grammar for any k.

(Proof: cf. Wilhelm, Maurer, Example 8.3.4, p. 319) Transformation to LL(2) grammar Γ"4:

STM ASS_CALL|ID := VAR

ASS_CALL ID(IDLIST)ASS_CALL_REST

ASS_CALL_REST := VAR|(

Ina Schaefer Context-Free Analysis 45

Implementation of Parsers Top-Down Syntax Analysis

Non LL(k) Grammars (4)

Remark:

The transformed grammars accept the same language, but provide other syntax trees.

From a theoretical point of view, this is acceptable.

From a programming language implementation perspective, this is in general not acceptable.

There are languagesL for which no LL(k) grammar Γ exists that generates the language, i.e. L(Γ) =L. (Example: Grammar Γ5)

(24)

Implementation of Parsers Top-Down Syntax Analysis

Non LL(k) Grammars (5)

Example 3: For L(Γ5), there exists no LL(k) grammar.

S A|B

A aAb|0

B aBbb|1

We show that there is no k such that Γ5 is an LL(k) grammar.

Proof.

Let k be arbitrary but fixed. Choose two derivations according to the LL(k) definition and show that desipite of equal prefixes of length k the resuts from β and γ are not equal:

S lm S Alm lm ak0bk S lm S lm B lm ak1b2k

Then: prefix(k,ak0bk) = prefix(k,ak1b2k) = ak, but β = A += B = γ.

Ina Schaefer Context-Free Analysis 47

Implementation of Parsers Top-Down Syntax Analysis

FIRST and FOLLOW Sets

Definition

Let Γ =( N,T,Π,S) be a CFG, k N. Tk = {u T|length(u) k} denotes all prefixes of length at least k.

We define:

FIRSTk : (N ∪T) P(Tk)

with FIRSTk(α) = {prefix(k,u)|α u}

where prefix(n,u) = u for all u with length(U) n.

FOLLOWk : (N ∪T) P(Tk)

(25)

Implementation of Parsers Top-Down Syntax Analysis

FIRST and FOLLOW Sets in Parse Trees

X S

FIRST

k

(X) FOLLOW

k

(X)

Ina Schaefer Context-Free Analysis 49

Implementation of Parsers Top-Down Syntax Analysis

Characterization of LL(1) Grammars

Definition (Reduced CFG)

A CFG Γ = (N,T,Π,S) is reduced if each non-terminal occurs in a derivation and each non-terminal derives at least one word.

Lemma

A reduced CFG is LL(1) if-and-only-if for any two productions A β and A γ it holds that

FIRST1(β)1 FOLLOW1(A)∩FIRST1(γ)1 FOLLOW1(A) = where L1 1 L2 = {prefix(1,vw)|v L1,w L2}

Remark: FIRST and FOLLOW sets are computable, such this criterion can be checked automatically.

(26)

Implementation of Parsers Top-Down Syntax Analysis

Examples: FIRST

k

and FOLLOW

k

Check that modified expression grammarΓ3 is LL(1).

S E#

E TE"

E" +TE"|(

T FT"

T" → ∗FT|(

F (E)|ID

Compute FIRST1 and FOLLOW1 for each non-terminal.

Ina Schaefer Context-Free Analysis 51

Implementation of Parsers Top-Down Syntax Analysis

Examples: FIRST

k

and FOLLOW

k

(2)

F (E)|ID:

FI1((E))1 FOLLOW1(F)∩FIRST1(ID) FOLLOW1(F)

= {(}⊕1 FOLLOW1(F)∩{ID}⊕ FOLLOW1(F)

=

E" +TE"|(:

FIRST1(+TE")1 FOLLOW1(E")∩FIRST1(()⊕FOLLOW1(E")

= {+}⊕1 FOLLOW1(E")∩{(}⊕FOLLOW1(E")

= {+}∩{#,)}

=

T" → ∗FT|( :

FIRST1(∗FT")1 FOLLOW1(T")∩FIRST1(() FOLLOW1(T")

(27)

Implementation of Parsers Top-Down Syntax Analysis

Proof of LL Characterization Lemma

Left-To-Right Direction: Γ is LL(1) implies FIRST and FOLLOW characterization.

Proof by Contradiction:

Suppose two productions A β and A γ with β += γ and Φ = FIRST1(β)1 FOLLOW1(A)∩FIRST1(γ)1 FOLLOW1(A) += . Then there exists z Φ with length(z) = 1.

As Γ is reduced, there are two derivations:

S ψAα ψβα ψzx S ψAα ψγα ψzy

Ina Schaefer Context-Free Analysis 53

Implementation of Parsers Top-Down Syntax Analysis

Proof of LL Characterization Lemma (2)

Thus, we can construct the following left derivations : S lm uAα lm uβα lm uzx S lm uAα lm uγα lm uzy

with prefix(1,zx) = z = prefix(1,zy) which contradicts the LL(1) property of Γ.

(28)

Implementation of Parsers Top-Down Syntax Analysis

Proof of LL Characterization Lemma (3)

Right-To-Left Direction: FIRST and FOLLOW characterization implies Γ is LL(1) .

Proof by Contradiction:

Suppose Γ is not LL(1). Then there are two different derivations with length(z) = 1:

S ψAα ψβα ψzx S ψAα ψγα ψzy

But z FIRST1(β)1 FIRST1(γ) which is a contradiction.

Ina Schaefer Context-Free Analysis 55

Implementation of Parsers Top-Down Syntax Analysis

Parser Generation for LL(k) Languages

LL(k) Parser Generator

Grammar

Table for Push-Down Automaton/

Error:

Grammar is

not LL(k)

(29)

Implementation of Parsers Top-Down Syntax Analysis

Parser Generation for LL(k) Languages (2)

Remarks:

Use of push-down automata with look ahead

Select Production from Tables

Advantages over bottom-up techniques in error analysis and error handling

Example System: ANTLR (http://www.antlr.org/) Recommended Reading for Top-Down Analysis:

Wilhelm, Maurer: Chapter 8, Sections 8.3.1. to Sections 8.3.4, pp.

312 - 329

Ina Schaefer Context-Free Analysis 57

Referenzen

ÄHNLICHE DOKUMENTE

String → Token Stream (or Symbol String) Context-free Analysis:.. Token Stream → Tree

If there are more than one token matching the longest input prefix, one of these tokens is returned by the function symbol. Ina Schaefer Syntax and Type

Ina Schaefer Context-Free Analysis 60.. Implementation of Parsers Bottom-Up Syntax Analysis.. Principles of

• Different static analysis techniques for (intermediate) programs.. •

Java Byte Code and CIL (Common Intermediate Language, cf. .NET) are examples for stack machine code, i.e., intermediate results are stored on a runtime stack.. Further

The set of root variables in an execution state A contains all variables that are allocated globally or on the stack (i.e., global variables, instances of local variables,

variables, etc.) such that each reference to an object on the heap is either reachable from a root variable or from a variable on the heap!. An object is reachable in an execution

XPath has a large set of built-in functions (even more in XPath 2.0) that can be used in XPath predicates and in XSLT scripts for computing values from document nodes.