• Keine Ergebnisse gefunden

ConnectionbetweenCFGsandpush-downautomata LR:Readinputfromleft(L)andsearchforrightderivations(R) ResolvingConflictsinParserGeneration LR(k)Analysis GeneralPrinciplesofBottom-UpSyntaxAnalysis Bottom-UpAnalysisismorepowerfulthantop-downanalysis,sinceproducti

N/A
N/A
Protected

Academic year: 2022

Aktie "ConnectionbetweenCFGsandpush-downautomata LR:Readinputfromleft(L)andsearchforrightderivations(R) ResolvingConflictsinParserGeneration LR(k)Analysis GeneralPrinciplesofBottom-UpSyntaxAnalysis Bottom-UpAnalysisismorepowerfulthantop-downanalysis,sinceproducti"

Copied!
40
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Implementation of Parsers Bottom-Up Syntax Analysis

Bottom-Up Syntax Anaysis

Educational Objectives:

General Principles of Bottom-Up Syntax Analysis

LR(k) Analysis

Resolving Conflicts in Parser Generation

Connection between CFGs and push-down automata

Ina Schaefer Context-Free Analysis 59

Implementation of Parsers Bottom-Up Syntax Analysis

Basic Ideas: Bottom-Up Syntax Analysis

Bottom-Up Analysis is more powerful than top-down analysis, since production is chosen at the end of the analysis while in top-down analysis the production is selected up front.

LR: Read input from left (L) and search for right derivations (R)

Ina Schaefer Context-Free Analysis 60

(2)

Implementation of Parsers Bottom-Up Syntax Analysis

Principles of LR Parsing

1. Reduce from sentence to axiom

2. Construct sentential forms from prefixes in (N ∪T) and input rests in T. Prefixes are right sentential forms of grammar. Such prefixes are called viable prefixes. This prefix property has to hold invariantly to avoid dead ends.

3. Reductions are always made at the left-most possible position.

Ina Schaefer Context-Free Analysis 61

Implementation of Parsers Bottom-Up Syntax Analysis

Viable Prefix

Definition

Let S rm βAu rm βαu a right sentential form of Γ.

Then α is called handle or redex of the right sentential for βαu.

Each prefix of βα is a viable prefix of Γ.

(3)

Implementation of Parsers Bottom-Up Syntax Analysis

Regularity of Viable Prefixes

Theorem

The language of viable prefixes of a grammar Γ is regular.

Proof.

Cf. Wilhelm, Maurer Thm. 8.4.1 and Corrollary 8.4.2.1. (pp. 361, 362), Essential proof steps are illustrated in the following by construction of LR −DFA(Γ).

Ina Schaefer Context-Free Analysis 63

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples

Consider Γ1

! S aCD

! C b

! D a|b

Analysis of aba can lead to an dead end. (cf. Lecture).

Considering viable prefixes can avoid this.

Ina Schaefer Context-Free Analysis 64

(4)

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples (2)

Consider Γ2

! S E#

! E a|(E)|EE

Analysis of ( ( a ) ) ( a) # (cf. Lecture) Stack can manage prefixes already read.

Ina Schaefer Context-Free Analysis 65

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples (3)

Consider Γ3

! S E#

! E E +T|T

! T ID

Analysis of ID + ID + ID # (cf. Lecture)

(5)

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce

Schematic syntax tree for input xay with a T, x,y T and start symbol S:

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

! a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

! a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

Read Pointer

Read Pointer

Read Pointer

Ina Schaefer Context-Free Analysis 67

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce (2)

Shift step:

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

x a y

! a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! ="#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

Read Pointer

Read Pointer

Read Pointer

Reduce step:

80

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

© A. Poetzsch-Heffter, TU Kaiserslautern 80 26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

80

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

x a y

! a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$ =>

Read Pointer

Read Pointer

Read Pointer

Ina Schaefer Context-Free Analysis 68

(6)

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce (3)

Main Problems:

Reductions can only be performed if remaining prefix is still a viable prefix.

When to shift? When to reduce? Which production to use?

Solution:

For each grammar Γ construct LR −DFA(Γ) automaton (also called LR(0) automaton), that describes the viable prefixes.

Ina Schaefer Context-Free Analysis 69

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA

Let Γ =(T,N,Π,S) be a CFG.

For each non-terminal n N, construct Item Automaton

Build union of item automata: Start state is the start state of item automaton for S, Final states are final states of item automata

Add # transitions from each state which contains the position point in front for a non-terminal A to the starting state of the item

automaton of A

If all states of the LR-DFA automaton are considered as final states, the accepted language is the language of viable prefixes.

(7)

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example

Γ3: S E#,E E +T|T, T ID

© A. Poetzsch-Heffter, TU Kaiserslautern 82 26.04.2007

!5 : S E # , E E + T | T , T ID

Beispiel: (Konstruktion eines LR-DEA)

Konstruktion des LR-DEA für

[S .E#] [S E.# ] [S E#.]

[E .E+T]

[E .T ]

[T .ID ]

[E E+.T] [E E+T.]

[E T.]

[T ID.]

E #

E + T

ID

[E E.+T]

T

"

" "

"

Deterministisch machen liefert folgenden Automaten:

Ina Schaefer Context-Free Analysis 71

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example (2)

Determinisation:

83

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

[S .E#]

[S E.# ]

[S E#.]

[E .E+T]

[E .T ]

[T .ID ]

[E E+.T]

[E E+T.]

[E T.] [T ID.]

E #

+

T

ID Fehler

T

[E E.+T]

bezeichnet Fehlerkanten

q0

q1 q2

q3

q4 q5

q6

Die zuverlässigen Präfixe maximaler Länge:

E# , T , ID , E+ID , E+T

[T .ID ] ID

Bemerkungen:

• Im Beispiel enthält jeder Endzustand genau eine vollständig gelesene Produktion. Dies ist im Allg.

nicht so.

• Enthält ein Endzustand mehrere vollständig gelesene Produktionen spricht man von einemreduce/reduce- Konflikt.

• Enthält ein Endzustand eine vollständig gelesene und eine unvollständig gelesene Produktion mit einem Terminal nach dem Positionspunkt, spricht man von einemshift/reduce-Konflikt.

q7

Error

Error Transitions

Viable prefixes of maximal length: E#, T, ID, E +ID, E +T

Ina Schaefer Context-Free Analysis 72

(8)

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example (3)

Remarks:

In the example, each final state contains one completely read production, this is in general not the case.

If a final state contains more than one completely read productions, we have areduce/reduce conflict.

If a final state contains a completely read and an uncompletely read production with a terminal after the position point, we have a shift/reduce conflict.

Ina Schaefer Context-Free Analysis 73

Implementation of Parsers Bottom-Up Syntax Analysis

Analysis with LR-DFA

Analysis of ID + ID + ID #with LR-DFA (the viable prefix is underlined)

Analyse von ID + ID + ID # mit dem LR-DEA, unterstrichen ist jeweils der zuverlässige Präfix:

ID + ID + ID # <=

T + ID + ID # <=

E + ID + ID # <=

E + T + ID # <=

E + ID # <=

E + T # <=

E # <=

S

Beispiel: (Analyse mit LR-DEA)

Beachte:

• Die Satzformen bestehen immer aus einem zuverlässigen Präfix und der Resteingabe.

• Verwendet man nur den LR-DEA

zur Analyse muss man nach jeder Reduktion die Satzform von Anfang an lesen.

deshalb: verwende Kellerautomaten zur Analyse

Ina Schaefer Context-Free Analysis 74

(9)

Implementation of Parsers Bottom-Up Syntax Analysis

Analysis with LR-DFA (2)

Note:

The sentential forms always consist of a viable prefix and a remaining input.

If an LR-DFA is used, after each reduction the sentential form has to be read from the beginning.

Thus: Use pushdown automaton for analysis.

Ina Schaefer Context-Free Analysis 75

Implementation of Parsers Bottom-Up Syntax Analysis

LR pushdown automaton

Definition

Let Γ=( N,T,Π,S) be a CFG. The LR-DFA pushdown automaton forΓ contains:

a finite set of state Q (the states of the LR-DFA(Γ))

a set of actions Act ={shift,accept,error}∪red(Π), where red(Π) contains an action reduce(A α) for each production A α.

an action table at : Q Act.

a successor table succ : P ×(N ∪T) Q with P ={q Q|at(q) = shift}

Ina Schaefer Context-Free Analysis 76

(10)

Implementation of Parsers Bottom-Up Syntax Analysis

LR pushdown automaton (2)

Remarks:

The LR-DFA pushdown automaton is a variant of pushdown automata particularly designed for LR parsing.

States encode the read left context.

If there are no conflicts, the action table can be directly constructed from the LR-DFA:

! accept: final state of item automaton of start symbol

! reduce: all other final states

! error: error state

! shift: all other states

Ina Schaefer Context-Free Analysis 77

Implementation of Parsers Bottom-Up Syntax Analysis

Execution of Pushdown Automaton

Configuration: Q ×T where variablestack denotes the

sequence of states and variable inr denotes the remaining input

Start configuration: (q0,input), where q0 is the start state of the LR-DFA

Interpretation Procedure:

(stack, inr) := (q0,input);

do {

step(stack,inr);

} while ( at(top(stack)) != accept and at(top(stack)) ! = error );

if (( at (top(stack)) == error) return error;

with

(11)

Implementation of Parsers Bottom-Up Syntax Analysis

Execution of Push-Down Automaton (2)

void step ( var StateSeq stack, var SymbolSeq inr) { State tk: = top(stack);

switch ( at(tk) ) { case shift:

stack: = push ( succ (tk,top(inr)), keller);

inr := tail(inr);

break;

case reduce A -> a:

stack := mpop( length(a) ,stack);

stack := push (succ(top(stack), A), stack);

break;

} }

Ina Schaefer Context-Free Analysis 79

Implementation of Parsers Bottom-Up Syntax Analysis

LR push down automaton: Example

LR-DFA with states q0, . . . ,q7 for grammar Γ3 Action Table

© A. Poetzsch-Heffter, TU Kaiserslautern 87 26.04.2007

Beispiel: (LR-Kellerautomat zu

!5

)

Aktionstabelle:

q0 schieben q1 schieben q2 akzeptieren q3 schieben

q4 reduzieren E E+T q5 reduzieren E T q6 reduzieren T ID q7 fehler

Nachfolgertabelle:

ID + # E T q0 q6 q7 q7 q1 q5

q1 q7 q3 q2 q7 q7 q2

q3 q6 q7 q7 q7 q4 q4

q5 q6

q7

LR-DEA mit Zuständen q0 – q7 (siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion q0

q0 q6

q0 q5 q0 q1

q0 q1 q3 q0 q1 q3 q6

q0 q1 q3 q4 q0 q1

q0 q1 q3 q0 q1 q3 q6

q0 q1 q3 q4 q0 q1

q0 q1 q2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID + ID + ID # reduzieren E T + ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID + ID # reduzieren E E+T + ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben akzeptieren shift

accept

error reduce shift shift reduce reduce

Successor Table

87

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

Beispiel: (LR-Kellerautomat zu !

5

)

Aktionstabelle:

q

0

schieben q

1

schieben q

2

akzeptieren q

3

schieben

q

4

reduzieren E E+T q

5

reduzieren E T q

6

reduzieren T ID q

7

fehler

Nachfolgertabelle:

ID + # E T q

0

q

6

q

7

q

7

q

1

q

5

q

1

q

7

q

3

q

2

q

7

q

7

q

2

q

3

q

6

q

7

q

7

q

7

q

4

q

4

q

5

q

6

q

7

LR-DEA mit Zuständen q

0

– q

7

(siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion q

0

q

0

q

6

q

0

q

5

q

0

q

1

q

0

q

1

q

3

q

0

q

1

q

3

q

6

q

0

q

1

q

3

q

4

q

0

q

1

q

0

q

1

q

3

q

0

q

1

q

3

q

6

q

0

q

1

q

3

q

4

q

0

q

1

q

0

q

1

q

2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID + ID + ID # reduzieren E T + ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID + ID # reduzieren E E+T + ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben akzeptieren

Ina Schaefer Context-Free Analysis 80

(12)

Implementation of Parsers Bottom-Up Syntax Analysis

LR push down automaton: Example (2)

Computation for Input ID + ID + ID #

© A. Poetzsch-Heffter, TU Kaiserslautern 87 26.04.2007

Beispiel: (LR-Kellerautomat zu !5 )

Aktionstabelle:

q0 schieben q1 schieben q2 akzeptieren q3 schieben

q4 reduzieren E E+T q5 reduzieren E T q6 reduzieren T ID q7 fehler

Nachfolgertabelle:

ID + # E T q0 q6 q7 q7 q1 q5

q1 q7 q3 q2 q7 q7

q2

q3 q6 q7 q7 q7 q4

q4

q5

q6

q7

LR-DEA mit Zuständen q0– q7 (siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion q0

q0 q6

q0 q5

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID + ID + ID # reduzieren E T + ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID + ID # reduzieren E E+T + ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben akzeptieren

Stack Input Rest Action

shift shift shift

shift shift

shift accept reduce reduce reduce reduce reduce reduce

Ina Schaefer Context-Free Analysis 81

Implementation of Parsers Bottom-Up Syntax Analysis

LR-DFA Construction

Questions:

Does LR-DFA construction work for all unambiguous grammars?

For which grammars does the construction work?

How can the construction be generalized / made more expressive?

(13)

Implementation of Parsers Bottom-Up Syntax Analysis

Example LR-DFA

LR-DFA for Γ6: S E#, E T +E|T, T ID|N(), N ID

© A. Poetzsch-Heffter, TU Kaiserslautern 88 26.04.2007

Fragen:

• Funktioniert die obige Konstruktion für alle eindeutigen Grammatiken?

• Für welche Grammatiken funktioniert sie?

• Wie kann man sie verallgemeinern/mächtiger machen?

Beispiel:

LR-DEA für !6 :

S E # , E T+E | T , T ID | N( ) , N ID

[S .E#]

[S E.# ] [S E#.]

[E .T+E]

[E .T ] [T .ID ]

[ T N(.) ] E

#

+ T

ID

Fehler

T

bezeichnet Fehlerkanten

q0

q1 q2 q3

q4

q5

q6

ID [T .N( ) ]

[N .ID ]

[E T.+E]

[E T.] [E T+.E]

[E .T ] [E .T+E]

[T .N( ) ] [N .ID ]

[T .ID ] [E T+E.]

E

[T ID.]

q10

[N ID.] N

[ T N.( ) ] [ T N( ).]

N

( )

q8

q7 q9

error

Error Transitions

Ina Schaefer Context-Free Analysis 83

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Conflicts

2 Kinds of Conflicts:

Shift/Reduce Conflicts (q4 in example)

Reduce/Reduce Conflicts (q6 in example)

Ina Schaefer Context-Free Analysis 84

(14)

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Theory

Definition

Let Γ =(N,T,Π,S) be a CFG and k N. Γis an LR(k) grammar if for any two right derivations

S rm αAu rm αβu S rm γBv rm αβw it holds that:

If prefix(k,u) = prefix(k,w) thenα =γ, A =B and v =w

Ina Schaefer Context-Free Analysis 85

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Theory (2)

Remarks:

While for LL grammars the selection of the production depends on the non-terminal to be derived, for LR grammars it depends on the complete left context.

For LL grammars, the look ahead considers the language to be generated from the non-terminal. For LR grammars, the look ahead considers the language generated from not yet read non-terminals.

(15)

Implementation of Parsers Bottom-Up Syntax Analysis

Characterization of LR(0)

Theorem

A reduced CFG Γis LR(0) if-and-only-if the LR-DFA(Γ) contains no conflicts.

Proof.

cf. Lecture

Ina Schaefer Context-Free Analysis 87

Implementation of Parsers Bottom-Up Syntax Analysis

Characterization of LR(0) (2)

Example: Application of LR(0)-Chracterization Show (using the above theorem) that Γ5 is LR(0).

Γ5:

S A|B

A aAb|0

B aBbb|1

Ina Schaefer Context-Free Analysis 88

(16)

Implementation of Parsers Bottom-Up Syntax Analysis

Expressiveness of LR(k)

For each context-free language L with prefix property

(i.e. ∀v,w L: v is no prefix of w), there exists an LR(0) grammar.

Grammar Γ5 is not LL(k), but LR(0).

Methods for LR(1) can be generalized to LR(k), SLR(k) and LALR(k).

Ina Schaefer Context-Free Analysis 89

Implementation of Parsers Bottom-Up Syntax Analysis

Resolving Conflicts by Look Ahead

Compute look ahead sets from (N ∪T)k for items. The look ahead set of an item approximates the set of prefixes of length k with which the input rest at this item can start.

If the look ahead sets at an item are disjoint, then the action to be executed (shift, reduce) can be determined by k symbols look ahead.

For an item, select the action whose look ahead set contains the prefix of the input rest. Action table has to be extended.

For computation of look ahead sets, there are different methods.

(17)

Implementation of Parsers Bottom-Up Syntax Analysis

Common Methods for Look Ahead Computation

SLR(k) uses LR-DFA and FOLLOWk of conflicting items for look ahead

LALR(k) - look ahead LR - uses LR-DFA with state-dependent look ahead sets

LR(k) integrates computation of look ahead sets in automata construction (LR(k) automaton)

Ina Schaefer Context-Free Analysis 91

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars

Definition (SLR(1) grammar)

Let Γ =(N,T,Π,S) be a CFG and LA([A α.]) = FOLLOW1(A).

A state LR-DEA(Γ) has an SLR(1) conflict if there exists two different reduce items with LA([A α.])∩LA([B β.]) )= or two items [A α.] and [B α.aβ] with a LA([A a]).

Γ is SLR(1) if there is no SLR(1) conflict.

Ina Schaefer Context-Free Analysis 92

(18)

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (2)

Example: Γ6 is an SLR(1) grammar

S E#

E T +E|T

T ID|N()

N ID

Consider the conflicts between [E T.] and [E T.+E] and between [T ID.] and[N ID.]

FOLLOW1(E)∩{+}= {#}∩{+} =

FOLLOW1(T)∩FOLLOW1(N) = {#,+}∩{(}=

Ina Schaefer Context-Free Analysis 93

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (3)

Example: Γ7 (simplifed C expressions) is not an SLR(1) grammar

S E#

E L =R|R

L → ∗R|ID

R L

(19)

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (4)

LR-DFA for Γ7

© A. Poetzsch-Heffter, TU Kaiserslautern 93 26.04.2007

Beispiel: (nicht SLR(1)-Sprache)

Betrachte folgende Grammatik für vereinfachte C-Ausdrücke:

!7 : S E # , E L = R | R , L *R | ID , R L Der zugehörige LR-DEA:

[S .E# ]

[S E.# ] [S E#.]

[E .L=R]

[E .R]

[E L .=R]

[E L= .R]

[E L=R.]

[E R.]

[R .L]

[R L .]

[L .*R]

[L .ID]

[L * .R]

[L *R.]

[L ID.]

[R .L]

[L .*R]

[L .ID]

[R .L]

[L .*R]

[L .ID]

[R L .]

R E

L

*

= ID

#

R

* ID L

ID

R L

*

Der einzige Zustand mit einem Konflikt enthält die Items [E L .=R] und [R L .] mit

FOLLOW1(R) { = } = { =, # } { = } = { =} = { }U U /

Only conflict in items [E L. =R] and [R L.] with FOLLOW1(R)∩{=}= {=,#}∩{=} )=

Ina Schaefer Context-Free Analysis 95

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR(1) Automata

LR(1) automaton contains items [A α.β,V] with V ⊆T where

α is on top of the stack

the input rest is derivable from βc with c V, i.e.

V FOLLOW1(A).

Ina Schaefer Context-Free Analysis 96

(20)

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR(1) Automata (2)

LR(1) automaton for Γ7. Conflict is resolved, as {=}∩{#} =.

94

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

Konstruktion von LR(1)-Automaten:

Items der Form [ A !.", V ] mit V T

und der Bedeutung, dass ! auf dem Keller liegt und der Anfang des Eingaberests aus"c ableitbar ist mit c in V. D.h. V FOLLOW1(A) .

U

U

S .E#

S E.#

S E#.

E .L=R #

E .R #

E L .=R # E L= .R #

E L=R. # E R. #

R .L #

R L . #

L .*R #,=

L .ID #,=

L * .R #,=

L *R. #,=

L ID. #

R .L # L .*R # L .ID #

R .L #,=

L .*R #,=

L .ID #,=

R L . #

E R

L

*

=

ID

#

R

*

L ID

ID

R L

* L * .R # R .L # L .*R # L .ID #

L *R. # L ID. #,= *

R L . #,=

R L

Konflikt kann behoben werden, da {=} {#} = {}U

Ina Schaefer Context-Free Analysis 97

Implementation of Parsers Bottom-Up Syntax Analysis

LALR(1) Automata

LALR(1) Automata are constructed from LR(1) automata by

merging states in which items only differ in look ahead sets. Look ahead sets for equal items are conjoint. The resulting automaton has the same states as the LR-DFA.

However, LALR(1) automata can be generated more efficiently.

(21)

Implementation of Parsers Bottom-Up Syntax Analysis

LALR(1) Automata (2)

LR(1) automaton for Γ7.

© A. Poetzsch-Heffter, TU Kaiserslautern 95 26.04.2007

LALR(1)-Automaten:

Aus dem LR(1)-Automaten erhält man den LALR(1)- Automaten durch Zusammenlegen der Zustände, in denen sich die Items nur in der Vorausschaumenge unterscheiden. Die Vorausschaumengen zu gleichen Items werden dabei vereinigt. Der resultierende Automat hat die gleichen Zustände wie der LR-DEA..

S .E#

S E.#

S E#.

E .L=R #

E .R #

E L .=R # E L= .R #

E L=R. # E R. #

R .L #

R L . #

L .*R #,=

L .ID #,=

L * .R #,=

L *R. #,=

R .L # L .*R # L .ID #

R .L #,=

L .*R #,=

L .ID #,=

E R

L

*

=

ID

#

*

ID

ID

R

L

* L ID. #,=

R L . #,=

R L

Der LALR(1)-Automat lässt sich allerdings effizienter direkt konstruieren.

q0

q1

q2 q3

q4

q5

q6 q7

q8

q9

Ina Schaefer Context-Free Analysis 99

Implementation of Parsers Bottom-Up Syntax Analysis

Grammar Classes

96

© A. Poetzsch-Heffter, TU Kaiserslautern 26.04.2007

Zusammenhang der Grammatikklassen:

Lesen Sie zu Unterabschnitt 2.2.2.2:

Wilhelm, Maurer:

• aus Kap. 8, Abschnitt 8.4.1 bis einschl. 8.4.5, S. 353 – 383.

mehrdeutige Grammatiken eindeutige Grammatiken

LR(k) LR(1)

LALR(1) SLR(1)

LR(0) LL(k)

LL(1)

LL(0)

unambiguous grammars

ambiguous grammars

Ina Schaefer Context-Free Analysis 100

(22)

Implementation of Parsers Bottom-Up Syntax Analysis

Literature

Recommended Reading for Bottom-Up Analysis:

Wilhelm, Maurer: Chapter 8, Sections 8.4.1 - 8.4.5, pp. 353 - 383

Ina Schaefer Context-Free Analysis 101

Implementation of Parsers Bottom-Up Syntax Analysis

Parser Generators

Educational Objectives

Usage of Parser Generators

Characteristics of Parser Generators

(23)

Implementation of Parsers Bottom-Up Syntax Analysis

JavaCUP Parser Generator

CUP - Constructor of Useful Parsers

http://www2.cs.tum.edu/projects/cup/

Java-based Generator for LALR-Parsers

JFlex can be used to generate according scanner.

Running JavaCUP:

java -jar java-cup-11a.jar options inputfile

Ina Schaefer Context-Free Analysis 103

Implementation of Parsers Bottom-Up Syntax Analysis

Structure of JavaCUP Specification

package JavaPackageName;

import java_cup.runtime.*;

/* User supplied code for scanner, actions, ... */

/* Terminals (tokens returned by the scanner). */

terminal TerminalDecls;

/* Non-terminals */

non terminal NonTerminalDecls;

/* Precedences */

precedence [left | right | nonassoc ] TerminalList;

/* Grammar */

start with non-terminalName;

non_terminalName :: = prod_1 | ... | prod_n ;

Ina Schaefer Context-Free Analysis 104

(24)

Implementation of Parsers Bottom-Up Syntax Analysis

Example: JavaCUP Specification for Γ

7

import java_cup.runtime.*;

/* Terminals (tokens returned by the scanner). */

terminal ID, EQ, MULT;

/* Non terminals */

non terminal S, E, L, R;

/* The grammar */

start with S;

S ::= E;

E ::= L EQ R | R;

L ::= MULT R | ID;

R ::= L;

Ina Schaefer Context-Free Analysis 105

Implementation of Parsers Bottom-Up Syntax Analysis

Structure of Generated Parser Code

Output Files parser.java andsym.java

Tables for LALR Automaton

! Production table: provides the symbol number of the left hand side non-terminal, along with the length of the right hand side, for each production in the grammar,

! Action table: indicates what action (shift, reduce, or error) is to be taken on each lookahead symbol when encountered in each state

! Reduce-goto table: indicates which state to shift to after reduce

(25)

Implementation of Parsers Bottom-Up Syntax Analysis

Usage of Generated Parser

Parser calls scanner with scan() method when a new terminal is needed

Initialising Parser with new Scanner

parser parser_obj = new parser(new my_scanner());

Usage of Parser:

Symbol parse_tree = parser_obj.parse();

Ina Schaefer Context-Free Analysis 107

Error Handling

Error Handling

Educational Objectives:

Problems and Principles of Error Handling

Techniques of Error Handling for Context-Free Analysis

Ina Schaefer Context-Free Analysis 108

(26)

Error Handling

Principles of Error Handling

Error handling is required in all analysis phases and at runtime. One distinguishes

lexical errors

parse errors (in context-free analysis)

errors in name and type analysis

runtime errors (cannot be avoided in most cases)

logical errors (behavioural errors)

First 2 (3) kinds of errors are syntactic errors. We only consider error handling in context-free analysis.

Specification of error handling results basically from language specification.

Ina Schaefer Context-Free Analysis 109

Error Handling

Requirements for error handling

Errors should be localized as exactly as possible.

(Problem: Error is not detected at error position.)

As many errors at possible should be detected at once, but only real errors and no errors as consequences.

Errors are not always unique, i.e. it is not clear in general how to correct an error: class int { Int a; .... } or int a = 1-;

Error handling should not slow down analysis of correct programs.

Therefore, error handling is non-trivial and depends on the source language to be analysed.

(27)

Error Handling

Error Handling in Context-Free Analysis

1. Panic Error Handling

Mark synchronizing terminal symbols, e.g. end or ;

If parser reaches error state, all symbols up to next synchronizing symbol are skipped and the stack is corrected as if the production with the synchronizing symbol was read correctly.

! Pros: easy to implement, termination guaranteed

! Cons: large parts of the program can be skipped or misinterpreted

! Example: Incorrect Inputa : = b *** c;

Read until ; correct stack and reuse as if statement has been accepted

Ina Schaefer Context-Free Analysis 111

Error Handling

Error Handling in Context-Free Analysis (2)

2. Error Productions

Extend grammar with productions describing typical error situations, so called error productions. Error messages can be directly associated with error productions.

! Pros: easy to implement, termination guaranteed

! Cons: extended grammar can belong to more general grammar class, knowledge of typical error situations is necessary

! Example: Typical error in PASCAL if ... then A := E; else ...

Error Production:

Stmt if Expr then Stmt ; else Stmt

Ina Schaefer Context-Free Analysis 112

(28)

Error Handling

Error Handling in Context-Free Analysis (3)

3. Production-Local Error Correction

Goal is local correction of input such that analysis can be

resumed. Local means that it is tried to correct the input for the current production.

! Pros: flexible and powerful technique

! Cons: problematic if errors occur earlier than they can be detected, operations for corrections can lead to nonterminating analysis

Ina Schaefer Context-Free Analysis 113

Error Handling

Error Handling in Context-Free Analysis (4)

4. Global Error Correction

Attempt to get a correction that is as good as possible by altering the read input or the look ahead input.

Idea: Define distance or quality measure on inputs. For each incorrect input, look for a syntactically correct input that is best according to the used measure.

! Pros: very powerful technique

! Cons: analysis effort can be rather high, implementation is complex and poses risk of non-termination.

(29)

Error Handling

Error Handling in Context-Free Analysis (5)

5. Interactive Error Correction

In modern programming languages, syntactic analysis is often already supported by editors. In this case, editor marks error positions.

! Pros: quick feedback, possible error positions are shown directly, interaction with programmer possible

! Cons: editing can be disturbed, analysis must be able to handle incomplete programs

The presented techniques can also be combined. For selection of technique, programming language syntax is important. Error handling also depends on grammar class and implementation techniques used for parser.

Ina Schaefer Context-Free Analysis 115

Error Handling

Burke-Fischer Error Handling

Example of global error correction technique

Procedure: Use correction window of n symbols before symbol at which error was detected. Check all possible variations of symbol sequence in correction window that can be obtained by insertion, exchange or modification of a symbol at any position.

Quality Measure: Choose variation that allows longest continuation of parsing procedure

Implementation: Work with two stack automata, one represents the configuration at the beginning of the correction window, the other one the configuration at the end of the correction window. In an error case, the automaton running behind can be used to

resume at the old position and to test the computed variations.

Ina Schaefer Context-Free Analysis 116

(30)

Error Handling

Literature

Recommended Reading: Wilhelm, Maurer: Chapter 8, Sections 8.3.6 and 8.4.6 (general understanding sufficient)

Ina Schaefer Context-Free Analysis 117

Concrete and Abstract Syntax

Concrete and Abstract Syntax

Educational Objectives

Connection of parsing to other phases of program processing and translation

Differences between abstract and concrete syntax

Language concepts for describing syntax trees

Syntax tree construction

(31)

Concrete and Abstract Syntax

Connection of Parsers to other Phases

1. Parser directly controls following phases 2. Concrete Syntax Tree as Interface

3. Abstract Syntax Tree as Interface

Ina Schaefer Context-Free Analysis 119

Concrete and Abstract Syntax

Direct Control by Parser

Example: Recursive Descent: Parser calls other actions after each derivation/reduction step

Pros:

! simple (if realisable)

! flexible

! efficient (especially memory efficient)

Cons:

! non-modular, no clear interfaces

! not suitable for global aspects of translation

! following phases depend on parsing

! cannot be used with every parser generator

Ina Schaefer Context-Free Analysis 120

(32)

Concrete and Abstract Syntax

Abstract Syntax vs. Concrete Syntax

Definition (Concrete Syntax)

The concrete syntax of a programming languages determines the actual text representation of the programs (incl. key words,

separators).

If Γ is the CFG used for parsing a program P in a certain language, the syntax tree of P according to Γ is the concrete syntax tree of P.

Definition (Abstract Syntax)

The abstract syntax of a programming language describes the tree structure of programs in a form that is sufficient and suitable for further processing.

A tree for representing a program P according to the abstract syntax of a language is called abstract syntax tree of P.

Ina Schaefer Context-Free Analysis 121

Concrete and Abstract Syntax

Abstract Syntax

abstraction from keywords and separators

operator precedences are represented in tree structure (different non-terminals are not necessary)

better incorporation of symbol information

simplifying transformations Remarks:

The abstract syntax of a language is often not specified in the language report.

The abstract syntax usually also comprises information about source code positions.

(33)

Concrete and Abstract Syntax

Example: Concrete vs. Abstract Syntax

Concrete Syntax: Γ2

S E#

E T +E |T

T F ∗T |F

F (E)|ID Abstract Syntax

Exp = Add | Mult Ident

Add (Exp left, Exp right)

Mult (Exp left, Exp right)

Ina Schaefer Context-Free Analysis 123

Concrete and Abstract Syntax

Example: Concrete vs. Abstract Syntax (2)

Text: (a+b)∗c

Concrete Syntax Tree

Textrepräsentation: ( a + b ) * c

Konkreter Syntaxbaum: Abstrakter Syntaxbaum:

S Mult

T

E #

Mult

Add c

F F

T

a b

E

E T

F F T

( ID ID ) * ID

( ID + ID ) * ID a b c

113

© A. Poetzsch-Heffter, TU Kaiserslautern 07.05.2007

Abstract Syntax Tree

Textrepräsentation: ( a + b ) * c

Konkreter Syntaxbaum: Abstrakter Syntaxbaum:

S Mult

T

E #

Mult

Add c

F F

T

a b

E

E T

F F T

( ID ID ) * ID

( ID + ID ) * ID a b c

113

© A. Poetzsch-Heffter, TU Kaiserslautern 07.05.2007

Ina Schaefer Context-Free Analysis 124

(34)

Concrete and Abstract Syntax

Concrete Syntax Tree as Interface

Token Stream

Parser

(with Tree Construction)

Concrete Syntax Tree

Further Language Processing

Counters disadvantages of direct control by parser

Advantages over Abstract Syntax

! No additional specification of abstract syntax required

! Tree construction does not have to be described.

! Tree construction can be done automatically by parser generators.

Ina Schaefer Context-Free Analysis 125

Concrete and Abstract Syntax

Abstract Syntax Tree as Interface

Token Stream

Parser

(with Transforming Tree Construction)

Abstract Syntax Tree

Further Language Processing

Advantages over Concrete Syntax

! Simpler, more compact tree representation

! Simplifies later phases

! Often implemented by programming or specification language as

Referenzen

ÄHNLICHE DOKUMENTE

davon haben sich die 136,000 Menschen, die kein Land in Nutzung hatten, bisher genährt, davon Dnnen sich bei den immer kühneren Angriffen auf das „untauglich

Extend grammar with productions describing typical error situations, so called error productions. Error messages can be directly associated with

Arnd Poetzsch-Heffter Syntax and Type Analysis 3.. Context-Free Syntax

Arnd Poetzsch-Heffter Syntax and Type Analysis 3c. Context-Free Syntax

For sequential access to the symbol table, almost all types of the abstract syntax get an inherited attribute symin of type SymTab and an synthesized attribute symout.

• GlobDeclList, GlobDecl, LocVarList, LocVar, Stat, Exp, ExpList get inherited attribute envin of type Env. • GlobDecl gets synthesized

MC-LR concentrations detected, corresponding tissue concentrations, and calculated recovery after a 20-h incubation of fish liver tissue with various MC-LR concentrations using

As soon as brushes begin to enter pack, set START/STOP switch to STOP to remove drive motor power and then return switch to START position.. Observe action of brush cycling to