• Keine Ergebnisse gefunden

Language processing tools

N/A
N/A
Protected

Academic year: 2022

Aktie "Language processing tools"

Copied!
41
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Compilers and Language Processing Tools

Summer Term 2013

Arnd Poetzsch-Heffter Annette Bieniusa

Software Technology Group TU Kaiserslautern

c

Arnd Poetzsch-Heffter 1

(2)

Content of Lecture

1. Introduction

2. Syntax and Type Analysis 2.1 Lexical Analysis

2.2 Context-Free Syntax Analysis

2.3 Context-Dependent Analysis (Semantic Analysis) 3. Translation to Intermediate Representation

3.1 Languages for Intermediate Representation 3.2 Translation of Imperative Language Constructs 3.3 Translation of Object-Oriented Language Constructs 3.4 Translation of Procedures

4. Optimization and Code Generation 4.1 Assembly and Machine Code 4.2 Optimization

4.3 Register Allocation 4.4 Further Aspects

(3)

Content of Lecture (2)

5. Selected Topics in Compiler Construction 5.1 Garbage Collection

5.2 Just-in-time Compilation

5.3 XML Processing (DOM, SAX, XSLT)

c

Arnd Poetzsch-Heffter 3

(4)

Outline

(5)

Overview and Motivation Language Processing Tools

Language processing tools

Processing of source texts in (source) languages

Analysis of (source) texts

Translation to target languages

c

Arnd Poetzsch-Heffter Introduction 5

(6)

Overview and Motivation Language Processing Tools

Language processing tools (2)

Typical source languages

Programming languages: C, C++, C#, Java, Scala, Haskell, ML, Smalltalk, Prolog

Script languages: JavaScript, bash

Languages for configuration management: make, ant, maven

Domain-, application-, and tool-specific languages: BPEL, Excel, JFlex, CUPS,

Specification languages: Z, CASL, Isabelle/HOL, JML

Formatting and data description languages: LaTeX, HTML, XML

Design and architecture description languages: UML, SDL, VHDL, Verilog, ABS

(7)

Overview and Motivation Language Processing Tools

Language processing tools (3)

Typical target languages

Assembly, machine, and bytecode languages

Programming languages

Data and layout description languages

Languages for printer control

...

c

Arnd Poetzsch-Heffter Introduction 7

(8)

Overview and Motivation Language Processing Tools

Language processing tools (4)

Language implementation tasks

Tool support for language processing

Integration into existing systems

Connection to other systems

(9)

Overview and Motivation Application Domains

Application domains

Programming environments (IDE)

I Context-sensitive editors, class browers

I Graphical programming tools

I Pre-processors

I Compilers

I Interpreters

I Debuggers

I Run-time environments (loading, linking, execution, memory management e.g. JVM, .NET)

c

Arnd Poetzsch-Heffter Introduction 9

(10)

Overview and Motivation Application Domains

Application domains (2)

Generation of programs from models (e.g. UML)

Program comprehension, re-engineering

Design and implementation of domain-specific languages

I Robot control

I Simulation tools

I Spread sheets, active documents

Web technology

I Analysis of Web sites

I Active Web sites (with integrated functionality)

I Optimization of caching

(11)

Overview and Motivation Application Domains

Related fields

Formal languages, language specification and design

Programming and specification languages

Programming, software engineering, software generation, software architecture

System software, computer architecture

c

Arnd Poetzsch-Heffter Introduction 11

(12)

Overview and Motivation Tasks of Language Processing Tools

Tasks of Language-Processing Tools

Analyser Translation Interpreter Source Code Source Code

Target Code Analysis

Results

Source Code

Input Data

Output Data

Analysis, translation and interpretation are often combined.

(13)

Overview and Motivation Tasks of Language Processing Tools

Tasks of Language-Processing Tools (2)

1. Translation

I Compiler implements analysis and translation

I OS and real machine implement interpretation Pros:

I Most efficient solution

I One interpreter for different programming languages

I Prerequisite for other solutions

c

Arnd Poetzsch-Heffter Introduction 13

(14)

Overview and Motivation Tasks of Language Processing Tools

Tasks of Language-Processing Tools (3)

2. Direct interpretation

I Interpreter implements all tasks.

I Examples: JavaScript, command line languages (bash)

I Pros: No translation necessary (but analysis at run-time)

(15)

Overview and Motivation Tasks of Language Processing Tools

Tasks of Language-Processing Tools (4)

3. Abstract and virtual machines

I Compiler implements analysis and translation to abstract machine code

I Abstract machine works as interpreter

I Examples: Java/JVM, C#, .NET

I Pros:

Platform independent (portability, mobile code)

Self-modifing programs possible

4. Other combinations

c

Arnd Poetzsch-Heffter Introduction 15

(16)

Overview and Motivation Examples

Example: Analysis

package b1_1

;

class Weltklasse extends Superklasse implement BesteBohnen {Qualifikation studieren ( Arbeit schweiss) { return new

Qualifikation ();}}

Beispiel: (Analyse)

javac-Analysator

Superklasse.class Qualifikation.class Arbeit.class

BesteBohnen.class

...

b1_1/Weltklasse.java:4: '{' expected.

extends Superklasse

^ 1 error

c

Arnd Poetzsch-Heffter Introduction 16

(17)

Overview and Motivation Examples

Example: Translation

17.04.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 9 package b1_1;

class Weltklasse extends Superklasse implements BesteBohnen {

Qualifikation studieren ( Arbeit schweiss ) { return new Qualifikation();

}}

Beispiel 1: (Übersetzung)

javac

Superklasse.class Qualifikation.class Arbeit.class

BesteBohnen.class

...

Compiled from Weltklasse.java class b1_1/Weltklasse

extends ... implements ... { b1_1/Weltklasse();

b1_1.Qualifikation studieren(...);

}

Method b1_1/Weltklasse() ...

Method b1_1.Qualifikation studieren(...) ...

c

Arnd Poetzsch-Heffter Introduction 17

(18)

Overview and Motivation Examples

Example: Translation (2)

Result of translation17.

04.2007© A. Poetzsch-Heffter, TU Kaiserslautern Beispiel 1:(Fortsetzung)

Compiled from Weltklasse.java class b1_1/Weltklasse

extends b1_1.Superklasse implements b1_1.BesteBohnen { b1_1/Weltklasse();

b1_1.Qualifikation studieren(b1_1.Arbeit);

}

Method b1_1/Weltklasse() 0 aload_0

1 invokespecial #6 <Method b1_1.Superklasse()>

4 return

Method b1_1.Qualifikation studieren(b1_1.Arbeit) 0 new #2 <Class b1_1.Qualifikation>

3 dup

4 invokespecial #5 <Method b1_1.Qualifikation()>

7 areturn

(19)

Overview and Motivation Examples

Example 2: Translation

17.04.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 11 int main() {

printf("Willkommen zur Vorlesung!");

return 0;

}

Beispiel 2: (Übersetzung)

gcc

.file "hello_world.c"

.version "01.01"

gcc2_compiled.:

.section .rodata .LC0:

.string "Willkommen zur Vorlesung!"

.text .align 16 .globl main

.type main,@function main:

pushl %ebp movl %esp,%ebp subl $8,%esp ...

c

Arnd Poetzsch-Heffter Introduction 19

(20)

Overview and Motivation Examples

Example 2: Translation (2)

Result of translation

Beispiel 2: (Fortsetzung)

.file "hello_world.c"

.version "01.01"

gcc2_compiled.:

.section .rodata .LC0:

.string "Willkommen zur Vorlesung!"

.text .align 16 .globl main

.type main,@function main:

pushl %ebp movl %esp,%ebp subl $8,%esp addl $-12,%esp pushl $.LC0 call printf addl $16,%esp xorl %eax,%eax jmp .L2 .p2align 4,,7 .L2:

movl %ebp,%esp popl %ebp ret .Lfe1:

.size main,.Lfe1-main

.ident "GCC: (GNU) 2.95.2 19991024 (release)"

(21)

Overview and Motivation Examples

Example 3: Translation

17.04.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 13

Beispiel 3: (Übersetzung)

\documentclass{article}

\begin{document}

\vspace*{7cm}

\centerline{\Huge\bf It‘s groovy}

\end{document}

groovy.tex (104 bytes)

...

groovy.dvi (207 bytes, binary)

%!PS-Adobe-2.0

%%Creator: dvips(k) 5.86 ...

%%Title: groovy.dvi ...

groovy.ps (7136 bytes)

latex

dvips

c

Arnd Poetzsch-Heffter Introduction 21

(22)

Overview and Motivation Examples

Example: Interpretation

Beispiel: (Interpretation)

...

14 iload_1 15 iload_2 16 idiv 17 istore_3 ...

.class-Datei

Eingabedaten

Ausgabedaten ...

14 iload_1 15 iload_2 16 idiv 17 istore_3 ...

Java Virtual Machine (JVM)

Input Data

Output Data .class File

(23)

Overview and Motivation Examples

Example: Combined technique

Java implementation with just-in-time (JIT) compiler

17.04.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 15

Kombinierte Implementierungstechnik:

Java-Implementierung mit JIT-Übersetzer

Java-Überset- zungseinheit

javac

Analysator Übersetzer

Eingabedaten

Java Byte Code .class-Datei

Ausgabedaten JIT-Übersetzer

JVM

Maschinencode reale Maschine/Hardware (JIT=Just in time) Beispiel: (Kombinierte Technik)

Java Source Code Unit

Analyzer Translator

Input Data

Output Data .class file

JIT Translator

Machine Code Real Machine / Hardware

c

Arnd Poetzsch-Heffter Introduction 23

(24)

Language Processing Terminology and Requirements

Language processing: The task of translation

Translator Source Code

Error Message or Target Code

Translator(in a broader sense):

Analysis, optimization and translation

Source code:

Input (string) for translator in syntax of source language (SL)

Target Code:

Output (string) of translator in syntax of target language (TL)

(25)

Language Processing Terminology and Requirements

Phases of language processing

Analysis of input:

I Program text

I Specification

I Diagrams

Dependant on target of implementation

I Transformation (XSLT, refactoring)

I Pretty printing, formatting

I Semantic analysis (program comprehension)

I Optimization

I (Actual) translation

c

Arnd Poetzsch-Heffter Introduction 25

(26)

Language Processing Terminology and Requirements

Compile time vs. run-time

Compile time: during run-time of compiler/translator Static: All information/aspects known at compile time, e.g.:

I Type checks

I Evaluation of constant expressions

I Relative addresses

Run-time: during run-time of compiled program

Dynamic: All information that are not statically known, e.g.:

I Allocation of dynamic arrays

I Bounds check of arrays

I Dynamic binding of methods

I Memory management of recursive procedures

Fordynamic aspectsthat cannot be handled atcompile time, the compiler generates code that handles these aspects atrun-time.

(27)

Language Processing Terminology and Requirements

What is a good compiler?

c

Arnd Poetzsch-Heffter Introduction 27

(28)

Language Processing Terminology and Requirements

Requirements for translators

Error handling (static/dynamic)

Efficient target code

Choice: Fast translation with slow code vs. slow translation with fast code

Semantically correct translation

(29)

Language Processing Terminology and Requirements

Semantically correct translation

Intuitive definition: Compiled program behaves according to language definition of source language.

Formal definition:

semSL: SL_Program×SL_Data→SL_Data

semTL: TL_Program×TL_Data→TL_Data

compile: SL_Program→TL_Program

code: SL_Data→TL_Data

decode: TL_Data→SL_Data Semantic correctness:

semSL(P,D) = decode(semTL(compile(P), code(D)))

c

Arnd Poetzsch-Heffter Introduction 29

(30)

Language Processing Compiler Architecture

Compiler Architecture

Scanner Source Code

as String

Token Stream

Parser

Name and Type Analysis

Translator

Code Generator Syntax

Tree

Decorated Syntax Tree

(Close to SL)

Intermediate Language

Target Code as String

Attribution &

Optimization

Attribution &

Optimization

Peep Hole Optimization Analysis

Synthesis

(31)

Language Processing Compiler Architecture

Properties of compiler architectures

Phases are conceptual units of translation

Phases can be interleaved

Design of phases depends on source language, target language and design decisions

Phase vs.pass(phase can comprise more than one pass)

Separatecompilation of pogram parts (interface information must be accessible)

Combination with other architecture decisions:

Common intermediate language

c

Arnd Poetzsch-Heffter Introduction 31

(32)

Language Processing Compiler Architecture

Common intermediate language

Source Language 1

Source Language 2

Source Language n

Intermediate Language

Target Language 1

Target Language 2

Target Language m ...

...

(33)

Language Processing Compiler Architecture

Dimensions of compiler construction

Programming languages

I Sequential procedural, imperative, OO-languages

I Functional, logical languages

I Parallel languages/language constructs

Target languages/machines

I Code for abstract machines

I Assembler

I Machine languages (CISC, RISC, ...)

I Multi-processor/multi-core architectures

I Memory hierarchy

Translation tasks: analysis, optimization, synthesis

Construction techniques and tools: bootstrapping, generators

Portability, specification, correctness

c

Arnd Poetzsch-Heffter Introduction 33

(34)

Compiler Construction

Compiler construction techniques

1. Stepwise construction

I Construction with compiler for different language

I Construction with compiler for different machine

I Bootstrapping

2. Compiler-compiler: Tools for compiler generation

I Scanner generators (regular expressions)

I Parser generators (context-free grammars)

I Attribute evaluation generators (attribute grammar)

I Code generator generators (machine specification)

I Interpreter generators (semantics of language)

I Other phase-specific tools 3. Special programming techniques

I General technique: syntax-driven

I Special technique: recursive descend

(35)

Compiler Construction

Stepwise construction

Programming typically depends on an existing compiler for the implementation language. For compiler construction, this does not hold in general.

Source, target, and implementation languages of compilers can be denoted in T-diagrams.

CL SL −→ TL

T-diagram denotes compiler from source languageSLto target languageTL(SL→TLcompiler) written in languageCL.

c

Arnd Poetzsch-Heffter Introduction 35

(36)

Compiler Construction

Construction with compiler for different language

Given:C →ML(machine language) compiler inML

Construct:SL→MLcompiler inML

Solution: DevelopSL→MLcompiler inC, translate that compiler fromC →MLby using the existingC→MLcompiler

C SL −→ ML

ML

C −→ ML ML SL −→ ML to be developed

existing

by translation

(37)

Compiler Construction

Construction with compiler for different machine

Construct:C →ML1compiler inML1

Given

1. CML1compiler inC 2. CML2compiler inML2

Method: constructcross compiler First step

C

C −→ ML1

ML2

C −→ ML2 ML2 C −→ ML1 cross compiler given

given

c

Arnd Poetzsch-Heffter Introduction 37

(38)

Compiler Construction

Construction with compiler for different machine (2)

Second step

C

C −→ ML1

ML2

C −→ ML1 ML1 C −→ ML1 resulting compiler given

cross compiler

(39)

Compiler Construction

Bootstrapping

Construct:SL→MLcompiler inML

Suppose: yet no compiler exists

Method:

1. Construct partial languageSLi ofSLsuch that

SL0SL1SL2. . .SL

2. ImplementSL0compiler forMLinML 3. ImplementSLi+1compiler forMLinSLi 4. CreateSLi+1compiler forMLinML

c

Arnd Poetzsch-Heffter Introduction 39

(40)

Compiler Construction

Bootstrapping (2)

SL0 SL1 −→ ML

ML

SL0 −→ ML ML SL1 −→ ML SL1

SL2 −→ ML SL2 SL −→ ML

ML

SL2 −→ ML ML SL −→ ML

manually by extension

by translation

(41)

Compiler Construction

Recommended reading

Wilhelm, Seidl, [Hack]: Übersetzerbau

Virtuelle Maschinen (Band 1): Einleitung (S. 1–6)

Das Frontend (Band 2): Die Struktur von Übersetzern (S. 1–11) Wilhelm, Maurer:

Chap. 1, Introduction (pp. 1–5)

Chap. 6, Structure of Compilers (pp. 225 – 238) Appel:

Chap. 1, Introduction (pp. 3 – 14)

c

Arnd Poetzsch-Heffter Introduction 41

Referenzen

ÄHNLICHE DOKUMENTE

For the pilot study presented here, the performance of two randomly chosen students was compared over the two sessions (activation of the target language German in the first session

3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages 3.1.3 Translation of Variables and Data

3.2 Translation of Object-Oriented Language Constructs 3.2.1 Concepts of Object-Oriented Programming Languages 3.2.2 Translation into Procedural Languages.. 3.2.3 Translation

Analoge to parameters, also procedure-local variables have to be stored in the stack frame, because there is one instance of the local variables for each procedure

Analoge to parameters, also procedure-local variables have to be stored in the stack frame, because there is one instance of the local variables for each procedure

The example demonstrates classes, object creation, inheritance (with subtyping and specialization) and dynamic method binding.. Ina Schaefer Translation to Target

Rechtsgrundlagen für diesen Beschluss sind das Universitätsgesetz 2002 und der Studienrechtliche Teil der Satzung der Universität Wien in der jeweils geltenden Fassung..

This work investigates the use of cross- language resources for statistical machine translation (SMT) between English and two closely related South Slavic lan- guages, namely