3.1 Translation of Imperative Language Constructs

(1)

Summer Term 2011

Prof. Dr. Arnd Poetzsch-Heffter

Software Technology Group TU Kaiserslautern

c Prof. Dr. Arnd Poetzsch-Heffter 1

Content of Lecture

1. Introduction

2. Syntax and Type Analysis 2.1 Lexical Analysis

2.2 Context-Free Syntax Analysis 2.3 Context-Dependent Analysis 3. Translation to Target Language

3.1 Translation of Imperative Language Constructs

3.2 Translation of Object-Oriented Language Constructs 4. Selected Aspects of Compilers

4.1 Intermediate Languages 4.2 Optimization

4.3 Data Flow Analysis 4.4 Register Allocation 4.5 Code Generation 5. Garbage Collection

(2)

3. Translation to Target Language

c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 3

Chapter Outline

3. Translation to Target Language

3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages

3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions

3.1.5 Translation of Statements

3.1.6 Translation of Procedures and Local Objects

3.2 Translation of Object-Oriented Language Constructs 3.2.1 Concepts of Object-Oriented Programming Languages 3.2.2 Translation with Procedural Languages

3.2.3 Translation of Classes

3.2.4 Problems of Multiple Inheritance

3.2.5 Further Aspects of Object-Oriented Languages

(3)

Focus:

• Differences between source languages and target languages/target machines

• Most important translation techniques for different programing paradigms (procedural/object-oriented)

Translation to Target Language (2)

Learning Objectives:

• Overview of imperative and procedural language constructs

• Typical language constructs of assembler languages

• Translation techniques for procedural language constructs

• Translation of object-oriented language constructs

(4)

3.1 Translation of Imperative Language Constructs

Translation of Imperative Language Constructs

Section Outline

3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages

3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions

3.1.5 Translation of Statements

3.1.6 Translation of Procedures and Local Objects

(5)

3.1.1 Language Constructs of Procedural Languages

Translation of Imperative Language Constructs Language Constructs of Procedural Languages

Language Constructs of Procedural Languages

From a conceptional and semantical view point, procedural languages have the following constructs:

• Domains with operations (often typed)

I pre-defined: int, boolean, ...

I user-defined: records, classes, ...

I implicitly defined: field types, address types, function types

• Variables

I simple and compound types

I global, local, statically/dynamically allocated

I define memory state

• Expressions

I computation of values with implicit intermediate results

I possibly in combination with execution control and state

(6)

Language Constructs of Procedural Languages (2)

• Statements

I simple and combined statements

I define execution control and state modification

• Procedures

I abstraction of parametrized statements

I may be recursive

I may be nested

Modules usually do not have a semantic meaning and are only

relevant for translation in name analysis and for binding and loading.

Translation of Imperative Language Constructs Language Constructs of Procedural Languages

Nested Procedures

Example from [Wilhelm, Maurer; Fig. 2.9]

Übersetzung geschachtelter Prozeduren

Geschachtelte/lokale Prozeduren werden z.B.

von Pascal und Ada unterstützt

Beispiel: (geschachtelte Prozeduren)

von Pascal und Ada unterstützt.

proc P(a) var b

Abb. 2.9)

var b var c proc Q

var a proc R

elm/Maurer,

var b begin ... b ...

... a ...

c

mt aus Wilhe

... c ...

end begin ... a ...

... b ...

spiel stamm

... call Q ...

end proc S

var a begin

Beis

begin ... a ...

(7)

3.1.2 Assembly and Machine Languages

Translation of Imperative Language Constructs Assembly and Machine Languages

Assembly and Machine Languages

Assembly languages have the following language constructs:

• Finite sequences of bits of various length: byte, word, halfword, ...

• Global memory

I register, flags (addressing by name)

I indexed, mostly word addressed main memory

• Instructions

I load, store

I arithmetic and boolean operations

I execution control (jumps, procedures)

I simple, not combined statements

I possibly complex addressing of operands

• Initialization instructions

(8)

The MIPS Assembler

MIPS - Microprocessor without interlocked pipeline stages

• RISC Architecture, originally 32 bit (since 1991 64bit)

• developed by John Hennessy (Stanford) starting 1981

• MARS Simulator

http://courses.missouristate.edu/KenVollmar/MARS/

MIPS Architecture

• Arithmetic-Logic Unit (ALU)

• Floating-Point Unit (FPU)

• 32 Registers (inkl. stack pointer, frame pointer, global pointer, return address)

• Main memory, 2³⁰ memory words (4 byte)

• 5-stage pipeline

(9)

MIPS Architecture

Memory PC

Adder

Register File

Sign Extend

IF / ID ID / EX

Imm RS1

RS2

Zero?

ALU

MUX EX / MEM

Memory

MUX

MEM / WB

MUX

Next SEQ PC Next SEQ PC

WB Data Branch

taken

IR

Instruction Fetch

Next PC

Instruction Decode

Register Fetch Execute

Address Calc. Memory Access Write Back

IF ID EX MEM WB

image: Wikipedia

Memory Structure

Reserved for OS Stack Segment

free

Heap Segment Data Segment

Text Segment Reserved 0xFFFFFFFF

0x80000000 0x7FFFFFFF

0x10000000 0x00400000 0x00000000

$sp

(10)

Data Types and Literals in MIPS Assembly Language

Data Types

• Instructions are all 32 bits

• byte (8 bits), halfword (2 bytes), word (4 bytes)

• integer (1 word storage)

• single precision floats (1 word storage)

• double precision floats (2 word storage) Literals

• Integers (e.g. 4, 2, -236, 0x44)

• Floats (e.g. 3.41, -0.323e5)

• Characters in single quotes, e.g. ’b’

• Strings in double quotes, e.g. "Hello World"

MIPS Registers

No Name P* Description

0 $zero - the constant 0

1 $at - assembler temporary (reserved by the assembler) 2-3 $v0, $v1 no values for function results and expression evaluation 4-7 $a0 - $a3 no arguments for subroutine calls

8-15 $t0 - $t7 no temporaries

16-23 $s0 - $s7 yes saved temporaries 24-25 $t8 - $t9 no additional temporaries 26-27 $k0, $k1 no reserved for OS kernel

28 $gp yes global pointer

29 $sp yes stack pointer

30 $fp yes frame pointer

31 $ra yes return address

*callee must preserve value

(11)

MIPS Instruction Format

• Instructions are always 32 bit

• Opcode in first 6 bits

• 3 types of instructions: R-, I-, and J-instructions R-Instructions

opcode (6) rs (5) rt (5) rd (5) shamt (5) funct (6) I-Instructions

opcode (6) rs (5) rt (5) immediate (16) J-Instructions

opcode (6) address (26)

MIPS Instructions

In the following let, r1, r2, r3, be registers (e.g. $s1, $t3) and let c be constant values (e.g. 4, 100, -4).

Arithmetic

add ^add ^{r1, r2, r3} r1 = r2 + r3 subtract ^sub ^{r1, r2, r3} r1 = r2 - r3 add immediate addi r1, r2, c r1 = r2 + c multiply mult r1, r2, r3 r1 = r2 * r3

(lower 32 bits of result) move move r1, r2 addi r1, r2, 0

(12)

MIPS Instructions (2)

Data Transfer

load word lw r1, c(r2) r1 = Memory[r2 + c]

store word sw r1, c(r2) Memory[r2 + c] = r1 load immediate ^{li r1, c} r1 = c

load half lh r1, c(r2) r1 = Memory[r2 + c]

store half sh r1, c(r2) Memory[r2 + c] = r1 load byte lb r1, c(r2) r1 = Memory[r2 + c]

store byte sb r1, c(r2) Memory[r2 + c] = r1

MIPS Instructions (3)

Logical

and and r1, r2, r3 r1 = r2 & r3 or or r1, r2, r3 r1 = r2 | r3 nor nor r1, r2, r3 r1 = ¬ ( r2 | r3 ) and immediate andi r1, r2, c r1 = r2 & c or immediate ori r1, r2, c r1 = r2 | c shift left logical sll r1, r2, c r1 = r2 « c shift right logical srl r1, r2, c r1 = r2 » c

(13)

MIPS Instructions (4)

Conditional Branches

branch on equal beq r1, r2, label if (r1 == r2) goto label branch on not equal bne r1, r2, label if (r1 != r2)

goto label set on less than slt r1, r2, r3 if (r2 < r3)

r1 := 1 else r1 := 0 set o.l.t. immediate slti r1, r2, c if (r2 < c)

r1 := 1 else r1 := 0

Unconditional Branches

jump ^{j label} goto label jump register ^{jr r1} goto r1

jump and link ^{jal label} $ra = PC + 4; goto label

Subroutine Calls

Subroutine call (jump and link)

jal label # jump and link

• copy program counter to $ra

• jump to label

• Note: before call store $ra on stack Subroutine return (jump register)

jr $ra # jump register

(14)

Working with the Stack

Push data on the stack

sw $ra, ($sp) # save return address on stack addi $sp, $sp, -4 # decrement stack pointer

sw $fp, ($sp) # save frame pointer on stack addi $sp, $sp, -4 # decrement stack pointer

Pop data from the stack

addi $sp, $sp, 4 # increment stack pointer lw $fp, ($sp) # pop saved frame pointer addi $sp, $sp, 4 # increment stack pointer lw $ra, ($sp) # pop saved return address

Adressing in MIPS

• Immediate: Operand is a constant, e.g. 25

• Register: Operand is a register, e.g. $s2

• Base or Displacement Addressing: Operand is a memory

location whose address is the sum of the register and a constant, e.g. 8($sp)

• PC relative: Address is the sum of PC and a constant

• Pseudodirect Addressing: Jump address is the 26 bit of the instruction with the upper bits of the PC

(15)

Syscalls for MARS/SPIM Simulators

How to use System Calls:

• load service number into register $v0

• load argument values, if any into $a0, $a1, $a2

• issue call instruction ^syscall

• retrieve return values, if any Example:

li $v0, 1 # print integer

move $a0, $t0 # load value into $a0 syscall

List of System Services

Service Code in $v0 Arguments

print integer 1 $a0 = integer to print

print string 4 $a0 = address of

null-terminated string to print exit (terminate execution) 10

print character 11 $a0 = character to print

exit2 (terminate with value) 17 $a0 = termination result

(16)

MIPS Assembly Program Structure

.data # data declarations follow this line

# ...

.text # instructions follow this line

# ...

main: # indicates the first instruction to execute

# ...

Data Declarations

Format

<name>: <type> (<initial values> | <allocated space>)

Example

.data # data declarations follow

var: .word 3 # integer variable with initial value 3 array1: .byte ’a’,’b’ # 2-element character array initialized

# with ’a’ and ’b’

array2: .space 40 # allocate 40 consecutive bytes, uninitialized

(17)

Example: Translation to MIPS

The example illustrates the MIPS assembler and typical translation tasks.

Code quality is not considered.

Source Code in C

1 char a[3], b[3];

2 ^int ^i;

3 char res;

4 int main() { 5 ^{i = 2;}

6 res = 1;

7 while( -1 < i ) { 8 if( res ) {

9 res = (a[i]==b[i]);

10 ^{i = i-1;}

11 } else {

12 i = i-1;

13 ^}

14 }

15 return res;

16 }

Source Code in C with Labels

1 ^char a[3], b[3];

2 int i;

3 char res;

4 int main() {

5 main: i = 2;

6 ^{res = 1;}

7 loop: while( -1 < i ) {

8 if( res ) {

9 res = (a[i]==b[i]);

10 after: i = i-1;

11 } else {

12 elseif: i = i-1;

13 } // afterif:

14 ^}

15 exit: return res;

16 }

(18)

Source Code in C with Gotos

1 ^char a[3], b[3];

2 int i;

3 char res;

4 int main() {

5 i = 2;

6 ^{res = 1;}

7 loop: if (! (-1 < i ))

8 goto exit;

9 ^{if( !res )}

10 goto elseif;

11 ^if (a[i]==b[i])

12 goto equal;

13 res = 0;

14 ^goto ^after;

15 equal: res = 1;

16 after: i = i-1;

17 goto afterif;

18 elseif: i = i-1;

19 ^afterif: ^goto ^loop;

20 exit: return res;

21 }

MIPS Program

# sp + 0 : i

# sp + 4 : res

# sp + 5 : base address of a[3]

# sp + 8 : base address of b[3]

main:

addi $sp, $sp, -12 # make space for the variables li $t1, 2

sw $t1, 0($sp) # i = 2 li $t1, 1

sb $t1, 4($sp) # set res at sp +4

(19)

MIPS Program (2)

loop:

lw $t2, 0($sp) # load i into $t2 li $t3, -1 # load -1 into $t3 slt $t0, $t3, $t2 # -1 < i ?

beq $t0, $zero, exit # if not -1 < i goto exit lb $t1, 4($sp) # load res from stack into $t1 beq $t1, $zero, elseif # if res == 0 goto else if add $t4, $sp, 5 # base address of array a add $t4, $t4, $t2 # add offset/ array index lb $t0, 0($t4) # load a[i]

add $t4, $sp, 8 # base address of array b add $t4, $t4, $t2 # add offset/ array index lb $t1, 0($t4) # load b[i]

beq $t0, $t1, equal # if a[i] == b[i]

sb $zero, 4($sp) # set res to 0

j after

MIPS Program (3)

equal:

addi $t3, $zero, 1 # $t3 = 1 sb $t3, 4($sp) # res = $t3 after:

subi $t2, $t2, 1 # i = i-1

sw $t2, 0($sp) # store i to $sp +4

j afterif # goto end of if statement elseif:

subi $t2, $t2, 1 # i = i-1

sw $t2, 0($sp) # store i to $sp +4 afterif:

j loop # return to loop

exit:

lw $a0, 4($sp) # terminate with exit code res addi $sp, $sp, 12 # reset stack pointer

li $v0, 17 syscall

(20)

Translation to MIPS

Remarks:

The example illustrates typical translation tasks:

• Translation of data types, memory management, addressing

• Translation of expressions, management of intermediate results, mapping of operations of the source language to operations of the target language

• Translation of statements by implementation with jumps

• Bad code quality with simple systematic approach

Translation Process

Concrete Syntax

SL

Concrete Syntax

MIPS

AST SL AST

MIPS Lexical and

Context-Free Analysis

Context- Dependent

Analysis

Translator Code Generator

(21)

MIPS Abstract Syntax

Prog * Instruction

Instruction = ADD (Register reg0, Register reg1, Register reg2)

| ADDI (Register reg0, Register reg1, Const const0)

| BEQ (Register reg0, Register reg1, Label label0)

| SLT (Register reg0, Register reg1, Register reg2)

| SLTI (Register reg0, Register reg1, Const const0)

| J (Label label0)

| JR (Register reg0)

| JAL (Label label0) ...

Const ( Integer value ) Label ( Integer labelId )

| KReg | GP () | SP () | FP () | RA ()

VReg = V0 () | V1 ()

AReg = A0 () | A1 () | A2 () | A3 () ...

Translation of Imperative Language Constructs Translation of Variables and Data Types

3.1.3 Translation of Variables and Data Types

(22)

Translation of variables and data types

Compiler

Programing Language

Assembly Language

named variables complex types

addresses of memory regions index and offset computation

Translation of variables and data types (2)

The translation of variables and data types comprises:

• handling of primitive data types

• conversion of data types (e.g. int → float)

• memory organisation

• translation of arrays

• translation of records and classes

• implementation of dynamic objects

(23)

Primitive data types

Usually, the primitive data types of source languages are supported by the target machine:

• int, long →4 byte word with integer arithmetic

• float, double → accordingly

Potentially, data types have to be encoded:

• boolean → 1 byte or 4 byte words

Problem, if target machine does not comply to requirements of source language, e.g.

• floating point arithmetic is not handled according to IEEE standard

• overflows are not dealt with correctly (cmp. Java FP-strict expressions)

• operations for conversion are missing on target machine

Memory layout

The conceptional memory layout of most imperative programing languages and target machines is similar. (Details depend on OS and machine)

dynamic variables, objects, ...

intermediate results,

OS kernel

global values

low addresses

global, static variables, constants, ...

heap program

(24)

Translation of arrays

Efficient translation of arrays is important for many tasks.

One-dimensional static arrays

• Allocate memory in the segment for global data (starting at $gp)

• Address computation with base address of array, index of array element and size of element type

Consider the array declaration ^{T tarr[57]}:

• $gp contains the base adress for the global memory region

• Let R_rel contain the relative address of the array ^tarr in the global memory region

• Let Ri contain the index i of the array component

If k =sizeof(T), then the address of ^tarr[i] is $gp+R_rel +k ∗ R_i.

Translation of Arrays (2)

Computation in MIPS

li $ti, k

mul $ti, Ri, $ti add $ti, R_rel, $ti add $ti, $gp, $ti lw $ti, ($ti)

(25)

More Translation of Arrays

Multi-dimensional static arrays

Consider as example the Pascal declaration

var a:array[-5..5][1..9] of integer;

which corresponds to 99 integer variables:

a[-5, 1] ... a[-5,9]

...

a[5,1] ... a[5,9]

Matrix is stored in rows in memory. Storing in rows is more efficient than storing columns as second index is often incremented in inner loops.

Further Translation of Arrays(2)

Translation of access to ^a[E1,E2]:

Assume results of evaluating E1 and E2 are stored in $t1 and $t2.

As ^ais a static array, we know the dimensions at compile time.

a[$t1,$t2] is the r-th component of a linear array with r = ($t1−(−5))∗((9−1) + 1) + ($t2−1)

= 9∗$t1+45+ $t2−1

= 9∗$t1+ $t2+44

Result: Store the address of the 44-th component as base address of the array in symbol table. Then it suffices to add 9∗$t1+ $t2 to base address.

(26)

Further Translation of Arrays(2)

Code example for access to a[E1,E2]:

[Code for E1 -> $t1]

[Code for E2 -> $t2]

LI ($t3, 9)

MULT ($t1, $t1, $t3) ADD ($t1, $t1, $t2) LI ($t2, 4)

MULT ($t1, $t1, $t2) ADDI ($t1, $t1, relA) ADD ($t1, $t1, $gp) LW ($t1, 0, $t1)

where relA = offset(a) + 44

General Translation of Arrays

General array declaration of dimension k

var a: array [u1..o1], ...., [uk..uk] of T;

Storing rows yields the following adress for accessing a[R1, ..., Rk]:

r = (R1−u1)∗ size(array[u2..o2, ...,uk..ok]of T) + (R2−u2)∗ size(array[u3..o3, ...,uk..ok]of T)

+ . . .

+ (Rk −uk)∗size(T)

(27)

General Translation of Arrays (2)

For i =1, . . . ,k −1, it holds that

size(i) := size(array[u{i +1}..o{i +1}, ...,uk..ok]of T) size(k) = size(T)

This implies

size(i −1) = size(i)∗(oi −ui +1) Simplification yields:

r = Xk

i=1

Ri ∗ size(i)− Xk

i=1

ui ∗ size(i)

At runtime, only the first summand has to be computed for which code has to be generated.

Code Generation for Array Access

Abstract syntax of source language:

Einfache Codeerzeugung für Feldzugriff:

Beispiel:

ArrayAccess ( UsedId uid, IndexExps ies ) UsedId ( Ident id )

IndexExps = IndexExpElem | IndexExp

IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p ) IndexExp ( ... )

Symboltabelle

Register, in dem Ergebnis steht ( Reg(Ri) ) Adressierung des Feldelements

Code für den Unterbaum

Liste der Größen zu jeder Felddimension

Relativadresse zur Adressierung eines Feldes a:

relA = offset(a) - !"^kui * size(i)

I=1

lkupRA: Ident x SymTab ! Adresse lk SZL Id t S T b ! I tLi t

I=1

lkupSZL: Ident x SymTab ! IntList

Zur Konkatenation von Codelisten benutzen wir “+“, die Erzeugung einer einelementigen Liste aus einem

El t h ib i l [ ]

Element e schreiben wir als [e] .

(28)

Code Generation for Array Access (2)

Attribution:

Einfache Codeerzeugung für Feldzugriff:

ArrayAccess ( UsedId uid, IndexExps ies ) UsedId ( Ident id )

IndexExps = IndexExpElem | IndexExp

IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p ) IndexExp ( ... )

Symboltabelle

Register, in dem Ergebnis steht ( Reg(Ri) ) Adressierung des Feldelements

Code für den Unterbaum

Liste der Größen zu jeder Felddimension

Relativadresse zur Adressierung eines Feldes a:

relA = offset(a) - !"

^k

ui * size(i)

I=1

lkupRA: Ident x SymTab ! Adresse lk SZL Id t S T b ! I tLi t

I=1

lkupSZL: Ident x SymTab ! IntList

Zur Konkatenation von Codelisten benutzen wir “+“, die Erzeugung einer einelementigen Liste aus einem

El t h ib i l [ ]

12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 211

Element e schreiben wir als [e] .

Symbol Table

Result Register Ri

Address of Array Element Code for Subtree

List of Sizes for each Array Dimension Relative Address for Array a

Code Generation for Array Access (3)

Operations for attribution:

• lkupRA: Ident × SymTab → Address

• lkupSZL: Ident × SymTab → IntList

• + : List concatenation, for an element e, [e] is the list containing only e.

In the following, the SymTab attribute is only explicitly given where it is required.

(29)

Code Generation for Array Access (4)

gebraucht wird. R0 enthält die Basisadresse des Speicherbereichs, in dem das Feld gespeichert ist.

ArrayAccess

UsedId IndexExps

Bdispx(Reg(R0),_,_)

UsedId IndexExps

lkupRA(_,_) lkupSZL(_,_)

IndexExpElem Ident

IndexExpElem

_ + rest(_) first(_)

_ +

[ Mult2(W,Imm(_),_) ] + [ Add2(W,_,_) ]

IndexExps IndexExp IndexExp

ADD(Ri,Ri, $gp) ADD(Ri, Ri,RA)

RA Ri

Code Generation for Array Access (5)

Um die Attributierungsbilder übersichtlicher zu gestalten, können Bezeichner für Attributwerte benutzt werden:

IndexExpElem rest(_) first(_)

CL + CR +

[ Mult2(W,Imm(_),RL) ] + [ Add2(W,RL,RR) ]

IndexExps

IndexExp RL CL RR CR

Zur Laufzeit braucht wieder nur der erste Summand berechnet werden. Dafür muss also Code generiert werden. Bei der schrittweisen Berechung kann auch eine Bereichsprüfung für das Feld vorgenommen werden.

Bemerkungen:

• Bei der Berechnung von Feldindizes gibt es häufig eine großes Potential für Optimierungen.

• Für die Übersetzung dynamischer Felder muss die Adressierung geeignet verallgemeinert werden die Adressierung geeignet verallgemeinert werden.

(siehe z.B. Wilhelm/Maurer, Abschnitt 2.6.2).

CL + CR +

[LOADI (RT, FI)] + [MUL (RL, RL, RT) ] + [ADD (RR, RR, RL) ]

FI

During stepwise computation, array bounds can also be checked.

(30)

Array Access

Remarks:

• Computation of array indices offers great potential for optimizations.

• For translation of dynamic arrays, addressing has to be generalized appropriately. (cf. Wilhelm/Maurer, Sect. 2.6.2)

Translation of Records

Translation of records is similar to translation of arrays:

• Determine size and memory layout

• Compute adresses for selection of record components and pointer dereferencing

• Translation of record operations, e.g. assignments to record components

Recommended Reading: Wilhelm, Maurer, Section 2.6.2

(31)

Implementation of Dynamic Objects

Dynamic objects = dynamically allocated variables and objects in sense of OO programing

Dynamic objects are stored on the heap:

• number of dynamic objects is not known at compile time, objects are created at runtime

• dynamic objects have a designated lifetime which disallows handling with stack

Memory representation and addressing of components is similar to static records.

Implementation of Dynamic Objects (2)

Example:

Implementierung dynamischer Objekte

Dynamische Objekte werden hier als Sammelbegriff für Dynamische Objekte werden hier als Sammelbegriff für dynamisch allozierte Variable und Objekte im Sinne der OO-Programmierung verwendet.

Dynamische Objekte werden auf der Halde verwaltet:

• Ihre Anzahl ist im Allg. zur Übersetzungszeit nicht

bekannt. Deshalb werden sie erst zur Laufzeit erzeugt.

• Sie haben eine Lebensdauer die eine kellerartigeSie haben eine Lebensdauer, die eine kellerartige Behandlung im Allg. nicht zulässt.

Beispiel: (dynamische Objekte) Beispiel: (dynamische Objekte)

typedef struct listelem { int head;

struct listelem* tail; }* list;

# define listelemSIZE sizeof(struct listelem{

int h; struct listelem* t;}) list append( int i list l ) {

list append( int i, list l ) {

list lvar = (list) calloc(1,listelemSIZE);

lvar->head = i;

lvar->tail = l;

return lvar;

}

(32)

Dynamic Memory Management

Dynamic memory management

• is handled by runtime environment

• can be supported by compiler

• can partially be handled by user program

Runtime environment provides operations for dynamic memory management:

• for the programmer, e.g. in C malloc, calloc, realloc, free

• for the compiler as in Pascal, Java, Ada

• no memory deallocation by programer possible, but garbage collection by runtime environment e.g. in Java

Dynamic Memory Management (2)

General Problem: Provide memory blocks of different sizes from a linear memory and reuse memory after it has been freed

Simple memory management by linear list of free memory areas Structure of free memory area of variable length:

user data size

header

(33)

Dynamic Memory Management (3)

List of free memory areas:

user data size

header

free used

used free used

freelist

Procedure to allocate and deallocate memory:

• Allocate memory

I Search memory area B of appropriate size

I Update references:

• If area has exactly required size, remove it from list.

• Else update header of area, create header for rest of free memory and add this area instead of the old area to list.

Dynamic Memory Management (4)

I Return pointer to memory cell after header (size information has to be kept.)

I If no memory area of required size is found, new memory has to be requested from the OS

• Free memory

I Find header for memory area to be freed by pointer to this area

I If previous or next memory areas are free, join the areas

I Add resulting memory area to list

(34)

Dynamic Memory Management (5)

Remarks:

• If program writes over assigned memory area, references or size information can be destroyed with bad consequences.

• If memory cannot be allocated in bytes, alignment restrictions have to be obeyed.

• For practical use the above principle can be improved by

I non linear search

I search for exact memory areas, avoiding defragmentation

I support for joining memory areas after deallocation

Translation of Imperative Language Constructs Translation of Expressions

3.1.4 Translation of Expressions

(35)

Translation of Expressions

Difficulties for translation of expressions

• Management of intermediate results on stack or in registers

• Translation of source language operations

I no counterpart in target language

I addressing

I context-dependent (Boolean expression as condition is handled differently as Boolean expression in an assignment.)

Translation of Expressions (2)

Abstract Syntax of Expressions:

Hier demonstrieren wir die generellen Probleme anhand eines kleinen Beispiels, das die direkte Übersetzung von Ausdrücken demonstriert.

Fortgeschrittene Techniken werden in Kapitel 3 behandelt.

B i i l ( i f h A d k üb t ) Beispiel: (einfache Ausdrucksübersetzung)

Wir betrachten die Ausdruckssyntax aus dem MI-Übersetzungsbeispiel in Abschnitt 3.1.2:

Exp = ArtihmExp | Relation | IntConst

| CharConst | ArrayAccess | Var ArithmExp = Add | Sub

Add, Sub ( Exp left, Exp right ) Relation = Lt | Eq

Relation Lt | Eq

Lt, Eq ( Exp left, Exp right ) IntConst ( Int i )

CharConst ( Char c )

ArrayAccess ( UsedId uid, Exp e ) i

Var ( UsedId uid ) UsedId ( Ident id )

Wir treffen folgende Entwurfsentscheidungen:

Zwischenergebnisse werden auf dem Keller verwaltet

• Zwischenergebnisse werden auf dem Keller verwaltet.

• Vergleiche werden durch Sprünge implementiert:

- Subtrahiere die beiden Werte auf dem Keller.

- In Abhängigkeit des Ergebnisses springe einen In Abhängigkeit des Ergebnisses springe einen Befehl an der 1 kellert bzw. der 0 kellert.

(36)

Translation of Expressions (3)

Design decisions:

• Intermediate results are stored on stack.

• Comparisons are implemented by jumps:

I compare values on stack

I dependent on result, jump to command pushing 1 or pushing 0

I generate associated labels

Translation of Expressions (4)

Attribution:

Attributdeklarationen:

Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList

eindeutige Marke für Ausdruck vom Typ String

Attributierung für das Code-Attribut:

Add

CL + CR +

[ Add2(W Postinc(SP) Regdef(SP) ]

tt but e u g ü das Code tt but

Exp

[ Add2(W,Postinc(SP),Regdef(SP) ]

CL CR

Exp

Lt

CL + CR + M

[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +

Exp

[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]

CL CR

Exp

Exp Exp

Relative Address of Variable or Array

Type of Expression (int, char, int[], char[]) Code for Subtree of Type CodeList

Unique Label for Expression of Type String

(37)

Translation of Expressions (5)

Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList eindeutige Marke für Ausdruck vom Typ String

Attributierung für das Code-Attribut:

Add

CL + CR +

[ Add2(W Postinc(SP) Regdef(SP) ]

tt but e u g ü das Code tt but

Exp

[ Add2(W,Postinc(SP),Regdef(SP) ]

CL CR

Exp

Lt

CL + CR + M

[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +

Exp

[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]

CL CR

Exp

Exp Exp

( Die Attributierungen für Sub und Eq sind entsprechend. ) CL +

CR +

[LOAD (R2, 0, $sp) ADD ($sp, $sp, 4) LOAD (R1, 0, $sp) ADD (R1, R1, R2) STORE (R1, 0, $sp)]

Translation of Expressions (6)

Attributdeklarationen:

Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList eindeutige Marke für Ausdruck vom Typ String Attributierung für das Code-Attribut:

Add

CL + CR +

[ Add2(W Postinc(SP) Regdef(SP) ] tt but e u g ü das Code tt but

Exp

[ Add2(W,Postinc(SP),Regdef(SP) ]

CL CR

Exp

Lt

CL + CR + M

[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +

Exp

[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]

CL CR

Exp

Exp Exp

( Die Attributierungen für Sub und Eq sind entsprechend. ) CL + CR +

[LOAD (R2, 0, $sp) ADD($sp, $sp, 4) LOAD (R1, 0, $sp) SLT (R1, R1, R2)

BEQ (R1, $zero, “PUSH_0_”+M) LOADI (R1, 1)

STORE (R1, 0, $sp) JUMP (“ENDREL_”+M) LABEL(“PUSH_0_”+M) LOADI (R1, 0)

STORE (R1, 0, $sp) LABEL (“ENDREL_”+M)]

(38)

Translation of Expressions (7)

IntConst

[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int

Var TV

if TV = int then

[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char

else // TV char

[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA

ArrayAccess TV

ArrayAccess

CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then

[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),

[ ( p ( g( ) g( ) )

Regdef(SP) ] else // TV = char

[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]

Beachte: Die Attributierung von Var und ArrayAccess

UsedId RA CR

Exp

Beachte: Die Attributierung von Var und ArrayAccess erzeugt Code zum Kellern des Werts vom Ausdruck, nicht für die Adressierung des Zugriffs.

[LOADI (Ri, int) ] + [SUB ($sp, $sp, 4)] + [STORE (Ri, 0, $sp)]

if TV = int then

[SUB ($sp, $sp, 4) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)

STORE (R2, 0, $sp) ] else // TV = char

[SUB ($sp,$sp,1) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)

STOREB (R2, 0, $sp) ]

Translation of Expressions (8)

IntConst

[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int

Var TV

if TV = int then

[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char

else // TV char

[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA

ArrayAccess TV

ArrayAccess

CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then

[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),

[ ( p ( g( ) g( ) )

Regdef(SP) ] else // TV = char

[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]

Beachte: Die Attributierung von Var und ArrayAccess

UsedIdRA CR

Exp

Beachte: Die Attributierung von Var und ArrayAccess [LOADI (Ri, int) ] +

[SUB ($sp, $sp, 4)] + [STORE (Ri, 0, $sp)]

if TV = int then [SUB ($sp, $sp, 4) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI) STORE (R2, 0, $sp) ] else // TV = char [SUB ($sp,$sp,1) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)

STOREB (R2, 0, $sp) ]

(39)

Translation of Expressions (9)

[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int

Var TV

if TV = int then

[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char

else // TV char

[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA

ArrayAccess TV

ArrayAccess

CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then

[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),

[ ( p ( g( ) g( ) )

Regdef(SP) ] else // TV = char

[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]

Beachte: Die Attributierung von Var und ArrayAccess

UsedIdRA CR

Exp

Beachte: Die Attributierung von Var und ArrayAccess erzeugt Code zum Kellern des Werts vom Ausdruck, nicht für die Adressierung des Zugriffs.

CR +

[LOAD (R1, 0, $sp) LOADI (R2, RA) ADD (R1, R1, R2) ADD (R1, R1, $gp)] + if TV = int then

[LOAD (R2, 0, RI) STORE (R2, 0, $sp)]

else // TV = char [LOADB (R2 0, RI) STOREB (R2, 0, $sp)]

Improvements

• Improvement of generated code by

I Storage of intermediate results in registers

I Context-dependent optimizing instruction selection

I Avoiding redundant computations by evaluating common subexpressions only once

• Improvement of translation technique by usage of intermediate language

(40)

3.1.5 Translation of Statements

Translation of Imperative Language Constructs Translation of Statements

Translation of Statements

Most statements can be translated by translation schemes with jumps:

Verbesserungen:

• des erzeugten Codes durch

Verwaltung von Zwischenergebnissen in Registern - Verwaltung von Zwischenergebnissen in Registern - kontextabhängige, optimierende Befehlsauswahl - Vermeidung redundanter Berechnungen durch

einmalige Auswertung gemeinsamer Teilausdrücke Ü

3 1 5 Übersetzung von Anweisungen

• der Übersetzungstechnik durch Benutzung einer Zwischensprache

Für die meisten Anweisungen lassen sich relativ leicht Übersetzungsschemata mittels Sprüngen angeben:

3.1.5 Übersetzung von Anweisungen

While

[ Label( “BEGWHILE_“ + M ) ] + CE +

[ Cmp( W Imm(0) Postinc(SP) ) ] + M

[ Cmp( W, Imm(0), Postinc(SP) ) ] + [ Jeq( Label( “ENDWHILE_“+M) ) ] + CS +

[ Jump(Label( “BEGWHILE_“+M)) ] + [ Label( “ENDWHILE_“ + M ) ]

Schwieriger ist die gute Übersetzungen von switch- Exp

( )

CE CS

Stat

g g g

[LABEL (“BEGWHILE_”+M)] + CE +

[LOAD (R1, 0, $sp) ADD ($sp, $sp, 4)

BEQ (R1, $zero, “ENDWHILE_”+M)] + CS +

[JUMP (“BEGWHILE_”+ M)] + [LABEL (“ENDWHILE_”+M)]

(41)

More Complex Translation of Statements

More complex is a good translation of switch-statements and efficient handling of non-strict expressions.

We consider the translation of non-strict Boolean expressions as an example of an optimizing translation and for the usage of context information.

Example: Abstract Syntax

Wir demonstrieren hier die Übersetzung nicht-strikter boolescher Ausdrücke:

• als Beispiel für eine optimierende Übersetzung

• um die Verwendung von Kontextinformation zu illustrieren.

Beispiel: (Verwendung ererbter Information)

Stat = While | IfThenElse | ...

BExp = And | Or | Not | StrictExp

Beispiel: (Verwendung ererbter Information)

Wir betrachten folgendes Sprachfragment:

BExp And | Or | Not | StrictExp While ( BExp c, Stat b )

IfThenElse ( BExp c, Stat then, Stat else ) And, Or ( BExp left, BExp right )

Not ( Bexp e ) StrictExp ( Exp e ) Ein Programmfragment dazu:

if( (B1 || B2) && ! B3 ) { while( !(B4 || B5) ) A1

Wobei A1 und A2 Anweisungen sind und B1 bis B5 while( !(B4 || B5) ) A1

} else { A2 }

Wobei A1 und A2 Anweisungen sind und B1 bis B5 strikte Ausdrücke. Wie in C und Java sind die

booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.

dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!

nicht mehr ausgewertet werden braucht und darf!

Außerdem sollen Sprungketten vermieden werden, d.h. Sprünge zu unbedingten Sprungbefehlen.

Translation of Imperative Language Constructs Translation of Statements

More Complex Translation of Statements (2)

A program fragment:

nicht-strikter boolescher Ausdrücke:

• als Beispiel für eine optimierende Übersetzung

• um die Verwendung von Kontextinformation zu illustrieren.

Beispiel: (Verwendung ererbter Information)

Stat = While | IfThenElse | ...

BExp = And | Or | Not | StrictExp

Beispiel: (Verwendung ererbter Information)

Wir betrachten folgendes Sprachfragment:

BExp And | Or | Not | StrictExp While ( BExp c, Stat b )

IfThenElse ( BExp c, Stat then, Stat else ) And, Or ( BExp left, BExp right )

Not ( Bexp e ) StrictExp ( Exp e )

Ein Programmfragment dazu:

if( (B1 || B2) && ! B3 ) { while( !(B4 || B5) ) A1

Wobei A1 und A2 Anweisungen sind und B1 bis B5

while( !(B4 || B5) ) A1 } else {

A2 }

Wobei A1 und A2 Anweisungen sind und B1 bis B5 strikte Ausdrücke. Wie in C und Java sind die

booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.

dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!

nicht mehr ausgewertet werden braucht und darf!

Außerdem sollen Sprungketten vermieden werden,

where

• A1, A2 are statements

• B1 – B5 are strict expressions