Summer Term 2011
Prof. Dr. Arnd Poetzsch-Heffter
Software Technology Group TU Kaiserslautern
c Prof. Dr. Arnd Poetzsch-Heffter 1
Content of Lecture
1. Introduction
2. Syntax and Type Analysis 2.1 Lexical Analysis
2.2 Context-Free Syntax Analysis 2.3 Context-Dependent Analysis 3. Translation to Target Language
3.1 Translation of Imperative Language Constructs
3.2 Translation of Object-Oriented Language Constructs 4. Selected Aspects of Compilers
4.1 Intermediate Languages 4.2 Optimization
4.3 Data Flow Analysis 4.4 Register Allocation 4.5 Code Generation 5. Garbage Collection
3. Translation to Target Language
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 3
Chapter Outline
3. Translation to Target Language
3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages
3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions
3.1.5 Translation of Statements
3.1.6 Translation of Procedures and Local Objects
3.2 Translation of Object-Oriented Language Constructs 3.2.1 Concepts of Object-Oriented Programming Languages 3.2.2 Translation with Procedural Languages
3.2.3 Translation of Classes
3.2.4 Problems of Multiple Inheritance
3.2.5 Further Aspects of Object-Oriented Languages
Focus:
• Differences between source languages and target languages/target machines
• Most important translation techniques for different programing paradigms (procedural/object-oriented)
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 5
Translation to Target Language (2)
Learning Objectives:
• Overview of imperative and procedural language constructs
• Typical language constructs of assembler languages
• Translation techniques for procedural language constructs
• Translation of object-oriented language constructs
3.1 Translation of Imperative Language Constructs
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 7
Translation of Imperative Language Constructs
Section Outline
3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages
3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions
3.1.5 Translation of Statements
3.1.6 Translation of Procedures and Local Objects
3.1.1 Language Constructs of Procedural Languages
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 9
Translation of Imperative Language Constructs Language Constructs of Procedural Languages
Language Constructs of Procedural Languages
From a conceptional and semantical view point, procedural languages have the following constructs:
• Domains with operations (often typed)
I pre-defined: int, boolean, ...
I user-defined: records, classes, ...
I implicitly defined: field types, address types, function types
• Variables
I simple and compound types
I global, local, statically/dynamically allocated
I define memory state
• Expressions
I computation of values with implicit intermediate results
I possibly in combination with execution control and state
Language Constructs of Procedural Languages (2)
• Statements
I simple and combined statements
I define execution control and state modification
• Procedures
I abstraction of parametrized statements
I may be recursive
I may be nested
Modules usually do not have a semantic meaning and are only
relevant for translation in name analysis and for binding and loading.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 11
Translation of Imperative Language Constructs Language Constructs of Procedural Languages
Nested Procedures
Example from [Wilhelm, Maurer; Fig. 2.9]
Übersetzung geschachtelter Prozeduren
Geschachtelte/lokale Prozeduren werden z.B.
von Pascal und Ada unterstützt
Beispiel: (geschachtelte Prozeduren)
von Pascal und Ada unterstützt.
proc P(a) var b
Abb. 2.9)
var b var c proc Q
var a proc R
elm/Maurer,
var b begin ... b ...
... a ...
c
mt aus Wilhe
... c ...
end begin ... a ...
... b ...
spiel stamm
... call Q ...
end proc S
var a begin
Beis
begin ... a ...
3.1.2 Assembly and Machine Languages
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 13
Translation of Imperative Language Constructs Assembly and Machine Languages
Assembly and Machine Languages
Assembly languages have the following language constructs:
• Finite sequences of bits of various length: byte, word, halfword, ...
• Global memory
I register, flags (addressing by name)
I indexed, mostly word addressed main memory
• Instructions
I load, store
I arithmetic and boolean operations
I execution control (jumps, procedures)
I simple, not combined statements
I possibly complex addressing of operands
• Initialization instructions
The MIPS Assembler
MIPS - Microprocessor without interlocked pipeline stages
• RISC Architecture, originally 32 bit (since 1991 64bit)
• developed by John Hennessy (Stanford) starting 1981
• MARS Simulator
http://courses.missouristate.edu/KenVollmar/MARS/
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 15
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Architecture
• Arithmetic-Logic Unit (ALU)
• Floating-Point Unit (FPU)
• 32 Registers (inkl. stack pointer, frame pointer, global pointer, return address)
• Main memory, 230 memory words (4 byte)
• 5-stage pipeline
MIPS Architecture
Memory PC
Adder
Register File
Sign Extend
IF / ID ID / EX
Imm RS1
RS2
Zero?
ALU
MUX EX / MEM
Memory
MUX
MEM / WB
MUX
MUX
Next SEQ PC Next SEQ PC
WB Data Branch
taken
IR
Instruction Fetch
Next PC
Instruction Decode
Register Fetch Execute
Address Calc. Memory Access Write Back
IF ID EX MEM WB
image: Wikipedia
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 17
Translation of Imperative Language Constructs Assembly and Machine Languages
Memory Structure
Reserved for OS Stack Segment
free
Heap Segment Data Segment
Text Segment Reserved 0xFFFFFFFF
0x80000000 0x7FFFFFFF
0x10000000 0x00400000 0x00000000
$sp
Data Types and Literals in MIPS Assembly Language
Data Types
• Instructions are all 32 bits
• byte (8 bits), halfword (2 bytes), word (4 bytes)
• integer (1 word storage)
• single precision floats (1 word storage)
• double precision floats (2 word storage) Literals
• Integers (e.g. 4, 2, -236, 0x44)
• Floats (e.g. 3.41, -0.323e5)
• Characters in single quotes, e.g. ’b’
• Strings in double quotes, e.g. "Hello World"
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 19
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Registers
No Name P* Description
0 $zero - the constant 0
1 $at - assembler temporary (reserved by the assembler) 2-3 $v0, $v1 no values for function results and expression evaluation 4-7 $a0 - $a3 no arguments for subroutine calls
8-15 $t0 - $t7 no temporaries
16-23 $s0 - $s7 yes saved temporaries 24-25 $t8 - $t9 no additional temporaries 26-27 $k0, $k1 no reserved for OS kernel
28 $gp yes global pointer
29 $sp yes stack pointer
30 $fp yes frame pointer
31 $ra yes return address
*callee must preserve value
MIPS Instruction Format
• Instructions are always 32 bit
• Opcode in first 6 bits
• 3 types of instructions: R-, I-, and J-instructions R-Instructions
opcode (6) rs (5) rt (5) rd (5) shamt (5) funct (6) I-Instructions
opcode (6) rs (5) rt (5) immediate (16) J-Instructions
opcode (6) address (26)
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 21
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Instructions
In the following let, r1, r2, r3, be registers (e.g. $s1, $t3) and let c be constant values (e.g. 4, 100, -4).
Arithmetic
add add r1, r2, r3 r1 = r2 + r3 subtract sub r1, r2, r3 r1 = r2 - r3 add immediate addi r1, r2, c r1 = r2 + c multiply mult r1, r2, r3 r1 = r2 * r3
(lower 32 bits of result) move move r1, r2 addi r1, r2, 0
MIPS Instructions (2)
Data Transfer
load word lw r1, c(r2) r1 = Memory[r2 + c]
store word sw r1, c(r2) Memory[r2 + c] = r1 load immediate li r1, c r1 = c
load half lh r1, c(r2) r1 = Memory[r2 + c]
store half sh r1, c(r2) Memory[r2 + c] = r1 load byte lb r1, c(r2) r1 = Memory[r2 + c]
store byte sb r1, c(r2) Memory[r2 + c] = r1
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 23
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Instructions (3)
Logical
and and r1, r2, r3 r1 = r2 & r3 or or r1, r2, r3 r1 = r2 | r3 nor nor r1, r2, r3 r1 = ¬ ( r2 | r3 ) and immediate andi r1, r2, c r1 = r2 & c or immediate ori r1, r2, c r1 = r2 | c shift left logical sll r1, r2, c r1 = r2 « c shift right logical srl r1, r2, c r1 = r2 » c
MIPS Instructions (4)
Conditional Branches
branch on equal beq r1, r2, label if (r1 == r2) goto label branch on not equal bne r1, r2, label if (r1 != r2)
goto label set on less than slt r1, r2, r3 if (r2 < r3)
r1 := 1 else r1 := 0 set o.l.t. immediate slti r1, r2, c if (r2 < c)
r1 := 1 else r1 := 0
Unconditional Branches
jump j label goto label jump register jr r1 goto r1
jump and link jal label $ra = PC + 4; goto label
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 25
Translation of Imperative Language Constructs Assembly and Machine Languages
Subroutine Calls
Subroutine call (jump and link)
jal label # jump and link
• copy program counter to $ra
• jump to label
• Note: before call store $ra on stack Subroutine return (jump register)
jr $ra # jump register
Working with the Stack
Push data on the stack
sw $ra, ($sp) # save return address on stack addi $sp, $sp, -4 # decrement stack pointer
sw $fp, ($sp) # save frame pointer on stack addi $sp, $sp, -4 # decrement stack pointer
Pop data from the stack
addi $sp, $sp, 4 # increment stack pointer lw $fp, ($sp) # pop saved frame pointer addi $sp, $sp, 4 # increment stack pointer lw $ra, ($sp) # pop saved return address
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 27
Translation of Imperative Language Constructs Assembly and Machine Languages
Adressing in MIPS
• Immediate: Operand is a constant, e.g. 25
• Register: Operand is a register, e.g. $s2
• Base or Displacement Addressing: Operand is a memory
location whose address is the sum of the register and a constant, e.g. 8($sp)
• PC relative: Address is the sum of PC and a constant
• Pseudodirect Addressing: Jump address is the 26 bit of the instruction with the upper bits of the PC
Syscalls for MARS/SPIM Simulators
How to use System Calls:
• load service number into register $v0
• load argument values, if any into $a0, $a1, $a2
• issue call instruction syscall
• retrieve return values, if any Example:
li $v0, 1 # print integer
move $a0, $t0 # load value into $a0 syscall
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 29
Translation of Imperative Language Constructs Assembly and Machine Languages
List of System Services
Service Code in $v0 Arguments
print integer 1 $a0 = integer to print
print string 4 $a0 = address of
null-terminated string to print exit (terminate execution) 10
print character 11 $a0 = character to print
exit2 (terminate with value) 17 $a0 = termination result
MIPS Assembly Program Structure
.data # data declarations follow this line
# ...
.text # instructions follow this line
# ...
main: # indicates the first instruction to execute
# ...
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 31
Translation of Imperative Language Constructs Assembly and Machine Languages
Data Declarations
Format
<name>: <type> (<initial values> | <allocated space>)
Example
.data # data declarations follow
var: .word 3 # integer variable with initial value 3 array1: .byte ’a’,’b’ # 2-element character array initialized
# with ’a’ and ’b’
array2: .space 40 # allocate 40 consecutive bytes, uninitialized
Example: Translation to MIPS
The example illustrates the MIPS assembler and typical translation tasks.
Code quality is not considered.
Source Code in C
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() { 5 i = 2;
6 res = 1;
7 while( -1 < i ) { 8 if( res ) {
9 res = (a[i]==b[i]);
10 i = i-1;
11 } else {
12 i = i-1;
13 }
14 }
15 return res;
16 }
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 33
Translation of Imperative Language Constructs Assembly and Machine Languages
Source Code in C with Labels
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() {
5 main: i = 2;
6 res = 1;
7 loop: while( -1 < i ) {
8 if( res ) {
9 res = (a[i]==b[i]);
10 after: i = i-1;
11 } else {
12 elseif: i = i-1;
13 } // afterif:
14 }
15 exit: return res;
16 }
Source Code in C with Gotos
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() {
5 i = 2;
6 res = 1;
7 loop: if (! (-1 < i ))
8 goto exit;
9 if( !res )
10 goto elseif;
11 if (a[i]==b[i])
12 goto equal;
13 res = 0;
14 goto after;
15 equal: res = 1;
16 after: i = i-1;
17 goto afterif;
18 elseif: i = i-1;
19 afterif: goto loop;
20 exit: return res;
21 }
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 35
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Program
# sp + 0 : i
# sp + 4 : res
# sp + 5 : base address of a[3]
# sp + 8 : base address of b[3]
main:
addi $sp, $sp, -12 # make space for the variables li $t1, 2
sw $t1, 0($sp) # i = 2 li $t1, 1
sb $t1, 4($sp) # set res at sp +4
MIPS Program (2)
loop:
lw $t2, 0($sp) # load i into $t2 li $t3, -1 # load -1 into $t3 slt $t0, $t3, $t2 # -1 < i ?
beq $t0, $zero, exit # if not -1 < i goto exit lb $t1, 4($sp) # load res from stack into $t1 beq $t1, $zero, elseif # if res == 0 goto else if add $t4, $sp, 5 # base address of array a add $t4, $t4, $t2 # add offset/ array index lb $t0, 0($t4) # load a[i]
add $t4, $sp, 8 # base address of array b add $t4, $t4, $t2 # add offset/ array index lb $t1, 0($t4) # load b[i]
beq $t0, $t1, equal # if a[i] == b[i]
sb $zero, 4($sp) # set res to 0
j after
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 37
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Program (3)
equal:
addi $t3, $zero, 1 # $t3 = 1 sb $t3, 4($sp) # res = $t3 after:
subi $t2, $t2, 1 # i = i-1
sw $t2, 0($sp) # store i to $sp +4
j afterif # goto end of if statement elseif:
subi $t2, $t2, 1 # i = i-1
sw $t2, 0($sp) # store i to $sp +4 afterif:
j loop # return to loop
exit:
lw $a0, 4($sp) # terminate with exit code res addi $sp, $sp, 12 # reset stack pointer
li $v0, 17 syscall
Translation to MIPS
Remarks:
The example illustrates typical translation tasks:
• Translation of data types, memory management, addressing
• Translation of expressions, management of intermediate results, mapping of operations of the source language to operations of the target language
• Translation of statements by implementation with jumps
• Bad code quality with simple systematic approach
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 39
Translation of Imperative Language Constructs Assembly and Machine Languages
Translation Process
Concrete Syntax
SL
Concrete Syntax
MIPS
AST SL AST
MIPS Lexical and
Context-Free Analysis
Context- Dependent
Analysis
Translator Code Generator
MIPS Abstract Syntax
Prog * Instruction
Instruction = ADD (Register reg0, Register reg1, Register reg2)
| ADDI (Register reg0, Register reg1, Const const0)
| BEQ (Register reg0, Register reg1, Label label0)
| SLT (Register reg0, Register reg1, Register reg2)
| SLTI (Register reg0, Register reg1, Const const0)
| J (Label label0)
| JR (Register reg0)
| JAL (Label label0) ...
Const ( Integer value ) Label ( Integer labelId )
Register = Zero () | AT () | VReg | AReg | TReg | SReg
| KReg | GP () | SP () | FP () | RA ()
VReg = V0 () | V1 ()
AReg = A0 () | A1 () | A2 () | A3 () ...
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 41
Translation of Imperative Language Constructs Translation of Variables and Data Types
3.1.3 Translation of Variables and Data Types
Translation of variables and data types
Compiler
Programing Language
Assembly Language
named variables complex types
addresses of memory regions index and offset computation
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 43
Translation of Imperative Language Constructs Translation of Variables and Data Types
Translation of variables and data types (2)
The translation of variables and data types comprises:
• handling of primitive data types
• conversion of data types (e.g. int → float)
• memory organisation
• translation of arrays
• translation of records and classes
• implementation of dynamic objects
Primitive data types
Usually, the primitive data types of source languages are supported by the target machine:
• int, long →4 byte word with integer arithmetic
• float, double → accordingly
Potentially, data types have to be encoded:
• boolean → 1 byte or 4 byte words
Problem, if target machine does not comply to requirements of source language, e.g.
• floating point arithmetic is not handled according to IEEE standard
• overflows are not dealt with correctly (cmp. Java FP-strict expressions)
• operations for conversion are missing on target machine
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 45
Translation of Imperative Language Constructs Translation of Variables and Data Types
Memory layout
The conceptional memory layout of most imperative programing languages and target machines is similar. (Details depend on OS and machine)
dynamic variables, objects, ...
intermediate results,
OS kernel
global values
low addresses
global, static variables, constants, ...
heap program
Translation of arrays
Efficient translation of arrays is important for many tasks.
One-dimensional static arrays
• Allocate memory in the segment for global data (starting at $gp)
• Address computation with base address of array, index of array element and size of element type
Consider the array declaration T tarr[57]:
• $gp contains the base adress for the global memory region
• Let Rrel contain the relative address of the array tarr in the global memory region
• Let Ri contain the index i of the array component
If k =sizeof(T), then the address of tarr[i] is $gp+Rrel +k ∗ Ri.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 47
Translation of Imperative Language Constructs Translation of Variables and Data Types
Translation of Arrays (2)
Computation in MIPS
li $ti, k
mul $ti, Ri, $ti add $ti, R_rel, $ti add $ti, $gp, $ti lw $ti, ($ti)
More Translation of Arrays
Multi-dimensional static arrays
Consider as example the Pascal declaration
var a:array[-5..5][1..9] of integer;
which corresponds to 99 integer variables:
a[-5, 1] ... a[-5,9]
...
a[5,1] ... a[5,9]
Matrix is stored in rows in memory. Storing in rows is more efficient than storing columns as second index is often incremented in inner loops.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 49
Translation of Imperative Language Constructs Translation of Variables and Data Types
Further Translation of Arrays(2)
Translation of access to a[E1,E2]:
Assume results of evaluating E1 and E2 are stored in $t1 and $t2.
As ais a static array, we know the dimensions at compile time.
a[$t1,$t2] is the r-th component of a linear array with r = ($t1−(−5))∗((9−1) + 1) + ($t2−1)
= 9∗$t1+45+ $t2−1
= 9∗$t1+ $t2+44
Result: Store the address of the 44-th component as base address of the array in symbol table. Then it suffices to add 9∗$t1+ $t2 to base address.
Further Translation of Arrays(2)
Code example for access to a[E1,E2]:
[Code for E1 -> $t1]
[Code for E2 -> $t2]
LI ($t3, 9)
MULT ($t1, $t1, $t3) ADD ($t1, $t1, $t2) LI ($t2, 4)
MULT ($t1, $t1, $t2) ADDI ($t1, $t1, relA) ADD ($t1, $t1, $gp) LW ($t1, 0, $t1)
where relA = offset(a) + 44
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 51
Translation of Imperative Language Constructs Translation of Variables and Data Types
General Translation of Arrays
General array declaration of dimension k
var a: array [u1..o1], ...., [uk..uk] of T;
Storing rows yields the following adress for accessing a[R1, ..., Rk]:
r = (R1−u1)∗ size(array[u2..o2, ...,uk..ok]of T) + (R2−u2)∗ size(array[u3..o3, ...,uk..ok]of T)
+ . . .
+ (Rk −uk)∗size(T)
General Translation of Arrays (2)
For i =1, . . . ,k −1, it holds that
size(i) := size(array[u{i +1}..o{i +1}, ...,uk..ok]of T) size(k) = size(T)
This implies
size(i −1) = size(i)∗(oi −ui +1) Simplification yields:
r = Xk
i=1
Ri ∗ size(i)− Xk
i=1
ui ∗ size(i)
At runtime, only the first summand has to be computed for which code has to be generated.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 53
Translation of Imperative Language Constructs Translation of Variables and Data Types
Code Generation for Array Access
Abstract syntax of source language:
Einfache Codeerzeugung für Feldzugriff:
Beispiel:
ArrayAccess ( UsedId uid, IndexExps ies ) UsedId ( Ident id )
IndexExps = IndexExpElem | IndexExp
IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p ) IndexExp ( ... )
Symboltabelle
Register, in dem Ergebnis steht ( Reg(Ri) ) Adressierung des Feldelements
Code für den Unterbaum
Liste der Größen zu jeder Felddimension
Relativadresse zur Adressierung eines Feldes a:
relA = offset(a) - !"kui * size(i)
I=1
lkupRA: Ident x SymTab ! Adresse lk SZL Id t S T b ! I tLi t
I=1
lkupSZL: Ident x SymTab ! IntList
Zur Konkatenation von Codelisten benutzen wir “+“, die Erzeugung einer einelementigen Liste aus einem
El t h ib i l [ ]
Element e schreiben wir als [e] .
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 54
Translation of Imperative Language Constructs Translation of Variables and Data Types
Code Generation for Array Access (2)
Attribution:
Einfache Codeerzeugung für Feldzugriff:
ArrayAccess ( UsedId uid, IndexExps ies ) UsedId ( Ident id )
IndexExps = IndexExpElem | IndexExp
IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p ) IndexExp ( ... )
Symboltabelle
Register, in dem Ergebnis steht ( Reg(Ri) ) Adressierung des Feldelements
Code für den Unterbaum
Liste der Größen zu jeder Felddimension
Relativadresse zur Adressierung eines Feldes a:
relA = offset(a) - !"
kui * size(i)
I=1
lkupRA: Ident x SymTab ! Adresse lk SZL Id t S T b ! I tLi t
I=1
lkupSZL: Ident x SymTab ! IntList
Zur Konkatenation von Codelisten benutzen wir “+“, die Erzeugung einer einelementigen Liste aus einem
El t h ib i l [ ]
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 211
Element e schreiben wir als [e] .
Symbol Table
Result Register Ri
Address of Array Element Code for Subtree
List of Sizes for each Array Dimension Relative Address for Array a
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 55
Translation of Imperative Language Constructs Translation of Variables and Data Types
Code Generation for Array Access (3)
Operations for attribution:
• lkupRA: Ident × SymTab → Address
• lkupSZL: Ident × SymTab → IntList
• + : List concatenation, for an element e, [e] is the list containing only e.
In the following, the SymTab attribute is only explicitly given where it is required.
Code Generation for Array Access (4)
gebraucht wird. R0 enthält die Basisadresse des Speicherbereichs, in dem das Feld gespeichert ist.ArrayAccess
UsedId IndexExps
Bdispx(Reg(R0),_,_)
UsedId IndexExps
lkupRA(_,_) lkupSZL(_,_)
IndexExpElem Ident
IndexExpElem
_ + rest(_) first(_)
_ +
[ Mult2(W,Imm(_),_) ] + [ Add2(W,_,_) ]
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 212
IndexExps IndexExp IndexExp
ADD(Ri,Ri, $gp) ADD(Ri, Ri,RA)
RA Ri
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 57
Translation of Imperative Language Constructs Translation of Variables and Data Types
Code Generation for Array Access (5)
Um die Attributierungsbilder übersichtlicher zu gestalten, können Bezeichner für Attributwerte benutzt werden:
IndexExpElem rest(_) first(_)
CL + CR +
[ Mult2(W,Imm(_),RL) ] + [ Add2(W,RL,RR) ]
IndexExps
IndexExp RL CL RR CR
Zur Laufzeit braucht wieder nur der erste Summand berechnet werden. Dafür muss also Code generiert werden. Bei der schrittweisen Berechung kann auch eine Bereichsprüfung für das Feld vorgenommen werden.
Bemerkungen:
• Bei der Berechnung von Feldindizes gibt es häufig eine großes Potential für Optimierungen.
• Für die Übersetzung dynamischer Felder muss die Adressierung geeignet verallgemeinert werden die Adressierung geeignet verallgemeinert werden.
(siehe z.B. Wilhelm/Maurer, Abschnitt 2.6.2).
CL + CR +
[LOADI (RT, FI)] + [MUL (RL, RL, RT) ] + [ADD (RR, RR, RL) ]
FI
During stepwise computation, array bounds can also be checked.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 58
Array Access
Remarks:
• Computation of array indices offers great potential for optimizations.
• For translation of dynamic arrays, addressing has to be generalized appropriately. (cf. Wilhelm/Maurer, Sect. 2.6.2)
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 59
Translation of Imperative Language Constructs Translation of Variables and Data Types
Translation of Records
Translation of records is similar to translation of arrays:
• Determine size and memory layout
• Compute adresses for selection of record components and pointer dereferencing
• Translation of record operations, e.g. assignments to record components
Recommended Reading: Wilhelm, Maurer, Section 2.6.2
Implementation of Dynamic Objects
Dynamic objects = dynamically allocated variables and objects in sense of OO programing
Dynamic objects are stored on the heap:
• number of dynamic objects is not known at compile time, objects are created at runtime
• dynamic objects have a designated lifetime which disallows handling with stack
Memory representation and addressing of components is similar to static records.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 61
Translation of Imperative Language Constructs Translation of Variables and Data Types
Implementation of Dynamic Objects (2)
Example:
Implementierung dynamischer Objekte
Dynamische Objekte werden hier als Sammelbegriff für Dynamische Objekte werden hier als Sammelbegriff für dynamisch allozierte Variable und Objekte im Sinne der OO-Programmierung verwendet.
Dynamische Objekte werden auf der Halde verwaltet:
Dynamische Objekte werden auf der Halde verwaltet:
• Ihre Anzahl ist im Allg. zur Übersetzungszeit nicht
bekannt. Deshalb werden sie erst zur Laufzeit erzeugt.
• Sie haben eine Lebensdauer die eine kellerartigeSie haben eine Lebensdauer, die eine kellerartige Behandlung im Allg. nicht zulässt.
Beispiel: (dynamische Objekte) Beispiel: (dynamische Objekte)
typedef struct listelem { int head;
struct listelem* tail; }* list;
# define listelemSIZE sizeof(struct listelem{
int h; struct listelem* t;}) list append( int i list l ) {
list append( int i, list l ) {
list lvar = (list) calloc(1,listelemSIZE);
lvar->head = i;
lvar->tail = l;
return lvar;
}
Dynamic Memory Management
Dynamic memory management
• is handled by runtime environment
• can be supported by compiler
• can partially be handled by user program
Runtime environment provides operations for dynamic memory management:
• for the programmer, e.g. in C malloc, calloc, realloc, free
• for the compiler as in Pascal, Java, Ada
• no memory deallocation by programer possible, but garbage collection by runtime environment e.g. in Java
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 63
Translation of Imperative Language Constructs Translation of Variables and Data Types
Dynamic Memory Management (2)
General Problem: Provide memory blocks of different sizes from a linear memory and reuse memory after it has been freed
Simple memory management by linear list of free memory areas Structure of free memory area of variable length:
user data size
header
Dynamic Memory Management (3)
List of free memory areas:
user data size
header
free used
used free used
freelist
Procedure to allocate and deallocate memory:
• Allocate memory
I Search memory area B of appropriate size
I Update references:
• If area has exactly required size, remove it from list.
• Else update header of area, create header for rest of free memory and add this area instead of the old area to list.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 65
Translation of Imperative Language Constructs Translation of Variables and Data Types
Dynamic Memory Management (4)
I Return pointer to memory cell after header (size information has to be kept.)
I If no memory area of required size is found, new memory has to be requested from the OS
• Free memory
I Find header for memory area to be freed by pointer to this area
I If previous or next memory areas are free, join the areas
I Add resulting memory area to list
Dynamic Memory Management (5)
Remarks:
• If program writes over assigned memory area, references or size information can be destroyed with bad consequences.
• If memory cannot be allocated in bytes, alignment restrictions have to be obeyed.
• For practical use the above principle can be improved by
I non linear search
I search for exact memory areas, avoiding defragmentation
I support for joining memory areas after deallocation
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 67
Translation of Imperative Language Constructs Translation of Expressions
3.1.4 Translation of Expressions
Translation of Expressions
Difficulties for translation of expressions
• Management of intermediate results on stack or in registers
• Translation of source language operations
I no counterpart in target language
I addressing
I context-dependent (Boolean expression as condition is handled differently as Boolean expression in an assignment.)
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 69
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (2)
Abstract Syntax of Expressions:
Hier demonstrieren wir die generellen Probleme anhand eines kleinen Beispiels, das die direkte Übersetzung von Ausdrücken demonstriert.
Fortgeschrittene Techniken werden in Kapitel 3 behandelt.
B i i l ( i f h A d k üb t ) Beispiel: (einfache Ausdrucksübersetzung)
Wir betrachten die Ausdruckssyntax aus dem MI-Übersetzungsbeispiel in Abschnitt 3.1.2:
Exp = ArtihmExp | Relation | IntConst
| CharConst | ArrayAccess | Var ArithmExp = Add | Sub
Add, Sub ( Exp left, Exp right ) Relation = Lt | Eq
Relation Lt | Eq
Lt, Eq ( Exp left, Exp right ) IntConst ( Int i )
CharConst ( Char c )
ArrayAccess ( UsedId uid, Exp e ) i
Var ( UsedId uid ) UsedId ( Ident id )
Wir treffen folgende Entwurfsentscheidungen:
Zwischenergebnisse werden auf dem Keller verwaltet
• Zwischenergebnisse werden auf dem Keller verwaltet.
• Vergleiche werden durch Sprünge implementiert:
- Subtrahiere die beiden Werte auf dem Keller.
- In Abhängigkeit des Ergebnisses springe einen In Abhängigkeit des Ergebnisses springe einen Befehl an der 1 kellert bzw. der 0 kellert.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 70
Translation of Expressions (3)
Design decisions:
• Intermediate results are stored on stack.
• Comparisons are implemented by jumps:
I compare values on stack
I dependent on result, jump to command pushing 1 or pushing 0
I generate associated labels
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 71
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (4)
Attribution:
Attributdeklarationen:
Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList
eindeutige Marke für Ausdruck vom Typ String
Attributierung für das Code-Attribut:
Add
CL + CR +
[ Add2(W Postinc(SP) Regdef(SP) ]
tt but e u g ü das Code tt but
Exp
[ Add2(W,Postinc(SP),Regdef(SP) ]
CL CR
Exp
Lt
CL + CR + M
[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +
Exp
[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]
CL CR
Exp
Exp Exp
Relative Address of Variable or Array
Type of Expression (int, char, int[], char[]) Code for Subtree of Type CodeList
Unique Label for Expression of Type String
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 72
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (5)
Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList eindeutige Marke für Ausdruck vom Typ String
Attributierung für das Code-Attribut:
Add
CL + CR +
[ Add2(W Postinc(SP) Regdef(SP) ]
tt but e u g ü das Code tt but
Exp
[ Add2(W,Postinc(SP),Regdef(SP) ]
CL CR
Exp
Lt
CL + CR + M
[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +
Exp
[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]
CL CR
Exp
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 221
Exp Exp
( Die Attributierungen für Sub und Eq sind entsprechend. ) CL +
CR +
[LOAD (R2, 0, $sp) ADD ($sp, $sp, 4) LOAD (R1, 0, $sp) ADD (R1, R1, R2) STORE (R1, 0, $sp)]
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 73
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (6)
Attributdeklarationen:
Relativadresse einer Variable oder eines Feldes Typ eines Ausdrucks ( int, char, int[ ], char[ ] ) Code für den Unterbaum vom Typ CodeList eindeutige Marke für Ausdruck vom Typ String Attributierung für das Code-Attribut:
Add
CL + CR +
[ Add2(W Postinc(SP) Regdef(SP) ] tt but e u g ü das Code tt but
Exp
[ Add2(W,Postinc(SP),Regdef(SP) ]
CL CR
Exp
Lt
CL + CR + M
[ Sub2( W, Postinc(SP), Regdef(SP) ] + [ Jlt( Label( “PUSH1_“ + M ) ) ] + [ Move( W, Imm(0), Regdef(SP) ) ] + [ Jump( Label( “ENDREL_“ + M )) ] + [ Label( “PUSH1 “ + M ) ] +
Exp
[ Label( PUSH1_ + M ) ] + [ Move( W, Imm(1), Regdef(SP) ) ] + [ Label( “ENDREL_“ + M ) ]
CL CR
Exp
Exp Exp
( Die Attributierungen für Sub und Eq sind entsprechend. ) CL + CR +
[LOAD (R2, 0, $sp) ADD($sp, $sp, 4) LOAD (R1, 0, $sp) SLT (R1, R1, R2)
BEQ (R1, $zero, “PUSH_0_”+M) LOADI (R1, 1)
STORE (R1, 0, $sp) JUMP (“ENDREL_”+M) LABEL(“PUSH_0_”+M) LOADI (R1, 0)
STORE (R1, 0, $sp) LABEL (“ENDREL_”+M)]
Translation of Expressions (7)
IntConst
[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int
Var TV
if TV = int then
[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char
else // TV char
[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA
ArrayAccess TV
ArrayAccess
CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then
[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),
[ ( p ( g( ) g( ) )
Regdef(SP) ] else // TV = char
[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]
Beachte: Die Attributierung von Var und ArrayAccess
UsedId RA CR
Exp
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 222
Beachte: Die Attributierung von Var und ArrayAccess erzeugt Code zum Kellern des Werts vom Ausdruck, nicht für die Adressierung des Zugriffs.
[LOADI (Ri, int) ] + [SUB ($sp, $sp, 4)] + [STORE (Ri, 0, $sp)]
if TV = int then
[SUB ($sp, $sp, 4) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)
STORE (R2, 0, $sp) ] else // TV = char
[SUB ($sp,$sp,1) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)
STOREB (R2, 0, $sp) ]
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 75
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (8)
IntConst
[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int
Var TV
if TV = int then
[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char
else // TV char
[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA
ArrayAccess TV
ArrayAccess
CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then
[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),
[ ( p ( g( ) g( ) )
Regdef(SP) ] else // TV = char
[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]
Beachte: Die Attributierung von Var und ArrayAccess
UsedIdRA CR
Exp
Beachte: Die Attributierung von Var und ArrayAccess [LOADI (Ri, int) ] +
[SUB ($sp, $sp, 4)] + [STORE (Ri, 0, $sp)]
if TV = int then [SUB ($sp, $sp, 4) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI) STORE (R2, 0, $sp) ] else // TV = char [SUB ($sp,$sp,1) LOADI(R1,RA) ADD (RI, RI, $gp) LOAD(R2, 0, RI)
STOREB (R2, 0, $sp) ]
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 76
Translation of Imperative Language Constructs Translation of Expressions
Translation of Expressions (9)
[ Move( W, Imm( ), Predec(SP) ] [ Move( W, Imm(_), Predec(SP) ] Int
Var TV
if TV = int then
[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ] else // TV = char
else // TV char
[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId RA
ArrayAccess TV
ArrayAccess
CR + [ Move( W, Regdef(SP), Reg(R1) ] + if TV = int then
[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),
[ ( p ( g( ) g( ) )
Regdef(SP) ] else // TV = char
[ Conv( Bdispx( Reg(R0), Reg(R1), RA), Regdef( SP ) ]
Beachte: Die Attributierung von Var und ArrayAccess
UsedIdRA CR
Exp
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 222
Beachte: Die Attributierung von Var und ArrayAccess erzeugt Code zum Kellern des Werts vom Ausdruck, nicht für die Adressierung des Zugriffs.
CR +
[LOAD (R1, 0, $sp) LOADI (R2, RA) ADD (R1, R1, R2) ADD (R1, R1, $gp)] + if TV = int then
[LOAD (R2, 0, RI) STORE (R2, 0, $sp)]
else // TV = char [LOADB (R2 0, RI) STOREB (R2, 0, $sp)]
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 77
Translation of Imperative Language Constructs Translation of Expressions
Improvements
• Improvement of generated code by
I Storage of intermediate results in registers
I Context-dependent optimizing instruction selection
I Avoiding redundant computations by evaluating common subexpressions only once
• Improvement of translation technique by usage of intermediate language
3.1.5 Translation of Statements
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 79
Translation of Imperative Language Constructs Translation of Statements
Translation of Statements
Most statements can be translated by translation schemes with jumps:
Verbesserungen:
• des erzeugten Codes durch
Verwaltung von Zwischenergebnissen in Registern - Verwaltung von Zwischenergebnissen in Registern - kontextabhängige, optimierende Befehlsauswahl - Vermeidung redundanter Berechnungen durch
einmalige Auswertung gemeinsamer Teilausdrücke Ü
3 1 5 Übersetzung von Anweisungen
• der Übersetzungstechnik durch Benutzung einer Zwischensprache
Für die meisten Anweisungen lassen sich relativ leicht Übersetzungsschemata mittels Sprüngen angeben:
3.1.5 Übersetzung von Anweisungen
While
[ Label( “BEGWHILE_“ + M ) ] + CE +
[ Cmp( W Imm(0) Postinc(SP) ) ] + M
[ Cmp( W, Imm(0), Postinc(SP) ) ] + [ Jeq( Label( “ENDWHILE_“+M) ) ] + CS +
[ Jump(Label( “BEGWHILE_“+M)) ] + [ Label( “ENDWHILE_“ + M ) ]
Schwieriger ist die gute Übersetzungen von switch- Exp
( )
CE CS
Stat
g g g
[LABEL (“BEGWHILE_”+M)] + CE +
[LOAD (R1, 0, $sp) ADD ($sp, $sp, 4)
BEQ (R1, $zero, “ENDWHILE_”+M)] + CS +
[JUMP (“BEGWHILE_”+ M)] + [LABEL (“ENDWHILE_”+M)]
More Complex Translation of Statements
More complex is a good translation of switch-statements and efficient handling of non-strict expressions.
We consider the translation of non-strict Boolean expressions as an example of an optimizing translation and for the usage of context information.
Example: Abstract Syntax
Wir demonstrieren hier die Übersetzung nicht-strikter boolescher Ausdrücke:
• als Beispiel für eine optimierende Übersetzung
• um die Verwendung von Kontextinformation zu illustrieren.
Beispiel: (Verwendung ererbter Information)
Stat = While | IfThenElse | ...
BExp = And | Or | Not | StrictExp
Beispiel: (Verwendung ererbter Information)
Wir betrachten folgendes Sprachfragment:
BExp And | Or | Not | StrictExp While ( BExp c, Stat b )
IfThenElse ( BExp c, Stat then, Stat else ) And, Or ( BExp left, BExp right )
Not ( Bexp e ) StrictExp ( Exp e ) Ein Programmfragment dazu:
if( (B1 || B2) && ! B3 ) { while( !(B4 || B5) ) A1
Wobei A1 und A2 Anweisungen sind und B1 bis B5 while( !(B4 || B5) ) A1
} else { A2 }
Wobei A1 und A2 Anweisungen sind und B1 bis B5 strikte Ausdrücke. Wie in C und Java sind die
booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.
dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!
12.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 224
nicht mehr ausgewertet werden braucht und darf!
Außerdem sollen Sprungketten vermieden werden, d.h. Sprünge zu unbedingten Sprungbefehlen.
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 81
Translation of Imperative Language Constructs Translation of Statements
More Complex Translation of Statements (2)
A program fragment:
nicht-strikter boolescher Ausdrücke:
• als Beispiel für eine optimierende Übersetzung
• um die Verwendung von Kontextinformation zu illustrieren.
Beispiel: (Verwendung ererbter Information)
Stat = While | IfThenElse | ...
BExp = And | Or | Not | StrictExp
Beispiel: (Verwendung ererbter Information)
Wir betrachten folgendes Sprachfragment:
BExp And | Or | Not | StrictExp While ( BExp c, Stat b )
IfThenElse ( BExp c, Stat then, Stat else ) And, Or ( BExp left, BExp right )
Not ( Bexp e ) StrictExp ( Exp e )
Ein Programmfragment dazu:
if( (B1 || B2) && ! B3 ) { while( !(B4 || B5) ) A1
Wobei A1 und A2 Anweisungen sind und B1 bis B5
while( !(B4 || B5) ) A1 } else {
A2 }
Wobei A1 und A2 Anweisungen sind und B1 bis B5 strikte Ausdrücke. Wie in C und Java sind die
booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.
dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!
nicht mehr ausgewertet werden braucht und darf!
Außerdem sollen Sprungketten vermieden werden,
where
• A1, A2 are statements
• B1 – B5 are strict expressions
c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 82