Compilers and Language Processing Tools
Summer Term 2013
Arnd Poetzsch-Heffter Annette Bieniusa
Software Technology Group TU Kaiserslautern
c
Arnd Poetzsch-Heffter 1
Content of Lecture
1. Introduction
2. Syntax and Type Analysis 2.1 Lexical Analysis
2.2 Context-Free Syntax Analysis
2.3 Context-Dependent Analysis (Semantic Analysis) 3. Translation to Intermediate Representation
3.1 Languages for Intermediate Representation 3.2 Translation of Imperative Language Constructs 3.3 Translation of Object-Oriented Language Constructs 3.4 Translation of Procedures
4. Optimization and Code Generation 4.1 Assembly and Machine Code 4.2 Optimization
4.3 Register Allocation 4.4 Further Aspects
Content of Lecture (2)
5. Selected Topics in Compiler Construction 5.1 Garbage Collection
5.2 Just-in-time Compilation
5.3 XML Processing (DOM, SAX, XSLT)
c
Arnd Poetzsch-Heffter 3
4. Optimization and Code Generation
Chapter Outline
4. Optimization and Code Generation 4.1 Assembly and Machine Languages 4.2 Optimization
4.3 Register Allocation 4.4 Further Aspects
c
Arnd Poetzsch-Heffter Optimization and Code Generation 5
Learning objectives
• Introduction to assembly and machine languages
• Different optimization techniques
• Different static analysis techniques
• Register allocation
• Further aspects of code generation
4.1 Assembly and Machine Languages
c
Arnd Poetzsch-Heffter Optimization and Code Generation 7
Assembly and Machine Languages
Introduction
Assembly languages have the following language constructs:
• Finite sequences of bits of various length: byte, word, halfword, ...
• Global memory
I register, flags (addressing by name)
I indexed, mostly word-addressed main memory
• Instructions
I load, store
I arithmetic and boolean operations
I execution control (jumps, procedures)
I simple, not combined statements
I possibly complex addressing of operands
• Initialization instructions
The MIPS Assembler
MIPS -Microprocessor withoutinterlockedpipelinestages
• RISC Architecture, originally 32 bit (since 1991 64bit)
• developed by John Hennessy (Stanford) starting 1981
• SPIM Simulator
c
Arnd Poetzsch-Heffter Optimization and Code Generation 9
Assembly and Machine Languages
MIPS Architecture
• Arithmetic-Logic Unit (ALU)
• Floating-Point Unit (FPU)
• 32 Registers (inkl. stack pointer, frame pointer, global pointer, return address)
• Main memory, 230 memory words (4 byte)
• 5-stage pipeline
MIPS Architecture
Memory PC
Adder
Register File
Sign Extend
IF / ID ID / EX
Imm RS1
RS2
Zero?
ALU
MUX EX / MEM
Memory
MUX
MEM / WB
MUX
MUX
Next SEQ PC Next SEQ PC
WB Data Branch
taken
IR
Instruction Fetch
Next PC
Instruction Decode
Register Fetch Execute
Address Calc. Memory Access Write Back
IF ID EX MEM WB
image: Wikipedia
c
Arnd Poetzsch-Heffter Optimization and Code Generation 11
Assembly and Machine Languages
Memory Structure
Reserved for OS
Stack Segment
free
Heap Segment
Data Segment
Text Segment
Reserved 0xFFFFFFFF
0x80000000 0x7FFFFFFF
0x10000000
0x00400000
0x00000000
$sp
Data Types and Literals in MIPS Assembly Language
Data Types
• Instructions are all 32 bits
• byte (8 bits), halfword (2 bytes), word (4 bytes)
• integer (1 word storage)
• single precision floats (1 word storage)
• double precision floats (2 word storage)
Literals
• Integers (e.g. 4, 2, -236, 0x44)
• Floats (e.g. 3.41, -0.323e5)
• Characters in single quotes, e.g. ’b’
• Strings in double quotes, e.g. "Hello World"
c
Arnd Poetzsch-Heffter Optimization and Code Generation 13
Assembly and Machine Languages
MIPS Registers
No Name P* Description
0 $zero - the constant 0
1 $at - assembler temporary (reserved by the assembler) 2-3 $v0, $v1 no values for function results and expression evaluation 4-7 $a0 - $a3 no arguments for subroutine calls
8-15 $t0 - $t7 no temporaries 16-23 $s0 - $s7 yes saved temporaries 24-25 $t8 - $t9 no additional temporaries 26-27 $k0, $k1 no reserved for OS kernel
28 $gp yes global pointer
29 $sp yes stack pointer
30 $fp yes frame pointer
31 $ra yes return address
*callee must preserve value
MIPS Instruction Format
• Instructions are always 32 bit
• Opcode in first 6 bits
• 3 types of instructions: R-, I-, and J-instructions R-Instructions
opcode (6) rs (5) rt (5) rd (5) shamt (5) funct (6) I-Instructions
opcode (6) rs (5) rt (5) immediate (16) J-Instructions
opcode (6) address (26)
c
Arnd Poetzsch-Heffter Optimization and Code Generation 15
Assembly and Machine Languages
MIPS Instructions
In the following let, r1, r2, r3, be registers (e.g. $s1, $t3) and let c be constant values (e.g. 4, 100, -4).
Arithmetic
add add r1, r2, r3 r1 = r2 + r3 subtract sub r1, r2, r3 r1 = r2 - r3 add immediate addi r1, r2, c r1 = r2 + c multiply mult r1, r2, r3 r1 = r2 * r3
(lower 32 bits of result) move move r1, r2 addi r1, r2, 0
MIPS Instructions (2)
Data Transfer
load word lw r1, c(r2) r1 = Memory[r2 + c]
store word sw r1, c(r2) Memory[r2 + c] = r1 load immediate li r1, c r1 = c
load half lh r1, c(r2) r1 = Memory[r2 + c]
store half sh r1, c(r2) Memory[r2 + c] = r1 load byte lb r1, c(r2) r1 = Memory[r2 + c]
store byte sb r1, c(r2) Memory[r2 + c] = r1
c
Arnd Poetzsch-Heffter Optimization and Code Generation 17
Assembly and Machine Languages
MIPS Instructions (3)
Logical
and and r1, r2, r3 r1 = r2 & r3 or or r1, r2, r3 r1 = r2 | r3 nor nor r1, r2, r3 r1 =¬( r2 | r3 ) and immediate andi r1, r2, c r1 = r2 & c or immediate ori r1, r2, c r1 = r2 | c shift left logical sll r1, r2, c r1 = r2 « c shift right logical srl r1, r2, c r1 = r2 » c
MIPS Instructions (4)
Conditional Branches
branch on equal beq r1, r2, label if (r1 == r2) goto label branch on not equal bne r1, r2, label if (r1 != r2)
goto label set on less than slt r1, r2, r3 if (r2<r3)
r1 := 1 else r1 := 0 set o.l.t. immediate slti r1, r2, c if (r2<c)
r1 := 1 else r1 := 0
Unconditional Branches
jump j label goto label jump register jr r1 goto r1
jump and link jal label $ra = PC + 4; goto label
c
Arnd Poetzsch-Heffter Optimization and Code Generation 19
Assembly and Machine Languages
Subroutine Calls
Subroutine call (jump and link)
jal label # jump and link
• copy program counter to $ra
• jump to label
• Note: before call store $ra on stack Subroutine return (jump register)
jr $ra # jump register
• jump to return address in $ra
Working with the Stack
Push data on the stack
sw $ra, ($sp) # save return address on stack addi $sp, $sp, -4 # decrement stack pointer sw $fp, ($sp) # save frame pointer on stack addi $sp, $sp, -4 # decrement stack pointer
Pop data from the stack
addi $sp, $sp, 4 # increment stack pointer lw $fp, ($sp) # pop saved frame pointer addi $sp, $sp, 4 # increment stack pointer lw $ra, ($sp) # pop saved return address
c
Arnd Poetzsch-Heffter Optimization and Code Generation 21
Assembly and Machine Languages
Adressing in MIPS
• Immediate: Operand is a constant, e.g. 25
• Register: Operand is a register, e.g. $s2
• Base or Displacement Addressing:Operand is a memory location whose address is the sum of the register and a constant, e.g. 8($sp)
• PC relative: Address is the sum of PC and a constant
• Pseudodirect Addressing:Jump address is the 26 bit of the instruction with the upper bits of the PC
Syscalls for MARS/SPIM Simulators
How to use System Calls:
• load service number into register $v0
• load argument values, if any into $a0, $a1, $a2
• issue call instructionsyscall
• retrieve return values, if any Example:
li $v0, 1 # print integer
move $a0, $t0 # load value into $a0 syscall
c
Arnd Poetzsch-Heffter Optimization and Code Generation 23
Assembly and Machine Languages
List of System Services
Service Code in $v0 Arguments
print integer 1 $a0 = integer to print
print string 4 $a0 = address of
null-terminated string to print exit (terminate execution) 10
print character 11 $a0 = character to print
exit2 (terminate with value) 17 $a0 = termination result
MIPS Assembly Program Structure
.data # data declarations follow this line
# ...
.text # instructions follow this line
# ...
main: # indicates the first instruction to execute
# ...
c
Arnd Poetzsch-Heffter Optimization and Code Generation 25
Assembly and Machine Languages
Data Declarations
Format
<name>: <type> (<initial values> | <allocated space>)
Example
.data # data declarations follow
var: .word 3 # integer variable with initial value 3 array1: .byte ’a’,’b’ # 2-element character array initialized
# with ’a’ and ’b’
array2: .space 40 # allocate 40 bytes, uninitialized
Example: Translation to MIPS
The example illustrates the MIPS assembler and typical translation tasks.
Code quality is not considered.
Source Code in C
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() { 5 i = 2;
6 res = 1;
7 while( -1 < i ) { 8 if( res ) {
9 res = (a[i]==b[i]);
10 i = i-1;
11 } else { 12 i = i-1;
13 }
14 }
15 return res;
16 }
c
Arnd Poetzsch-Heffter Optimization and Code Generation 27
Assembly and Machine Languages
Source Code in C with Labels
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() {
5 main: i = 2;
6 res = 1;
7 loop: while( -1 < i ) {
8 if( res ) {
9 res = (a[i]==b[i]);
10 after: i = i-1;
11 } else {
12 elseif: i = i-1;
13 } // afterif:
14 }
15 exit: return res;
16 }
Source Code in C with Gotos
1 char a[3], b[3];
2 int i;
3 char res;
4 int main() {
5 i = 2;
6 res = 1;
7 loop: if (! (-1 < i ))
8 goto exit;
9 if( !res )
10 goto elseif;
11 if (a[i]==b[i])
12 goto equal;
13 res = 0;
14 goto after;
15 equal: res = 1;
16 after: i = i-1;
17 goto afterif;
18 elseif: i = i-1;
19 afterif: goto loop;
20 exit: return res;
21 }
c
Arnd Poetzsch-Heffter Optimization and Code Generation 29
Assembly and Machine Languages
MIPS Program
# sp + 0 : i
# sp + 4 : res
# sp + 5 : base address of a[3]
# sp + 8 : base address of b[3]
main:
addi $sp, $sp, -12 # make space for the variables li $t1, 2
sw $t1, 0($sp) # i = 2 li $t1, 1
sb $t1, 4($sp) # set res at sp +4
MIPS Program (2)
loop:
lw $t2, 0($sp) # load i into $t2 li $t3, -1 # load -1 into $t3 slt $t0, $t3, $t2 # -1 < i ?
beq $t0, $zero, exit # if not -1 < i goto exit lb $t1, 4($sp) # load res from stack into $t1 beq $t1, $zero, elseif # if res == 0 goto else if add $t4, $sp, 5 # base address of array a add $t4, $t4, $t2 # add offset/ array index lb $t0, 0($t4) # load a[i]
add $t4, $sp, 8 # base address of array b add $t4, $t4, $t2 # add offset/ array index lb $t1, 0($t4) # load b[i]
beq $t0, $t1, equal # if a[i] == b[i]
sb $zero, 4($sp) # set res to 0
j after
c
Arnd Poetzsch-Heffter Optimization and Code Generation 31
Assembly and Machine Languages
MIPS Program (3)
equal:
addi $t3, $zero, 1 # $t3 = 1 sb $t3, 4($sp) # res = $t3 after:
subi $t2, $t2, 1 # i = i-1
sw $t2, 0($sp) # store i to $sp +4
j afterif # goto end of if statement elseif:
subi $t2, $t2, 1 # i = i-1
sw $t2, 0($sp) # store i to $sp +4 afterif:
j loop # return to loop
exit:
lw $a0, 4($sp) # terminate with exit code res addi $sp, $sp, 12 # reset stack pointer
li $v0, 17 syscall
Translation to MIPS
Remarks:
The example illustrates typical translation tasks:
• Translation of data types, memory management, addressing
• Translation of expressions, management of intermediate results, mapping of operations of the source language to operations of the target language
• Translation of statements by implementation with jumps
• Bad code quality with simple systematic approach
c
Arnd Poetzsch-Heffter Optimization and Code Generation 33
Assembly and Machine Languages
MIPS Abstract Syntax
Prog * Instruction
Instruction = ADD (Register reg0, Register reg1, Register reg2)
| ADDI (Register reg0, Register reg1, Const const0)
| BEQ (Register reg0, Register reg1, Label label0)
| SLT (Register reg0, Register reg1, Register reg2)
| SLTI (Register reg0, Register reg1, Const const0)
| J (Label label0)
| JR (Register reg0)
| JAL (Label label0) ...
Const ( Integer value ) Label ( Integer labelId )
Register = Zero () | AT () | VReg | AReg | TReg | SReg
| KReg | GP () | SP () | FP () | RA () VReg = V0 () | V1 ()
AReg = A0 () | A1 () | A2 () | A3 () ...
4.2 Optimization
c
Arnd Poetzsch-Heffter Optimization and Code Generation 35
Optimization
Optimization
Optimization refers to improving the code with the following goals:
• Runtime behavior
• Memory consumption
• Size of code
• Energy consumption
Optimization (2)
We distinguish the following kinds of optimizations:
• machine-independent optimizations
• machine-dependent optimizations (exploit properties of a particular real machine)
and
• local optimizations
• intra-procedural optimizations
• inter-procedural/global optimizations
c
Arnd Poetzsch-Heffter Optimization and Code Generation 37
Optimization
Remark on Optimization
Appel (Chap. 17, p 350):
"In fact, there can never be a complete list [of optimizations]. "
"Computability theory shows that it will always be possible to invent new optimizing transformations."
4.2.1 Classical Optimization Techniques
c
Arnd Poetzsch-Heffter Optimization and Code Generation 39
Optimization Classical Optimization Techniques
Constant propagation
If the value of a variable is constant, the variable can be replaced with the constant.
Constant folding
Evaluate all expressions with constants as operands at compile time.
Iteration of Constant Folding and Propagation:
c
Arnd Poetzsch-Heffter Optimization and Code Generation 41
Optimization Classical Optimization Techniques
Non-local constant optimization
For each program position, the possible values for each variable are required. If the set of possible values is infinite, it has to be abstracted appropriately.
Copy propagation
Eliminate all copies of variables, i.e., if there exist several variables x,y,z at a program position, that are known to have the same value, all uses of y and z are replaced by x.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 43
Optimization Classical Optimization Techniques
Copy propagation (2)
This can also be done at join points of control flow or for loops:
For each program point, the information which variables have the same value is required.
Common subexpression elimination
If an expression or a statement contains the same partial expression several times, the goal is to evaluate this subexpression only once.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 45
Optimization Classical Optimization Techniques
Common subexpression elimination (2)
Optimization of a basic block is done after transformation to SSA and construction of a DAG:
Common subexpression elimination (3)
Remarks:
• The elimination of repeated computations is often done before transformation to 3AC, but can also be reasonable following other transformations.
• The DAG representation of expressions is also used as intermediate language by some authors.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 47
Optimization Classical Optimization Techniques
Algebraic optimizations
Algebraic laws can be applied in order to be able to use other optimizations. For example, use associativity and commutativity of addition:
Caution: For finite data type, common algebraic laws are not valid in general.
Strength reduction
Replace expensive operations by more efficient operations (partially machine-dependent).
For example: y: = 2 * x can be replaced by y : = x + x
or by
y: = x « 1
c
Arnd Poetzsch-Heffter Optimization and Code Generation 49
Optimization Classical Optimization Techniques
Inline expansion of procedure calls
Replace call to non-recursive procedure by its body with appropriate substitution of parameters.
Note: This reduces execution time, but increases code size.
Inline expansion of procedure calls (2)
Remarks:
• Expansion is in general more than text replacement:
c
Arnd Poetzsch-Heffter Optimization and Code Generation 51
Optimization Classical Optimization Techniques
Inline expansion of procedure calls (3)
• In OO programs with relatively short methods, expansion is an important optimization technique. But, precise information about the target object is required.
• A refinement of inline expansion is the specialization of
procedures/functions if some of the actual parameters are known.
This technique can also be applied to recursive procedures/functions.
Dead code elimination
Remove code that is not reached during execution or that has no influence on execution.
In one of the above examples, constant folding and propagation produced the following code:
Provided, t3 and t4 are no longer used after the basic block (not live).
c
Arnd Poetzsch-Heffter Optimization and Code Generation 53
Optimization Classical Optimization Techniques
Dead code elimination (2)
A typical example for non-reachable and thus, dead code that can be eliminated:
Dead code elimination (3)
Remarks:
• Dead code is often caused by optimizations.
• Another source of dead code are program modifications.
• In the first case, liveness information is the prerequiste for dead code elimination.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 55
Optimization Classical Optimization Techniques
Code motion
Move commands over branching points in the control flow graph such that they end up in basic blocks that are less often executed.
We consider two cases:
• Move commands in succeeding or preceeding branches
• Move code out of loops
Optimization of loops is very profitable, because code inside loops is executed more often than code not contained in a loop.
Move code over branching points
If a sequential computation branches, the branches are less often executed than the sequence.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 57
Optimization Classical Optimization Techniques
Move code over branching points (2)
Prerequisite for this optimization is that a defined variable is only used in one branch.
Moving the command over a preceeding joint point can be advisable, if the command can be eliminated by optimization from one of the branches.
Partial redundancy elimination
Definition (Partial Redundancy)
An assignment isredundantat a program positions, if it has already been executed on all paths tos.
An expressioneisredundantats, if the value ofehas already been calculated on all paths tos.
An assignment/expression ispartially redundantats, if it is redundant with respect to some execution paths leading tos.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 59
Optimization Classical Optimization Techniques
Partial redundancy elimination (2)
Example:
Partial redundancy elimination (3)
Elimination of partial redundancy:
c
Arnd Poetzsch-Heffter Optimization and Code Generation 61
Optimization Classical Optimization Techniques
Partial redundancy elimination (4)
Remarks:
• PRE can be seen as a combination and extension of common subexpression elimination and code motion.
• Extension: Elimination of partial redundancy according to estimated probability for execution of specific paths.
Code motion from loops
Idea: Computations in loops whose operations are not changed inside the loop should be done outside the loop.
Provided, t1 is not live at the end of the top-most block on the left side.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 63
Optimization Classical Optimization Techniques
Optimization of loop variables
Variables and expressions that are not changed during the execution of a loop are calledloop invariant.
Loops often have variables that are increased/decreased systematically in each loop execution, e.g., for-loops.
Often, a loop variable depends on another loop variable, e.g., a relative address depends on the loop counter variable.
Optimization of loop variables (2)
Definition (Loop Variables)
A variablei is calledexplicit loop variableof a loopS, if there is exactly one definition ofi inSof the formi :=i+cwherec is loop invariant.
A variablek is calledderived loop variableof a loopS, if there is exactly one definition ofk inSof the formk :=j∗cork :=j+d wherej is a loop variable andcandd are loop invariant.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 65
Optimization Classical Optimization Techniques
Induction variable analysis
Compute derived loop variables inductively, i.e., instead of computing them from the value of the loop variable, compute them from the valued of the previous loop execution.
Note: For optimization of derived loop variables, the dependencies between variable definitions have to be precisely understood.
Loop unrolling
If the number of loop executions is known statically or properties about the number of loop executions (e.g., always an even number) can be inferred, the loop body can be copied several times to save comparisons and jumps.
Provided,ix is dead at the end of the fragment.
Note, the static computation ofix’s values in the unrolled loop.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 67
Optimization Classical Optimization Techniques
Loop unrolling (2)
Remarks:
• Partial loop unrolling aims at obtaining larger basic blocks in loops to have more optimization options.
• Loop unrolling is in particular important for parallel processor architectures and pipelined processing (machine-dependent).
Optimization for other language classes
The discussed optimizations aim at imperative languages. For optimizing programs of other language classes, special techniques have been developed.
For example:
• Object-oriented languages: Optimization of dynamic binding (type analysis)
• Non-strict functional languages: Optimization of lazy function calls (strictness analysis)
• Logic programming languages: Optimization of unification
c
Arnd Poetzsch-Heffter Optimization and Code Generation 69
Optimization Potential of Optimizations
4.2.2 Potential of Optimizations
Potential of optimizations - Example
Consider procedureskprodfor the evaluation of the optimization techniques:
4.2.2 Optimierungspotential
Am Beispiel der Prozedur skprod demonstrieren
i i i d bi T h ik d d
wir einige der obigen Techniken und das Verbesserungspotential, das durch Optimierungen erzielt werden kann; dabei skizzieren wir auch dessen Bewertung.
k d
skprod:
res:= 0 ix := 0
t0 := lng-1 if ix<=t0
true false
return res t1 := i1+ix
tx := t1*4 ta := a+tx t2 := *ta t1 := i2+ix t1 : i2+ix tx := t1*4 tb := b+tx t3 := *tb t1 := t2*t3 res:= res+t1 ix := ix+1
Bewertung: Anzahl der Befehlsschritte in Abhängigkeit
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 322
Bewertung: Anzahl der Befehlsschritte in Abhängigkeit von lng: 2 + 2 + 13*lng + 1 = 13*lng + 5
( lng = 100: 1305, lng = 1000: 13005 )
Evaluation:
Number of steps depending onlng:
2+2+13∗lng+1=13∗lng+5 lng=100: 1305
lng=1000: 13005
c
Arnd Poetzsch-Heffter Optimization and Code Generation 71
Optimization Potential of Optimizations
Potential of optimizations - Example (2)
Move computation of loop invariant out of loop:Herausziehen der Berechnung der Schleifeninvariante t0:
skprod:
res:= 0 res:= 0 ix := 0 t0 := lng-1
if i < t0
return res t1 := i1+ix
tx := t1*4 if ix<=t0
true false
ta := a+tx t2 := *ta t1 := i2+ix tx := t1*4 tb := b+tx tb : b+tx t3 := *tb t1 := t2*t3 res:= res+t1 ix := ix+1
Bewertung: 3 + 1 + 12*lng + 1 = 12*lng + 5g g g ( lng = 100: 1205, lng = 1000: 12005 )
Evaluation: 3+1+12*lng+1 = 12 *lng + 5
c
Arnd Poetzsch-Heffter Optimization and Code Generation 72
Potential of optimizations - Example (3)
Optimization of loop variables: There are no derived loop variables, because t1 and tx have several definitions; transformation to SSA for t1 and tx yields that t11, tx1, ta, t12, tb become derived loop variables.
Optimierung von Schleifenvariablen (1):
Zunächst gibt es keine abgeleiteten Schleifenvariablen, da t1 und tx mehrere Definitionen besitzen; Einführen von SSA für t1 und tx macht t11, tx1, ta, t12, tx2, tb zu abgeleiteten Schleifenvariablen:
skprod:
res:= 0 res:= 0 ix := 0 t0 := lng-1
if i < t0
return res t11:= i1+ix
tx1:= t11*4 1 if ix<=t0
true false
ta := a+tx1 t2 := *ta t12:= i2+ix tx2:= t12*4 tb := b+tx2 tb : b t t3 := *tb t13:= t2*t3 res:= res+t13 ix := ix+1
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 324 c
Arnd Poetzsch-Heffter Optimization and Code Generation 73
Optimization Potential of Optimizations
Potential of optimizations - Example (4)
Optimization of loop variables(2): Inductive definition of loop variablesOptimierung von Schleifenvariablen (2):
Initialisierung und induktive Definition der S hl if i bl
Schleifenvariablen:
skprod:
res:= 0 res:= 0 ix := 0 t0 := lng-1 t11:= i1-1 tx1:= t11*4 ta := a+tx1 t12:= i2-1 tx2:= t12*4 tb := b+tx2
t11:= t11+1 if ix<=t0 true false
return res t11:= t11+1
tx1:= tx1+4 ta := ta+4 t2 := *ta t12:= t12+1 tx2:= tx2+4 tb := tb+4 t3 := *tb t13:= t2*t3 res:= res+t13
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 325 res: res+t13
ix := ix+1
Potential of optimizations - Example (5)
Dead Code Elimination: t11, tx1, t12, tx2 do not influence the result.Elimination toten Codes:
Die Zuweisungen an t11, tx1, t12, tx2sind toter Code da sie das Ergebnis nicht beeinflussen
skprod:
Code, da sie das Ergebnis nicht beeinflussen.
res:= 0 ix := 0 t0 := lng-1 t11:= i1-1 tx1:= t11*4 tx1: t11 4 ta := a+tx1 t12:= i2-1 tx2:= t12*4 tb := b+tx2
if ix<=t0
true false
return res ta := ta+4
t2 := *ta tb := tb+4 t3 := *tb t13:= t2*t3 t13: t2 t3 res:= res+t13 ix := ix+1
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 326
Bewertung: 9 + 1 + 8*lng + 1 = 8*lng + 11 ( lng = 100: 811, lng = 1000: 8011 )
Evaluation: 9 + 1 + 8 * lng +1 = 8 * lng +11
c
Arnd Poetzsch-Heffter Optimization and Code Generation 75
Optimization Potential of Optimizations
Potential of optimizations - Example (6)
Algebraic Optimizations: Use invariantsta=4∗(i1−1+ix) +afor the comparisonta≤4∗(i1Algebraische Optimierung:−1+t0) +a
Ausnutzen der Invarianten: ta = 4*(i1-1+ix)+ a für den Vergleich: ta < 4*(i1 1+t0)+ a für den Vergleich: ta <= 4*(i1-1+t0)+ a
skprod:
res:= 0 ix := 0 t0 := lng-1 t11:= i1-1 tx1:= t11*4 tx1: t11 4 ta := a+tx1 t12:= i2-1 tx2:= t12*4 tb := b+tx2 t4 := t11+t0 t5 := 4*t4 t6 := t5+a
ta := ta+4 t2 := *ta
if ta<=t6
true false
return res t2 : ta
tb := tb+4 t3 := *tb t13:= t2*t3 res:= res+t13
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 327 ix := ix+1
Potential of optimizations - Example (7)
Dead Code Elimination: Assignment to ix is dead code and can be eliminated.Elimination toten Codes:
Durch die Transformation der Schleifenbedingung ist
di Z i C d d d k
die Zuweisung an ixtoter Code geworden und kann eliminiert werden:
skprod:
res:= 0 t0 := lng-1 t11:= i1-1 tx1:= t11*4 ta := a+tx1 ta := a+tx1 t12:= i2-1 tx2:= t12*4 tb := b+tx2 t4 := t11+t0 t5 := 4*t4 t6 := t5+a if ta<=t6
return res ta := ta+4
t2 := *ta tb := tb+4
if ta< t6
true false
tb : tb+4 t3 := *tb t13:= t2*t3 res:= res+t13
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 328 Bewertung: 11 + 1 + 7*lng + 1 = 7*lng + 13 ( lng = 100: 713, lng = 1000: 7013 )
Evaluation: 11 + 1 + 7 * Ing +1 = 7 * lng + 13
c
Arnd Poetzsch-Heffter Optimization and Code Generation 77
Optimization Potential of Optimizations
Potential of optimizations - Example (8)
Remarks:
• Reduction of execution steps by almost half, where the most significant reductions are achieved by loop optimization.
• Combination of optimization techniques is important. Determining the ordering of optimizations is in general difficult.
• We have only considered optimizations at examples. The difficulty is to find algorithms and heuristics for detecting optimization potential automatically and for executing the optimizing transformations.
4.2.3 Data flow analysis
c
Arnd Poetzsch-Heffter Optimization and Code Generation 79
Optimization Data flow analysis
Data flow analysis: Introduction
For optimizations, data flow information is required that can be obtained by data flow analysis.
Goal: Explanation of basic concepts of data flow analysis at examples Outline:
• Liveness analysis (Typical example of data flow analysis)
• Data flow equations
• Important analysis classes
Each analysis has an exact specification which information it provides.
Control flow graphs
Data flow analyses are usually defined based on a representation of a program or procedure as acontrol flow graph,CFGfor short:
• nodes are individual program statements or basic blocks
• an edge fromnton0 represents a potential control transfer from (the end of)nto (the beginning of)n0
Out-edges from nodenlead tosuccessor nodes,succ(n) In-edgesto nodencome frompredecessor nodes, pred(n)
Data flow information is mostly computed for(CF-)positionsbefore and after nodes.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 81
Optimization Data flow analysis
Liveness analysis
A temporary or variable is live at a position of a CFG if it holds a value that may be needed in the future. More precisely:
Definition (Liveness Analysis)
LetPbe a program. A variablev isliveat a CF-positionSinP if there is an execution pathπ fromSto a use ofv such that there is no definition ofv onπ.
Theliveness analysisdetermines for all positionsSinP which variables are live atS.
Liveness analysis (2)
Remarks:
• The definition of liveness of variables is static/syntactic. (In contrast, dead code was defined dynamically/semantically.)
• The result of the liveness analysis for a programmP can be represented as a functionlivemapping positions inP to bit vectors, where a bit vector contains an entry for each variable in P. Letibe the index of a variable inP, then it holds that:
live(S)[i] =1 iff v is live at positionS
c
Arnd Poetzsch-Heffter Optimization and Code Generation 83
Optimization Data flow analysis
Liveness analysis (3)
Idea:
• In a procedure-local analysis, exactly the global variables are live at the end of the exit block of the procedure.
• If the live variablesout(n)after a nodenare known, the live variablesin(n)beforenare computed by:
in(n) =gen(n)∪(out(n)\kill(n)) where
I gen(n)is the set of variablesv such thatv is applied innwithout a prior definition ofv
I kill(n)is the set of variables that are defined inn
Liveness analysis (4)
As the setin(n)is computed fromout(n), we have abackward analysis.
Fornnot the exit block of the procedure,out(n)is obtained by out(n) =[
in(ni)for all successorsniofn
Thus, for a program without loops,in(n)andout(n)are defined for all nodesn. Otherwise, we obtain a system of recursive equations.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 85
Optimization Data flow analysis
Liveness analysis - Example
Question: How do we compute out(B2)?
Data flow equations
Theory:
• There is always a solution for equations of the considered form.
• There is always a smallest solution that is obtained by an iteration starting from emptyinandoutsets.
Note: The equations may have several solutions.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 87
Optimization Data flow analysis
Ambiguity of solutions - Example
a := a B0:
b := 7 B1:
out(B0) =in(B0)∪in(B1) out(B1) ={ }
in(B0) =gen(B0)∪(out(B0)\kill(B0))
={a} ∪out(B0)
in(B1) =gen(B1)∪(out(B1)\kill(B1))
={ }
Thus,out(B0) =in(B0), and hencein(B0) ={a} ∪in(B0).
Possible Solutions: in(B0) ={a}orin(B0) ={a,b}
Computation of smallest fixpoint
foreach n gen(n) := ...
kill(n) := ...
in(n) := ∅ out(n) := ∅
if nis exit node then out(n) := ....
repeat
foreach n in0(n) := in(n) out0(n) := out(n)
in(n) := gen(n)∪(out(n)\kill(n) ) out(n) := S
s∈succ(n)in(s)
until ∀n.in0(n) =in(n)∧out0(n) =out(n)
c
Arnd Poetzsch-Heffter Optimization and Code Generation 89
Optimization Data flow analysis
Complexity
LetNbe the size of the input program
• ≤N nodes in CFG
⇒ ≤Nvariables
⇒N elements perin/out
⇒ O(N)time per set-union
• forloop performs constant number of set operations per node
⇒O(N2)time forforloop
• each iteration ofrepeatloop can only add to each set sets can contain at most every variable
⇒ sizes of all in and out sets sum to 2N2,
bounding the number of iterations of therepeatloop
⇒ worst-case complexity ofO(N4)
• ordering can cutrepeatloop down to 2-3 iterations
⇒O(N)orO(N2)in practice
Further analyses and classes of analyses
Many data flow analyses can be described as bit vector problems:
• Reaching definitions: Which definitions reach a positionS?
• Available expressions for elimination of repeated computations
• Very busy expressions: Which expression is needed for all subsequent computations?
The according analyses can be treated analogue to liveness analysis, but differ in
• the definition of the data flow information
• the definition ofgenandkill
• the direction of the analysis and the equations
c
Arnd Poetzsch-Heffter Optimization and Code Generation 91
Optimization Data flow analysis
Further analyses and classes of analyses (2)
For backward analyses, the data flow information before a nodenis obtained from the information aftern:
in(n) =gen(n)∪(out(n)\kill(n))
Analyses can be distinguished according to whether they consider the conjunction or intersection of the successor information:
out(n) = [
ni∈succ(n)
in(ni)
or
out(n) = \
ni∈succ(n)
in(ni)
Further analyses and classes of analyses (3)
For forward analyses, the dependency is the other way round:
out(n) =gen(n)∪(in(n)\kill(n)) with
in(n) = [
ni∈pred(n)
out(ni)
or
in(n) = \
ni∈pred(n)
out(ni)
c
Arnd Poetzsch-Heffter Optimization and Code Generation 93
Optimization Data flow analysis
Further analyses and classes of analyses (4)
Examples for each class of analysis:
conjunction intersection forward reachable definitions available expressions backward live variables busy expressions
Further analyses and classes of analyses (5)
For bit vector problems, data flow information consists of subsets of finite sets.
For other analyses, the collected information is more complex, e.g., for constant propagation, we consider mappings from variables to values.
For interprocedural analyses, complexity increases because the flow graph is not static.
Formal basis for the development andcorrectnessof optimizations is provided by the theory ofabstract interpretation.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 95
Optimization Data flow analysis
Literature
Recommended reading:
• Flemming Nielson, Hanne R. Nielson, Chris Hankin: Principles of Program Analysis (Springer-Verlag, corrected 2nd printing, 2005)
4.2.4 Non-Local Program Analysis
c
Arnd Poetzsch-Heffter Optimization and Code Generation 97
Optimization Non-Local Program Analysis
Non-local program analysis
We use apoints-toanalysis to demonstrate:
• interprocedural aspects: The analysis crosses the borders of single procedures.
• constraints: Program analysis very often involves solving or refining constraints.
• complex analysis results: The analysis result cannot be represented locally for a statement.
• analysis as abstraction: The result of the analysis is an abstraction of all possible program executions.
Points-to analysis
Analysis for programs with pointers and for object-oriented programs Goal: Compute which references to which records/objects a variable can hold.
Applications of Analysis Results:
Basis for optimizations
• Alias information (e.g., important for code motion)
I Can p.f = x cause changes to an object referenced by q?
I Can z = p.f read information that is written by p.f = x?
• Call graph construction
• Resolution of virtual method calls
• Escape analysis
c
Arnd Poetzsch-Heffter Optimization and Code Generation 99
Optimization Non-Local Program Analysis
Alias information
Beispiele: (Verwendung von Points-to- Analyseinformation)
Analyseinformation)
(1) p.f = x;
(2) f
A. Nutzen von Alias-Information:
(2) y = q.f;
(3) q.f = z;
p == q: (1)
(2) y = x;
(2) y x;
(3) q.f = z;
p != q: Erste Anweisung lässt sich mit den anderen beiden vertauschen
anderen beiden vertauschen.
B. Elimination dynamischer Bindung:
class A { class A {
void m( ... ) { ... } }
class B extends A { void m( ) { } void m( ... ) { ... } }
...
A p;
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 338
p = new B();
p.m(...) // Aufruf von B::m
First two statements can be switched.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 100
Optimization Non-Local Program Analysis
Elimination of dynamic binding
Beispiele: (Verwendung von Points-to- Analyseinformation)
Analyseinformation)
(1) p.f = x;
(2) f
A. Nutzen von Alias-Information:
(2) y = q.f;
(3) q.f = z;
p == q: (1)
(2) y = x;
(2) y x;
(3) q.f = z;
p != q: Erste Anweisung lässt sich mit den anderen beiden vertauschen
anderen beiden vertauschen.
B. Elimination dynamischer Bindung:
class A { class A {
void m( ... ) { ... } }
class B extends A { void m( ) { } void m( ... ) { ... } }
...
A p;
© A. Poetzsch-Heffter, TU Kaiserslautern
p = new B();
p.m(...) // Aufruf von B::mCall of B::m
c
Arnd Poetzsch-Heffter Optimization and Code Generation 101
Optimization Non-Local Program Analysis
Escape analysis
C. Escape-Analyse:
R m( A p ) {( p ) { B q;
q = new B(); // Kellerverwaltung möglich q.f = p;
q.g = p.n();
q g p ();
return q.g;
}
Eine Points-to-Analyse für Java:
Vereinfachungen:
• Gesamte Programm ist bekannt.
• Nur Zuweisungen und Methodenaufrufe der folgenden Form:
Di kt Z i
- Direkte Zuweisung: l = r - Schreiben auf Instanzvariablen: l.f = r - Lesen von Instanzvariablen: l = r.f
Objekterzeugung: l C()
- Objekterzeugung: l = new C() - Einfacher Methodenaufruf: l = r0.m(r1,..)
• Ausdrücke ohne Seiteneffekte
• Zusammengesetzte Anweisungen
© A. Poetzsch-Heffter, TU Kaiserslautern
• Zusammengesetzte Anweisungen
Can be stored on stack
c
Arnd Poetzsch-Heffter Optimization and Code Generation 102
Points-to analysis for Java
Simplifications and assumptions about underlying language
• Complete program is known.
• Only assignments and method calls of the following form are used:
I Direct assignment:l = r
I Write to instance variables:l.f = r
I Read of instance variables:l = r.f
I Object creation:l = new C()
I Simple method call:l = r0.m(r1, ...)
• Expressions without side effects
• Compound statements
c
Arnd Poetzsch-Heffter Optimization and Code Generation 103
Optimization Non-Local Program Analysis
Points-to analysis for Java (2)
Analysis type
• Flow-insensitive:The control flow of the program has no influence on the analysis result. The states of the variables at different program points are combined.
• Context-insensitive: Method calls at different program points are not distinguished.
Points-to analysis for Java (3)
Points-to graph as abstraction
Result of the analysis is a so-calledpoints-to graphhaving
• abstract variables and abstract objects as nodes
• edges represent that an abstract variable may have a reference to an abstract object
Abstract variables V represent sets of concrete variables at runtime.
Abstract objects O represent sets of concrete objects at runtime.
An edge between V and O means that in a certain program state, a concrete variable in V may reference an object in O.
c
Arnd Poetzsch-Heffter Optimization and Code Generation 105
Optimization Non-Local Program Analysis
Points-to graph - Example
Beispiel: (Points-to-Graph)
class Y { ... } class X {
Y f;
void set( Y r ) { this.f = r; } static void main() {
X p = new X(); // s1 „erzeugt“ o1 Y q = new Y(); // s2 „erzeugt“ o2q (); // „ g p.set(q);
} }
p
o1 this
o1
f q
r
o2
28.06.2007 © A. Poetzsch-Heffter, TU Kaiserslautern 341
r
c
Arnd Poetzsch-Heffter Optimization and Code Generation 106