• Keine Ergebnisse gefunden

Application: Loop-invariant Code Example:

N/A
N/A
Protected

Academic year: 2022

Aktie "Application: Loop-invariant Code Example:"

Copied!
49
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

1.8

Application: Loop-invariant Code

Example:

for (i = 0;i < n; i++) a[i] = b + 3;

// The expression b + 3 is recomputed in every iteration :-(

// This should be avoided :-)

(2)

The Control-flow Graph:

3 2

4 5 7

6 0

1

i = 0;

Neg(i < n) Pos(i < n)

y = b + 3;

A1 = A + i;

M[A1] = y;

(3)

Warning:

T = b + 3; may not be placed before the loop :

3 4 5 7

6 2

1 0

i = 0;

Neg(i < n) Pos(i < n)

A1 = A + i;

i = i+ 1;

T = b + 3;

y = T;

M[A1] = y;

==⇒ There is no decent place for T = b + 3; :-(

(4)

Idea:

Transform into a do-while-loop ...

3 2

4 5 0

1

i = 0;

Pos(i < n)

A1 = A + i;

i = i+ 1;

Neg(i < n)

y = b+ 3;

M[A1] = y;

(5)

... now there is a place for T = e; :-)

3 2

4 5 7 6

0

1

i = 0;

A1 = A + i;

i = i+ 1;

Neg(i < n) Pos(i < n) Neg(i < n)

Pos(i < n)

T = b + 3;

y = T;

M[A1] = y;

(6)

Application of T5 (PRE) :

3 2

4 5 0

1

i = 0;

Pos(i < n)

A1 = A + i;

i = i + 1;

Neg(i < n)

y = b + 3;

M[A1] = y;

A B

0 ∅ ∅

1 ∅ ∅

2 ∅ {b + 3}

3 {b + 3} ∅ 4 {b + 3} ∅ 5 {b + 3} ∅ 6 {b + 3} ∅

6 ∅ ∅

(7)

Application of T5 (PRE) :

3 2

4 5 0

1

i = 0;

7 6

Pos(i < n)

A1 = A + i;

i = i + 1;

Neg(i < n)

Neg(i < n) Pos(i < n) y = b + 3;

M[A1] = y;

A B

0 ∅ ∅

1 ∅ ∅

2 ∅ {b + 3}

3 {b + 3} ∅ 4 {b + 3} ∅ 5 {b + 3} ∅ 6 {b + 3} ∅

6 ∅ ∅

7 ∅ ∅

(8)

Conclusion:

• Elimination of partial redundancies may move loop-invariant code out of the loop :-))

• This only works properly for do-while-loops :-(

• To optimize other loops, we transform them into do-while-loops before-hand:

while (b) stmt ==⇒ if (b)

do stmt while (b);

(9)

Problem:

If we do not have the source program at hand, we must re-construct potential loop headers ;-)

==⇒ Pre-dominators

u pre-dominates v , if every path π : start → v contains u. We write: u ⇒ v .

“⇒” is reflexive, transitive and anti-symmetric :-)

(10)

Computation:

We collect the nodes along paths by means of the analysis:

P = 2N odes , ⊑ = ⊇

[[(_, _, v)]] P = P ∪ {v}

Then the set P[v] of pre-dominators is given by:

P[v] = \

{[[π]] {start} | π : start → v}

(11)

Since [[k]] are distributive, the P[v] can computed by means of fixpoint iteration :-)

Example:

3 2

4 5

0

1

P

0 {0}

1 {0, 1}

2 {0,1, 2}

3 {0, 1, 2,3}

4 {0,1,2, 3, 4}

5 {0,1, 5}

(12)

The partial ordering “⇒” in the example:

3 2

4 0

1 5

P

0 {0}

1 {0, 1}

2 {0, 1,2}

3 {0, 1, 2, 3}

4 {0, 1, 2,3, 4}

5 {0, 1,5}

(13)

Apparently, the result is a tree :-) In fact, we have:

Theorem:

Every node v has at most one immediate pre-dominator.

Proof:

Assume:

there are u1 6= u2 which immediately pre-dominate v.

If u1 ⇒ u2 then u1 not immediate.

Consequently, u1, u2 are incomparable :-)

(14)

Now for every π : start → v :

π = π1 π2 with π1 : start → u1 π2 : u1 v

If, however, u1, u2 are incomparable, then there is path: start → v avoiding u2 :

start u1

u2 u2

v

(15)

Now for every π : start → v :

π = π1 π2 with π1 : start → u1 π2 : u1 v

If, however, u1, u2 are incomparable, then there is path: start → v avoiding u2 :

start u1

u2 u2

v

(16)

Observation:

The loop head of a while-loop pre-dominates every node in the body.

A back edge from the exit u to the loop head v can be identified through

v ∈ P[u]

:-)

Accordingly, we define:

(17)

Transformation 6:

u

v

u2 u2 u

lab Pos (e) Neg (e)

v

lab Pos (e) Neg (e)

Neg (e) Pos (e) u2,v ∈ P[u]

u1 6∈ P[u]

u1 u1

We duplicate the entry check to all back edges :-)

(18)

... in the Example:

3 2

4 5 7

0

1

i = 0;

6

Neg(i < n) Pos(i < n)

A1 = A + i;

i = i + 1;

y = b + 3;

M[A1] = y;

(19)

... in the Example:

3 2

4 5 7

0

1

i = 0;

6

Neg(i < n) Pos(i < n)

A1 = A + i;

0, 1

0, 1, 2

0, 1, 2, 3, 4 0, 1, 2, 3 0, 1, 7

0

0, 1, 2, 3, 4, 5 0, 1, 2, 3, 4, 5, 6 i = i + 1;

y = b + 3;

M[A1] = y;

(20)

... in the Example:

3 2

4 5 7

0

1

i = 0;

6

Neg(i < n) Pos(i < n)

A1 = A + i;

0, 1

0, 1, 2

0, 1, 2, 3, 4 0, 1, 2, 3 0, 1, 7

0

0, 1, 2, 3, 4, 5 0, 1, 2, 3, 4, 5, 6 i = i + 1;

M[A1] = y;

y = b + 3;

(21)

... in the Example:

3 2

4 5 7

0

1

i = 0;

6

Neg(i < n) Pos(i < n) y = b + 3;

A1 = A + i;

0, 1

0, 1, 2

0, 1, 2, 3, 4 0, 1, 2, 3 0, 1, 7

0

0, 1, 2, 3, 4, 5 0, 1, 2, 3, 4, 5, 6 i = i + 1;

Pos(i < n) Neg(i < n)

M[A1] = y;

(22)

Warning:

There are unusual loops which cannot be rotated:

3 2 0

4 1

3 2 0

1

4 Pre-dominators:

(23)

... but also common ones which cannot be rotated:

3 2

4 5

0

1

3 2

4 0

1 5

Here, the complete block between back edge and conditional jump should be duplicated :-(

(24)

... but also common ones which cannot be rotated:

3 2

4 5

0

1

3 2

4 0

1 5

Here, the complete block between back edge and conditional jump should

(25)

... but also common ones which cannot be rotated:

3 2

4 5

0

1

5

3 2

4 1

0

Here, the complete block between back edge and conditional jump should be duplicated :-(

(26)

1.9

Eliminating Partially Dead Code

Example:

0

1

2

3 4

T = x + 1;

M[x] = T;

(27)

Idea:

0

1

2

3 4

0

1

2

3 4

T = x + 1;

M[x] = T; M[x] = T;

T = x + 1;

(28)

Problem:

• The definition x = e; (x 6∈ Varse) may only be moved to an edge where e is safe ;-)

• The definition must still be available for uses of x ;-)

==⇒

We define an analysis which maximally delays computations:

[[;]] D = D

[[x = e;]] D =

( D\(Usee ∪ Def x) ∪ {x = e;} if x 6∈ Varse D\(Usee ∪ Def x) if x ∈ Varse

(29)

... where:

Usee = {y = e; | y ∈ Varse}

Def x = {y = e; | y ≡ x ∨ x ∈ Varse}

(30)

... where:

Usee = {y = e; | y ∈ Varse}

Def x = {y = e; | y ≡ x ∨ x ∈ Varse}

For the remaining edges, we define:

[[x = M[e];]] D = D\(Usee ∪ Def x) [[M[e1] = e2;]] D = D\(Usee1 ∪ Usee2)

[[Pos(e)]] D = [[Neg(e)]] D = D\Usee

(31)

Warning:

We may move y = e; beyond a join only if y = e; can be delayed along all joining edges:

0

1

2

3 4

T = x + 1;

x = M[T];

Here, T = x + 1; cannot be moved beyond 1 !!!

(32)

We conclude:

• The partial ordering of the lattice for delayability is given by “⊇”.

• At program start: D0 = ∅.

Therefore, the sets D[u] of at u delayable assignments can be computed by solving a system of constraints.

• We delay only assignments a where a a has the same effect as a alone.

• The extra insertions render the original assignments as assignments to dead variables ...

(33)

Transformation 7:

v u

lab lab

v u

a ∈ D[u]\[[lab]](D[u])

a [[lab]](D[u])\D[v]

(34)

v1 v2 u

u

v1 v2

Pos(e) Neg(e)

u

Pos(e) Neg(e)

a ∈ D[u]\[[Pos(e)]](D[u])

a [[Neg(e)]](D[u])\D[v1] a [[Pos(e)]](D[u])\D[v2]

Note:

Transformation T7 is only meaningful, if we subsequently eliminate assignments to dead variables by means of transformation T2 :-)

In the example, the partially dead code is eliminated:

(35)

0

1

2

3 4

T = x + 1;

M[x] = T;

D

0 ∅

1 {T = x + 1;}

2 {T = x + 1;}

3 ∅

4 ∅

(36)

0

1

4

2

3

M[x] = T; T = x+ 1;

T = x + 1;

T = x+ 1; D

0 ∅

1 {T = x + 1;}

2 {T = x + 1;}

3 ∅

4 ∅

(37)

0

1

4

2

3

M[x] = T; T = x+ 1;

;

;

L 0 {x}

1 {x}

2 {x}

2 {x,T}

3 ∅

4 ∅

(38)

Remarks:

• After T7 , all original assignments y = e; with y 6∈ Varse are assignments to dead variables and thus can always be eliminated :-)

• By this, it can be proven that the transformation is guaranteed to be non-degradating efficiency of the code :-))

• Similar to the elimination of partial redundancies, the transformation can be repeated :-}

(39)

Conclusion:

→ The design of a meaningful optimization is non-trivial.

→ Many transformations are advantageous only in connection with other optimizations :-)

→ The ordering of applied optimizations matters !!

→ Some optimizations can be iterated !!!

(40)

... a meaningful ordering:

T4 Constant Propagation Interval Analysis

Alias Analysis T6 Loop Rotation

T1, T3, T2 Available Expressions T2 Dead Variables

T7, T2 Partially Dead Code

T5, T3, T2 Partially Redundant Code

(41)

2 Replacing Expensive Operations by Cheaper Ones

2.1

Reduction of Strength

(1) Evaluation of Polynomials

f (x) = an · xn + an−1 · xn−1 + . . . + a1 · x + a0

Multiplications Additions

naive 12n(n + 1) n

re-use 2n − 1 n

Horner-Scheme n n

(42)

Idea:

f (x) = (. . .((an · x + an−1) · x + an−2). . .) · x + a0

(2) Tabulation of a polynomial

f(x) of degree n :

→ To recompute f(x) for every argument x is too expensive :-)

→ Luckily, the n-th differences are constant !!!

(43)

Example:

f(x) = 3x3 − 5x2 + 4x + 13

n f(n) ∆ ∆23

0 13 2 8 18

1 15 10 26

2 25 36

3 61

4 . . .

Here, the n-th difference is always

nh(f) = n! · an · hn (h step width)

(44)

Costs:

• n times evaluation of f ;

12 · (n − 1) · n subtractions to determine the ∆k ;

• n additions for every further value :-)

==⇒

Number of multiplications only depends on n :-))

(45)

Simple Case: f (x) = a

1

· x + a

0

• ... naturally occurs in many numerical loops :-)

• The first differences are already constant:

f (x + h) − f (x) = a1 · h

• Instead of the sequence: yi = f (x0 + i · h), i ≥ 0 we compute: y0 = f (x0), ∆ = a1 · h

yi = yi−1 + ∆, i > 0

(46)

Example:

for (i = i0;i < n;i = i + h) {

A = A0 + b · i;

M[A] = . . .; }

2 0

1

5 6

3 4 i = i0;

Pos(i < n) Neg(i < n)

A = A0 + b· i;

i = i+ h;

M[A] = . . .;

(47)

... or, after loop rotation:

i = i0;

if (i < n) do {

A = A0 + b · i;

M[A] = . . .; i = i+ h;

} while (i < n);

2 0

5 6

3 4 1

Pos(i < n) Neg(i < n)

i = i0;

A = A0 + b· i;

i = i+ h;

M[A] = . . .;

Neg(i < n) Pos(i < n)

(48)

... and reduction of strength:

i = i0;

if (i < n) {

∆ = b · h;

A = A0 +b · i0; do {

M[A] = . . .; i = i +h;

A = A + ∆;

} while (i < n);

}

2

5 6

3 4 0

1

i = i0;

Neg(i < n)

Pos(i < n)

M[A] = . . .; i = i +h;

A = A + ∆;

∆ = b · h;

A = A0 + b · i;

(49)

Warning:

• The values b, h, A0 must not change their values during the loop.

• i, A may be modified at exactly one position in the loop :-(

• One may try to eliminate the variable i altogether :

→ i may not be used else-where.

→ The initialization must be transformed into:

A = A0 + b · i0 .

→ The loop condition i < n must be transformed into:

A < N for N = A0 + b · n .

→ b must always be different from zero !!!

Referenzen

ÄHNLICHE DOKUMENTE

• A transformation definition is a set of transformation rules that together describe how a model in the source language can be transformed into a model in the target language. •

tiresome, hard work at dusty archives and taken to the active virtual life via social media. Almost all photos got comments, disproving or confirming

We consider a Lotka-Volterra tritrophic food chain composed of a resource, its consumer, and a predator species, each characterized by a single adaptive phenotypic trait, and we

→ Correctness of the transformation along a path: If the value of a variable is accessed, this variable is necessarily live. The value of dead variables thus is irrelevant :-).

Now, (∗∗) is proved by case distinction on the edge labels

• After T 7 , all original assignments y = e; with y 6∈ Vars e are assignments to dead variables and thus can always be eliminated :-). • By this, it can be proven that

The main way jihadi groups get hold of these weapons is by stealing them from the regu- lar armed forces – like ISIL did in Iraq or Boko Haram in Nigeria.. In

Ceasefire/peace negotiations should therefore include representatives of the Assad regime, Syrian political opponents and armed rebels from the Free Syrian Army, and