Constructing a Heap

(1)

deposit_hagen

Publikationsserver der Universitätsbibliothek

Mathematik und

Informatik

Informatik-Berichte 03 – 03/1981

Constructing a Heap

Überarbeitete Fassung 1981 des

Informatik-Berichts Nr. 3 von 1980

(2)

Abstract: The well known algorithms due to Williams and to Floyd for the construction of a heap are considered with respect to their ~robabilistic properties. For Williams ¹ algorithm the distribution of the heaps generated by this method is investigated by means of a special kind of labeled trees.

Floyd¹s algorithm is considered in greater detail, the expected numbers of interchanges, and comparisons respectively are derived and the leading terms of the asymptotic expan- sions are given.

* A preliminary version has been reported at the 21st Symposium on Foundations of Computer Science, Syracuse, N.Y., Oct. 13-15, 1980

** Fachbereich Mathematik-Infonnatik, Fernuniversität Hagen, Feithstr.140, 0-5800 Hagen, West Germany (on leave)

(3)

1. Introduction

Heapsort is a well known algorithm for sorting. It works on a data structure commonly called a heap, and consists of two phases: in the first the items tobe sorted are made a heap, and in the second the root of the heap which contains the greatest item is deleted and the heap property is reconstructed; this is repeated until all items are processed. There are two methods known for heap construction: the first, originally invented by Williams (WI), constructs the heap by repeated insertions into an originally empty heap structure, the second, devised by Floyd (FL), starts from heaps consisting just of one point each and proceeds to combine smaller heaps into greater ones until all items are made a heap.

In this paper a closer look is done at the distributional aspects of heap construction by means of Williams' as well as Floyd's method. In the case of repeated insertions it will be shown how these insertions do affect. equidistribution (or randomness, as it is called sometimes).

This is done by means of some kind of trees: given a heap, a labeled tree is defined which allows us to reconstruct all possible preimages of this heap, hence all possible configurations of items which result in the same heap, when repeated insertions are used. Thus the number of leaves of this tree permits to count the number of ways the heap may be constructed, and this observation permits us to formulate the distribution of the heaps, when initially all possible configurations of items are equally probable. In contrast to this, Floyd's method preserves equidistribution: if all configurations initially have the same probability to occur, so does any heap which is constructed from them. This property of Floyd¹s algorithm is well known, and has been proved by Knuth in a purely combinatorial manner (KN,Theorem 5.2.3.H).

(4)

In this paper an attempt is made to use methods of Mathematical Ana- lysis for the derivation of the distributions by means of a method

which may be called the method of transformed measures. The idea behind this method may be explained in the following way: suppose we want to analyze the behavior of an algorithm when its input is assumed tobe governed by a particularly chosen distribution Dis, hence Dis(A) is the probability chosen for the event that the input is taken from the set A. Let f be the function which is computed by the algorithm, hence Dis(f- 1 [8]) is the probability that the output of the algorithm will be an element of the set 8, which is part of the range space for the algorithm. Consequently, we have the originally given distribution Dis, and the image distribution Disof- 1 . By having a careful look at the correspondence between Dis and its image Dis⁰ f-l it will be possible to make some observations on the average performance of the algorithm.

But before giving the impression that too much is promissed it should be pointed out that the initial distribution considered here is closely related to equidistribution by some special kind of density.

After having established the quoted result for Floyd's algorithm, it is used for a derivation of the expected number of interchanges, and of comparisons, which are needed by this algorithm. These theoretically derived exrectations are confirmed experimentally; running the algorithm repeatedly on randomly chosen data reveals the same behavior as predicted mathematically.

We are mainly interested in the distribution, and the average number of interchanges, and comparisons, respectively. Here we greatly profit from the investigation of the anatomy of a heap, as studied by Knuth in greater detail (KN, 5.2.3.). From these geometrical results he de- rives some aspects of the average behavior of the heap construction

(5)

phase in heapsort due to Floyd, which is called sift up there. For example, Knuth has a look at the average number of keys which are pro- moted during sift-ups. Knuth's approach is based on counting. In contrast to this, Gonnet and Rodgers use some continuous models in order to investigate a family of data structures which include heaps (GOR).

Usually, a heap of N elements may be considered as a labeled tree:

let the set {1, ... ,N} be represented as a tree by regarding 1 as the root of the tree and letting the node i + 2 (+ denoting integer division) be the father of node i, 2 :s; i :s; N. Moreover the i-th element of the

heap is thought of as a label for node i. Gonnet and Rodgers observe that there may be more general "father functions", than the assigment i 1-- i+2, and consequently they investigate structures arising from some such father functions. Among others, they have a look at heap construction, but their results are addressed mainly to find informa- tion theoretic values.

Let us have a look at the organization of this paper: in Section 2, heaps are defined formally, some anatomical details of heaps are quoted from KN, and the algorithms investigated are presented for the reader's convenience. In Section 3, measure transformations are discussed briefly. Moreover, a classical result for measure transformations, viz., the Change of Variables Formula from Classical Analysis is quoted. Section 4 deals with Williams' algorithm for heap construction by repeated insertions. The trees already mentioned are discussed, and it is shown how these trees may be used in a formulation of the probability distribution resulting from this algorithm. Finally, Section 5 deals with Floyd's algorithm. By introducing viable paths for a description of the ways keys move in the tree corresponding to the heap, the result that this algorithm preserves equidistribution is

(6)

respectively. Moreover some experimental results are displayed.

Closing this Introduction, the author gratefully wants to acknowledge discussions with Philippe Flajolet, and with Gaston Gonnet on the subject presented here.

2. Heaps

Fix N as the number of items tobe processed for the rest of this paper. The set {1, ... ,N} may thought tobe represented as a tree in the following manner: 1 is the root of the tree, and the node i has the node i-;- 2 as its father. Now label this tree by some real numbers in such a manner, that no son is assigned a greater label, than its father has. This labeling then is said to constitute a heap. More formally, a vector (x

1, ... ,xN) is said to be a heap iff x;-,- 2 ::::: xi holds for 2 s i::,; N. Let us assume that N has the binary representation (lbn_1 ... b1b

0)

2, in particular n= L}og 2 NJ holds. Since N= (lbn_

1 ... b

0)

2 is the rightmost node on level n, and since (lbn_

1 ... bj)

2 is the father of node (lbn_ 1 ... bj_ 1) 2 , the nodes (lbn_

1 ... bi)

2, Os i sn, constitute the path from the rightmost leaf on level n to the root. This path is geometrically distinguished from other paths in the tree, and Knuth calls the corresponding nodes the special nodes for the heap. These nodes are distinguished in another respect from the nonspecial nodes in this tree: having a look at Fig.l, it is seen that the subtrees rooted at nonspecial nodes are always

perfectly balanced. Now let y(i) denote the size of the subtree rooted at i, then it is shown in KN, Ex. 5.2.3.20 that

(7)

holds. Let us call the node i a left resp. right node on level k if 2k:s;i<(lbn_1 ... bn_k)

2 resp. if (lbn_ 1 ... bn-k) 2 <i:s;2k+l_l holds. It is easily observed that there exist

(lbn-1· .. bn-k)2 - 2 k 1 eft nodes, and

2 k+l -l-(lbn_l ... bn-k)2

right nodes on level k. A left node on level k has 2n-k+l _ l

offsprings. These numbers will be used when calculating expectations later.

Fig.l displays the canonical tree belonging to N = 13; with each node there is associated the number which represents it, and the special path is paid particular attention by drawing it with heavier lines.

Note that 13 has the bi nary representati on ( 11010)2, hence n = 3.

- Fig. 1 -

Now assume that we have a heap with N elements, and we want to insert a new element into this heap in such a way that the heap structure is

(8)

not affected. This is done in the following way: a new node N+l is created, and appended to the tree. Now the new element traverses the special path with respect to {1, ... ,N+l} in order to find a position which is not in conflict with the heap property. The node found in this way is labeled with the new element, and all special nodes the new

element has investigated receive their father's label. In this way a heap with N+l elements is created. Observing that an array with just one component constitutes a heap, the procedure described above can be used to create a heap. Assume a is an array l 1. .NJ of reals, the fol low- ing procedure w creates a heap from a:

pAaeeduAe w(va~ a);

begin

6aA k : = 2 ta N da begin

j := k; i := j+2; x := a[j];

whfle ( i > 0) and ( x > a [ i]) da begin

a[j] :=a[i];

j : = i; i : = i +2 end (*while*);

a[j] := X

end (*for k*) end (,:-w*).

This is the algorithm invented originally by Williams. It will be investigated in Section 4.

Let us have a look at Floyd's method to construct heaps. Suppose the node j has the property that the subtrees rooted at the left, and the right son of j both enjoy the heap property already. Then the tree rooted at j enjoys this property, provided the label of j is not smaller, than the maximum of the label of its sons (if there are any). If this is not the case, the label of j will be interchanged with the

(9)

label of the greatest son, and the considerations are repeated with the subtree rooted at this son. In this way the label of j percolates the subtree rooted at j, until a position is reached which satisfies the heap property. This idea is expressed in the well known procedure heapify(j), which has the array a as a global variable; the formulation of this procedure is taken from AHU, p.90.

ptLoc.e.du.tLe. heapi fy( k: 1 .. N);

be.gin

in

k is not a leaf and if a son of k has a larger element, than k has t!ie.n

be.gin

let j be the son of k with the largest element;

interchange a[j] and a[k];

heapify(j) e.nd (*if*)

e.nd(*heapify*).

De f i n e a v e c t o r ( x 1 , ... , x N ) t o b e a k - h e a p i ff x i .,. 2 > x_ i h o l d s f o r an y

i with k::; h2 < i::; N (or, less formally, if the subtree rooted at k enjoys the heap-property). Then it is seen from the procedure, that after leaving it,a will be a k-heap, provided it has been a (k-1)-heap upon, entry, when heapify is called with the actual parameter k. The procedure f, which makes a to a heap, is then formulated as follows:

p!Loc.e.du.!Le. f(vatL a);

p!Loc.e.du!Le. heapi fy( k: 1 .. M);

e. nd ( * h e a pi f y * ) ;

b e.gin

6otL k:=M.,-2 downto l do heapify (k) e.nd (*f*).

This is Floyd¹s algorithm for heap construction. It will be considered in Section 5 in greater detail.

(10)

3. Transformed Measures

This short Section is intended to give a general idea of measure transformations. Suppose we have measurable spaces X and Y. serving as the respective spaces of inputs and outputs for some algorithm, which is represented by a measurable map f: X-Y. Here measurability of a space means that it is ~ set endowed with a a-field, and measurability of a map means that the preimage of a measurable set in the range is a measurable set in the domain. Now let µ be a probability measure on X, which describes the probabilistic nature of the input to the algorithm.

Given a measurable set B in Y, µ(f- 1 [B]) then is the probability that the algorithm produces an element of B, hence the measure

µ o f- 1 : B i--µ({x; f(x) E B}) describes the probabilistic characteristics of the output of the algorithm conditioned on its input characteristic.

It is clear that µ o f-l is a probabil ity measure, and it is easily shown that any measurable map g from Y to the reals is integrable with respect to µ o f-l iff the pulled back map g o f: X-IR is integrable with respect to µ. Moreover it is not difficult to show that in the case of integrability the equality

fg(y) (µof-l) (dy)

y ⁼ fg(f(x))µ(dx)

X

holds. This transformation might be called output oriented, since any measurable subset of the output space is assigned the probability that

the output produced will be a member of it.

This equality is too general tobe of use in the present situation.

Let us consider the case in which both X and Y are open and bounded subsets of IRn for some n, and that f: X-Y is a differentiable homeomorphism with a continuous inverse. Then the situation becomes easier, if we assume that µ is of the form

(11)

µ(A) ⁼ fF(x 1 , ... ,x ) dx 1 ... dx ⁼ fF d>.n,

A n n A

where >.n denotes n-dimensional Lebesgue measure; here F is a suitably chosen density. Now consider µ o f- 1; at first it will be shown that

(µof-1)(A) ⁼ 0 holds for any Borel set A, provided An(A) = 0 holds:

= fFof d>..n

A :: 0 .

Hence by the classical Radon-Nikodym Theorem (RU,p.166) we see that there exists a measurable function G: Y - IR such that

(µof-l)(A) ⁼ JG d\n

A

holds for any measurable set A in Y. But now the Change of Variables Formula from Calculus appl ies:

{G d>..n = {G(f(x 1, ... ,xn)) ldet f ¹ (x 1 , ... ,xn) ¹ dx 1 ... dxn

= JGof ldet f ¹ ¹ d;-s_n

A

(RU,p.185f.). Thus we see that the computation of the distributional effect of the algorithm can be reduced to the computation of some n-dimensional integrals (which may be difficult enough). A sometimes more convenient statement of the latter equality is that we have for any continuous t: Y-JR the equality

ft d(µof-l) = ft•G⁰ f jdet f' ¹ d>.n.

y y

At the first glance, however, this approach does not look compromising, simply since the Radon-Nikodym Theorem is an Existence Theorem, which does only give any hints how to construct the function G above. The practicability of the integra1s above stands and falls however with the possibility to handle Gina convenient manner. But it will be

(12)

seen below how to overcome this difficulty in concrete cases.

A second remark addresses the applicability of the conditions, under which the formula above is established: when constructing heaps, it

is not clear where a differentiable homeomorphism with a continuous inverse might arise at all. It will turn out later that this kind of map emerges from suitably chosen restricted input domains to various parts of the algorithms in question.

In order to close this Section, an overall assumption is met: It is assumed that the inputs for the algorithms considered is chosen from the set Gof all vectors with N real components, which are mutually distinct and lie in the closed interval [0,1]. The initial distribution µ is assumed tobe

v.' i1 er e F : G - IR+ i s m e a s u r ab l e an d s y mm et r i c , t hat i s ,

holds for any permutation p of {1, ... ,N}.

Note that two well known cases are covered as special cases by this assumption:

1.) Consider the specia1 case that F(x₁, ... ,xN) = f(x_{1 )• ...}•f(xN).

Having in mind the components xi' rather than the vector (x_{1 , ...},xN), this is the model in which the components from which a heap is tobe constructed are identically distributed and independent random variables.

This holds in particular in case f is the uniform distribution (GOR), but f may be any measurable function on [0,1] such that

J1

^{f(x) dx}⁼^1.

0

2.) Let P(N) be the symmetric group over {1, . . . ,N}. Given xEG, let p(x) be the canonic renumbering of x, i .e. p(x) is a vector of N components taken from the set {l, ... ,N} such that the i-th component of

(13)

p(x) is k iff xi is the k-th smallest component in x. Then p(x) may be considered as a member of P(N), and x i-p(x) is onto. Let us have a look at the distributional properties of this map: Given

p

E: P(N), consider the set

H(p) := {x E: G; p(x) ⁼

p}

of all x which are mapped onto p. Now let g(x) be x with its i-th and j-th component interchanged, and denote by <i ,j> the permutation which interchanges i with j, and fixes the other components. Then we have

x E: H(po<i ,j>) iff g(x) E: H(p),

hence the Change if Variables Formula implies that

Since cycles of the form <i,j> span P(N) algebraically, from this argument it is concluded that

( F d:x.n =

H(p)

denoting by id the identity permutation. Hence any permutation has the same probability to occur in this derived model. Consequently, random permutations as considered by Knuth (KN,Theorem 5.2.3.H, p.155) are covered by this model, too.

4. Williams' Method: Computing the Distribution

Consider the heap (5,4,3,1,2), then it may be shown that there are exactly 26 permutations in P(5) which generate this heap by repeated insertion, whereas the heap (5,4,1,3,2) is produced only six times.

This shows that Williams' algorithm for constructing heaps does affect equidistribution rather heavily. In this Section it will be shown how

(14)

the corresponding distribution 1ooks 1ike.

Let W(k) be the set of all x EG with the property that the first k

components forma heap already, and define the map Tk+l: \.✓ k - \-Jk+l by the following procedure

p~ocedu~e insert(k+l);

begüi

j := k+l; i := j+2;

whi-te ( i > 0) and (xk+l >xi) do begin

X. : = X.; j : = i; i : = j +2

J 1

end

xj :=xk+l end (*insert*).

Hence Tk+l has the effect of inserting the (k+l)-st component into such a place that a member of W(k+l) arises, provided the argument comes from W(k). Since W(N) is the set of all heaps, and since Tk will shown tobe surjective for any k, the effect of Williams' procedure w, as described in the Introduction, is TNoTN_

1o ... oT₂, when the algorithm is considered as a map.

Let us describe the maps (Tk)

2sksN more carefully. For this, define t(k,0) := k,

t ( k , j ) : = t ( k , j - 1 ) + 2 , p r o v i de d 1 s j S l_1 o g 2 ^{kj ,}

h e n c e { t ( k , j ) ; 0 s j s Ll o g 2 kj } i s t h e spe c i a 1 p a t h i n t h e t r e e c o r r es - ponding to {1, ... ,k}. It will be helpful to make use of the brothers of these special nodes: let b(k,j) be the brother of t(k,j) for 1 sj s Uog2k.J. Care must be taken with b(k,O), since t(k,O) has no brother in the tree corresponding to {1, ... ,k} in case k is even. In this case xb(k,O) is defined tobe -⁰⁰^• Now consider xEW(k-1)

with its image z := Tk(x). Assume that xt(k,r) < xk < xt(k,r+l) r, 0 < n < L l o ^g

2 kJ . Th e n w e w i 11 h a v e

together for some

(15)

and

zt(k,j) = xt(k,j+l) for any j with O :S j :S r-1,

2t(k,r) ⁼ xk'

Zi = X;, if i $ {t(k,j); O:::;j :Sr}.

Consequently, zt(k,j) is greater than zb(k,j) is, since xEW(k-1), provided O :S j :S r-1. Hence as the image of an element in W(k), z is some- what skew, since some nodes on the special path w.r.t. {1, ... ,k} have greater labels, than their brothers. It turns out that this property will make it possible to describe the effect of Tk.

Define

vJ(k,r) := {xEW(k); xt(k,j)>xb(k,j) holds for 0:Sj:Sr-1},

t h u s x E lt! ( k , r ) i f f t h e r e i s a n i n i t i a l s e g m e n t o f r n o d e s o n t h e s p e - cial path which have a greater label, than their brothers have. In order to have a look at the preimage of .W(k,r) under Tk, let V(k-1,r) be the set of all those xEW(k-1), for 1t1hich the following holds:

(xk < xt(k,l)' if r ⁼0,

1

^x^{t (}^{k '}^{r )}^<^x^k^<^x^{t (}^{k '}^r⁺^{1 ) '} ^{i f} ^O^<^r^<Ll o g 2 kj '

l

x t ( k , Ll _O

9 2 kJ ) < x k , i f r ⁼ U o g 2 kJ .

Hence, if xE V(k-1,r), the position xk will have after executing the procedure corresponding to Tk is clear:

4.1 Lemma:

Tk: V(k-1,r) - v/(k,r) is a bijection for any r, 0:S r:S Llog2kJ. D

Note that {V(k-1,r); 0:Sr:S Llog

2kJ} is a disjoint cover for W(k-1), w h e r e a s W ( k ) = \✓ ( k , 0 ) ::> lt/ ( k , 1 ) ::> .. • ::> W ( k , Ll _{o g 2 k}J ) . S i n c e { W ( k , r ) ; 0 :S r :S Ll o g

2 kJ}

forms a chain under inclusion, for any xE\.J(k) there exists a maximal r s u c h t h a t x E \✓ ( k , r ) . C o n s e q u e n t 1 y , x h a s e x a c t l y r + 1 p r e i m a g e s i n

(16)

V(k-1,s), 0:5s:5r.

When restricted to V(k-1,r), the map Tk equals that permutation of the components which corresponds to the cycle <t(k,r) , ... ,t(k,0)>. Denote the restriction of Tk to V(k-1,r) by Yk _{, r}, and let Xk _{, r}:W(k,r) ... V(k-1,r) be the inverse map (corresponding to the cycle <t(k,0), ... ,t(k,r)>) which undoes the effect of insertion. These maps will serve as labels for a tree now. Define for x E \✓ (2) the tree T

2(x) by

1 ~ 2

~ l - Fig. 2 -

This means that x may have exactly two preimages in W(l) with respect to T2, viz., Y2 , 0(x') for some x' EW(l), or the first and the second component of some x ¹¹ E\✓ (l) had tobe interchanged, hence x=Y

2,

1(x¹¹ ^{) .} Now·let us proceed in an inductive manner: assume that for any xEW(k) a tree Tk(x) is defined, and let for xEl1/(k+l) r be the greatestindex s such that xEW(k+l,s) holds, hence xEW(k+l,j) holds for O::;j::;r. By construction, there are exactly r+l preimages Xk

1 .(x) E V(k,j). Then + 'J

define Tk+l(x) as the tree which has k+l as its root, and Tk(Xk+l,j(x)), 0::; j::; r, as subtrees of the root, such that the edge connecting k+l with the root of Tk(Xk+l,j(x)) is labeled by Xk+l,j' These trees yield a complete description for the history of the respective vectors in the following sense: any path from the root of Tk+l(x) to some leaf corresponds to a mapping which is obtained by composing the maps which serve as labels for the respective edges. If x serves as an input at the root, and for any path the map so obtained is applied, then some elements from W(l) will result as outputs at the leaves. It is then not difficult to show that exactly these output vectors are mapped by the composition Tk+loTko ... oT2 onto x. This applies to permutations as well: Consider as an example (4,3,l,2)E\~(4,2). The corresponding

(17)

tree i s

3

X ( 2 , l l / 2 ~

7

x ( 2 , o ~

1 1

4

lx(4,l) 3

- Fig. 3-

3

V

/\(3,0)

X / ~ l \ 2

, , , ~ X . ( 2 ,o)

1 1

From this it is seen that the heap in question is constructed by Williams' method from exactly the following permutations: (4,3,1,2),

(3,4,1,2), (4,2,1,3), (2,4,1,3), (3,2,1,4), (2,3,1,4).

Let us now have a look at the distributional properties, which are

induced by this method, .thus by the sequence (Tk)₂~k~N of maps. Define µk:=µo(TkoTk_ -1

1 o ... oT

2) , hence µk is the distributional description of the insertion process. It turns out that the trees defined above are a helpful tool in describing µk: given xEW(k), denote by Mk(x) the number of leaves for Tk(x).

4.2 Theorem:

µk(A) =

f

F•Mk dA holds for every Borel set A in W(k).

A

Before proving this statement, let us remember the indicator function lA of a set A:

{

1, ifxEA,

0 , o t h e rw i s e .

Proof: Proceeding by induction on k, we see that the case k = 2 is ob-

(18)

v i o u s . I n o r d e r t o e s t a b l i s h t h e i n d u c t i v e s t e p , l e t h : \,J ( k + 1 ) - IR be bounded and continuous, then we have the following chain of equalities:

) h dµk+l W(k+l)

(~) f

^hoT ^du

\.J(k) k~l ·k

(~)<-_{- 2_ {}

J

h(Yk+l(x))·F(x)·Mk(Xk+l (Yk l (x)))A (dx); ⁿ

V(k,r) ,r + ,r

0 s r s L1 og₂( k+ 1 )J}

(3) =

L

^r

S

h(x)•F(x)•Mk(Xk+l (x))\n(dx)

W(k+l,r) ,r

( ~)

( 5) =

J

h(x)•F(x)•Mk+l(x)\ (dx). n

W(k+l) Let us comment on the equalities:

( 1) ( 2)

holds, by definition of µk+l' since µk+l may be written as µkoTk+l' -1

h o 1 d s , s i n c e t h e s et s { V ( k , r) ; 0 s r s L1 o g 2 ( k + 1 ^{)J }} form a d i s j o i n t cover of W(k), hence the additivity of the integral may be used together with the induction hypothesis,

(3) is a joint application of Lemma (4. 1), and the Change of Variables Formula,

(4) is the additivity of the integral,

(5) is implied by Mk+l(x) =LrMk(Xk+l,r(x))•l

1

✓ (k+l,r)(x); this equality holds because of the definition of Tk+l(x). D

Note that this Theorem removes the question mark with respect to heaps in FFV, Table 4, where the standard probabilities, i .e., the effect of repeated insertions, are listed for some data structures.

-· -~-"··--

At this point we stop the investigation of Williams' algorithm for heap construction. There remains plenty of work tobe done, since the

(19)

determination of the distribution above is only the first step towards the computation of the expected number of comparisons, and the expected data movement.

5. Floyd¹s Algorithm: Computing the Expectations

Floyd¹s algorithm is based on the idea that in case both the left, and the rigth subtree of a node k enjoy the heap property, the label of the root will be interchanged with the greatest label of one of its sons, if necessary, and that the same considerations must be done for the subtree rooted at this son. In this way the label of k percolates the tree. It will be convenient tobe able to keep track of the move- ments of this label. This is done by means of a word v over the alpha- bet {0,1}, such that the letter 0 means that the label of k is interchanged with the left son of the node which it labels in the moment, and 1 means that it takes the right way. Denote by H(k,v) the final node of the path v when this path is started in node k. More formally, let s(v) be the numerical value of v when considered as a binary number (e.g., s(00ll0) = (110)

2 ⁼6), and let IV! be the length of the word v, then H(k,v) is easily seen tobe equal to 2¹v¹.k+s(v).

As above, let us agree that x EG is said to be a k-heap if the subtree rooted at k is a heap, hence if the heap defining property x .. 2 >x. ^{l .-} ¹ holds for any i such that i72 is in the subtree rooted at k. Let

B(N72+1) := G,

B(k) := {x E B(k+l); x is a k-heap},

then a call to heapify (k) transforms a member of B(k+l) into a member of B(k), provided it is implemented by the array a, which is handled

(20)

in the formulation of the procedure.

Denote by Zk: ß(k+l) - B(k) the map which corresponds to the effect of a call to heapify (k). Hence heap creation by Floyd ¹s algorithm corresponds to the map z 1oz 2o ... oZN+ 2_1ozN+ 2 .

Call a path v tobe viahle in k iff the path, when started in k, does not end outside the tree corresponding to {l, ... ,N}; an equivalent formulation is to say that v is viable in k iff H(k,v) ~ N holds. Mow denote for an admissible path v by Sk(v) the set of all (k+l)-heaps which have the property that the way the label of the node k goes is described by v. More formally, a vector xEB(k+l) is a member of Sk(v) iff the following conditions hold:

xk<max{xH(k v. O)'xH(k v. l)}=xH(k v.)

' 1-l ' 1-l ' ,,

is true for l~i~lvl, and if H(k,v) < (N-1)/2, then

xk >max{xH(k,vO)'xH(k,vl)}'

if however v ends in (N-1)/2, i.e. if H(k,v) = (N-1)/2, then

Here vi is the length i prefix of v. Note that {Sk(v); v is viablein k}

constitutes a disjoint cover of B(k+l). It will be shown now that Zk is a bijection onto B(k), when restricted to an arbitrary Sk(v). Hence all of B(k+l) folds onto B(k), and any member in B(k) has the same number of preimages. From this it is deduced now that equidistribution is preserved during heap construction.

5.1 Lemma:

Given a viable path v, Zk: Sk(v) -- B(k) is a bijection.

Proof: From the definition of Zk it is deduced that Zk[Sk(v)] cB(k)

(21)

holds, and it is readily seen that Zk is 1-1, when restricted to Sk(v).

In order to establish that Zk is onto, consider an arbitrary zEB(k), and undo the effect of Zk having the course of v in mind. lt is readily verified that in this way a member of Sk(v) is found the result of

which under Zk is z. O

Now let us have a look at the distributional effect of the maps {Z,e.; 1:::;-t:::;N.,.2}. For this, define

mk := µ o (Zkozk+lo ... oZN.,.2)-1

as the distribution of the k-heap, when the inputs are governed by the originally given distribution µ.

5.2 Theorem:

Let gk be the product of all subtree sizes down to and including k, i.e. gk:=y(N-,-2+1)• ... •y(k). Then

(*) mk(A)=gkfF dAN

A

hol ds for any Sorel set Ac G.

The proof proceeds by a backward induction on k, starting at k = N-,-2+1.

At first it is noted that the number of viable paths in any node coincides with the size of the subtree rooted at that node. This is so since any path is determined uniquely by its endnode.

1. Let k = N-,-2+1, then Zk is the identity, and gk equals unity, since only the empty path is viable. Hence mk equals µ, as asserted.

2. Now assume that (*) is true for some k+l ^$ N-,-2+1, and let h: B(k) - ffi be a bounded and continuous map, then we have

(22)

f

h dmk ⁼

f

h o Z dm 8 ( k) B(k+l) k k+l

= gk+l

f

h(Zk(x))•F(x);/¹(dx) B(k+l)

L{ _J

h(Zk(x))

·~

= gk+l F(Zk(x))>.¹(dx); ^V i ^S a S k (V)

viable path for k}

= gk+l

L{ f

^{h (}^x)

8 ( k ) F(x)>,N(dx); ^V i ^S a viable path for k}

= gk+l•y(k)

f

h(x)•F(x)>. (dx). N 8 ( k )

The first equality is the definition of mk, the second stresses the induction hypothesis, the third follows from the additivity of the integral. The next equality is a combination of Lemma 5.1, and the Change of Variable Formula (note that F(Zk(x)) = F(x) holds for every x, since Fis assumed tobe symmetric, and Zk permutes the order of the components of its argument). Having established this chain of equalities, the assertion is proved. D

It might be interesting to compare the proof above with the proof given by Knuth for the fact that heap creation preserves equidistribution in KN, p.155: Knuth works backward, too; he fixes some result in ß(k) after having executed heapify(k) on a member of B(k+l), and shows then that this result has exactly y(k) possible ancestors in B(k+l) (we have changed notations to that employed here). The diffe- rence to the proof given here is that a much broader class of distributions is covered by 5.2, using methods from Calculus and Measure Theory.

Having established the fact that any heap has the same probability to occur, it is not difficult to extend this result to subheaps in case

(23)

F(x1, ... ,xN) = f(x1)• .... f(xN), i.e. in case the components of the originally given vectors are stochastically independent and identically distributed. Let k be an arbitrary node, and denote by Gk the set of all heaps with components from [0,1] having y(k) components. Let

projk: G₁ -Gk

be the projection which assigns any heap the subheap rooted at k.

5.3 Corollary:

Assume that Fis given, as above. Then we have for every Borel set AcGk the equality

(m₁ °projk -1 )(A) = -ll{g(j); j is in the subtree rootedatk}*

*1

^f(x1 )• ... •f(xy(k)) dx_{1 ...}dxy(k)' stating that any subheap of a fixed size is equiprobable.

Proof: lt will be more convenient to prove a seemingly sharper result,

V i Z. ,

f

lP o projk dmk = -11

f

,_p(x 1, ... ,>\(k))•f(x1),~ ... f(\(k))dx^{1 ...}d\(k)

Gl Gk --

for any continuous and bounded map ^lP: Gk - ffi. Here -1-1 is used as an abbreviation for the product above. Sut the 1atter identity is nothing but an easy application of Fubini's Theorem on product integration (RU, p.150f.) which is made possible by Theorem 5.2.

•

Since the continuous model investigated here contains the discrete model of originally equidistributed permutations as a special case,

the last Corollary applies to permutations as well. Consequently, 5.3 is another version of Lemma 1 in POS, in which any subheap is shown tobe equiprobable. This Lemma is used by Porter and Simon to analyze heap insertion, i .e. the algorithm which operates on a heap and a new element which is tobe inserted.

(24)

Given xEB(k+l), let Vk(x) be the number of interchanges, which are necessary in a call to heapify(k), when x is implemented by the arraya.

We would like to calculate the expectation of this map, i .e., the integral

J

V dm . B(k+l) k k+l

From the definition of Sk(v) it is immediate that Vk(x) equals !vl, whenever x E Sk(v). Since every path has the same probability tobe

pursued, an educated guess might be that the expected value of Vk coincides with the average path length in the subtree rooted at k.

5.4 Corollary:

J

V dm = y(k)-lL{lv!; v is a path for k}.

B(k+l) k k+l

Proof: 1. By the additivity of the integral, and the remark preceding the Corollary, the integral in question equals

L{lvl mk+l(Sk(v)); v is a viable path for k}, hence mk+l(Sk(v)) should be known.

2. By the Change of Variables Formula, and by an application of Lemma 5. 1, ·we have

mk+l(Sk(v)) ⁼ 9k+l

J

F(x)J-,N(dx) S k ( V)

= 9 k+l

f

F(x);>,.N(dx) B ( k )

9 k+l

gk

J

F(x)AN(dx)

= gk B ( k)

Now we are in a position that allows us to compute the expected number

(25)

of interchanges during the heap creation. Given x E: G, let V(x) be this number. The integer valued map V is related to {Vk; N-;-2:::: k 2 l} in the following manner:

This is so since V is the sum over all interchanges which arise in all possible calls to heapify. In order to calculate

J V d l-1, G

note that by construction of the distributions {mk; N-;-2:::: k 2 l} the integral

l

VkoZk+lo ... oZN-;-Z dµ

coincides with

f

^V ^dm ^.

B(k+l) k k+l

Hence the computation of .the integral in question is done by summing up the expected values for VN.,.z' VN.,.2__{1 , ...}

,v

1 . In order to simplify sums, we should have in mind the numbers given in Section 2 concerning the subtree sizes, depending on the position of the respective node in the tree. lt also should be taken into consideration that if v is a viable path for k ending in the node j, then v has the length

Ll o g

2 ^{jJ -} U o g

2 kJ . T h e c a l c u 1 a t i o ⁿ c a n b e ^as p r e c i s e , a s w e w a n t , but here only the leading term of the expected value is given. Denote by a. the infinite sum L]2k-1)-i for i = 1,2.

1 k:c::1

5.5 Theorem:

The expected number of interchanges during heap-creation by means of Floyd¹s method equals

(a1+a

2-2)N+O((log N)2).

•

(26)

In case N is a power of 2, the expectation can be calculated in a convenient manner with greater precision:

5.6 Corollary:

Let N = 2 , then the expectation in question equals n

n2 20

(a.1+a.2-2)•N +2 - 3n +

3 - 2a.

1-a.2 ⁺o(l). D

Let us turn our attention to the number of comparisons which are necessary to construct a heap. If x E Sk(v) for a viable path v, then any node on V i s compared to i ts brother, and xk i s compared to the node on V. Moreover i f the node at which V ends has two offsprings, then these offsprings a re compared, and xk i s compared to the winner. Let k be no special node or assume N to be odd. Then any node i n the subtree rooted at k has either exactly two offsprings of none, thus if xESk(v), th.en there are either 2•1VI +2, or 2•1VI comparisons necessary, depending on the number of sons which H(k,v) has. Thus in this case we have for the number Ck(x) of comparisons which arise in a call to heapify(k) the relation

Ck(x) ⁼2•Vk(x) + 2 - 2·L0sk(v/x); H(k,v) has no sons}.

Consequently,

f

^C dm

B(k+l) k k+l ⁼ 2

J

vk dmk+l+ 2 - 2-~((~)),

ß(k+l) r "

where ß(k) denotes the number of leaves in the subtree rooted at k.

This follows from the fact that mk+l(Sk(v)) = y(k)-1

holds for any in k viable path v.

This observation extends to the case that N is even, and k is a special node. In this case the corresponding quantity equals

(27)

Ck(x) ⁼ 2•Vk(x) + 2 -2L{1Sk(v)(x); H(k,v) has no sons}

- lSk(v*)(x), where v* is the unique path leading to N/2.

Computing only the leading term, the sums of the ratios for specia1 nodes may neglected (since they are of the order of log N), and it is seen that

L{

²

~f~~;

k is nospecia1, k ~N+2} = (2•a1-2)•N+2+0(log N) holds. This is calculated using the quantities given in Section 2.

This yields

5.7 Corollary:

Heap creation using Floyd¹s method requires

( a l + 2 a

2 - 2 ) • N + 0 ( ( l o g

~q

²) comparisons on the average.

•

Note that in D0 the leaves in the tree corresponding to {1, ... ,N} are not paid sufficient attention.

Having established these theoretical results, heap creation has been simulated in order to getan impression on the precision of the re- sul ts. For any N, 8 :s; N :s; 800, one hundred vectors of length N with components in the interval [0,1] were created randomly, and Floyd¹s algorithm was performed on them. Hereby the numbers of interchanges, and comparisons were counted and the corresponding average values computed.

Some of these results are reported in the table below, where each expected value is given divided by the length N of the vector.

(28)

N 21 34 57 124 12 5 126 127 128 129 130 131 508 509 510 511 512 513 514 515

Note that

interchanges comparisons

0.584500 1.494000

0.616765 1.613529

0.668947 1.712632

0.698952 1.777016

0.696000 1.777760

0.698968 1.781190

0.694173 1.774803

0.693437 1.778125

0.699070 1.792248

0.705000 1. 791000

0.705267 1.798015

0.730079 1.849232

0.730020 1.852063

0.727863 1.847373

0.727025 1.848532

0.729883 1.849395

0.729084 1.847329

0.731265 1.851148

0.729650 1.853359

Fig.4: Some experimental results

= 0.7430567.

a1 +2•a2 -2 ⁼ 1.8803951.

The experimental results show that the leading term of the expansion is not too unrealistic, but convercence is slow (due to the 0((log N)2

) term. As mentioned, the expectations may be computed with more precision (involving the binary representation of N heavily). but then the results look rather tangled.

(29)

REFERENCES

AHU Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analyis of Computer Algorithms. Addison-Wesley, Reading, Mass., 1974

DO Doberkat, E.-E.: Some Observations on the Average Performance of Heapsort-Preliminary Report. 21st Annual Symposium on

Foundations of Computer Science, Syracuse, ~!. Y., 1980, 229-237 FFV Flajolet, P., Francon, J., Vuillemin, J.: Sequence of Opera-

tions Analysis of Dynamic Data Structures. 20th Annual Sym- posium on Foundations of Computer Science, Puerto Rico, 1979, 183-195

FL Floyd, R.: Algorithm 245: Treesort 3. Comm. ACM 7 (1964), 701 GOR Gonnet, G.H., Rogers, L.D.: An Algorithmic and Complexity

Analysis of the Heap as a Data Structure. Research Report CS-75-20, Oepartment of Computer Science, University of Waterloo, Waterloo, 1975

KN Knuth, D.E.: The Art of Computer Programming, Vol.3 - Sorting and Searching. Addison-Wesley, Reading, Mass., 1973

POS Porter, Th., Simon, I.: Random Insertion into a Priority Queue Structure. IEEE Trans. Softw. Eng., vol. SE-1 (1975), 292-298

RU Rudin, W.: Real and Complex Analysis. Tata McGraw-Hill, New Delhi, Second Edition, 1974

Williams, J.W.J.: Algorithm 232, Heapsort. Comm. ACM 7(1964), 347-348

Constructing a Heap

Mathematik und

Informatik

Informatik-Berichte 03 – 03/1981

Constructing a Heap

Überarbeitete Fassung 1981 des

Informatik-Berichts Nr. 3 von 1980

in

J1

p

p}

1

l

7

f

(~) f

J

L

S

J

1

f

f

f

L{ J

·~

L{ f

f

*1

f

f

•

J

J

J

f

J

l

f

,v

•

f

J

L{

~f~~;

~q

•

L{ _J