University of Freiburg Dept. of Computer Science Prof. Dr. F. Kuhn
Algorithms and Datastructures Summer Term 2021
Exercise Sheet 3
Exercise 1: Bucket Sort
Bucketsort is an algorithm to stably sort an arrayA[0..n−1] ofnelements where the sorting keys of the elements take values in{0, . . . , k}. That is, we have a functionkeyassigning a keykey(x)∈ {0, . . . , k}
to each x∈A.
The algorithm works as follows. First we construct an array B[0..k] consisting of (initially empty) FIFO queues. That is, for eachi∈ {0, . . . , k},B[i] is a FIFO queue. Then we iterate through A and for each j∈ {0, . . . , n−1} we attachA[j] to the queueB[key(A[j])] using the functionenqueue.
Finally we empty all queuesB[0], ..., B[k] usingdequeueand write the returned values back to A, one after the other. After that,Ais sorted with respect tokeyand elementsx, y∈Awithkey(x) =key(y) are in the same order as before.
Implement Bucketsort based on this description. You can use the template BucketSort.py which uses an implementation of FIFO queues that are available in Queue.py and ListElement.py.1 An example of usage of this template is the following:
from Queue import Queue
from L i s t E l e m e n t import L i s t E l e m e n t q = Queue ( )
q . enqueue ( L i s t E l e m e n t ( 5 ) ) q . enqueue ( L i s t E l e m e n t ( 1 7 ) ) q . enqueue ( L i s t E l e m e n t ( 8 ) ) while not q . i s e m p t y ( ) :
print( q . dequeue ( ) . g e t k e y ( ) )
This would print the numbers 5,17,8 on three separate lines.
Solution:
def b u c k e t s o r t ( a r r a y , k , key=lambda x : x ) :
’ ’ ’
I m p l e m e n t s t h e b u c k e t s o r t a l g o r i t h m t o s o r t d a t a e l e m e n t s u s i n g a k e y f u n c t i o n t o
a s s i g n s o r t i n g k e y s t o d a t a e l e m e n t s Args :
a r r a y : a r r a y o f d a t a e l e m e n t s k : l a r g e s t k e y
k e y : a f u n c t i o n mapping d a t a e l e m e n t s t o v a l u e s i n r a n g e ( k +1) ( i d e n d i t y f u n c t i o n a s d e f a u l t )
1Remember to make unit-tests and to add comments to your source code.
>>> b u c k e t s o r t ( [ 2 1 0 , 1 2 1 , 2 0 3 , 4 2 0 , 3 0 7 ] , 2 , lambda x : i n t ( x / 1 0 ) % 1 0 ) [ 2 0 3 , 3 0 7 , 2 1 0 , 1 2 1 , 4 2 0 ]
>>> b u c k e t s o r t ( [ ] , 1 0 ) [ ]
>>> b u c k e t s o r t ([10−i f o r i i n r a n g e ( 1 0 ) ] , 1 0 ) [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 1 0 ]
’ ’ ’
# add y o u r c o d e h e r e
b u c k e t = [ Queue ( ) f o r i in range( k + 1 ) ] f o r i in range(len( a r r a y ) ) :
b u c k e t [ key ( a r r a y [ i ] ) ] . enqueue ( L i s t E l e m e n t ( a r r a y [ i ] ) ) i = 0
f o r j in range( k + 1 ) :
while not b u c k e t [ j ] . i s e m p t y ( ) :
a r r a y [ i ] = b u c k e t [ j ] . dequeue ( ) . g e t k e y ( ) i += 1
return a r r a y
Exercise 2: Radix Sort
Assume we want to sort an array A[0..n−1] of size n containing integer values from {0, . . . , k} for somek∈N. We describe the algorithmRadixsort which usesBucketSort as a subroutine.
Let m =blogbkc. We assume each key x∈ A is given in base-b representation, i.e., x= Pm i=0ci·bi for some ci ∈ {0, . . . , b−1}. First we sort the keys according to c0 using BucketSort, afterwards we sort according to c1 and so on.2
(a) Implement Radixsort based on this description. You may assume b = 10, i.e., your algorithm should work for arrays containing numbers in base-10 representation. Use Bucketsort as a sub- routine.
(b) Compare the runtimes of Bucketsort and Radixsort. For both algorithms and each k∈ {i·104 | i= 1, . . . ,50}, use an array of size 104 with randomly chosen keys from{0, . . . , k} as input and plot the runtimes. Shortly discuss your results.
(c) Explain the asymptotic runtime of your implementations of Bucketsort and Radixsort depending on nand k.
Solution:
(a) def r a d i x s o r t ( a r r a y , k ) :
’ ’ ’
I m p l e m e n t s t h e r a d i x s o r t a l g o r i t h m t o s o r t d a t a e l e m e n t s w i t h k e y s i n r a n g e ( k +1) Args :
a r r a y : a r r a y o f d a t a e l e m e n t s k : l a r g e s t k e y
>>> r a d i x s o r t ( [ 1 2 3 , 1 1 1 1 , 7 8 9 , 4 5 6 , 0 , 1 2 , 1 3 , 2 4 7 ] , 2 0 0 0 ) [ 0 , 1 2 , 1 3 , 1 2 3 , 2 4 7 , 4 5 6 , 7 8 9 , 1 1 1 1 ]
>>> r a d i x s o r t ([1000−i f o r i i n r a n g e ( 0 , 1 0 0 0 ) ] , 1 0 0 0 ) == \ [ i f o r i i n r a n g e ( 1 , 1 0 0 1 ) ]
True
’ ’ ’
m = math . c e i l ( math . l o g ( k , 1 0 ) )
2Thei-th digitci of a numberx∈Nin base-brepresentation (i.e,x=c0·b0+c1·b1+c2·b2+. . .), can be obtained via the formulaci= (xmodbi+1)divbi, wheremodis the modulo operation anddivthe integer division.
f o r i in range(m+ 1 ) :
key = lambda x : ( x % 1 0∗ ∗( i +1)) // 10∗∗i B u c k e t S o r t . b u c k e t s o r t ( a r r a y , 1 0 , key ) return a r r a y
(b) See Figure 1. We see thatBucketsort is linear in k. ForRadixsort the situation is not that clear.
At the first sight, the runtime could be constant, but upon closer examination (see Figure 2) we see a step at k = 105. The reason is that Radixsort calls Bucketsort for each digit in the input and the number of these digits (and therefore the calls of Bucketsort) is increased from 5 to 6 at k= 105.
(c) Bucketsort goes through A twice, once to write all values from A into the buckets and another time to write the values back to A. This takes time O(n) as writing a value into a bucket and from a bucket back to AcostsO(1). Additionally,Bucketsort needs to allocate kempty lists and write it into an array of size kwhich takes timeO(k). Hence, the runtime isO(n+k).
RadixSort callsBucketsort for each digit. The keys havem=O(logk) digits, so we callBucketsort O(logk) times. One run ofBucketsort takes O(n) here as the keys according to whichBucketsort sorts the elements are from the range{0, . . . ,9}. The overall runtime is therefore O(nlogk).
0 100 200 300 400 500 600 700
0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000
Elapsed time in ms
largest key radixsort
bucketsort
Figure 1: Plot for exercise 2 b).
0 50 100 150 200 250
0 20000 40000 60000 80000 100000 120000
sorting time in ms
largest key radixsort
bucketsort
Figure 2: Considering a larger range of keys to visualize the second step at 106.