0% found this document useful (0 votes)
35 views

Alg Ds 1 Lecture Notes

This document contains lecture notes on algorithms and data structures. It covers topics like time and space complexity analysis of algorithms, common data structures like arrays, stacks, queues, linked lists and trees. Sorting algorithms like insertion sort, merge sort, quicksort and heap sort are discussed. The notes also cover searching techniques using hash tables and binary search trees.

Uploaded by

deaarasheed77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Alg Ds 1 Lecture Notes

This document contains lecture notes on algorithms and data structures. It covers topics like time and space complexity analysis of algorithms, common data structures like arrays, stacks, queues, linked lists and trees. Sorting algorithms like insertion sort, merge sort, quicksort and heap sort are discussed. The notes also cover searching techniques using hash tables and binary search trees.

Uploaded by

deaarasheed77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Algorithms and Data Structures I.

Lecture Notes

Tibor Ásványi
Department of Computer Science

Eötvös Loránd University, Budapest

[email protected]

July 23, 2022


Contents

1 Notations and Basic Notions 4


1.1 Time complexities of algorithms . . . . . . . . . . . . . . . . . 5
1.2 The big-O notation . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Note on time complexity measures . . . . . . . . . . . . . . . . 12
1.4 Space complexities of algorithms . . . . . . . . . . . . . . . . . 13

2 Elementary Data Structures and Data Types 14


2.1 Arrays, memory allocation and deallocation . . . . . . . . . . 14
2.2 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Algorithms in computing: Insertion Sort 20


4 Fast sorting algorithms based on the divide and conquer ap-
proach 23
4.1 Merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 The time complexity of merge sort . . . . . . . . . . . 25
4.1.2 The space complexity of merge sort . . . . . . . . . . . 27
4.2 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Linked Lists 33
5.1 One-way or singly linked lists . . . . . . . . . . . . . . . . . . 33
5.1.1 Simple one-way lists (S1L) . . . . . . . . . . . . . . . . 33
5.1.2 One-way lists with header node (H1L) . . . . . . . . . 34
5.1.3 One-way lists with trailer node . . . . . . . . . . . . . 34
5.1.4 Handling one-way lists . . . . . . . . . . . . . . . . . . 35
5.1.5 Insertion sort of H1Ls . . . . . . . . . . . . . . . . . . 36
5.1.6 Merge sort of S1Ls . . . . . . . . . . . . . . . . . . . . 36
5.1.7 Cyclic one-way lists . . . . . . . . . . . . . . . . . . . . 37
5.2 Two-way or doubly linked lists . . . . . . . . . . . . . . . . . . 37
5.2.1 Simple two-way lists (S2L) . . . . . . . . . . . . . . . . 38
5.2.2 Cyclic two-way lists (C2L) . . . . . . . . . . . . . . . . 38
5.2.3 Example programs on C2Ls . . . . . . . . . . . . . . . 41

6 Trees, binary trees 45


6.1 General notions . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.3 Linked representations of binary trees . . . . . . . . . . . . . . 48
6.4 Binary tree traversals . . . . . . . . . . . . . . . . . . . . . . . 50
6.4.1 An application of traversals: the height of a binary tree 51

2
6.4.2 Using parent pointers . . . . . . . . . . . . . . . . . . . 52
6.5 Parenthesized, i.e. textual form of binary trees . . . . . . . . . 52
6.6 Binary search trees . . . . . . . . . . . . . . . . . . . . . . . . 53
6.7 Level-continuous binary trees, and heaps . . . . . . . . . . . . 56
6.8 Arithmetic representation of level-continuous binary trees . . . 56
6.9 Heaps and priority queues . . . . . . . . . . . . . . . . . . . . 57
6.10 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7 Lower bounds for sorting 65


7.1 Comparison sorts and the decision tree model . . . . . . . . . 65
7.2 A lower bound for the worst case . . . . . . . . . . . . . . . . 66

8 Sorting in Linear Time 68


8.1 Radix sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.2 Distributing sort . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.3 Radix-Sort on lists . . . . . . . . . . . . . . . . . . . . . . . . 69
8.4 Counting sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.5 Radix-Sort on arrays ([1] 8.3) . . . . . . . . . . . . . . . . . . 74
8.6 Bucket sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

9 Hash Tables 76
9.1 Direct-address tables . . . . . . . . . . . . . . . . . . . . . . . 76
9.2 Hash tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.3 Collision resolution by chaining . . . . . . . . . . . . . . . . . 77
9.4 Good hash functions . . . . . . . . . . . . . . . . . . . . . . . 79
9.5 Open addressing . . . . . . . . . . . . . . . . . . . . . . . . . 80
9.5.1 Open addressing: insertion and search, without deletion 80
9.5.2 Open addressing: insertion, search, and deletion . . . . 81
9.5.3 Linear probing . . . . . . . . . . . . . . . . . . . . . . 82
9.5.4 Quadratic probing . . . . . . . . . . . . . . . . . . . . 83
9.5.5 Double hashing . . . . . . . . . . . . . . . . . . . . . . 84

3
References

[1] Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.,


Introduction to Algorithms (Third Edititon), The MIT Press, 2009.

[2] Cormen, Thomas H., Algorithms Unlocked, The MIT Press, 2013.

[3] Narashima Karumanchi,


Data Structures and Algorithms Made Easy, CareerMonk Publication,
2016.

[4] Neapolitan, Richard E., Foundations of algorithms (Fifth edition),


Jones & Bartlett Learning, 2015. ISBN 978-1-284-04919-0 (pbk.)

[5] Weiss, Mark Allen, Data Structures and Algorithm Analysis in C++
(Fourth Edition),
Pearson, 2014.
http://aszt.inf.elte.hu/∼asvanyi/ds/DataStructuresAndAlgorithmAnalysisInCpp_2
[6] Wirth, N., Algorithms and Data Structures,
Prentice-Hall Inc., 1976, 1985, 2004.
http://aszt.inf.elte.hu/∼asvanyi/ds/AD.pdf
[7] Burch, Carl, B+ trees
http://aszt.inf.elte.hu/∼asvanyi/ds/B+trees.zip

1 Notations and Basic Notions

N = {0; 1; 2; 3; . . . } = natural numbers.


Z = {· · · − 3; −2, −1; 0; 1; 2; 3; . . . } = integer numbers
R = real numbers
P = positive real numbers
P0 = nonnegative
real numbers
log2 n if n > 0
log n =
0 if n=0
n
half (n) = 2 , where n ∈ N
Half (n) = n2 , where n ∈ N

We use the structogram notation in our pseudo codes. In the structograms we


use a UML-like notation with some C++/Java/Pascal avour. Main points:

1. A program is made up of declarations. Each of them is represented by


a structogram, but we do not care about their order.

4
2. We can declare subroutines, classes, variables and constants.

3. The subroutines (i.e. subprograms) are the functions, the procedures


(i.e. void functions), and the methods of the classes.

4. while loop is the default loop ; for loops are also used..
5. The operators 6=, ≥, ≤ are written by the usual mathematical notation.
The assignment statements and the is equal to comparisons are writ-
ten in Pascal style. For example, x := y assigns the value of y to x and
x=y checks their equality.

6. Only scalar parameters can be passed by value (and it is their default).


Non-scalar parameters are always passed by reference. For example,
an object cannot be passed by value, just by reference.

7. An assignment statement can copy just a scalar value or a subarray of


scalar values like A[u..v] := B[p..q] where v − u = q − p. A[u..v] := x
means that each element of A[u..v] is initialized with x.

8. In the denitions of classes we use a simple UML notation. The


name of the class comes in the rst part. The data members in the
second part, and the methods, if any, in the third part. A private
declaration/specication is prexed by a − sign. A public declara-
tion/specication is prexed by a + sign. We use neither templates
nor inheritance (nor protected members/methods).

9. We do not use libraries.

10. We do not use exception handling.

1.1 Time complexities of algorithms


Time complexity (operational complexity) of an algorithm reects some kind
of abstract time. We count the subroutine calls + the number of loop
iterations during the run of the program. (This measure is approximately
proportional to the real runtime of the program, and we can omit constant
factors here, because they are useful only if we know the programming en-
vironment [for example, (the speed of ) the hardware, the operation system,
the compiler etc.]).

We calculate the time complexity of an algorithm typically as a function


of the size of the input data structure(s) (for example, the length of the
input array). We distinguish M T (n) (Maximum Time), AT (n) (Average or

5
expected Time), and mT (n) (minimum Time). Clearly M T (n) ≥ AT (n) ≥
mT (n). If M T (n) = mT (n) then we can speak of a general time complexity
T (n) where T (n) = M T (n) = AT (n) = mT (n).
Typically, the time complexities of the algorithms are not calculated ex-
actly. Only we calculate their asymptotic order or make asymptotic estima-
tion(s) using the big-O notation.

1.2 The big-O notation


Denition 1.1 Given g:N→R ;
g is asymptotically positive (AP), i
there exists an N ∈N so that g(n) > 0 for each n≥N .

In this chapter we suppose that f, g, h denote asymptotically positive func-


tions of type N → R in each case, because they denote time (or space)
complexity functions and these satisfy this property. Usually it is not easy
to give such a function exactly. We just make estimations.

• When we make an asymptotic upper estimate g f , then we say that


of
f is big-O g , and write f ∈ O(g). Informally f ∈ O(g) means that
function f is at most proportional to g. (f (n) ≤ d ∗ g(n) for some
d > 0, if n is large enough.)

• When we make an asymptotic lower estimate g of f , then we say that


f is Omega g , f ∈ Ω(g). Informally f ∈ Ω(g) means that
and write
function f is at least proportional to g . (f (n) ≥ c ∗ g(n) for some c > 0,
if n is large enough.)

• When we make an asymptotic upper and lower estimate g of f, then


we say that f is Theta g , and write f ∈ Θ(g) which means that
f ∈ O(g) ∧ f ∈ Ω(g). In this case, we also say that the asymptotic
order of f is Θ(g).

Denition 1.2
f (n)
f ≺ g ⇐⇒ lim =0
n→∞ g(n)

f ≺ g is read as f is asymptotically less than g . It can also be written


as f ∈ o(g) which means that

o(g) = {h | h ≺ g}

6
Denition 1.3
g f ⇐⇒ f ≺ g
g f is read as g is asymptotically greater than f . It can also be written
as f ∈ ω(g) which means that

ω(g) = {h | h g}

Denition 1.4 O(g) = {f | there exist positive constants d and n0


. such that f (n) ≤ d ∗ g(n) for all n ≥ n0 }

( f ∈ O(g) can be read as f is at most proportional to g .)

Denition 1.5 Ω(g) = {f | there exist positive constants c and n0


. such that f (n) ≥ c ∗ g(n) for all n ≥ n0 }

( f ∈ Ω(g) can be read as f is at least proportional to g .)

Denition 1.6
Θ(g) = O(g) ∩ Ω(g)
( f ∈ Θ(g) can be read as f is roughly proportional to g .)

Consequence 1.7 .
ψ(n)
f ∈ O(g) ⇐⇒ ∃d, n0 > 0, and ψ:N→R so that limn→∞ g(n)
= 0, and

f (n) ≤ d ∗ g(n) + ψ(n)

for each n ≥ n0 .

Consequence 1.8 .
ϕ(n)
f ∈ Ω(g) ⇐⇒ ∃c, n0 > 0, and ϕ:N→R so that limn→∞ g(n)
=0 and

c ∗ g(n) + ϕ(n) ≤ f (n)

for each n ≥ n0 .

Consequence 1.9 f ∈ Θ(g) ⇐⇒ f ∈ O(g) ∧ f ∈ Ω(g) .

Note 1.10 .

• If f ∈ O(g), we can say that g is asymptotic upper bound of f.

• If f ∈ Ω(g), we can say that g is asymptotic lower bound of f.

7
• If f ∈ Θ(g), we can say that f and g are asymptotically equivalent.

• We never use the popular notations f = O(g), f = Ω(g) or


f = Θ(g). We think, one should not use it for at least two reasons.

1. O(g), Ω(g) and Θ(g) are sets of asymptotically positive (AP) func-
tions while f is just a single AP function. Thus the equalities
above are simply false claims.

2. Let us suppose, one is aware that for example f = O(g) here


means f ∈ O(g). Still f = O(g) and h = O(g) does not imply
f = h. Similarly, according to this popular notation
1= Θ(1) ∧ 2 = Θ(1) ∧ 1 6= 2, etc.

Theorem 1.11
f (n)
lim = 0 =⇒ f ≺ g =⇒ f ∈ O(g)
n→∞ g(n)

f (n)
lim = c ∈ P =⇒ f ∈ Θ(g)
n→∞ g(n)

f (n)
lim = ∞ =⇒ f g =⇒ f ∈ Ω(g)
n→∞ g(n)

Proof. The rst and last statements follow directly from the denition of
the ≺
and relations. In order to prove the middle one, consider that
limn→∞ fg(n)
(n)
= c. Consequently, if n is suciently large, fg(n)
(n)
− c < 2c .
Thus
c f (n)
< < 2c
2 g(n)
Because g is AP, g(n) > 0 for suciently large n values, and we can multiply
with it both sides of this inequality. Therefore

c
∗ g(n) < f (n) < 2c ∗ g(n)
2
As a result, f ∈ Θ(g)

Consequence 1.12
k ∈ N∧a0 , a1 , . . . , ak ∈ R∧ak > 0 =⇒ ak nk +ak−1 nk−1 +· · ·+a1 n+a0 ∈ Θ(nk )

8
Proof.
ak nk + ak−1 nk−1 + · · · + a1 n + a0
lim =
n→∞ nk
ak nk ak−1 nk−1

a1 n a0
lim + + ··· + k + k =
n→∞ nk nk n n
ak−1 a1 a0
lim ak + + · · · + k−1 + k =
n→∞ n n n
ak−1 a1 a0
lim ak + lim + · · · + lim k−1 + lim k =
n→∞ n→∞ n n→∞ n n→∞ n

ak + 0 + · · · + 0 + 0 = ak ∈ P =⇒
ak nk + ak−1 nk−1 + · · · + a1 n + a0 ∈ Θ(nk )

Property 1.13 (Relations of the dierent classes of functions)

Θ(g) = O(g) ∩ Ω(g)

o(g) $ O(g) \ Ω(g)


ω(g) $ Ω(g) \ O(g)

Example of the asymptotic order of AP functions:


log n ≺ n ≺ n log n ≺ n2 ≺ n2 log n ≺ n3

Property 1.14 (Transitivity)

f ∈ O(g) ∧ g ∈ O(h) =⇒ f ∈ O(h)

f ∈ Ω(g) ∧ g ∈ Ω(h) =⇒ f ∈ Ω(h)


f ∈ Θ(g) ∧ g ∈ Θ(h) =⇒ f ∈ Θ(h)
f ≺ g ∧ g ≺ h =⇒ f ≺ h
f g ∧ g h =⇒ f h

Property 1.15 (Symmetry)

f ∈ Θ(g) ⇐⇒ g ∈ Θ(f )

Property 1.16 (Exchanged symmetry)

f ∈ O(g) ⇐⇒ g ∈ Ω(f )

f ≺ g ⇐⇒ g f

9
Property 1.17 (Asymmetry)

f ≺ g =⇒ ¬(g ≺ f )

f g =⇒ ¬(g f )

Property 1.18 (Reexivity)

f ∈ O(f ) ∧ f ∈ Ω(f ) ∧ f ∈ Θ(f )

Consequence 1.19 (=⇒: 1.15, 1.14.3 ; ⇐=: 1.18.3)

f ∈ Θ(g) ⇐⇒ Θ(f ) = Θ(g)

Property 1.20 (Relations ≺ and are irreexive.)

¬(f ≺ f )

¬(f f )

Consequence 1.21 Since the binary relation · ∈ Θ(·) is reexive, symmet-


ric and transitive, it gives a classication of the set of asymptotically positive
functions, where f and g belong to the same equivalence class, i f ∈ Θ(g).
In this case, we can say that function f is asymptotically equivalent to func-
tion g.

We are going to see that such equivalence classes can be identied, and they
will be fundamental from the point of view of calculating the eciency of the
algorithms. For example, we have already seen that any k -degree polynomial
k
with positive major coecient is asymptotically equivalent to the n function.
(See Consequence 1.12.) The asymptotic order of such equivalence classes
can be established, and it is based on the following property.

Property 1.22
f1 , g1 ∈ Θ(h1 ) ∧ f2 , g2 ∈ Θ(h2 ), ∧f1 ≺ f2 =⇒ g1 ≺ g2

Denition 1.23
Θ(f ) ≺ Θ(g) ⇐⇒ f ≺ g

10
Lemma 1.24 Sometimes the following so called L'Hospital rule can be
applied for computing limes limn→∞ fg(n)
(n)
in Theorem 1.11.

If the real extensions of the functions f and g are dierentiable for suciently
large substitution values, and

f 0 (n)
lim f (n) = ∞ ∧ lim g(n) = ∞ ∧ ∃ lim 0 =⇒
n→∞ n→∞ n→∞ g (n)

f (n) f 0 (n)
lim = lim 0
n→∞ g(n) n→∞ g (n)

Property 1.25 (Based on Theorem 1.11 and on Lemma 1.24.)

c, d ∈ R ∧ c < d =⇒ nc ≺ nd

c, d ∈ P0 ∧ c < d =⇒ cn ≺ dn
c, d ∈ R ∧ d > 1 =⇒ nc ≺ dn
d ∈ P0 =⇒ dn ≺ n! ≺ nn
c, d ∈ P ∧ c, d > 1 =⇒ logc n ∈ Θ(logd n)
ε ∈ P =⇒ log n ≺ nε
c ∈ R ∧ ε ∈ P =⇒ nc log n ≺ nc+ε

Proof. In order to prove that ε ∈ P =⇒ log n ≺ nε , we need the


L'Hospital rule, i.e. Lemma 1.24.

log n ln n ln0 n log e 1


n
lim = log e lim ε = log e lim ε 0 = lim =
n→∞ nε n→∞ n n→∞ (n ) ε n→∞ nε−1

log e 1 log e
= lim ε = 0=0
ε n→∞ n ε

Consequence 1.26
Θ(log n) ≺ Θ(n) ≺ Θ(n ∗ log n) ≺ Θ(n2 ) ≺ Θ(n2 ∗ log n) ≺ Θ(n3 )

11
Property 1.27
(Function classes O(·), Ω(·), Θ(·), o(·), ω(·) are closed for the following ops.)

f ∈ O(g) ∧ c ∈ P =⇒ c ∗ f ∈ O(g)
f ∈ O(h1 ) ∧ g ∈ O(h2 ) =⇒ f + g ∈ O(h1 + h2 )
f ∈ O(h1 ) ∧ g ∈ O(h2 ) =⇒ f ∗ g ∈ O(h1 ∗ h2 )
ϕ(n)
f ∈ O(g) ∧ ϕ : N → R ∧ lim = 0 =⇒ f + ϕ ∈ O(g)
n→∞ f (n)
(Similarly for function classes Ω(·), Θ(·), o(·), ω(·).)

Property 1.28
max(f, g) ∈ Θ(f + g) where max(f, g)(n) = max(f (n), g(n)) (n ∈ N)

Proof. Because f and g are AP, f (n) > 0 and g(n) > 0 for suciently large
n∈N values, and

f (n) + g(n) |f (n) − g(n)|


max(f (n), g(n)) = + ∧
2 2
f (n) + g(n) |f (n) − g(n)| f (n) + g(n) f (n) + g(n)
+ ≤ + = f (n) + g(n)
2 2 2 2
f (n) + g(n) |f (n) − g(n)| 1
∧ + ≥ ∗ (f (n) + g(n))
2 2 2
1
=⇒ ∗ (f (n) + g(n)) ≤ max(f (n), g(n)) ≤ 1 ∗ (f (n) + g(n))
2
=⇒ max(f, g) ∈ Θ(f + g)

1.3 Note on time complexity measures


Actually, we measure the eciency of each algorithm by counting the sub-
routine calls + iterations. (The nonrecursive calls may be omitted.) In case
of comparison sorts (like insertion sort, merge sort, heap sort, quicksort etc.)
counting the key comparisons also works, because the two measures have
the same asymptotic order. But counting the subroutine calls + iterations is
more general, because it can be applied to each algorithm. Thus we use this
one. Otherwise we should nd a good measure of runnig time eciency for
each algorithm which is not a comparison sort. Therefore we use this univer-
sal measure of time complexity: we count the subroutine calls + iterations
for all the algorithms.

12
1.4 Space complexities of algorithms
Space complexity of an algorithm is an abstract measure of space (i.e. mem-
ory) requirements of that algorithm.
It depends on the depth of the subroutine calls and on the sizes of the
data structures (like arrays, linked lists, trees, etc.) generated or needed by
that algorithm. The size of the input is typically excluded from the space
requirements.

We calculate the space complexity of an algorithm typically as a function of


the size of the input data structure(s) (for example, the length of the input
array).
Even if the the size of the input is the same, dierent runs of an algorithm
may need dierent amount of space. Thus we distinguish M S(n) (Maxi-
mum Space complexity),AS(n) (Average or expected Space complexity), and
mS(n) (minimum Space complexity). Clearly M S(n) ≥ AS(n) ≥ mS(n).
If M S(n) = mS(n) then we can speak of a general space complexity S(n)
where S(n) = M S(n) = AS(N ) = mS(n).
Typically, the space complexities of the algorithms are not calculated
exactly. Only we calculate their asymptotic order or make asymptotic esti-
mation(s) using the big-O notation.

Clearly, for any algorithm, mS(n) ∈ Ω(1). Thus, provided that


M S(n) ∈ O(1), it follows that M S(n), AS(n), mS(n) ∈ Θ(1).

Denition 1.29 An algorithm works in-place,


i M S(n) ∈ O(1) is satised for it.
Denition 1.30 An algorithm works non-strictly in-place,
i M S(n) ∈ O(log n) is satised for it.

Notice that this later denition is based on the fact that although
limn→∞ log(n) = ∞, still function log(n) grows very-very slowly. For ex-
3 6 9 12
ample log2 10 < 10, log2 10 < 20, log2 10 < 30, log2 10 < 40.

13
2 Elementary Data Structures and Data Types

A data structure (DS) is a way to store and organize data in order to


facilitate access and modications. No single data structure works well for
all purposes, and so it is important to know the strengths and limitations of
several of them ([1] 1.1).
A data type (DT) is a data structure + its operations.
An abstract data type (ADT) is a mathematical or informal structure
+ its operations described mathematically or informally.
A representation of an ADT is an appropriate DS.
An implementation of an ADT is some program code of its operations.

[1] Chapter 10; [6] 4.1 - 4.4.2; [3] 3-5

2.1 Arrays, memory allocation and deallocation


The most common data type is the array. It is a nite sequence of data. Its
data elements can be accessed and updated directly and eciently through
indexing. Most programming languages support arrays and we can access
any element of an array in Θ(1) time.
In this book arrays must be declared in the following way.

A, Z : T[n]

In this example A and Z are two arrays of element type T and of size n.
In our model, the array data structure is an array object containing its size
and elements, and it is accessed through a so called array pointer containing
its memory address. The operations of the array can
read its size like A.length and Z.length: A.length = Z.length = n here,
access (read and write) its elements through indexing as it is usual, for
example A[i] and Z[j] here.
Arrays are indexed from 0.
Provided that an object or variable is created by declaring it, this object or
variable is deleted automatically when the subroutine containing it nishes.
Then the memory area reserved by it can be reused for other purposes.

If we want to declare array pointers, we can do it in the following way.

P : T[]

Now, given the declarations above, after the assignment statement P := Z ,


P and Z refer to the same array object, P [0] is identical with Z[0], and so
on, P [n − 1] is identical with Z[n − 1].

14
Array objects can be created (i.e. allocated) dynamically like in C++,
but our array objects always contain their size, unlike in C++. For example,
the statement
P := new T[m]
creates a new array object, pointer P refers to it, P.length = m.

Note that any object (and especially array object) generated dynamically
must be deleted (i.e. deallocated) explicitly when the object is not needed
any more. Deletion is done like in C++, in order to avoid memory leaking.

delete P

suces here. It does note delete the pointer but the pointed object. Having
deleted the object the memory area reserved by it can be reused for other
purposes.
Unfortunately we cannot tell anything about eciency of memory allo-
cation and deallocation in general. Sometimes these can be performed with
Θ(1) time complexity, but usually they need much more time, and their ef-
ciency is often unpredictable. Therefore we avoid the overuse of them and
we apply them only when it is really necessary.

If we want to pass an array parameter to a subroutine, the actual parameter


must be the identier of the array, the formal parameter has to be specied as
an array pointer and the parameter passing copies the address of the actual
array object into the formal parameter. If we write an identier between the
square brackets, it is a short notation for the length of the array.
The following two procedures are equivalent, because in the left struc-
togram n = A.length.

init(A : T[n] ; x : T ) init(Z : T[] ; x : T )

i := 0 to n − 1 i := Z.length − 1 downto 0
A[i] := x Z[i] := x

Given an array A, subarray A[u..v] denotes the following sequence of elements


ofA: hA[u], . . . , A[v]i where u and v are valid indexes in the array. If u > v ,
the subarray is empty (i.e. hi). In this case u or v may be invalid index.

Given an array A, subarray A[u..v) denotes the following sequence of elements


of A: hA[u], . . . , A[v − 1]i where u and v − 1 are valid indexes in the array.
If u ≥ v , the subarray is empty (i.e. hi).

15
2.2 Stacks
A stack is a LIFO (Last-In First-Out) data storage. It can be imagined as a
vertical sequence of items similar to a tower of plates on a table. We can push
a new item at the top, and we can check or remove (i.e. pop) the topmost
item.
In the following representation the elements of the stack are stored in
subarray A[0..n). Provided that n > 0, A[n − 1] is the top of the stack.
Provided that n = 0, the stack is empty.
T (n) ∈ Θ(1) for each method, because there is neither iteration nor sub-
routine invocation in their code. The time complexities of the constructor
and of the destructor depend on that of the new and delete expressions.
Stack
−A : T[] // T is some known type ; A.length is the max. size of the stack
−n : N // n ∈ 0..A.length is the actual size of the stack
+ Stack(m : N) {A := new T[m] ; n := 0} // create an empty stack
+ push(x : T) // push x onto the top of the stack
+ pop() : T // remove and return the top element of the stack
+ top() : T // return the top element of the stack
+ isFull() : B {return n = A.length}
+ isEmpty() : B {return n = 0}
+ setEmpty() {n := 0} // reinitialize the stack
+ ∼ Stack() { delete A }

Some operations of a Stack


Stack(4) push(5) push(3) push(7) push(2) // n=4
3 3 3 3 n=3 3 2
2 2 2 n=2 2 7 2 7
1 1 n=1 1 3 1 3 1 3
0 n=0 0 5 0 5 0 5 0 5
A A A A A


Stack::push(x : T)

A n < A.length
A[n] := x
StackOverow
n++

16

Stack::pop():T

Stack::top():T
A n>0

n−− A n>0
StackUnderow
return A[n] return A[n − 1] StackUnderow

Example for a simple use of a stack: printing n input items in reversed order.
We suppose that read(x) reads form the current input the next piece of data
and write(x) prints to the current output the value of x. Treverse (n) ∈ Θ(n).

reverse(n : N)

v: Stack(n)

x:T
n>0
read(x)

v .push(x)
n := n − 1
¬v .isEmpty()
write(v .pop())

2.3 Queues
A queue is a FIFO (First-In First-Out) data storage. It can be imagined as
a horizontal sequence of items similar to a queue at the cashier's desk. We
can add a new item to the end of the queue, and we can check or remove the
rst item.
In the following representation the elements of the queue are stored in

hZ[k], Z[(k + 1) mod Z.length], . . . , Z[(k + n − 1) mod Z.length]i

Provided that n > 0, Z[k] is the rst element of the queue where n is the
length of the queue. Provided that n = 0, the queue is empty.
T (n) ∈ Θ(1) for each method, because there is neither iteration nor sub-
routine invocation in their code. The time complexities of the constructor
and of the destructor depend on that of the new and delete expressions.

17
Queue
−Z : T[] // T is some known type
−n : N // n ∈ 0..Z.length is the actual length of the queue
−k : N // k ∈ 0..(Z.length−1) is the starting position of the queue in array Z
+ Queue(m : N){ Z := new T[m] ; n := 0 ; k := 0 } // create an empty queue
+ add(x : T) // join x to the end of the queue
+ rem() : T // remove and return the rst element of the queue
+ rst() : T // return the rst element of the queue
+ length() : N {return n}
+ isFull() : B {return n = Z.length}
+ isEmpty() : B {return n = 0}
+ ∼ Queue() { delete Z }
+ setEmpty() {n := 0} // reinitialize the queue

Some operations of a Queue


Queue(4) add(5) add(3) rem() : 5
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
5 5 3 3
k k k k
n=0 n=1 n=2 n=1
rem() : 3 add(7) add(2) add(4)
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
7 7 2 4 7 2
k k k k
n=0 n=1 n=2 n=3

Queue::add(x : T)

A n < Z.length
Z[(k + n) mod Z.length] := x
QueueOverow
n++


Queue::rem() :T

A n>0
n−−
Queue::rst() :T
i := k
QueueUnderow
k := (k + 1) mod Z.length A n>0
return Z[i] return Z[k] QueueUnderow

18
We described the methods of stacks and queues with simple codes: we ap-
plied neither iterations nor recursions. Therefore the time complexity of each
method is Θ(1). This is a fundamental requirement for each implementation
of stacks and queues.
Note that with linked list representations this constraint can be guaran-
teed only if M Tnew , M Tdelete ∈ Θ(1).
Considering the above implementations of stacks and queues, the space
complexity of the destructor and of each method is also Θ(1), because they
create no data structure, but they use only a few temporal variables. The
space complexity of the constructor is clearly Θ(m).

19
3 Algorithms in computing: Insertion Sort

[1] Chapter 1-3; [4] Chapter 1, 2, 7


insertionSort(A : T[n])

i := 1 to n − 1
A A[i − 1] > A[i]
x := A[i]
A[i] := A[i − 1]
j := i − 2
j ≥ 0 ∧ A[j] > x SKIP

A[j + 1] := A[j]
j := j − 1
A[j + 1] := x

mTIS (n) = 1 + (n − 1) = n

n−1 n−2
X X (n − 1) ∗ (n − 2)
M TIS (n) = 1 + (n − 1) + (i − 1) = n + j =n+
i=1 j=0
2

1 1
M TIS (n) = n2 − n + 1
2 2

M TIS (n) ≈ (1/2) ∗ n2 , if n is large.

Let us suppose that our computer can perform


2 ∗ 109 elementary operations /second.
Considering the code of insertion sort above, and counting these operations
2
as a function of n, we receive that mT (n) ≥ 8 ∗ n and M T (n) > 6 ∗ n
2
elementary operations. Counting with mT (n) ≈ 8 ∗ n and M T (n) ≈ 6 ∗ n ,
we receive the following table on running times as lower estimates:

20
5 2 7 1 4 6 8 2'

2 5 7 1 4 6 8 2'

2 5 7 1 4 6 8 2'

1 2 5 7 4 6 8 2' (*)

1 2 4 5 7 6 8 2'

1 2 4 5 6 7 8 2'

1 2 4 5 6 7 8 2'

1 2 2' 4 5 6 7 8

(*) detailed:

1 2 5 7 4 6 8 2'

x=4
1 2 5 7 6 8 2'

x=4
1 2 5 7 6 8 2'

x=4

Figure 1: An illustration of Insertion Sort. In such examples, key = key 0 for


any key . In this way, we demonstrate the handling of dierent occurrences
of the same key . Exercise: Give a detailed illustration of the last insertion.

21
n mT (n) in secs M T (n) in time
1000 8000 4 ∗ 10−6 6 ∗ 106 0.003 sec
106 8 ∗ 106 0.004 6 ∗ 1012 50 min
107 8 ∗ 107 0.04 6 ∗ 1014 ≈ 3.5 days
8 8
10 8 ∗ 10 0.4 6 ∗ 1016 ≈ 347 days
9 9
10 8 ∗ 10 4 6 ∗ 1018 ≈ 95 years

This means that in the worst case insertion sort is too slow to sort one million
element, and it becomes completely impractical, if we try to sort huge amount
of data. Let us consider the average case:

n−1 n−2
X i−1 1 X
ATIS (n) ≈ 1 + (n − 1) + =n+ ∗ j=
i=1
2 2 j=0
1 (n − 1) ∗ (n − 2) 1 1 1
=n+ ∗ = n2 + n +
2 2 4 4 2

This calculation shows, that the expected or average running time of insertion
sort is roughly the half of the time needed in the worst case, so even the
expected running time of insertion sort is too long to sort one million element,
and it becomes completely impractical, if we try to sort huge amount of data.
The asymptotic time complexities:

mTIS (n) ∈ Θ(n)


ATIS (n), M TIS (n) ∈ Θ(n2 )

Let us notice that the minimum running time is very good. There is no
chance to sort elements faster than in linear time, because each piece of data
must be checked. One may say that this best case is not much gain, because
in this case the items are already sorted. The fact is, that if the input is
nearly sorted, we can remain close to the best case, and insertion sort turns
out to be the best to sort such data.

Insertion sort is also stable: Stable sorting algorithms maintain the relative
order of records with equal keys (i.e. values). That is, a sorting algorithm is
stable if whenever there are two records R and S with the same key and with
R appearing before S in the original list, R will appear before S in the sorted
list. (For example, see keys 2 and 2' in Figureinsertion-sort.) Stability is an
important property of sorting methods in some applications.

The space complexity of insertion sort is Θ(1), because it creates no data


structure, but it uses only a few temporal variables.

22
4 Fast sorting algorithms based on the divide
and conquer approach
In computer science, divide and conquer is an algorithm design paradigm
based on multi-branched recursion. A divide and conquer algorithm works
by recursively breaking down a problem into two or more sub-problems of the
same or related type, until these become simple enough to be solved directly.
The solutions to the sub-problems are then combined to give a solution to
the original problem.
This divide and conquer technique is the basis of ecient algorithms many
kinds of problems, for example sorting (e.g., quicksort, mergesort).

4.1 Merge sort

5 3 1 6 8 2 3'

5 3 1 6 8 2 3'

5 3 1 6 8 2 3'

3 1 6 8 2 3'

1 3 6 8 2 3'

1 3 5 2 3' 6 8

1 2 3 3' 5 6 8

Figure 2: An illustration of merge sort.

The divide and conquer paradigm is often used to nd the optimal so-
lution of a problem. Its basic idea is to decompose a given problem into
two or more similar, but simpler, subproblems, to solve them in turn, and
to compose their solutions to solve the given problem. Problems of sucient

23
simplicity are solved directly. For example, to sort a given list of n keys,
split it into two lists of about n/2 keys each, sort each of them in turn, and
interleave (i.e. merge) both results appropriately to obtain the sorted version
of the given list. (See Figure 2.) This approach is known as the merge sort
algorithm.
Merge sort is stable (preserving the input order of items with equal keys)
and its worst case time complexity is asymptotically optimal among compar-
ison sorts. (See section 7 for details.)


mergeSort(A : T[n])

B : T[n] ; B[0..n) := A[0..n)
// Using B[0..n) sort A[0..n) non-decreasingly:
ms(B, A, 0, n)


ms(B, A : T[] ; u, v : N)

// Initially B[u..v) = A[u..v).
// Using B[u..v) sort A[u..v) non-decreasingly:

A v−u>1
m := u+v

2
ms(A, B, u, m) // Using A[u..m) sort B[u..m)
SKIP
ms(A, B, m, v ) // Using A[m..v) sort B[m..v)
merge(B, A, u, m, v ) // merge B[u..m) and B[m..v) into A[u..v)

m = u+v

Using , A[u..m) and A[m..v) have the same length, if the length
2
of A[u..v) is even number; and A[u..m) is shorter by one than A[m..v), if the
length of A[u..v) is odd number; because


u+v
length(A[u..m)) = m − u = −u=
2

u+v v−u length(A[u..v))
−u = =
2 2 2

24

merge(B, A : T[] ; u, m, v : N)

// sorted merge of B[u..m) and B[m..v) into A[u..v)
k := u // in loop, copy into A[k]

i := u ; j := m // from B[i] or B[j]


i<m∧j <v
A B[i] ≤ B[j]
A[k] := B[i] A[k] := B[j]
i := i + 1 j := j + 1
k := k + 1
A i<m
A[k..v) := B[i..m) A[k..v) := B[j..v)

The stability of merge sort is ensured in the explicit loop of merge, because
in case of B[i] = B[j], B[i] is copied into A[k], and B[i] came earlier in the
input.
Merge makes l loop iterations where l = v − u is the length of the actual
subarrays A[u..v) and B[u..v). In order to prove this statement consider the
value of k after the explicit loop of merge. Clearly, this loop lls subarray
A[u..k), and it copies one element / iteration. Thus it iterates k − u times.
In addition, the implicit loop hidden into A[k..v) := . . . iterates v − k times.
Therefore the sum of the loop iterations is (k − u) + (v − k) = v − u = l .
Consequently for the body of procedure merge (mb) we have:

Tmb (l) = l (l = v − u)

4.1.1 The time complexity of merge sort


Merge sort is one of the fastest sorting algorithms and there is not a big
dierence between its worst-case and best-case (i.e. maximal an minimal)
running time. For our array sorting version, we state:

M TmergeSort (n) = mTmergeSort (n) = TmergeSort (n) ∈ Θ(n log n)


Proof: Clearly TmergeSort (0) = 2. We suppose that n > 0. First we count all
the loop iterations of procedure merge. Next we count the procedure calls of
merge sort.
Loop iterations: We proved above that a single call of merge(B, A, u, m, v )
makes Tmb (l) = l iterations where l = v − u. Let us consider the levels
of recursion of procedure ms(B, A, u, v ). At level 0 of the recursion ms is

25
called for the whole array A[0..n). Considering all the recursive calls and the
corresponding subarrays at a given depth of recursion, at level 1 this array
is divided into two halves, at level 2
4 parts and so on. Let nij be the
into
length of the j th subarray of the form A[u..v) at level i, and let m = blog nc.
We have:

At level 0: 2m ≤ n01 = n < 2m+1


At level 1: 2m−1 ≤ n1j ≤ 2m+1−1 (j ∈ [1..21 ])
At level 2: 2m−2 ≤ n2j ≤ 2m+1−2 (j ∈ [1..22 ])
...
At level i: 2m−i ≤ nij ≤ 2m+1−i (i ∈ [1..m], j ∈ [1..2i ])
...
At level m − 1: 2 ≤ n(m−1)j ≤ 4 = 22 (j ∈ [1..2m−1 ])
At level m: 1 ≤ nmj ≤ 2 = 21 (j ∈ [1..2m ])
At level m + 1: n(m+1)j = 1 (j ∈ [1..(n − 2m )])

Thus at levels [0..m] these subarrays cover the whole array. At levels [0..m)
merge is called for each subarray, but at level m it is called only for those
subarrays with length 2, and the number of these subarrays is clearly n −
2m . Merge makes as many iterations as the length of the actual A[u..v).
Consequently at each level in [0..m) merge makes n iterations in all the
merge calls of the level altogether. At level m the sum of the iterations is
2 ∗ (n − 2m ), and there is no iteration at level m + 1. Therefore the sum total
of all the iterations during the merge calls is

Tmb[0..m] (n) = n ∗ m + 2 ∗ (n − 2m ).
The number of procedure calls: Clearly, the ms calls form a strictly binary
tree. (See section 6.2.) The leaves of this tree correspond to the subarrays
with length 1. Thus this strictly binary tree has n leaves and n − 1 internal
nodes. Consequently we have 2n − 1 calls of ms, and n − 1 calls of merge.
Adding to this the single call of mergeSort we receive 3n − 1 procedure calls
altogether.
And there are n iterations hidden into the initial assignment B[0..n) :=
A[0..n) in procedure mergeSort. Thus the number of steps of mergeSort is

T (n) = (n+n∗m+2∗(n−2m ))+(3∗n−1) = n∗blog nc+2∗(n−2blog nc )+4∗n−1


n ∗ log n + 2 ∗ n ≤ n ∗ (log n − 1) + 4 ∗ n − 1 ≤ T (n) < n ∗ log n + 6 ∗ n.
Thus T (n) ∈ Ω(n log n) (see Consequence 1.8) and T (n) ∈ O(n log n) (see
Consequence 1.7). After all we have for mergeSort(A : T[n])
T (n) ∈ Θ(n log n).

26
4.1.2 The space complexity of merge sort
Considering the above code of merge sort, in the main procedure we create
temporal array B : T[n], and we need a temporal variable for the implicit for
loop. Altogether we need Θ(n) working memory here. In addition, we have
seen in the previous subsection (4.1.1) that the number of levels of recursion
is blog nc + 1 which is approximately log n, and we have a constant amount
of temporal variables at each level. Thus we need Θ(log n) working memory
inside the call stack in order to control the recursion. And Θ(log n) ≺ Θ(n).
These measures clearly do not depend on the content of the array to be
sorted. For this reason, the space complexity of this array version of merge
sort is S(n) ∈ Θ(n).

4.2 Quicksort
Quicksort is a divide and conquer algorithm. Quicksort rst divides a large
array into two smaller sub-arrays: the low elements and the high elements.
Quicksort can then recursively sort the sub-arrays.

The steps are:

• Pick an element, called pivot, from the array.

• Partitioning: reorder the array so that all elements with values less
than the pivot come before the pivot, while all elements with values
greater than the pivot come after it (equal values can go either way).
After this partitioning, the pivot is in its nal position. This is called
the partition operation.

• Recursively apply the above steps to the sub-array of elements with


smaller values and separately to the sub-array of elements with greater
values.

• The base case of the recursion is arrays of size zero or one, which are
in order by denition, so they never need to be sorted.

The pivot selection and partitioning steps can be done in several dierent
ways; the choice of specic implementation schemes greatly aects the algo-
rithm's performance.


quicksort(A : T[n])

QS(A, 0, n − 1)

27

QS(A : T[] ; p, r : N)

A
p<r

q := partition(A, p, r )

QS(A, p, q − 1) SKIP

QS(A, q + 1, r)

partition(A : T[] ; p, r : N) : N

i := random(p, r )

pivot := A[i] ; A[i] := A[r]


i := p
i < r ∧ A[i] ≤ pivot
i := i + 1
A i<r
j := i + 1
j<r
A A[j] < pivot
swap(A[i], A[j]) i=r
//
SKIP // A[p..r) ≤ pivot
i := i + 1
A[r] := pivot
j := j + 1
A[r] := A[i] ; A[i] := pivot
return i

Explanation of function partition: In order to illustrate the operation


of function partition, let us introduce the following notation:

• A[k..m) ≤ pivot, i for each l ∈ [k..m), A[l] ≤ pivot


• A[k..m) ≥ pivot, i for each l ∈ [k..m), A[l] ≥ pivot
We suppose that subarray A[p..r] is cut into pieces and the pivot is the second
5, i.e. the 4. element of this subarray (with index p+3).

Input:
p r
A: 5 3 8 5 6 4 7 1

28
Preparations for the rst loop:
p r
A: 5 3 8 1 6 4 7 pivot = 5

The rst loop searches for the rst item being greater than the pivot, if
there is such element. (The occurrences of index i reect to its left-to-right
movement.)
i=p i i r
A: 5 3 8 1 6 4 7 pivot = 5

We have found an item greater than the pivot. Variable j starts at the next
item.
p i j r
A: 5 3 8 1 6 4 7 pivot = 5

We have cut A[p..r] into four sections:

p ≤ i < j ≤ r
A[p..i) ≤ pivot, A[i..j) ≥ pivot, A[j..r) (unknown), A[r] (undened).

And this is an invariant of the second loop. (Notice that items equal to the
pivot may be either in the rst, and in the second section of A[p..r].)

Starting the run of the second loop, it turns out that A[j] = 1 < pivot must
be exchanged with A[i] = 8 ≥ pivot in order to move A[j] into the rst
section of A[p..r] (which contains items ≤ than the pivot), and to move A[i]
(A[i] ≥ pivot) to the end of the second section (which contains items ≥ than
the pivot). (The items to be exchanged are printed in bold.)
p i j r
A: 5 3 8 1 6 4 7 pivot = 5

Now we exchange A[i] and A[j]. Thus the length of the rst section of A[p..r]
(containing items ≤ than the pivot) has been increased by 1, while its second
section (containing items ≥ than the pivot) has been moved by one position.
So we increment variables i and j by 1, in order to keep the invariant of the
loop.
p i j r
A: 5 3 1 8 6 4 7 pivot = 5

Now A[j] = 6 ≥ pivot = 5, so we add A[j] to the second section of A[p..r],


i.e. we increment j by 1.
p i j r
A: 5 3 1 8 6 4 7 pivot = 5

29
And now A[j] = 4 < pivot = 5, so A[j] must be exchanged with A[i]. (The
items to be exchanged are printed in bold.)

Now we exchange A[i] and A[j]. Thus the length of the rst section of A[p..r]
(containing items ≤ than the pivot) has been increased by 1, while its second
section (containing items ≥ than the pivot) has been moved by one position.
So we increment variables i and j by 1, in order to keep the invariant of the
loop.
p i j r
A: 5 3 1 4 6 8 7 pivot = 5

Now A[j] = 7 ≥ pivot = 5, so we add A[j] to the second section of A[p..r],


i.e. we increment j by 1.
And now j = r , therefore the third (the unknown) section of A[p..r]
disappeared.
p i j=r
A: 5 3 1 4 6 8 7 pivot = 5

Now the rst two sections cover A[p..r), and the second section is not empty
because of invariant i < j . Thus the pivot can be put in between the items
≤ than it, and the items ≥ than it: rst we move the rst element (A[i]) of
the second section (A[i..j)) to the end of A[p..r], next we put the pivot into
A[i]. (The pivot is printed in bold.)
p i j=r
A: 5 3 1 4 5 8 7 6

Now the partitioning of subarray A[p..r] has been completed. We return


the position of the pivot (i), in order to inform procedure Quicksort(A, p, r ),
which subarrays of A[p..r] must be sorted recursively.

In the other case of procedure partition the pivot is a maximum of A[p..r].


This case is trivial. Let the reader consider it.

The time complexity of function partition is linear, because the two loops
together perform r−p−1 or r−p iterations.

There is a big dierence between the best-case and worst-case time com-
plexities of quicksort. At each level of recursion, counting all the elements
in the subarrays of that level, we have to divide altogether maximum n ele-
ments. Therefore the time complexity of each level of recursion is O(n).
In a lucky case, at each level of recursion, in the partitioning process,
the pivot divides the actual subarray into two parts of approximately equal

30
lengths, and the depth of recursion will be about log n. Thus the best-
case time complexity of quicksort is O(n log n). It can be proved that it is
Θ(n log n).
In an unlucky case, at each level of recursion, in the partitioning process,
the pivot will be the maximum or minimum of the actual subarray. And after
partitioning, the shorter partition will be empty, but the longer partition will
have all the elements, except of the pivot. Considering always the longer
subarray; at level zero, we have n elements; at level one, n − 1 elements;. . . at
level i, n−i elements (i ∈ 0..(n−1)). As a result, in this case, we have n levels
of recursion, and we need approximately n−i steps only for partitioning at
2
level i. Consequently the worst-case time complexity of quicksort is Ω(n ).
2
It can be proved that it is Θ(n ).
Fortunately, the probability of the worst case is extremely low, and the
average case is much closer to the best case. As a result,

mTquicksort (n), ATquicksort (n) ∈ Θ(n log n) ∧ M Tquicksort (n) ∈ Θ(n2 )

Considering its space complexity, quicksort does not use any temporal
data structure. Thus the memory needs are determined by the number of
levels of recursion. As we have seen it before, in worst case, it is n. And
in the best case, it is about log n. Fortunately, in the average case, it is
Θ(log n). Consequently

mSquicksort (n), ASquicksort (n) ∈ Θ(log n) ∧ M Squicksort (n) ∈ Θ(n)

It is known that on short arrays insertion sort is more ecient than the
fast sorting methods (merge sort, heap sort, quicksort). Thus procedure
quicksort(A, p, r ) can be optimized by switching to insertion sort on short
subarrays. Because in the recursive calls we determine a lot of short sub-
arrays, the speed-up can be signicant, although it does change neither the
time nor the space complexities.


QS(A : T[] ; p, r : N)

A p+c<r
q := partition(A, p, r )

QS(A, p, q− 1) insertionSort(A, p, r )

QS(A, q + 1, r )

31
c∈N is a constant. Its optimal value depends on many factors, but usually
it is between 20 and 50.

Exercise 4.1 How to speed-up merge sort in a similar way?

Considering its expected running time, quicksort is one of the most


ecient sorting methods. However, it slows down, if (for example) by chance
function partition always selects the maximal or minimal element of the
2
actual subarray. [In this case, the time complexity of quicksort is Θ(n ).]
The probability of such cases is low. However, if we want to avoid such
cases, we can pay attention to the depth of recursion, and switch to heap
sort (see later) when recursion becomes too deep (for example, deeper than
2 ∗ log n). With this optimization, even the worst-case time complexity is
Θ(n log n), and even the worst-case space complexity is Θ(log n).

32
5 Linked Lists

One-way lists can represent nite sequences of data. They consist of zero or
more elements (i.e. nodes) where each element of the list contains some data
and linking information.

5.1 One-way or singly linked lists


In singly linked lists each element contains a key , and a next pointer referring
to the next element of the list. A pointer referring to the front of the list
identies the list. The next pointer of the last element typically contains a
1
so called NULL (i.e. ) pointer which is the address of no object. (With
some abstraction, the nonexisting object with address can be called NO
object.) In this book E1 is the element type of one-way lists.

E1
+key : T
. . . // satellite data may come here
+next : E1*
+E1() { next := }

5.1.1 Simple one-way lists (S1L)


An S1L is identied by a pointer referring to its rst element, or this pointer
is , if the list is empty.

L1 =
L1 9 16 4 1

L1 25 9 16 4 1

L1 25 9 16 4

Figure 3: Pointer L1 identies simple one-way lists at dierent stages of an


imaginary program. In the rst line the S1L is empty.

1 except in cyclic lists

33

S1L_length(L : E1*) : N

If n is the length of list L,
n := 0 then
p := L the loop iterates n times,
p 6= so

n := n + 1 TS1L_length (n) = 1 + n,
therefore
p := p → next
TS1L_length (n) ∈ Θ(n).
return n

5.1.2 One-way lists with header node (H1L)


The header node or simply header of a list is an extra zeroth element of the
list with undened key . A list with header cannot be , it always contains
a header. The pointer identifying the list always refers to its header.

L2

L2 4 8 2

L2 17 4 8 2

L2 17 8 2

Figure 4: Pointer L2 identies one-way lists with heather at dierent stages


of an imaginary program. In the rst line the H1L is empty.


H1L_length(H : E1*) : N
TH1L_length (n) ∈ Θ(n)
return S1L_length(H → next) where n is the length of H1L H .

5.1.3 One-way lists with trailer node


Some one-way lists contain a trailer node instead of header. A trailer node
is always at the end of such lists. The data members (key and next) of the
trailer node are typically undened, and we have two pointers identifying the
list: One of them refers to its rst element, and the other to its trailer node.
If the list is empty, both identier pointers refer to its trailer node.

34
Such a list is ideal for representing a queue. Its method add(x : T ) copies
x into the key of the trailer node, and joins a new trailer node to the end of
the list.

5.1.4 Handling one-way lists



unlinkFirst(&L : E1*) : E1*

q := L
insertAtFront(&L, q : E1*)
L := q → next
q → next := L q → next :=
L := q return q

unlinkNext(p : E1*) : E1*

q := p → next
follow(p, q : E1*)
p → next := q → next
q → next := p → next q → next :=
p → next := q return q

TinsertAtF ront , TunlinkF irst , Tf ollow , TunlinkN ext ∈ Θ(1)


SinsertAtF ront , SunlinkF irst , Sf ollow , SunlinkN ext ∈ Θ(1)

cut(L : E1* ; n : N) : E1*

H1L_read() : E1*
p := L
n>1 H := v := new E1
n := n − 1 read(x)

p := p → next v := v → next := new E1

q := p → next v → key := x
p → next := // satellite data may be read here

return q return H

Tcut (n) ∈ Θ(n) ∧ Scut (n) ∈ Θ(1)


TH1L_read (n) ∈ Θ(n) ∧ SH1L_read (n) ∈ Θ(n)
where n is the length of the list.

35
5.1.5 Insertion sort of H1Ls

H1L_insertionSort(H : E1*)

s := H → next
A s 6=

u := s → next
u 6=
A s → key ≤ u → key

s → next := u → next
p := H ; q := H → next SKIP

s := u q → key ≤ u → key
p := q ; q := q → next
u → next := q ; p → next := u
u := s → next

mTIS (n) ∈ Θ(n) ∧ ATIS (n), M TIS (n) ∈ Θ(n2 ) ∧ SIS (n) ∈ Θ(1)

where n is the length of H1L H. Clearly procedure insertionSort(H : E1*)


is stable. Like the array version of it, it also sorts in-place.

5.1.6 Merge sort of S1Ls



ms(&L : E1* ; n : N)

A n>1
n1 := b n2 c
mergeSort(&L : E1*)
L2 := cut(L, n1)

// L is an S1L. ms(L, n1) SKIP

n := S1L_length(L) ms(L2, n − n1)


ms(L, n) L := merge(L, L2)

36

merge(L1, L2 : E1*) : E1*

A L1 → key ≤ L2 → key
L := t := L1 L := t := L2
L1 := L1 → next L2 := L2 → next
L1 6= ∧ L2 6=
A L1 → key ≤ L2 → key
t := t → next := L1 t := t → next := L2
L1 := L1 → next L2 := L2 → next
A L1 6=
t → next := L1 t → next := L2
return L

mTM S (n), M TM S (n) ∈ Θ(n log n) ∧ SM S (n) ∈ Θ(log n)


where n is the length of S1L L. Clearly procedure mergeSort(&L : E1*) is
stable. Unlike the array version of merge sort, it sorts non-strictly in-place.

5.1.7 Cyclic one-way lists


The last element of the list does not contain in its next eld, but this
pointer points back to the beginning of the list.
If a cyclic one-way list does not contain header node, and it is not empty,
the next eld of its last element points back to its rst element. If it is empty,
it is represented by the pointer. A pointer identifying a nonempty, cyclic
one-way list typically refers to its last element.
If a cyclic one-way list has a header node, the next eld of its last element
points back to its header node. If it is empty,the next eld of its header node
points to itself. Notice that the header of a cyclic list is also the trailer node
of that list.
Such lists are also good choices for representing queues. Given a queue
represented by a cyclic list with header the method add(x : T) copies x into
the key of the trailer/header node, and inserts a new trailer/header node
into the list.

5.2 Two-way or doubly linked lists


In a two-way list each element contains a key , and two pointers: a next and
a prev pointer referring to the next and previous elements of the list. A

37
pointer referring to the front of the list identies the list.

5.2.1 Simple two-way lists (S2L)


An S2L is identied by a pointer referring to its rst element, or this pointer
is , if the list is empty.
The prev pointer of the rst element, and the next pointer of the last
element contain the pointer.

L1 = prev key next

L1 9 16 4 1

L1 25 9 16 4 1

L1 25 9 16 1

Figure 5: Pointer L1 identies simple two-way lists (S2L) at dierent stages


of an imaginary program. In the rst line list L1 is initialized to empty.

Handling of S2Ls is unconvenient because all the modications of an S2L


must be done dierently at the front of a list, at the end of the list, and in
between. Consequently we prefer cyclic two-way lists (C2L) in general. (In
hash tables, however S2Ls are usually preferred to C2Ls.)

5.2.2 Cyclic two-way lists (C2L)


A cyclic two-way list (C2L) contains a header node (i.e. header) by default,
and its is identied by a pointer referring to its header. The next eld of its
last element points back to its header which is considered a zeroth element.
The prev pointer of its rst element points back to its header, and the prev
pointer of its header points back to its last element. If a C2L is empty, both
of the prev , and next elds of its header point to itself. Notice that the
header of a cyclic list is also the trailer node of the list.

The element type of C2Ls follows.

E2
+prev, next : E2* // refer to the previous and next neighbour or be this
+key : T
+ E2() { prev := next := this }

38
L2

L2 9 16 4 1

L2 25 9 16 4 1

L2 25 9 16 4

Figure 6: Pointer L2 identies cyclic two-way lists (C2L) at dierent stages


of an imaginary program. In the rst line list L2 is empty.

Some basic operations on C2Ls follow. Notice that these are simpler than
the appropriate operations of S2Ls. Clearly, T, S ∈ Θ(1) for each of them.


precede(q, r : E2*) follow(p, q : E2*)

// (∗q) will precede (∗r) // (∗q) will follow (∗p)
p := r → prev r := p → next
q → prev := p ; q → next := r q → prev := p ; q → next := r
p → next := r → prev := q p → next := r → prev := q


unlink(q : E2*)

(∗q)
// remove

p := q → prev ; r := q → next
p → next := r ; r → prev := p
q → prev := q → next := q

39
L1

L2

L1

L2

L1

L2

L1

L2

p r

L1

L2

p r

Figure 7: Illustrating unlink(q ) and precede(q, r ). Insertion is called on the


red element (∗q ), and it is inserted left to (∗r) into list L2 .

40
5.2.3 Example programs on C2Ls
We can imagine a C2L straightened, for example
H → [/][5][2][7][/] ← H representing h5; 2; 7i
or empty: H → [/][/] ← H representing hi.


C2L_read(&H : E2*) setEmpty(H : E2*)

H := new E2 p := H → prev
read(x) p 6= H
p := new E2 unlink(p)

p → key := x delete p
precede(p, H ) p := H → prev
H → [/][/] ← H
H → [/][5][/] ← H
H → [/][5][2][/] ← H
H → [/][5][2][7][/] ← H

TC2L_read (n), TsetEmpty (n), Tlenght (n) ∈ Θ(n)



insertionSort(H : E2*)

s := H → next ; u := s → next

u 6= H
length(H : E2*) : N
A s → key ≤ u → key
unlink(u) n := 0
p := s → prev p := H → next
s := u p 6= H ∧ p → key > u → key p 6= H
p := p → prev n := n + 1
follow(p, u) p := p → next
u := s → next return n
0
H → [/][5][2][7][2 ][/] ← H
H → [/][2][5][7][20 ][/] ← H
H → [/][2][5][7][20 ][/] ← H
H → [/][2][20 ][5][7][/] ← H

mTIS (n) ∈ Θ(n); ATIS (n), M TIS (n) ∈ Θ(n2 ) ∧ SIS (n) ∈ Θ(1)

41
where n is the length of C2L H. Clearly procedure insertionSort(H : E2*)
is stable.

Example 5.1 Let us suppose that Hu


Hi are strictly increasing C2Ls.
and
We write procedure unionIntersection(Hu , Hi : E2∗) which calculates the
strictly increasing union of the two lists in Hu and the strictly increasing
intersection of them in Hi .
We use neither memory allocation (new) nor deallocation (delete) state-
ments nor explicite assignment statements to data members. We rearrange
the lists only with unlink(q), precede(q,r), follow(p,q). M T (n, m) ∈ O(n+m)
and S(n, m) ∈ Θ(1) where n = lenght(Hu ) and m = length(Hi ).

Let us have q, r : E2∗ initialized to point to the rst items of Hu and Hi


respectively. For example:
q
Hu → [/][2][4][6][/] ← Hu
Hi → [/][1][4][8][9][/] ← Hi
r

We work with
the following invariant where
q → key if 6 H
q=
key(q, H) =
∞ if q=H
(H, q) is the sublist of the items between H and q ,
[q, H) is the sublist of the items starting with q but before H.

• C2Ls Hu and Hi are strictly increasing consisting of the original bag of


list items, q is pointer on Hu , and r on Hi ,

• (Hu , q) is the prex of the sorted union containing the keys


less than min(key(q, Hu ), key(r, Hi )),

• (Hi , r) is the prex of the sorted intersection containing the keys


less than min(key(q, Hu ), key(r, Hi )),

• [q, Hu ) and [r, Hi ) are still unaltered.

Illustration of the run of the program:


q
Hu → [/][2][4][6][/] ← Hu
Hi → [/][1][4][5][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][6][/] ← Hu
Hi → [/][4][5][8][9][/] ← Hi
r

42
q
Hu → [/][1][2][4][6][/] ← Hu
Hi → [/][4][5][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][6][/] ← Hu
Hi → [/][4][5][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][/] ← Hu
Hi → [/][4][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][/] ← Hu
Hi → [/][4][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][8][/] ← Hu
Hi → [/][4][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][8][9][/] ← Hu
Hi → [/][4][/] ← Hi
r


unionIntersection(Hu , Hi : E2∗)

q := Hu → next ; r := Hi → next
q 6= Hu ∧ r 6= Hi
A q → key < r → key A q → key > r → key A q → key = r → key
p := r
r := r → next q := q → next
q := q → next
unlink(p) r := r → next
precede(p, q )

r 6= Hi
p := r ; r := r → next ; unlink(p)

precede(p, Hu )

Exercise 5.2 Write the structograms of quickSort(H :E2∗){ QS(H, H ) }


which sort C2L H increasingly. QS(p, s:E2∗) can be a recursive procedure
sorting sublist (p, s) with quicksort. It starts with partitioning sublist (p, s).
Let the pivot (∗q ) be the rst element of the sublist. Let your version of
quicksort be stable.

43
Illustration of the partition part of QS(H, H ) on C2L H below:
(Partition of the sublist strictly between p and s, i.e. partition of sublist
(p, s). The pivot is ∗q , r goes on sublist (q, s), elements smaller then the
pivot are moved before the pivot.)
p s
H → [/][5][2][4][6][5][20 ][/] ← H
q r
p s
H → [/][2][5][4][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][20 ][5][6][5][/] ← H
q r

44
6 Trees, binary trees

Up till now, we worked with arrays and linked lists. They have a common
property: There is a rst item in the data structure and any element has
exactly one other item next to it except of the last element which has no
successor, according to the following scheme: O O O O O .
Therefore the arrays and linked lists are called linear data structures.
(Although in the case of cyclic lists the linear data structure is circular.)
Thus the arrays and linked lists can be considered representations of linear
graphs.

6.1 General notions


At an abstract level, trees are also special nite graphs. They consist of
nodes (i.e. vertices) and edges (i.e. arcs). If the tree is empty, it contains
neither node nor edge. The empty tree is denoted by . If it is nonempty,
2
it always contains a root node. Each node of the graph has zero or more
immediate successor nodes which are called its children and it is called their
parent. There is a directed edge going from the parent to each of its children.
If a node has no child, it is called a leaf. The root has no parent. Each other
node of the tree has exactly one parent. The non-leaf nodes are also called
internal nodes.
Notice that a linear data structure can be considered a tree where each
internal node has a single child. Such trees will be called linear trees.
The descendants of a node are its children, and the descendants of its
children. The ancestors of a node is its parent, and the ancestors of its
parent. The root (node) has no ancestor, but each other node of the tree is
descendant of it, thus the root is ancestor of each other node in the tree. Thus
the root of a tree identies the whole tree. The leaves have no descendant.
Traditionally a tree is drawn in reversed manner: The topmost node is
the root, and the edges go downwards (so the arrows at the ends of the edges
are often omitted). See gure 8.
Given a tree, it is subtree of itself. If the tree is not empty, its other subtrees
are the subtrees of the trees rooted by its children (recursively).

The size of a tree is the number of nodes in the tree. The size of tree t is
denoted by |t| (n(t) = |t|). The number of internal nodes of t is i(t).
n(t) or
The number of leaves of t is l(t). Clearly n(t) = i(t) + l(t) for a tree. For
example, n() = 0, and considering gure 8, n(t1 ) = i(t1 )+l(t1 ) = 1+2 = 3,

2 In
this chapter the trees are always rooted, directed trees. We do not consider free trees
here which are undirected connected graphs without cycles.

45
t1 t2 t3 t4

6 level 0

3 7 level 1
height = 3
2 5 8 level 2

1 4 level 3

Figure 8: Simple binary trees. The nodes of the trees are represented by
circles. If the structure of a subtree is not important or unknown, we denote
it with a triangle.

n(t2 ) = i(t2 ) + l(t2 ) = 1 + 1 = 2, n(t3 ) = i(t3 ) + l(t3 ) = 0 + 1 = 1, and


n(t4 ) = 2 + n(t4 →lef t) where t4 →lef t is the unknown left subtree of t4 .

We can speak of the levels of a nonempty tree. The root is at level zero
(which is the topmost level), its children are at level one, its grandchildren
at level two and so on. Given a node at level i, its children are at level i+1
(always in downwards direction).
The height of a nonempty tree is equal to the level of the leaves at its
lowest level. −1. The height of tree t
The height of the empty tree is
is denoted by h(t). For example, h() = −1, and considering gure 8,
h(t1 ) = 1, h(t2 ) = 1, h(t3 ) = 0, and h(t4 ) = 1 + max(h(t4 →lef t), 0) where
t4 →lef t is the unknown left subtree of t4 .

All the hierarchical structures can be modeled by trees, for example the
directory hierarchy of a computer. Typically, each node of a tree is labeled
by some key , and maybe with other data.

Sources: [1] Chapter 10, 6, 12; [6] 4.4; [3] 6-7

46
6.2 Binary trees
Binary trees are useful, for example for representing sets and multisets (i.e.
bags) like dictionaries and priority queues.

A binary tree is a tree where each internal (i.e. non-leaf ) node has at most
two children. If a node has two children, they are the lef t child, and the
right child of their parent. If a node has exactly one child, then this child is
the lef t or right child of its parent. This distinction is important.
If t is a nonempty binary tree (i.e. t 6= ), then ∗t is the root node of
the tree, t → key is the key labeling ∗t, t → lef t is the left subtree of
t, and t → right is the right subtree of t. If ∗t does not have left child,
t → lef t = , i.e. the left subtree of t is empty. Similarly, if ∗t does not have
right child, t → right = , i.e. the right subtree of t is empty.
If ∗p is a node of a binary tree, p is the (sub)tree rooted by ∗p, thus
p → key is its key, p → lef t is its left subtree, and p → right is its right
subtree. If ∗p has left child, it is ∗p → lef t. If ∗p has right child, it is
∗p → right. If ∗p has parent, it is ∗p → parent. The tree rooted by the
parent is p → parent. (Notice that the inx operator → binds stronger than
the prex operator ∗. For example, ∗p → lef t = ∗(p → lef t).) If ∗p does
not have parent, p → parent = .
If p = , all the expressions ∗p, p → key , p → lef t, p → right, p →
parent etc. are erroneous.

Properties 6.1
h() = −1
t 6= binary tree =⇒ h(t) = 1 + max(h(t → lef t), h(t → right))

A strictly binary tree is a binary tree where each internal node has two chil-
dren. The following property can be proved easily by induction on i(t).

Property 6.2 Given a t 6= strictly binary tree,


l(t) = i(t) + 1.
Thus n(t) = 2i(t) + 1 ∧ n(t) = 2l(t) − 1

A complete binary tree is a strictly binary tree where all the leaves are at
the same level. It follows that in a complete nonempty binary tree of height
h we have 20 nodes at level 0, 21 nodes at level 1, and 2i nodes at level i
(0 ≤ i ≤ h). Therefore we get the following property.

Property 6.3 Given a t 6= complete binary tree (n = n(t), h = h(t))


n = 20 + 21 + · · · + 2h = 2h+1 − 1

47
A nonempty nearly complete binary tree is a binary tree which becomes com-
plete if we delete its lowest level. The is also nearly complete binary tree.
(An equivalent definition: A nearly complete binary tree is a binary tree
which can be received form a complete binary tree by deleting zero or more
nodes from its lowest level.) Notice that the complete binary trees are also
nearly complete, according to this denition.
Because a nonempty nearly complete binary tree is complete, possibly
h
except at its lowest level h, it has 2 − 1 nodes at its rst h − 1 levels, at
h h
least 1 node and at most 2 nodes at its lowest level. Thus n ≥ 2 − 1 + 1 ∧
n ≤ 2h − 1 + 2h where n is the size of the tree.

Properties 6.4 Given a t 6= nearly complete binary tree


(n = n(t), h = h(t)). Then
n ∈ 2h ..(2h+1 −1). Thus
h = blog nc.

Now we consider the relations between the size n and height h of a


nonempty binary tree. These will be important at binary search trees which
are good representations of data sets stored in the RAM.
h has the most number of nodes,
Clearly, a nonempty binary tree of height
if it is complete. Therefore from property 6.3 we receive that for the size n
of any nonempty binary tree n < 2
h+1
. Thus log n < log(2
h+1
) = h + 1,3 so
blog nc < h + 1. Finally blog nc ≤ h.
On the other hand, a nonempty binary tree of height h has the least
number of nodes, if it contains just one node at each level (i.e. it is linear),
and so for the size n of any nonempty binary tree n ≥ h + 1. Thus h ≤ n − 1.

Property 6.5 Given a t 6= binary tree (n = n(t), h = h(t))


blog nc ≤ h ≤ n − 1
where the h = blog nc if the tree is nearly complete, and
h = n − 1 i the tree is linear (see page 45).

Notice that there are binary trees with h = blog nc, although they are
not nearly complete.

6.3 Linked representations of binary trees


The empty tree is represented by a null pointer, i.e. . A nonempty binary
tree is identied by a pointer referring to a Node which represents the root
of the tree:
3 Remember the denition of log n given in section 1: log n = log2 (n) if n > 0, and
log 0 = 0.

48
Node
+ key : T // T is some known type
+ lef t, right : Node*
+ Node() { lef t := right := } // generate a tree of a single node.
+ Node(x : T ) { lef t := right := ; key := x }

Sometimes it is useful, if we also have a parent pointer in the nodes of the


tree. We can see a binary tree with parent pointers on gure 9.

Figure 9: Binary tree with parent pointers. (We omitted the keys of the
nodes here.)

Node3
+ key : T // T is some known type
+ lef t, right, parent : Node3*
+ Node3(p:Node3*) { lef t := right := ; parent := p }
+ Node3(x : T , p:Node3*) { lef t := right := ; parent := p ; key := x }

49
6.4 Binary tree traversals

preorder(t : Node*) inorder(t : Node*)

A t 6= A t 6=
process(t) inorder(t → lef t)

→ lef t)
preorder(t SKIP process(t) SKIP

preorder(t → right) inorder(t → right)



levelOrder(t : Node*)

A t 6=
Q: Queue ; Q.add(t)

¬Q.isEmpty()
s := Q.rem()
postorder(t : Node*)
process(s)
SKIP
A t 6= A s → lef t 6=
postorder(t → lef t) Q.add(s → lef t) SKIP
postorder(t → right) SKIP A s → right 6=

process(t) Q.add(s → right) SKIP

Tpreorder (n), Tinorder (n), Tpostorder (n), TlevelOrder (n) ∈ Θ(n) where n = n(t) and
the time complexity of process(t) is Θ(1). For example, process(t) can print
t → key :

process(t)

cout << t → key << ` `

We can slightly reduce the running times of preoder and inorder traversals,
if we eliminate the tail recursions. (It does not aect the time complexities.)

preorder(t : Node*) inorder(t : Node*)

t 6= t 6=
process(t) inorder(t → lef t)

preorder(t → lef t) process(t)

t := t → right t := t → right

50
t t

1 4

2 5 2 6

3 4 6 7 1 3 5 7

t t

7 1

3 6 2 3

1 2 4 5 4 5 6 7

Figure 10: left upper part: preorder, right upper part: inorder, left lower
part: postorder, right lower part: level order traversal of binary tree t.

6.4.1 An application of traversals: the height of a binary tree

Postorder:

h(t : Node*) : Z

A t 6=
return return −1

1+max(h(t → lef t),h(t → right))

Preorder:

preorder_h(t : Node* ; level, &max : Z)

t 6=
level := level + 1
h(t : Node*) : Z
A level > max
max := level := −1 max := level SKIP

preorder_h(t, level, max) preorder_h(t → lef t, level, max)

return max t := t → right

51
6.4.2 Using parent pointers

inorder_next(p : Node3*) : Node3*

q := p → right
A q 6=
q → lef t 6= q := p → parent
q 6= ∧ q → lef t 6= p
q := q → lef t
p := q ; q := q → parent
return q

M T (h) ∈ O(h) where h = h(t)

Exercise 6.6 What can we say about the space complexities of the dierent
binary tree traversals and other subroutines above?

6.5 Parenthesized, i.e. textual form of binary trees


Parenthesized, i.e. textual form of a nonempty binary tree:

( leftSubtree Root rightSubtree )

The empty tree is represented by the empty string. We can use dierent
kinds of parentheses for easier reading. For example, see Figure 11.

3 The binary tree on the left in

2 6 • simple textual form:


( ( (1) 2) 3 ( (4 (5) ) 6 (7) ) )
1 4 7
• elegant parenthesized form:

5 { [ (1) 2 ] 3 [ ( 4 h5i ) 6 (7) ] }

Figure 11: The same binary tree in graphical and textual representations.
Notice that omitting the parenthesis from a textual form of a tree we receive
the inorder traversal of that tree.

52
6.6 Binary search trees
Binary search trees (BST) are a particular type of containers: data struc-
tures that store "items" (such as numbers, names etc.) in memory. They
allow fast lookup, addition and removal of items, and can be used to im-
plement either dynamic sets of items, or lookup tables that allow nding an
item by its key (e.g., nding the phone number of a person by name).
Binary search trees keep their keys in sorted order, so that lookup and
other operations can use the principle of binary search: when looking for a
key in a tree (or a place to insert a new key), they traverse the tree from
root to leaf, making comparisons to keys stored in the nodes of the tree and
deciding, on the basis of the comparison, to continue searching in the left
or right subtrees. On average, this means that each comparison allows the
operations to skip about half of the tree, so that each lookup, insertion or
deletion takes time proportional to the logarithm of the number of items
stored in the tree. This is much better than the linear time required to
nd items by key in an (unsorted) array, but slower than the corresponding
operations on hash tables.
(The source of the text above is Wikipedia.)

A binary search tree (BST) is a binary tree, whose nodes each store a key
(and optionally, some associated value). The tree additionally satises the
binary search tree property, which states that the key in each node must be
greater than any key stored in the left sub-tree, and less than any key stored
in the right sub-tree. (See Figure 12. Notice that there is no duplicated key
in a BST.)

h 6

d l 3 7

b f j n 1 4 8

a c e g i 2

Figure 12: Two binary search trees (BSTs)

53

inorderPrint(t : Node*)

A t 6=
inorderPrint(t → lef t)

write(t → key ) SKIP

inorderPrint(t → right)

Property 6.7 Procedure inorderPrint(t : Node*) prints the keys of binary


tree t in strictly increasing order, i binary tree t is a search tree.


search(t : Node* ; k : T) : Node*

t 6= ∧ t → key 6= k
A k < t → key
t := t → lef t t := t → right
return t


insert(&t : Node* ; k : T)

// See Figure 13.

A t=
t := A k < t → key A k > t → key A k = t → key
new Node(k) insert(t → lef t, k) insert(t → right, k) SKIP


remMin(&t, &minp : Node*)

// See Figure 14.
min(t : Node*) : Node*
A t → lef t =
t → lef t 6= minp := t
t := t → lef t t := minp → right remMin(t → lef t, minp)
return t minp → right :=

54
6 6

3 7 3 7

1 4 8 1 4 8

2 2 5

Figure 13: Inserting 5 into the BST on the left gives the BST on the right:
First we compare 5 with the key of the root, which is 6, and 5 < 6. Thus
we insert 5 into the left subtree. Next we we compare 5 with 3, and 5 > 3,
so we insert 5 into the right subtree of node 3, and again 5 > 4, therefore
we insert 5 into the right subtree of node 4. The right subtree of this node
is empty. Consequently we create a new node with key 5, and we substitute
that empty right subtree with this new subtree consisting of this single new
node.


del(&t : Node* ; k : T)

// See Figure 16.

A t 6=
A k < t → key A k > t → key A k = t → key
SKIP
del(t → lef t, k ) del(t → right, k ) delRoot(t)


delRoot(&t : Node*)

// See Figure 17.

At → lef t = A t → right = A t → lef t 6= ∧ t → right 6=


p := t p := t remMin(t → right, p)

t := p → right t := p → lef t p → lef t := t → lef t ; p → right := t → right


delete p delete p delete t ; t := p

M Tsearch (h), M Tinsert (h), M Tmin (h) ∈ Θ(h)


M TremMin (h), M Tdel (h), M TdelRoot (h) ∈ Θ(h)
where h = h(t).

Exercise 6.8 What can we say about the space complexities of the dierent
BST operations above?

55
6 6

3 7 3 7

1 4 8 2 4 8

Figure 14: Removing the node with minimal key from the BST on the left
gives the BST on the right: The leftmost node of the original BST is substi-
tuted by its right subtree. Notice that deleting the node with key 1 from
the BST on the left gives the same BST as result: The left subtree of this
node is empty. Thus deleting it means substituting it with its right subtree.

6.7 Level-continuous binary trees, and heaps


A binary tree is level-continuous, i all levels of the tree, except possibly
the last (deepest) one are completely lled (i.e. it is nearly complete), and,
provided that the last level of the tree is not complete, the nodes of that
level are lled from left to right. Clearly a node is internal node of a level-
continuous tree, i it has left child in the tree.
A binary maximum heap is a level-continuous binary tree where the key
stored in each node is greater than or equal to (≥) the keys in the node's
children. (See Figure 18.)
A binary minimum heap is a level-continuous binary tree where the key
stored in each node is less than or equal to (≤) the keys in the node's children.
In this lecture notes a heap is a binary maximum heap by default.

Priority queues are usually represented by heaps in arithmetic representation.

6.8 Arithmetic representation of level-continuous bi-


nary trees
A level-continuous binary tree of size n is typically represented by the rst n
elements of an array A: We put the nodes of the tree into the array in level
order. (See Figure 19.)
If an internal node has index i in arrayA, let lef t(i) be the index of its
left child. If it has right child, too, let right(i) be the index of it. Clearly
right(i) = lef t(i) + 1 in the latter case. If A[j] represents a node dierent
from the root, Let parent(j) be the index of its parent.

56
4 4

3 6 3 6

2 5 9 2 5 7

Figure 15: Removing the node with maximal key from the BST on the
left gives the BST on the right: The rightmost node of the original BST is
substituted by its left subtree. Notice that deleting the node with key 9
from the BST on the left gives the same BST as result: The right subtree
of this node is empty. Thus deleting it means substituting it with its left
subtree.

4 4

3 6 3 7

2 5 9 2 5 9

Figure 16: Deleting node 6 of the BST on the left gives the BST on the
right: Node 6 has two children. Thus we remove the minimal node of its
right subtree and substitute node 6 with it.

Let us suppose that we use the rst n elements of an array (indexed from
zero), for example A : T[m], i.e. we use the subarray A[0..n). Then the root
node of a nonempty tree is A[0]. Node A[i] is internal node, i lef t(i) < n.
Then lef t(i) = 2i + 1. The right child of node A[i] exists, i right(i) < n.
Then right(i) = 2i + 2. Node A[j] is not the root node of the tree, i j > 0.
j−1
Then parent(j) = b c.
2

6.9 Heaps and priority queues


A priority queue is a bag (i.e. multiset). We can add a new item to it, and we
4
can check or remove a maximal item of it. Often there is a priority function:

4 There are also min priority queues where we can check or remove a minimal item.

57
4 5

3 7 3 7

2 5 9 2 6 9

Figure 17: Deleting the root node of the BST on the left gives the BST on
the right: The root node has two children. Thus we remove the minimal
node of its right subtree and substitute the root node with it. (Notice that
we could remove the maximal node of the left subtree and substitute the root
node with it.)

6 7

6 5 3 4

4 1 3 5 2

Figure 18: A (binary maximum) heap. It is a nearly complete binary tree,


and the last level is lled from left to right. The key of each parent ≥ the
key of the child.

T→ some number type, where T is the element type of the priority queue,
and the items are compared according to their priorities. For example, the
processes of a computer are often scheduled according to their priorities.
Our representation is similar to our representation of the Stack type,
and some operations are also the same, except of the names. The actual
elements of the priority queue are in the subarray A[0..n) containing a (binary
maximum) heap.

58
A[0]

A[1] A[2]

A[3] A[4] A[5] A[6]

A[7] A[8] A[9] A[10] A[11]

Figure 19: A level-continuous binary tree of size 12 represented by the rst


12 elements of an array A: We put the nodes of the tree into the array in
level order.

PrQueue
− A : T[] // T is some known type
− n : N // n ∈ 0..A.length is the actual length of the priority queue
+ PrQueue(m : N){ A := new T[m]; n := 0 } // create an empty priority queue
+ add(x : T ) // insert x into the priority queue
+ remMax():T // remove and return the maximal element of the priority queue
+ max():T // return the maximal element of the priority queue
+ isFull() : B {return n = A.length}
+ isEmpty() : B {return n = 0}
+ ∼ PrQueue() { delete A }
+ setEmpty() {n := 0} // reinitialize the priority queue

Provided that subarray A[0..n) is the arithmetic representation of a heap,


M Tadd (n) ∈ Θ(log n), mTadd (n) ∈ Θ(1), M TremMax (n) ∈ Θ(log n),
mTremMax (n) ∈ Θ(1), Tmax (n) ∈ Θ(1).
M Tadd (n) ∈ Θ(log n), and M TremMax (n) ∈ Θ(log n), because the main
loops of subroutines add" and sink" below iterates maximum h times if h
is the height of the heap, and h = blog nc.

59

PrQueue::add(x : T)

A n < A.length
j := n ; n := n + 1
A[j] := x ; i := parent(j)
j > 0 ∧ A[i] < A[j] PrQueueOverow

swap(A[i], A[j])

j := i ; i := parent(i)


PrQueue::max() : T

A n>0
return A[0]

PrQueueUnderow


PrQueue::remMax() : T

A n>0
max := A[0]
n := n − 1 ; A[0] := A[n]
PrQueueUnderow
sink(A, 0, n)

return max

sink(A : T[] ; k, n : N)

i := k ; j := lef t(k) ; b := true
j <n∧b
// A[j] is the left child of A[i]
A j + 1 < n ∧ A[j + 1] > A[j]
j := j + 1 SKIP

// A[j] is the greater child of A[i]


A A[i] < A[j]

swap(A[i], A[j])
b := f alse
i := j ; j := lef t(j)

Exercise 6.9 What can we say about the space complexities of the dierent
heap and priority queue operations above?

60
op 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- 8 6 7 6 5 3 4 4 1 3 5 2
add(8) 8 6 7 6 5 *3 4 4 1 3 5 2 #8
. .. 8 6 *7 6 5 #8 4 4 1 3 5 2 3
. .. *8 6 #8 6 5 7 4 4 1 3 5 2 3
. 8 6 8 6 5 7 4 4 1 3 5 2 3
add(2) 8 6 8 6 5 7 *4 4 1 3 5 2 3 #2
. 8 6 8 6 5 7 4 4 1 3 5 2 3 2
add(9) 8 6 8 6 5 7 *4 4 1 3 5 2 3 2 #9
. .. 8 6 *8 6 5 7 #9 4 1 3 5 2 3 2 4
. .. *8 6 #9 6 5 7 8 4 1 3 5 2 3 2 4
. 9 6 8 6 5 7 8 4 1 3 5 2 3 2 4
remMax() ∼9 6 8 6 5 7 8 4 1 3 5 2 3 2 ∼4
max := 9 *4 6 #8 6 5 7 8 4 1 3 5 2 3 2
. .. 8 6 *4 6 5 7 #8 4 1 3 5 2 3 2
. .. 8 6 8 6 5 7 *4 4 1 3 5 2 3 #2
return 9 8 6 8 6 5 7 4 4 1 3 5 2 3 2
remMax() ∼8 6 8 6 5 7 4 4 1 3 5 2 3 ∼2
max := 8 *2 6 #8 6 5 7 4 4 1 3 5 2 3
. .. 8 6 *2 6 5 #7 4 4 1 3 5 2 3
. .. 8 6 7 6 5 *2 4 4 1 3 5 2 #3
return 8 8 6 7 6 5 3 4 4 1 3 5 2 2
Figure 20: Changes of A : Z[15] while applying add() and remMax() opera-
tions to it. At the add() operations, # prex identies the actual element,
and ∗ is the prex of its parent. Similarly, at sinking, ∗ denotes the
parent, and # is the prex of its greatest child. At remMax(), ∼ is the
prex of both of the maximum to be removed and of the key to be moved
into its place.

61
6.10 Heap sort
In Heap sort, rst we take the array to be sorted and build a heap form it.
We consider the array as a level-continuous binary tree.
While building the heap we consider the level order of the tree, and we
start from the last internal node of it. We go back in level order and sink
the root node of each subtree making a heap out of it. (See Figure 21.)


buildMaxHeap(A : T[n])

k := parent(n − 1) downto 0
sink(A, k, n)

When the heap is ready, we swap its rst and last items. Then the last
element is maximal element of the array. We cut it from the tree. Next we
sink the root of the tree, and receive a heap again. Again we swap, cut and
sink, and we have the two largest elements at the end of the array and a heap
without these elements before them. We repeat this process of swap, cut and
sink until we have only a single element in the heap which is a minimum
of the original array. And the whole array is sorted at this moment. (See
Figure 22.)

heapSort(A : T[n])

buildMaxHeap(A)

m := n
m>1
m := m − 1 ; swap(A[0], A[m])
sink(A, 0, m)

The time complexity of each sinking is O(log n) because h ≤ blog nc for each
subtree. Thus M TbuildMaxHeap (n) ∈ O(n log n) and the time complexity of the
main loop of heapSort() is also O(n log n).
M TheapSort (n) ∈ O(n log n) because M TbuildMaxHeap (n) ∈ O(n log n) and
the time complexity of the main loop of heapSort() is also O(n log n).

In chapter 7, we prove that M TheapSort (n) ∈ Θ(n log n).


Heap sort has a space complexity of Θ(1) since it is not recursive and uses no
temporal data structures, only a few simple temporal variables. Thus Heap
sort sorts in-place like Insertion sort.

62
t t

18 18

23 51 23 51

42 14 35 97 42 19 35 97

53 60 19 53 60 14

t t

18 18

23 51 23 97

60 19 35 97 60 19 35 51

53 42 14 53 42 14

t t

18 18

60 97 60 97

23 19 35 51 53 19 35 51

53 42 14 23 42 14

t t

97 97

60 18 60 51

53 19 35 51 53 19 35 18

23 42 14 23 42 14

Figure 21: From array h18, 23, 51, 42, 14, 35, 97, 53, 61, 19i we build a maxi-
mum heap. Notice that we work on the array from the beginning to the end.
The binary trees show the logical structure of the array. Finally we receive
array h97, 60, 51, 53, 19, 35, 18, 23, 42, 14i.
op 0 1 2 3 4 5 6 7 8 9

sink 68 23 51 42 *19 35 97 53 60 #14


sink 68 23 51 *42 19 35 97 53 #60 14
sink 68 23 *51 60 19 35 #97 53 42 14
sink 68 *23 97 #60 19 35 51 53 42 14
... 68 60 97 *23 19 35 51 #53 42 14
sink *68 60 #97 53 19 35 51 23 42 14
... 97 60 *68 53 19 35 #51 23 42 14
. 97 60 68 53 19 35 51 23 42 14

swap ∼97 60 68 53 19 35 51 23 42 ∼14


sink *14 60 #68 53 19 35 51 23 42 +97
... 68 60 *14 53 19 35 #51 23 42 +97
swap ∼68 60 51 53 19 35 14 23 ∼42 +97
sink *42 #60 51 53 19 35 14 23 +68 +97
... 60 *42 51 #53 19 35 14 23 +68 +97
. 60 53 51 *42 19 35 14 #23 +68 +97
swap ∼60 53 51 42 19 35 14 ∼23 +68 +97
sink *23 #53 51 42 19 35 14 +60 +68 +97
... 53 *23 51 #42 19 35 14 +60 +68 +97
swap ∼53 42 51 23 19 35 ∼14 +60 +68 +97
sink *14 42 #51 23 19 35 +53 +60 +68 +97
... 51 42 *14 23 19 #35 +53 +60 +68 +97
swap ∼51 42 35 23 19 ∼14 +53 +60 +68 +97
sink *14 #42 35 23 19 +51 +53 +60 +68 +97
... 42 *14 35 #23 19 +51 +53 +60 +68 +97
swap *42 23 35 14 #19 +51 +53 +60 +68 +97
sink *19 23 #35 14 +42 +51 +53 +60 +68 +97
swap ∼35 23 19 ∼14 +42 +51 +53 +60 +68 +97
sink *14 #23 19 +35 +42 +51 +53 +60 +68 +97
swap ∼23 14 ∼19 +35 +42 +51 +53 +60 +68 +97
sink *19 #14 +23 +35 +42 +51 +53 +60 +68 +97
swap *19 #14 +23 +35 +42 +51 +53 +60 +68 +97
sink *14 +19 +23 +35 +42 +51 +53 +60 +68 +97
. +14 +19 +23 +35 +42 +51 +53 +60 +68 +97

Figure 22: Full example of Heapsort on array A : Z[10]. At sinking, ∗


denotes the parent, and # is the prex of its greatest child. At the swap
operations, ∼ is the prex of both of the items to be exchanged. At their
nal places, the keys have + prexes.

64
7 Lower bounds for sorting

Theorem 7.1 For any sorting algorithm mT (n) ∈ Ω(n).

Proof. Clearly we have to check all the n items and only a limited number of
items is checked in a subroutine call or loop iteration (without the embedded
subroutine calls and loop iterations). Let this limit be k . Thus
mT (n) ∗ k ≥ n =⇒ mT (n) ≥ k1 n =⇒ mT (n) ∈ Ω(n).

7.1 Comparison sorts and the decision tree model


Denition 7.2 A sorting algorithm is comparison sort, i it gains informa-
tion about the sorted order of the input items only by comparing them. That
is, given two items ai and aj , in order to compare them it performs one of
the tests ai < a j , ai ≤ aj , ai = aj , ai 6= aj , ai ≥ aj , or ai > aj . We may
not inspect the values of the elements or gain order information about them
in any other way. [1]

The sorting algorithms we have studied before, that is insertion sort, heap
sort, merge sort, and quick sort are comparison sorts.
In this section, we assume without loss of generality that all the input
5
elements are distinct . Given this assumption, comparisons of the form ai =
aj and ai 6= aj are useless, so we can assume that no comparisons of this form
6
are made . We also note that the comparisons ai < aj , ai ≤ aj , ai ≥ aj , and
ai > a j are all equivalent in that they yield identical information about the
relative order of ai and aj . We therefore assume that all comparisons have
the form ai ≤ aj . [1]

We can view comparison sorts abstractly in terms of decision trees. A decision


tree is a strictly binary tree that represents the comparisons between elements
that are performed by a particular sorting algorithm operating on an input of
a given size. Control, data movement, and all other aspects of the algorithm
are ignored. [1]
Figure 23 shows the decision tree corresponding to the insertion sort al-
gorithm from Section 3 operating on an input sequence of three elements.

5 In
this way we restrict the set of possible inputs and we are going to give a lower bound
for the worst case of comparison sorts. Thus, if we give a lower bound for the maximum
number of key comparisons (M C(n)) and for maximum time complexity (M T (n)) on this
restricted set of input sequences, it is also a lower bound for them on the whole set of input
sequences, because M C(n) and M T (n) are surely ≥ on a larger set than on a smaller set.
6 Anyway, if such comparisons are made, neither M C(n) nor M T (n) are decreased.

65
a≤b
a≤b b<a

b≤c a≤c
b≤c c<b a≤c c<a

ha, b, ci a≤c hb, a, ci b≤c


a≤c c<a b≤c c<b

ha, c, bi hc, a, bi hb, c, ai hc, b, ai

Figure 23: The decision tree for insertion sort operating on the input sequence
ha, b, ci. There are 3! = 6 permutations of the 3 input items. Thus the
decision tree must have at least 3! = 6 leaves.

Let us suppose that ha1 , a2 , . . . an i is the input sequence to be sorted. In a


decision tree, each internal node is labeled by ai ≤ aj for some elements of the
input. We also annotate each leaf by a permutation of ha1 , a2 , . . . an i. The
execution of the sorting algorithm corresponds to tracing a simple path from
the root of the decision tree down to a leaf. Each internal node indicates
a comparison ai ≤ aj . The left subtree then dictates subsequent compar-
isons once we know that ai ≤ aj , and the right subtree dictates subsequent
comparisons knowing that ai > aj . When we come to a leaf, the sorting al-
gorithm has established the appropriate ordering of ha1 , a2 , . . . an i. Because
any correct sorting algorithm must be able to produce each permutation of
its input, each of the n! permutations on n elements must appear as one of
the leaves of the decision tree for a comparison sort to be correct. Thus, we
shall consider only decision trees in which each permutation appears as a leaf
of the tree. [1]

7.2 A lower bound for the worst case


Theorem 7.3 Any comparison sort algorithm requires M C(n) ∈ Ω(n log n)
comparisons in the worst case.

66
Proof. From the preceding discussion, it suces to determine the height h=
M C(n) of a decision tree in which each permutation appears as a reachable
leaf. Consider a decision tree of height h with l leaves corresponding to a
comparison sort on n elements. Because each of the n! permutations of the
input appears as some leaf, we have n! ≤ l. Since a binary tree of height h
h
has no more than 2 leaves, we have n! ≤ l ≤ 2h . [1]
Consequently

n
X n
X n
X lnm lnm lnm
M C(n) = h ≥ log n! = log i ≥ log i ≥ log ≥ ∗log ≥
i=1
2 2 2
i=d n
2 e i=d n
2 e

n n n n n n
≥ ∗ log = ∗ (log n − log 2) = ∗ (log n − 1) = log n − ∈ Ω(n log n)
2 2 2 2 2 2

Theorem 7.4 For any comparison sort algorithm M T (n) ∈ Ω(n log n).

Proof. Only a limited number of key comparisons are performed in a sub-


routine call or loop iteration (without the embedded subroutine calls and
loop iterations). Let this upper limit be k . Then M T (n) ∗ k ≥ M C(n) =⇒
M T (n) ≥ k1 M C(n) =⇒ M T (n) ∈ Ω(M C(n)). Together with theorem 7.3
(M C(n) ∈ Ω(n log n)) and transitivity of relation · ∈ Ω(·) we receive this
theorem (M T (n) ∈ Ω(n log n)).

Le us notice that heap sort and merge sort are asymptotically optimal in
the sense that theirM T (n) ∈ O(n log n) asymptotic upper bound meets the
M T (n) ∈ Ω(n log n) asymptotic lower bound from Theorem 7.4. This proves
that M T (n) ∈ Θ(n log n) for both of them.

67
8 Sorting in Linear Time

8.1 Radix sort


The abstract code for radix sort is straightforward. We assume that each
element of abstract list A is a natural number of d digits, where digit 1 is the
lowest-order digit and digit d is the highest-order digit.


radix_sort(A : dDigitNumberhi ; d : N)

i := 1 to d
use a stable sort to sort list A on digit i

Radix sort solves the problem of sorting counterintuitively by sorting


on the least signicant (i.e. rst) digit in its rst pass.
In the second pass it takes the result of the rst pass and applies a stable
sort to sort it on the second digit. It is important to apply stable sort so that
those numbers with equal second digit remain sorted according to their rst
digit. Consequently, after the second pass the numbers are already sorted on
the rst two digits (which are the two lowest order digits).
In the third pass Radix sort takes the result of the second pass and applies
a stable sort to sort it on the third digit. It is important to apply stable sort
so that those numbers with equal third digit remain sorted according to their
rst two digits. Consequently, after the third pass the numbers are already
sorted on the rst three digits (which are the three lowest order digits).
This process continues until the items have been sorted on all d digits.
Remarkably, at that point the items are fully sorted on the d-digit number.
Thus, only d passes through the numbers are required to sort. [1]
The sorting method used in each pass can be any stable sort, but it should
run in linear time in order to maintain eciency.
Distributing sort works well on linked lists, and counting sort on arrays.
Both of them is stable and works in linear time.

8.2 Distributing sort


Distributing sort is ecient on linked lists, and a version of radix sort can
be built on it.
Remember that stable sorting algorithms maintain the relative order of
records with equal keys (i.e. values).
Distributing sort is ideal auxiliary method of radix sort because of its
stability and linear time complexity.

68
Here comes a bit more general study than needed for radix sort. When
distributing sort is used for radix sort, its key function ϕ must select the
appropriate digit.

The sorting problem: Given abstract list L of length n with element type
T , r ∈ O(n) positive integer,
ϕ : T → 0..(r−1) key selection function.
Let us sort list L with stable sort, and with linear time complexity.


distributing_sort(L : Thi ; r : N ; ϕ : T → 0..(r−1))

B : Thi[r] // array of lists, i.e. bins

k := 0 to r−1
Let B[k] be empty list // init the array of bins

L is not empty

Remove the rst element x of list L


Insert x at the end of B[ϕ(x)] // stable distribution

k := r−1 downto 0
L := B[k] + L // append B[k] before L

Eciency of distributing sort: The rst and the last loop iterates r
times, and the middle loop n times. Provided that insertion and concatena-
tion can be performed in Θ(1) time, the time complexity of distributing sort
is consequently Θ(n + r) = Θ(n) because of the natural condition r ∈ O(n).
Its space complexity is determined by the size of array B . Thus it is Θ(r).

8.3 Radix-Sort on lists


The next example shows how radix sort operates on an abstract list of seven
3-digit numbers with base (i.e. radix) 4. In each pass we apply distributing
sort.

The input list (with a symbolic notation):


L = h103, 232, 111, 013, 211, 002, 012i

First pass (according to the rightmost digits of the numbers):


B0 = hi
B1 = h111, 211i
B2 = h232, 002, 012i

69
B3 = h103, 013i
L = h111, 211, 232, 002, 012, 103, 013i

Second pass (according to the middle digits of the numbers):


B0 = h002, 103i
B1 = h111, 211, 012, 013i
B2 = hi
B3 = h232i
L = h002, 103, 111, 211, 012, 013, 232i

Third pass (according to the leftmost digits of the numbers):


B0 = h002, 012, 013i
B1 = h103, 111i
B2 = h211, 232i
B3 = hi
L = h002, 012, 013, 103, 111, 211, 232i

In order for radix sort to work correctly, the digit sorts must be stable.
Distributing sort satises this requirement. If distributing sort runs in linear
time (Θ(n) where n is the length of the input list), and the number of digits
is constant d, radix sort also runs in linear time Θ(d ∗ n) = Θ(n).

Provided that we have to sort linked lists where the keys of the list-elements
are d-digit natural numbers with number base r, the implementation of the
algorithm above is straightforward. For example, let us suppose that we
have an L C2L with header, and function digit(i, r, x) can extract the ith
digit of number x, where digit 1 is the lowest-order digit and digit d is the
highest-order digit, in Θ(1) time.


radix_sort( L : E2* ; d, r : N )

BinHead : E2[r] // the headers of the lists representing the bins

B : E2*[r] // pointers to the headers

i := 0 to r − 1
B[i] := &BinHead[i] // Initialize the ith pointer.

i := 1d to

distribute(L, i, B ) // Distribute L on the ith digits of keys.

gather(B, L) // Gather form the bins back into L

70

distribute( L : E2* ; i : N;B : E2*[r] )

L → next 6= L
p := L → next ; unlink(p)
precede( p, B[digit(i, r, p → key)]

gather( B : E2*[r] ; L : E2* )

i := 0 to r−1
append(L, B[i]) // add to the end of L the elements form B[i]

append( L, Bi : E2* )

A Bi → next 6= Bi
p := L → prev ; q := Bi → next ; r := Bi → prev
p → next := q ; q → prev := p
SKIP
r → next := L ; L → prev := r
Bi → next := Bi ; Bi → prev := Bi

Clearly,Tappend ∈ Θ(1), so Tgather ∈ Θ(r) where r = B.length.


And Tdistribute (n) ∈ Θ(n) where n = |L|.
Thus Tradix_sort (n, d, r) ∈ Θ(r + d(n + r)).
Consequently, if d is constant and r ∈ O(n), then Tradix_sort (n) ∈ Θ(n).
The space complexity of Radix sort is determined by the common length
of its temporal arrays which is r. Thus the space complexity of Radix sort
is Θ(r).
In a typical computer, which is a sequential random-access machine, we some-
times use radix sort to sort records of information that are keyed by multiple
elds. For example, we might wish to sort dates by three keys: year, month,
and day. We could run a sorting algorithm with a comparison function that,
given two dates, compares years, and if there is a tie, compares months, and
if another tie occurs, compares days. Alternatively, we could sort the in-
formation three times with a stable sort: rst on day, next on month, and
nally on year. [1]

8.4 Counting sort


While the previous version of radix sort is ecient on linked lists, counting
sort can be applied eciently to arrays, and another version of radix sort can
be built on it.

71
Remember that stable sorting algorithms maintain the relative order of
records with equal keys (i.e. values).
Counting sort is stable, and it is ideal auxiliary method of radix sort
because of its linear time complexity.
Here comes a bit more general study than needed for radix sort. When
counting sort is used for radix sort, its key function ϕ must select the appro-
priate digit.

The sorting problem: Given array A:T[n], r ∈ O(n) positive integer,


ϕ : T → 0..(r−1) key selection function.
Let us sort array A with stable sort, and with linear time complexity so
that the result is produced in array B.


counting_sort(A, B : T[n] ; r : N ; ϕ : T → 0..(r−1))

C : N[r] // counter array

k := 0 to r−1
C[k] := 0 // init the counter array

i := 0 to n−1
C[ϕ(A[i])]++ // count the items with the given key

k := 1 to r−1
C[k] += C[k − 1] // C[k] := the number of items with key ≤k
i := n − 1 downto 0
k := ϕ(A[i]) // k := the key of A[i]
C[k] − − // The next one with key k must be put before A[i] where

B[C[k]] := A[i] //Let A[i] be the last of the {unprocessed items with key k }

The rst loop of the procedure above assigns zero to each element of the
counting array C.
The second loop counts the number of occurences of key k in C[k], for
each possible key k.
The third loop sums the number of occurences of keys ≤ than k, consid-
ering each possible key k.
The number of keys ≤ 0 is the same as the number of keys = 0, so the
value of C[0] is not changed in the third loop. Considering greater keys we
have that the number of keys ≤k equals to the number of keys =k + the
number of keys ≤ k − 1. Thus the new value of C[k] can be counted with
adding the new value of C[k−1] to the old value C[k].

72
The fourth loop goes on the input array in reverse direction. We put the
elements of input array A into output array B : Considering any key k in the
input array, rst we process its last occurence. The element containing it is
put into the last place reserved for keys = k, i.e. intoB[C[k] − 1]: First we
decrease C[k] by one, and put this element into B[C[k]]. According to this
reverse direction the next occurence of key k is going to be the immediate
predecessor of the actual item etc. Thus the elements with the same key
remain in their original order, and we receive a stable sort.

The time complexity is clearly Θ(n + r). Provided that r ∈ O(n), Θ(n + r) =
Θ(n), and so T (n) ∈ Θ(n).
Regarding the space complexity, besides the input array, we have output
arrayB of n items and counter array C of r items. As a result, S(n) ∈
Θ(n + r) = Θ(n) since r ∈ O(n).

Illustration of counting sort: We suppose that we have to sort numbers


of two digits with number base 4 according to their right-side digit, i.e.
function ϕ selects the rightmost digit.

0 1 2 3 4 5
The input:
A: 02 32 30 13 10 12

The changes of counter array C [the rst column reects the rst loop ini-
tializing counter array C to zero, the next six columns reect the second loop
counting
P the items with key k, for each possible key; the column labeled by
reects the third loop which sums up the number of items with keys ≤ k;
and the last six columns reect the fourth loop placing each item of the input
array into its place in the output array]:

P
C 02 32 30 13 10 12 12 10 13 30 32 02
0 0 1 2 2 1 0
1 0 2
2 0 1 2 3 5 4 3 2
3 0 1 6 5

0 1 2 3 4 5
The output:
B: 30 10 02 32 12 13

Now we suppose that the result of the previous counting sort is to be sorted
according to the left-side digits of the numbers (of number base 4), i.e. func-
tion ϕ selects the leftmost digits of the numbers.

73
0 1 2 3 4 5
The input:
B: 30 10 02 32 12 13

The changes of counter array C :N[4]:P


C 30 10 02 32 12 13 13 12 32 02 10 30
0 0 1 1 0
1 0 1 2 3 4 3 2 1
2 0 4
3 0 1 2 6 5 4

0 1 2 3 4 5
The output:
A: 02 10 12 13 30 32

The rst counting sort ordered the input according to the right-side digits
of the numbers. The second counting sort ordered the result of the rst sort
according to the left-side digits of the numbers using a stable sort. Thus
in the nal result the numbers with the same left-side digits remained in
order according to their right-side digits. Consequently in the nal result the
numbers are already sorted according to their both digits.
Therefore the two counting sorts illustrated above form the rst and sec-
ond passes of a radix sort. And our numbers have just two digits now, so we
have performed a complete radix sort in this example.

8.5 Radix-Sort on arrays ([1] 8.3)


The keys of array A are d-digit natural numbers with number base r. The
rightmost digit is the least signicant (digit 1), and the leftmost digit is the
most signicant (digit d).

radix_sort(A : dDigitNumber[n] ; d : N)

i := 1 to d
use a stable sort to sort array A on digit i

Provided that the stable sort is counting sort, the time complexity of Radix
sort isΘ(d(n + r)). If d is a constant and r ∈ O(n), Θ(d(n + r)) = Θ(n), i.e.
T (n) ∈ Θ(n).
The space complexity of Radix sort based on Counting sort is clearly
determined by that of the later one. Thus S(n) ∈ Θ(n + r) = Θ(n) since
r ∈ O(n).

74
8.6 Bucket sort
We suppose that the items to be sorted are elements of the real interval [0; 1).
This algorithm is ecient, if the keys of the input are equally distributed
on [0; 1). We sort the buckets with some known sorting method like insertion
sort or merge sort.


bucket_sort( L : list )

n := the length of L
B : list[n] // Create the buckets B[0..(n−1)]
j := 0 to (n−1)
Let B[j] be empty list

L 6=
Remove the rst element of list L
Insert this element according to its key k into list B[bn ∗ kc]
j := 0 to (n−1)
Sort list B[j] nondecreasingly

Append lists B[0], B[1], ..., B[n − 1] in order into list L

ClearlymT (n) ∈ Θ(n). If the keys of the input are equally distributed
on [0; 1), AT (n) ∈ Θ(n). M T (n) depends on the sorting method we use
when sorting lists B[j] nondecreasingly. For example, using Insertion sort
M T (n) ∈ Θ(n2 ); using Merge sort M T (n) ∈ Θ(n log n).
Considering the space complexity of Bucket sort, we have temporal array
B of n elements. If we suppose that our lists are linked lists, we do not have
other extra data structure. Provided that we use Merge sort as the subroutine
of Bucket sort, the depth of recursion is O(log n) in any case. And Insertion
sort sorts in-place. Consequently mS(n), M S(n) ∈ Θ(n), if we sort a linked
list, and we use Insertion Sort or Merge sort as the subroutine of Bucket sort.

75
9 Hash Tables

In everyday programming practice we often need so called dictionaries. These


are collections of records with unique keys. Their operations: (1) inserting a
new record into the dictionary, (2) searching for a record identied by a key,
(3) removing a record identied by a key (or localized by a previous search).
Besides AVL trees, B+ trees (see them in the next semester), and other
kinds of balanced search trees dictionaries are often represented and imple-
mented with hash tables, provided that we would like to optimize the average
running time of the operations. Using hash tables the average running time
of the operations above is the ideal Θ(1), while the maximum of it is Θ(n).
(With balanced search trees the worst case and average case performance of
each key-based operation are Θ(log n).)

Notations:
m : the size of the hash table
T [0..(m − 1)] : the hash table
T [0], T [1], . . . , T [m − 1] : the slots of the hash table
: empty slot in the hash table, if we use direct-address tables or key
collisions are resolved by chaining
E : the key of empty slots in case of open addressing
D : the key of deleted slots in case of open addressing
n : the number of records stored in the hash table
α = n/m : load factor
U : the universe of keys; k, k 0 , ki ∈ U
h : U → 0..(m − 1) : hash function

We suppose that the hash table does not contain two or more records with
the same key, and that h(k) can be calculated in Θ(1) time.

9.1 Direct-address tables


In case of direct address tables we do not have hash functions. We suppose
thatU = 0..(m−1) where m ≥ n, but m is not too big compared to n.
T : D∗[m] is the direct-address table. Its slots are pointers referring to
records of type D. Each record has a key k : U and contains satellite data.
The direct-address table is initialized with pointers.

D
+ k : U // k is the key
+ . . . // satellite data

76

init( T :D*[m] )

search( T :D*[] ; k :U ):D*
i := 0 to m−1
T [i] := return T [k]

insert( T :D*[] ; p:D* ):B remove( T :D*[] ; k :U ):D*

A T [p → k] = p := T [k]
T [p → k] := p T [k] :=
return f alse
return true return p

Clearly Tinit (m) ∈ Θ(m). And for the other three operations we have T ∈
Θ(1).

9.2 Hash tables


Hash function: Provided that |U | >> n, direct address tables cannot be
applied or at least applying them is wasting space. Thus we use a hash
function h : U → 0..(m−1) where typically |U | >> m (the size of the
universe U of keys is much greater than the size m of the hash table). The
record with key k is to be stored in slot T [h(k)] of hash table T [0..(m−1)].
Remember that the hash table should not contain two or more records
with the same key, and that h(k) must be calculated in Θ(1) time.
Function h : U → 0..(m−1) is simple uniform hashing, if it distributes
the keys evenly into the slots, i.e. any given key is equally likely to hash
into any of the m slots, independently of where other items have hashed to.
Simple uniform hashing is a general requirement about hash functions.

Collision of keys: Provided that h(k1 ) = h(k2 ) for keys k1 6= k2 we speak of


key collision. Usually |U | >> m, so key collisions probably happen, and we
have to handle this situation.
For example, let us suppose that the keys are integer numbers and
h(k) = k mod m. Then exactly those keys are hashed to slot s for which
s = k mod m.

9.3 Collision resolution by chaining


We suppose that the slots of the hash table identify simple linked lists (S1L)
that is T :E1*[m] where the elements of the lists contain the regular elds
key and next, plus usually additional elds (satellite data). Provided that

77
the hash function maps two or more keys to the same slot, the corresponding
records are stored in the list identied by this slot.

E1
+key : U
. . . // satellite data may come here
+next : E1*
+E1() { next := }

init( T :E1*[m] )

search( T :E1*[] ; k :U ):E1*
i := 0 to m−1
T [i] := return searchS1L(T [h(k)],k)


insert( T :E1*[] ; p:E1* ):B

k := p → key ; s := h(k)
searchS1L( q :E1* ; k :U ):E1*
A searchS1L(T [s], k) =
p → next := T [s] q 6= ∧ q → key 6= k
T [s] := p return f alse q := q → next
return true return q


remove( T :E1*[] ; k :U ):E1*

s := h(k) ; p := ; q := T [s]
6 ∧ q → key 6= k
q=
p := q ; q := q → next
A q 6=

A p=
T [s] := q → next p → next := q → next SKIP

q → next :=
return q

Clearly Tinit (m) ∈ Θ(m). For the other three operations mT ∈ Θ(1),
n
M T (n) ∈ Θ(n), AT (n, m) ∈ Θ(1 + m ).
n
AT (n, m) ∈ Θ(1 + m ) is satised, if function h : U → 0..(m−1) is simple
uniform hashing, because the average length of the lists of the slots is equal
n
to
m
= α.

78
n
Usually
m
∈ O(1) is required. In this case AT (n, m) ∈ Θ(1) is also
satised for insertion, search, and removal.

9.4 Good hash functions


Division method: Provided that the keys are integer numbers,

h(k) = k mod m

is often a good choice, because it can calculated simply and eciently. And
if m is a prime number not too close to a power of two, it usually distributes
the keys evenly among the slots, i.e. on the integer interval 0..(m−1).
For example, if we want to resolve key collision by chaining, and we would
like to store approximately 2000 records with maximum load factor α ≈ 3,
then m = 701 is a good choice: 701 is a prime number which is close to
2000/3, and it is far enough from the neighboring powers of two, i.e. from
512, and 1024.

Keys in interval [ 0 ; 1): Provided that the keys are evenly distributed on
[ 0 ; 1), function
h(k) = bk ∗ mc
is also simple uniform hashing.

Multiplication method: Provided that the keys are real numbers, and
A ∈ (0; 1) is a constant,

h(k) = b{k ∗ A} ∗ mc

is a hash function. ({x} is the fraction part of x.) It does not distribute the
keys equally well with all√the possible values of A.
5−1
Knuth proposes A =
2
≈ 0.618, because it is likely to work reasonably
well. Compared to the division method it has the advantage that the value
of m is not critical.

Each method above supposes that the keys are numbers. If the keys are
strings, the characters can be considered digits of unsigned integers with the
appropriate number base. Thus the strings can be interpreted as big natural
numbers.

79
9.5 Open addressing
The hash table is T : R[m]. The records of type R are directly in the slots.
Each record has a key eld k : U ∪ {E, D} where E 6= D; E, D ∈
/ U are
global constants in order to indicate empty (E ) and deleted (D ) slots.

R
+ k : U ∪ {E, D} // k is a key or it is Empty or Deleted
+ . . . // satellite data


init( T :R[m] )

i := 0 to m−1
T [i].k := E

Notations for open addressing:


h : U × 0..(m − 1) → 0..(m − 1) : probing function
hh(k, 0), h(k, 1), . . . , h(k, m − 1)i : potential probing sequence

The hash table does not contain double keys.

The empty and deleted slots together are free slots. The other slots are
occupied.) Instead of a single hash function we have m hash functions now:

h(·, i) : U → 0..(m − 1) (i ∈ 0..(m − 1))

In open addressing we try these in this order one after the other if needed.

9.5.1 Open addressing: insertion and search, without deletion


In many applications of dictionaries (and hash tables) we do not need dele-
tion. Insertion and search are sucient. In this case insertion is simpler.

Let us suppose that we want to insert record r with key k into the hash
table. First we probe slot h(k, 0). If it is occupied and its key is not k, we
try h(k, 1), and so on, throughout hh(k, 0), h(k, 1), . . . , h(k, m − 1)i until
- we nd an empty slot, or
- we nd an occupied slot with key k, or
- all the slots of the potential probing sequence have been considered but
found neither empty slot nor occupied slot with key k.
+ If we nd an empty slot, we put r into it. Otherwise insertion fails.

80
hh(k, 0), h(k, 1), . . . , h(k, m − 1)i is called potential probing sequence because
during insertion, or search (or deletion) only a prex of it is actually gener-
ated. This prex is called actual probing sequence.
The potential probing sequence must be a permutation of h0, 1, . . . , (m −
1)i, which means, that it covers the whole hash table, i.e. it does not refer
twice to the same slot.
The length of the actual probing sequence of insertion/search/deletion is
i, i this operation stops at probe h(k, i − 1).
When we search for the record with key k , again we follow the potential
probing sequence hh(k, 0), h(k, 1), . . . , h(k, m − 1)i.
- We stop successfully when we nd an occupied slot with key k.
- The search fails, if we nd an empty slot, or we use up the potential probing
sequence unsuccessfully.

In ideal case we have uniform hashing: the potential probe sequence of each
key is equally likely to be any of the m! permutations of h0, 1, . . . , (m − 1)i.
Provided that
- the hash table does not contain deleted slots,
- its load factor α = n/m satises 0 < α < 1, and
- we have uniform hashing,
+ the expected length of an unsuccessful search / successful insertion is at
most
1
1−α
+ and the expected length of a successful search / unsuccessful insertion is
at most
1 1
ln
α 1−α
For example, the rst result above implies that with uniform hashing, if the
hash table is half full, the expected number of probes in a unsuccessful search
(or in a successful insertion) is less than 2. If the hash table is 90% full, the
expected number of probes is less then 10.
Similarly, the second result above implies that with uniform hashing, if
the hash table is half full, the expected number of probes in a successful
search (or in an unsuccessful insertion) is less than 1.387. If the hash table
is 90% full, the expected number of probes is less then 2.559. [1].

9.5.2 Open addressing: insertion, search, and deletion


A successful deletion consists of a successful search for the slot T [s] containing
a given key + the assignment T [s].k := D (let the slot be deleted). T [s].k :=

81
E (let the slot be empty) is not correct. For example, let us suppose that we
inserted record r with key k into the hash table, but it could not be put into
T [h(k, 0)] because of key conict, and we put it into T [h(k, 1)]. And then we
delete the record at T [h(k, 0)]. If this deletion performed T [h(k, 0)].k := E ,
a subsequent search for key k would stop at the empty slot T [h(k, 0)], and it
would not nd record r with key k in T [h(k, 1)]. Instead, deletion performs
T [h(k, 0)].k := D. Then the subsequent search for key k does not stop at
slot T [h(k, 0)] (because neither it is empty nor it contains key k ), and it
nds record r with key k in T [h(k, 1)]. (Clearly the procedures of search and
deletion are not changed in spite of the presence of deleted slots.)

Thus during search we go through deleted slots and we stop when


- we nd the slot with the key we search for (successful search), or
- we nd an empty slot or use up the the potential probe sequence of the
given key (unsuccessful search).

Insertion becomes more complex because of the presence of deleted slots.


During the insertion of record r with key k, we perform a full search for k
but we also remember the rst deleted slot found during the search.
- If the search is successful, then the insertion fails (because we do not allow
duplicated keys).
- If the search is unsuccessful, but some deleted slot is remembered, then we
put r into it.
- If the search is unsuccessful, no deleted slot is remembered, but the search
stops at an empty slot, then we put r into it.
- If the search is unsuccessful, and neither deleted nor empty slot is found,
then the insertion fails, because the hash table is full.

If we use a hash table for a long time, there may be many deleted slots and
no empty slot, although the table is far from being full. This means that the
unsuccessful searches will check all the slots, and also the other operations
slow down. So we have to get rid of the deleted slots, for example, by
rebuilding the whole table.

9.5.3 Linear probing


In this subsection, and in the next two we consider three strategies in or-
der to generate actual probing sequences, that is the adequate prex of
hh(k, 0), h(k, 1), . . . , h(k, m−1)i. In each case we have primary hash function
h1 : U → 0..(m−1) where h(k, 0) = h1 (k). If it is needed, starting from this
slot we go on step by step throughout the slots of the hash table, according

82
to a well dened rule, until we nd the appropriate slot, or we nd that the
actual operation is impossible. h1 must be simple uniform hash function.
The simplest strategy is linear probing:

h(k, i) = (h1 (k) + i) mod m (i ∈ 0..(m − 1))

It is easy to implement linear probing, but we have only m dierent probing


sequences instead of the m! probing sequences needed for uniform hashing:
Given two keys, k1 and k2 ; if h(k1 , 0) = h(k2 , 0) then their whole probing
sequences are the same. In addition, dierent probing sequences tend to be
linked into continuous, long runs of occupied slots, increasing the expected
time of searching. This problem is called primary clustering. The longer
such a cluster is the the more probable that it becomes even longer after the
next insertion. For example, let we have two free slots with i occupied slots
between them. Then the probability that its length will be increased by the
next insertion is at least (i + 2)/m. And it may even be linked with another
cluster. Linear probing may be selected only if the probability of key collision
is extremely low.

9.5.4 Quadratic probing


h(k, i) = (h1 (k) + c1 i + c2 i2 ) mod m (i ∈ 0..m − 1)
where h1 : U → 0..(m−1) is the primary hash function; c1 , c2 ∈ R; c2 6= 0.
The dierent probing sequences are not linked together, but we have only
m dierent probing sequences instead of the m! probing sequences needed
for uniform hashing: Given two keys, k1 and k2 ; if h(k1 , 0) = h(k2 , 0) then
their whole probing sequences are the same. This problem is called secondary
clustering.

Choosing the constants of quadratic probing: In this case the poten-


tial probing sequence, i.e. hh(k, 0), h(k, 1), . . . , h(k, m − 1)i may have equal
members which implies that it does not cover the hash table. Therefore we
must be careful about selecting the constants of quadratic probing.
For example, if size m of the hash table is a power of 2, then c1 = c2 = 1/2
is appropriate. In this case

i + i2

h(k, i) = h1 (k) + mod m (i ∈ 0..m − 1)
2
Thus

(i + 1) + (i + 1)2 i + i2

(h(k, i + 1) − h(k, i)) mod m = − mod m =
2 2

83
(i + 1) mod m
So it is easy to compute the slots of the probing sequences recursively:

h(k, i + 1) = (h(k, i) + i + 1) mod m

Exercise 9.1 Write the structograms of the operations of hash tables with
quadratic probing (c1 = c2 = 1/2) applying the previous recursive formula.

9.5.5 Double hashing


h(k, i) = (h1 (k) + ih2 (k)) mod m (i ∈ 0..(m − 1))
where h1 : U → 0..(m−1) h2 : U → 1..(m−1) are hash functions.
and
The probing sequence covers the hash table, i h2 (k) and m are relative
primes. It is satised, for example, if m > 1 is a power of 2 and h2 (k) is odd
number for each k ∈ U , or if m is prime number. For example, if m is prime
0
number (which should not be close to powers of 2) and m is a bit smaller
0 0
(let m = m − 1 or m = m − 2) then

h1 (k) = k mod m

h2 (k) = 1 + (k mod m0 )
is an eligible choice.
In case of double hashing for each dierent pairs of (h1 (k), h2 (k)) there is a
2
dierent probing sequence, and so we have Θ(m ) dierent probing sequences.
Although the number of the probing sequences of double hashing is far
from the ideal number m! of probing sequences, its performance appears to
be very close to that of the ideal scheme of uniform hashing.

Illustration of the operations of double hashing: Because


h(k, i) = (h1 (k) + ih2 (k)) mod m (i ∈ 0..(m−1)), therefore h(k, 0) = h1 (k)
and h(k, i + 1) = (h(k, i) + d) mod m where d = h2 (k). After calculating
the place of the rst probing (h1 (k)) we always make a step of distance d
cyclically around the table..

Example: m = 11 h1 (k) = k mod 11 h2 (k) = 1 + (k mod 10).


In the operations (op) column of the next table, ins=insert, src=search,
ddel=delete. Next the key of the operation comes (being neither E nor D).
In this table we do not handle satellite data. We show just the keys. In the
next column of the table there is h2 (key), but only if it is needed. Next we
nd the actual probing sequence. Insertion remembers the rst deleted slot
of the actual probing sequence if any. In such cases we underlined the index

84
of this slot. In column s we have a + sign for a successful, and an −
sign for an unsuccessful operation. In the table we do not handle satellite
data, we process only the keys. (See the details in section 9.5.2.)
In the last 11 columns of the table we represent the actual state of the
hash table. The cells representing empty slots are simply left empty. We
wrote the reserving key into each cell of occupied slots, while the cells of the
deleted slots contain letter D.
op key h2 probes s 0 1 2 3 4 5 6 7 8 9 10
init +
ins 32 10 + 32
ins 40 7 + 40 32
ins 37 4 + 37 40 32
ins 15 6 4; 10; 5 + 37 15 40 32
ins 70 1 4; 5; 6 + 37 15 70 40 32
src 15 6 4; 10; 5 + 37 15 70 40 32
src 104 5 5; 10; 4; 9 − 37 15 70 40 32
del 15 6 4; 10; 5 + 37 D 70 40 32
src 70 1 4; 5; 6 + 37 D 70 40 32
ins 70 1 4; 5; 6 − 37 D 70 40 32
del 37 4 + D D 70 40 32
ins 104 5 5; 10; 4; 9 + D 104 70 40 32
src 15 6 4; 10; 5; 0 − D 104 70 40 32

Exercise 9.2 (Programming of double hashing) Write the structograms


of insertion, search, and deletion where x is the record to be inserted, and k
is the key we search for, and it is also the key of the record we want to delete.
The hash table is T [0..(m−1)].
In a search we try to nd the record identied by the given key. After a
successful search we return the position (index of the slot) of the appropriate
record. After an unsuccessful search we return −1.
After a successful insertion we return the position of the insertion. At
an unsuccessful insertion there are two cases. If there is no free place in the
hash table, we return −(m + 1). If the key of the record to be inserted is
found at slot j, we return −(j + 1).
In a deletion we try to nd and delete the record identied by the given
key. After a successful deletion we return the position (index of the slot) of
the appropriate record. After an unsuccessful deletion we return −1.

Solution:

85

insert( T :R[m] ; x:R ):Z

k := x.k ; j := h1 (k)
i := 0 ; d := h2 (k)
i < m ∧ T [j].k ∈
/ {E, D}
A T [j].k = k
search(T :R[m] ; k :U
return i++
):Z

−(j + 1) j := (j + d) mod m i := 0 ; j := h1 (k)


A i<m b := true ; d := h2 (k)
return −(m + 1)

ii := j b
i < m ∧ T [j].k 6= E A T [j].k = k
return j

A T [j].k = k SKIP

return i++ i++


−(j + 1) j := (j + d) mod m b := (T [j].k 6= E∧ i < m)
T [ii] := x j := (j + d) mod m
return ii return −1

delete( T :R[] ; k :U ):Z

j := search(T, k )

A j≥0
T [j].k := D SKIP

return j

86

You might also like