Alg Ds 1 Lecture Notes
Alg Ds 1 Lecture Notes
Lecture Notes
Tibor Ásványi
Department of Computer Science
5 Linked Lists 33
5.1 One-way or singly linked lists . . . . . . . . . . . . . . . . . . 33
5.1.1 Simple one-way lists (S1L) . . . . . . . . . . . . . . . . 33
5.1.2 One-way lists with header node (H1L) . . . . . . . . . 34
5.1.3 One-way lists with trailer node . . . . . . . . . . . . . 34
5.1.4 Handling one-way lists . . . . . . . . . . . . . . . . . . 35
5.1.5 Insertion sort of H1Ls . . . . . . . . . . . . . . . . . . 36
5.1.6 Merge sort of S1Ls . . . . . . . . . . . . . . . . . . . . 36
5.1.7 Cyclic one-way lists . . . . . . . . . . . . . . . . . . . . 37
5.2 Two-way or doubly linked lists . . . . . . . . . . . . . . . . . . 37
5.2.1 Simple two-way lists (S2L) . . . . . . . . . . . . . . . . 38
5.2.2 Cyclic two-way lists (C2L) . . . . . . . . . . . . . . . . 38
5.2.3 Example programs on C2Ls . . . . . . . . . . . . . . . 41
2
6.4.2 Using parent pointers . . . . . . . . . . . . . . . . . . . 52
6.5 Parenthesized, i.e. textual form of binary trees . . . . . . . . . 52
6.6 Binary search trees . . . . . . . . . . . . . . . . . . . . . . . . 53
6.7 Level-continuous binary trees, and heaps . . . . . . . . . . . . 56
6.8 Arithmetic representation of level-continuous binary trees . . . 56
6.9 Heaps and priority queues . . . . . . . . . . . . . . . . . . . . 57
6.10 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9 Hash Tables 76
9.1 Direct-address tables . . . . . . . . . . . . . . . . . . . . . . . 76
9.2 Hash tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.3 Collision resolution by chaining . . . . . . . . . . . . . . . . . 77
9.4 Good hash functions . . . . . . . . . . . . . . . . . . . . . . . 79
9.5 Open addressing . . . . . . . . . . . . . . . . . . . . . . . . . 80
9.5.1 Open addressing: insertion and search, without deletion 80
9.5.2 Open addressing: insertion, search, and deletion . . . . 81
9.5.3 Linear probing . . . . . . . . . . . . . . . . . . . . . . 82
9.5.4 Quadratic probing . . . . . . . . . . . . . . . . . . . . 83
9.5.5 Double hashing . . . . . . . . . . . . . . . . . . . . . . 84
3
References
[2] Cormen, Thomas H., Algorithms Unlocked, The MIT Press, 2013.
[5] Weiss, Mark Allen, Data Structures and Algorithm Analysis in C++
(Fourth Edition),
Pearson, 2014.
http://aszt.inf.elte.hu/∼asvanyi/ds/DataStructuresAndAlgorithmAnalysisInCpp_2
[6] Wirth, N., Algorithms and Data Structures,
Prentice-Hall Inc., 1976, 1985, 2004.
http://aszt.inf.elte.hu/∼asvanyi/ds/AD.pdf
[7] Burch, Carl, B+ trees
http://aszt.inf.elte.hu/∼asvanyi/ds/B+trees.zip
4
2. We can declare subroutines, classes, variables and constants.
4. while loop is the default loop ; for loops are also used..
5. The operators 6=, ≥, ≤ are written by the usual mathematical notation.
The assignment statements and the is equal to comparisons are writ-
ten in Pascal style. For example, x := y assigns the value of y to x and
x=y checks their equality.
5
expected Time), and mT (n) (minimum Time). Clearly M T (n) ≥ AT (n) ≥
mT (n). If M T (n) = mT (n) then we can speak of a general time complexity
T (n) where T (n) = M T (n) = AT (n) = mT (n).
Typically, the time complexities of the algorithms are not calculated ex-
actly. Only we calculate their asymptotic order or make asymptotic estima-
tion(s) using the big-O notation.
Denition 1.2
f (n)
f ≺ g ⇐⇒ lim =0
n→∞ g(n)
o(g) = {h | h ≺ g}
6
Denition 1.3
g f ⇐⇒ f ≺ g
g f is read as g is asymptotically greater than f . It can also be written
as f ∈ ω(g) which means that
ω(g) = {h | h g}
Denition 1.6
Θ(g) = O(g) ∩ Ω(g)
( f ∈ Θ(g) can be read as f is roughly proportional to g .)
Consequence 1.7 .
ψ(n)
f ∈ O(g) ⇐⇒ ∃d, n0 > 0, and ψ:N→R so that limn→∞ g(n)
= 0, and
for each n ≥ n0 .
Consequence 1.8 .
ϕ(n)
f ∈ Ω(g) ⇐⇒ ∃c, n0 > 0, and ϕ:N→R so that limn→∞ g(n)
=0 and
for each n ≥ n0 .
Note 1.10 .
7
• If f ∈ Θ(g), we can say that f and g are asymptotically equivalent.
1. O(g), Ω(g) and Θ(g) are sets of asymptotically positive (AP) func-
tions while f is just a single AP function. Thus the equalities
above are simply false claims.
Theorem 1.11
f (n)
lim = 0 =⇒ f ≺ g =⇒ f ∈ O(g)
n→∞ g(n)
f (n)
lim = c ∈ P =⇒ f ∈ Θ(g)
n→∞ g(n)
f (n)
lim = ∞ =⇒ f g =⇒ f ∈ Ω(g)
n→∞ g(n)
Proof. The rst and last statements follow directly from the denition of
the ≺
and relations. In order to prove the middle one, consider that
limn→∞ fg(n)
(n)
= c. Consequently, if n is suciently large, fg(n)
(n)
− c < 2c .
Thus
c f (n)
< < 2c
2 g(n)
Because g is AP, g(n) > 0 for suciently large n values, and we can multiply
with it both sides of this inequality. Therefore
c
∗ g(n) < f (n) < 2c ∗ g(n)
2
As a result, f ∈ Θ(g)
Consequence 1.12
k ∈ N∧a0 , a1 , . . . , ak ∈ R∧ak > 0 =⇒ ak nk +ak−1 nk−1 +· · ·+a1 n+a0 ∈ Θ(nk )
8
Proof.
ak nk + ak−1 nk−1 + · · · + a1 n + a0
lim =
n→∞ nk
ak nk ak−1 nk−1
a1 n a0
lim + + ··· + k + k =
n→∞ nk nk n n
ak−1 a1 a0
lim ak + + · · · + k−1 + k =
n→∞ n n n
ak−1 a1 a0
lim ak + lim + · · · + lim k−1 + lim k =
n→∞ n→∞ n n→∞ n n→∞ n
ak + 0 + · · · + 0 + 0 = ak ∈ P =⇒
ak nk + ak−1 nk−1 + · · · + a1 n + a0 ∈ Θ(nk )
f ∈ Θ(g) ⇐⇒ g ∈ Θ(f )
f ∈ O(g) ⇐⇒ g ∈ Ω(f )
f ≺ g ⇐⇒ g f
9
Property 1.17 (Asymmetry)
f ≺ g =⇒ ¬(g ≺ f )
f g =⇒ ¬(g f )
¬(f ≺ f )
¬(f f )
We are going to see that such equivalence classes can be identied, and they
will be fundamental from the point of view of calculating the eciency of the
algorithms. For example, we have already seen that any k -degree polynomial
k
with positive major coecient is asymptotically equivalent to the n function.
(See Consequence 1.12.) The asymptotic order of such equivalence classes
can be established, and it is based on the following property.
Property 1.22
f1 , g1 ∈ Θ(h1 ) ∧ f2 , g2 ∈ Θ(h2 ), ∧f1 ≺ f2 =⇒ g1 ≺ g2
Denition 1.23
Θ(f ) ≺ Θ(g) ⇐⇒ f ≺ g
10
Lemma 1.24 Sometimes the following so called L'Hospital rule can be
applied for computing limes limn→∞ fg(n)
(n)
in Theorem 1.11.
If the real extensions of the functions f and g are dierentiable for suciently
large substitution values, and
f 0 (n)
lim f (n) = ∞ ∧ lim g(n) = ∞ ∧ ∃ lim 0 =⇒
n→∞ n→∞ n→∞ g (n)
f (n) f 0 (n)
lim = lim 0
n→∞ g(n) n→∞ g (n)
c, d ∈ R ∧ c < d =⇒ nc ≺ nd
c, d ∈ P0 ∧ c < d =⇒ cn ≺ dn
c, d ∈ R ∧ d > 1 =⇒ nc ≺ dn
d ∈ P0 =⇒ dn ≺ n! ≺ nn
c, d ∈ P ∧ c, d > 1 =⇒ logc n ∈ Θ(logd n)
ε ∈ P =⇒ log n ≺ nε
c ∈ R ∧ ε ∈ P =⇒ nc log n ≺ nc+ε
log e 1 log e
= lim ε = 0=0
ε n→∞ n ε
Consequence 1.26
Θ(log n) ≺ Θ(n) ≺ Θ(n ∗ log n) ≺ Θ(n2 ) ≺ Θ(n2 ∗ log n) ≺ Θ(n3 )
11
Property 1.27
(Function classes O(·), Ω(·), Θ(·), o(·), ω(·) are closed for the following ops.)
f ∈ O(g) ∧ c ∈ P =⇒ c ∗ f ∈ O(g)
f ∈ O(h1 ) ∧ g ∈ O(h2 ) =⇒ f + g ∈ O(h1 + h2 )
f ∈ O(h1 ) ∧ g ∈ O(h2 ) =⇒ f ∗ g ∈ O(h1 ∗ h2 )
ϕ(n)
f ∈ O(g) ∧ ϕ : N → R ∧ lim = 0 =⇒ f + ϕ ∈ O(g)
n→∞ f (n)
(Similarly for function classes Ω(·), Θ(·), o(·), ω(·).)
Property 1.28
max(f, g) ∈ Θ(f + g) where max(f, g)(n) = max(f (n), g(n)) (n ∈ N)
Proof. Because f and g are AP, f (n) > 0 and g(n) > 0 for suciently large
n∈N values, and
12
1.4 Space complexities of algorithms
Space complexity of an algorithm is an abstract measure of space (i.e. mem-
ory) requirements of that algorithm.
It depends on the depth of the subroutine calls and on the sizes of the
data structures (like arrays, linked lists, trees, etc.) generated or needed by
that algorithm. The size of the input is typically excluded from the space
requirements.
Notice that this later denition is based on the fact that although
limn→∞ log(n) = ∞, still function log(n) grows very-very slowly. For ex-
3 6 9 12
ample log2 10 < 10, log2 10 < 20, log2 10 < 30, log2 10 < 40.
13
2 Elementary Data Structures and Data Types
A, Z : T[n]
In this example A and Z are two arrays of element type T and of size n.
In our model, the array data structure is an array object containing its size
and elements, and it is accessed through a so called array pointer containing
its memory address. The operations of the array can
read its size like A.length and Z.length: A.length = Z.length = n here,
access (read and write) its elements through indexing as it is usual, for
example A[i] and Z[j] here.
Arrays are indexed from 0.
Provided that an object or variable is created by declaring it, this object or
variable is deleted automatically when the subroutine containing it nishes.
Then the memory area reserved by it can be reused for other purposes.
P : T[]
14
Array objects can be created (i.e. allocated) dynamically like in C++,
but our array objects always contain their size, unlike in C++. For example,
the statement
P := new T[m]
creates a new array object, pointer P refers to it, P.length = m.
Note that any object (and especially array object) generated dynamically
must be deleted (i.e. deallocated) explicitly when the object is not needed
any more. Deletion is done like in C++, in order to avoid memory leaking.
delete P
suces here. It does note delete the pointer but the pointed object. Having
deleted the object the memory area reserved by it can be reused for other
purposes.
Unfortunately we cannot tell anything about eciency of memory allo-
cation and deallocation in general. Sometimes these can be performed with
Θ(1) time complexity, but usually they need much more time, and their ef-
ciency is often unpredictable. Therefore we avoid the overuse of them and
we apply them only when it is really necessary.
15
2.2 Stacks
A stack is a LIFO (Last-In First-Out) data storage. It can be imagined as a
vertical sequence of items similar to a tower of plates on a table. We can push
a new item at the top, and we can check or remove (i.e. pop) the topmost
item.
In the following representation the elements of the stack are stored in
subarray A[0..n). Provided that n > 0, A[n − 1] is the top of the stack.
Provided that n = 0, the stack is empty.
T (n) ∈ Θ(1) for each method, because there is neither iteration nor sub-
routine invocation in their code. The time complexities of the constructor
and of the destructor depend on that of the new and delete expressions.
Stack
−A : T[] // T is some known type ; A.length is the max. size of the stack
−n : N // n ∈ 0..A.length is the actual size of the stack
+ Stack(m : N) {A := new T[m] ; n := 0} // create an empty stack
+ push(x : T) // push x onto the top of the stack
+ pop() : T // remove and return the top element of the stack
+ top() : T // return the top element of the stack
+ isFull() : B {return n = A.length}
+ isEmpty() : B {return n = 0}
+ setEmpty() {n := 0} // reinitialize the stack
+ ∼ Stack() { delete A }
Stack::push(x : T)
A n < A.length
A[n] := x
StackOverow
n++
16
Stack::pop():T
Stack::top():T
A n>0
n−− A n>0
StackUnderow
return A[n] return A[n − 1] StackUnderow
Example for a simple use of a stack: printing n input items in reversed order.
We suppose that read(x) reads form the current input the next piece of data
and write(x) prints to the current output the value of x. Treverse (n) ∈ Θ(n).
reverse(n : N)
v: Stack(n)
x:T
n>0
read(x)
v .push(x)
n := n − 1
¬v .isEmpty()
write(v .pop())
2.3 Queues
A queue is a FIFO (First-In First-Out) data storage. It can be imagined as
a horizontal sequence of items similar to a queue at the cashier's desk. We
can add a new item to the end of the queue, and we can check or remove the
rst item.
In the following representation the elements of the queue are stored in
Provided that n > 0, Z[k] is the rst element of the queue where n is the
length of the queue. Provided that n = 0, the queue is empty.
T (n) ∈ Θ(1) for each method, because there is neither iteration nor sub-
routine invocation in their code. The time complexities of the constructor
and of the destructor depend on that of the new and delete expressions.
17
Queue
−Z : T[] // T is some known type
−n : N // n ∈ 0..Z.length is the actual length of the queue
−k : N // k ∈ 0..(Z.length−1) is the starting position of the queue in array Z
+ Queue(m : N){ Z := new T[m] ; n := 0 ; k := 0 } // create an empty queue
+ add(x : T) // join x to the end of the queue
+ rem() : T // remove and return the rst element of the queue
+ rst() : T // return the rst element of the queue
+ length() : N {return n}
+ isFull() : B {return n = Z.length}
+ isEmpty() : B {return n = 0}
+ ∼ Queue() { delete Z }
+ setEmpty() {n := 0} // reinitialize the queue
A n < Z.length
Z[(k + n) mod Z.length] := x
QueueOverow
n++
Queue::rem() :T
A n>0
n−−
Queue::rst() :T
i := k
QueueUnderow
k := (k + 1) mod Z.length A n>0
return Z[i] return Z[k] QueueUnderow
18
We described the methods of stacks and queues with simple codes: we ap-
plied neither iterations nor recursions. Therefore the time complexity of each
method is Θ(1). This is a fundamental requirement for each implementation
of stacks and queues.
Note that with linked list representations this constraint can be guaran-
teed only if M Tnew , M Tdelete ∈ Θ(1).
Considering the above implementations of stacks and queues, the space
complexity of the destructor and of each method is also Θ(1), because they
create no data structure, but they use only a few temporal variables. The
space complexity of the constructor is clearly Θ(m).
19
3 Algorithms in computing: Insertion Sort
insertionSort(A : T[n])
i := 1 to n − 1
A A[i − 1] > A[i]
x := A[i]
A[i] := A[i − 1]
j := i − 2
j ≥ 0 ∧ A[j] > x SKIP
A[j + 1] := A[j]
j := j − 1
A[j + 1] := x
mTIS (n) = 1 + (n − 1) = n
n−1 n−2
X X (n − 1) ∗ (n − 2)
M TIS (n) = 1 + (n − 1) + (i − 1) = n + j =n+
i=1 j=0
2
1 1
M TIS (n) = n2 − n + 1
2 2
20
5 2 7 1 4 6 8 2'
2 5 7 1 4 6 8 2'
2 5 7 1 4 6 8 2'
1 2 5 7 4 6 8 2' (*)
1 2 4 5 7 6 8 2'
1 2 4 5 6 7 8 2'
1 2 4 5 6 7 8 2'
1 2 2' 4 5 6 7 8
(*) detailed:
1 2 5 7 4 6 8 2'
x=4
1 2 5 7 6 8 2'
x=4
1 2 5 7 6 8 2'
x=4
21
n mT (n) in secs M T (n) in time
1000 8000 4 ∗ 10−6 6 ∗ 106 0.003 sec
106 8 ∗ 106 0.004 6 ∗ 1012 50 min
107 8 ∗ 107 0.04 6 ∗ 1014 ≈ 3.5 days
8 8
10 8 ∗ 10 0.4 6 ∗ 1016 ≈ 347 days
9 9
10 8 ∗ 10 4 6 ∗ 1018 ≈ 95 years
This means that in the worst case insertion sort is too slow to sort one million
element, and it becomes completely impractical, if we try to sort huge amount
of data. Let us consider the average case:
n−1 n−2
X i−1 1 X
ATIS (n) ≈ 1 + (n − 1) + =n+ ∗ j=
i=1
2 2 j=0
1 (n − 1) ∗ (n − 2) 1 1 1
=n+ ∗ = n2 + n +
2 2 4 4 2
This calculation shows, that the expected or average running time of insertion
sort is roughly the half of the time needed in the worst case, so even the
expected running time of insertion sort is too long to sort one million element,
and it becomes completely impractical, if we try to sort huge amount of data.
The asymptotic time complexities:
Let us notice that the minimum running time is very good. There is no
chance to sort elements faster than in linear time, because each piece of data
must be checked. One may say that this best case is not much gain, because
in this case the items are already sorted. The fact is, that if the input is
nearly sorted, we can remain close to the best case, and insertion sort turns
out to be the best to sort such data.
Insertion sort is also stable: Stable sorting algorithms maintain the relative
order of records with equal keys (i.e. values). That is, a sorting algorithm is
stable if whenever there are two records R and S with the same key and with
R appearing before S in the original list, R will appear before S in the sorted
list. (For example, see keys 2 and 2' in Figureinsertion-sort.) Stability is an
important property of sorting methods in some applications.
22
4 Fast sorting algorithms based on the divide
and conquer approach
In computer science, divide and conquer is an algorithm design paradigm
based on multi-branched recursion. A divide and conquer algorithm works
by recursively breaking down a problem into two or more sub-problems of the
same or related type, until these become simple enough to be solved directly.
The solutions to the sub-problems are then combined to give a solution to
the original problem.
This divide and conquer technique is the basis of ecient algorithms many
kinds of problems, for example sorting (e.g., quicksort, mergesort).
5 3 1 6 8 2 3'
5 3 1 6 8 2 3'
5 3 1 6 8 2 3'
3 1 6 8 2 3'
1 3 6 8 2 3'
1 3 5 2 3' 6 8
1 2 3 3' 5 6 8
The divide and conquer paradigm is often used to nd the optimal so-
lution of a problem. Its basic idea is to decompose a given problem into
two or more similar, but simpler, subproblems, to solve them in turn, and
to compose their solutions to solve the given problem. Problems of sucient
23
simplicity are solved directly. For example, to sort a given list of n keys,
split it into two lists of about n/2 keys each, sort each of them in turn, and
interleave (i.e. merge) both results appropriately to obtain the sorted version
of the given list. (See Figure 2.) This approach is known as the merge sort
algorithm.
Merge sort is stable (preserving the input order of items with equal keys)
and its worst case time complexity is asymptotically optimal among compar-
ison sorts. (See section 7 for details.)
mergeSort(A : T[n])
B : T[n] ; B[0..n) := A[0..n)
// Using B[0..n) sort A[0..n) non-decreasingly:
ms(B, A, 0, n)
ms(B, A : T[] ; u, v : N)
// Initially B[u..v) = A[u..v).
// Using B[u..v) sort A[u..v) non-decreasingly:
A v−u>1
m := u+v
2
ms(A, B, u, m) // Using A[u..m) sort B[u..m)
SKIP
ms(A, B, m, v ) // Using A[m..v) sort B[m..v)
merge(B, A, u, m, v ) // merge B[u..m) and B[m..v) into A[u..v)
m = u+v
Using , A[u..m) and A[m..v) have the same length, if the length
2
of A[u..v) is even number; and A[u..m) is shorter by one than A[m..v), if the
length of A[u..v) is odd number; because
u+v
length(A[u..m)) = m − u = −u=
2
u+v v−u length(A[u..v))
−u = =
2 2 2
24
merge(B, A : T[] ; u, m, v : N)
// sorted merge of B[u..m) and B[m..v) into A[u..v)
k := u // in loop, copy into A[k]
The stability of merge sort is ensured in the explicit loop of merge, because
in case of B[i] = B[j], B[i] is copied into A[k], and B[i] came earlier in the
input.
Merge makes l loop iterations where l = v − u is the length of the actual
subarrays A[u..v) and B[u..v). In order to prove this statement consider the
value of k after the explicit loop of merge. Clearly, this loop lls subarray
A[u..k), and it copies one element / iteration. Thus it iterates k − u times.
In addition, the implicit loop hidden into A[k..v) := . . . iterates v − k times.
Therefore the sum of the loop iterations is (k − u) + (v − k) = v − u = l .
Consequently for the body of procedure merge (mb) we have:
Tmb (l) = l (l = v − u)
25
called for the whole array A[0..n). Considering all the recursive calls and the
corresponding subarrays at a given depth of recursion, at level 1 this array
is divided into two halves, at level 2
4 parts and so on. Let nij be the
into
length of the j th subarray of the form A[u..v) at level i, and let m = blog nc.
We have:
Thus at levels [0..m] these subarrays cover the whole array. At levels [0..m)
merge is called for each subarray, but at level m it is called only for those
subarrays with length 2, and the number of these subarrays is clearly n −
2m . Merge makes as many iterations as the length of the actual A[u..v).
Consequently at each level in [0..m) merge makes n iterations in all the
merge calls of the level altogether. At level m the sum of the iterations is
2 ∗ (n − 2m ), and there is no iteration at level m + 1. Therefore the sum total
of all the iterations during the merge calls is
Tmb[0..m] (n) = n ∗ m + 2 ∗ (n − 2m ).
The number of procedure calls: Clearly, the ms calls form a strictly binary
tree. (See section 6.2.) The leaves of this tree correspond to the subarrays
with length 1. Thus this strictly binary tree has n leaves and n − 1 internal
nodes. Consequently we have 2n − 1 calls of ms, and n − 1 calls of merge.
Adding to this the single call of mergeSort we receive 3n − 1 procedure calls
altogether.
And there are n iterations hidden into the initial assignment B[0..n) :=
A[0..n) in procedure mergeSort. Thus the number of steps of mergeSort is
26
4.1.2 The space complexity of merge sort
Considering the above code of merge sort, in the main procedure we create
temporal array B : T[n], and we need a temporal variable for the implicit for
loop. Altogether we need Θ(n) working memory here. In addition, we have
seen in the previous subsection (4.1.1) that the number of levels of recursion
is blog nc + 1 which is approximately log n, and we have a constant amount
of temporal variables at each level. Thus we need Θ(log n) working memory
inside the call stack in order to control the recursion. And Θ(log n) ≺ Θ(n).
These measures clearly do not depend on the content of the array to be
sorted. For this reason, the space complexity of this array version of merge
sort is S(n) ∈ Θ(n).
4.2 Quicksort
Quicksort is a divide and conquer algorithm. Quicksort rst divides a large
array into two smaller sub-arrays: the low elements and the high elements.
Quicksort can then recursively sort the sub-arrays.
• Partitioning: reorder the array so that all elements with values less
than the pivot come before the pivot, while all elements with values
greater than the pivot come after it (equal values can go either way).
After this partitioning, the pivot is in its nal position. This is called
the partition operation.
• The base case of the recursion is arrays of size zero or one, which are
in order by denition, so they never need to be sorted.
The pivot selection and partitioning steps can be done in several dierent
ways; the choice of specic implementation schemes greatly aects the algo-
rithm's performance.
quicksort(A : T[n])
QS(A, 0, n − 1)
27
QS(A : T[] ; p, r : N)
A
p<r
q := partition(A, p, r )
QS(A, p, q − 1) SKIP
QS(A, q + 1, r)
partition(A : T[] ; p, r : N) : N
i := random(p, r )
Input:
p r
A: 5 3 8 5 6 4 7 1
28
Preparations for the rst loop:
p r
A: 5 3 8 1 6 4 7 pivot = 5
The rst loop searches for the rst item being greater than the pivot, if
there is such element. (The occurrences of index i reect to its left-to-right
movement.)
i=p i i r
A: 5 3 8 1 6 4 7 pivot = 5
We have found an item greater than the pivot. Variable j starts at the next
item.
p i j r
A: 5 3 8 1 6 4 7 pivot = 5
p ≤ i < j ≤ r
A[p..i) ≤ pivot, A[i..j) ≥ pivot, A[j..r) (unknown), A[r] (undened).
And this is an invariant of the second loop. (Notice that items equal to the
pivot may be either in the rst, and in the second section of A[p..r].)
Starting the run of the second loop, it turns out that A[j] = 1 < pivot must
be exchanged with A[i] = 8 ≥ pivot in order to move A[j] into the rst
section of A[p..r] (which contains items ≤ than the pivot), and to move A[i]
(A[i] ≥ pivot) to the end of the second section (which contains items ≥ than
the pivot). (The items to be exchanged are printed in bold.)
p i j r
A: 5 3 8 1 6 4 7 pivot = 5
Now we exchange A[i] and A[j]. Thus the length of the rst section of A[p..r]
(containing items ≤ than the pivot) has been increased by 1, while its second
section (containing items ≥ than the pivot) has been moved by one position.
So we increment variables i and j by 1, in order to keep the invariant of the
loop.
p i j r
A: 5 3 1 8 6 4 7 pivot = 5
29
And now A[j] = 4 < pivot = 5, so A[j] must be exchanged with A[i]. (The
items to be exchanged are printed in bold.)
Now we exchange A[i] and A[j]. Thus the length of the rst section of A[p..r]
(containing items ≤ than the pivot) has been increased by 1, while its second
section (containing items ≥ than the pivot) has been moved by one position.
So we increment variables i and j by 1, in order to keep the invariant of the
loop.
p i j r
A: 5 3 1 4 6 8 7 pivot = 5
Now the rst two sections cover A[p..r), and the second section is not empty
because of invariant i < j . Thus the pivot can be put in between the items
≤ than it, and the items ≥ than it: rst we move the rst element (A[i]) of
the second section (A[i..j)) to the end of A[p..r], next we put the pivot into
A[i]. (The pivot is printed in bold.)
p i j=r
A: 5 3 1 4 5 8 7 6
The time complexity of function partition is linear, because the two loops
together perform r−p−1 or r−p iterations.
There is a big dierence between the best-case and worst-case time com-
plexities of quicksort. At each level of recursion, counting all the elements
in the subarrays of that level, we have to divide altogether maximum n ele-
ments. Therefore the time complexity of each level of recursion is O(n).
In a lucky case, at each level of recursion, in the partitioning process,
the pivot divides the actual subarray into two parts of approximately equal
30
lengths, and the depth of recursion will be about log n. Thus the best-
case time complexity of quicksort is O(n log n). It can be proved that it is
Θ(n log n).
In an unlucky case, at each level of recursion, in the partitioning process,
the pivot will be the maximum or minimum of the actual subarray. And after
partitioning, the shorter partition will be empty, but the longer partition will
have all the elements, except of the pivot. Considering always the longer
subarray; at level zero, we have n elements; at level one, n − 1 elements;. . . at
level i, n−i elements (i ∈ 0..(n−1)). As a result, in this case, we have n levels
of recursion, and we need approximately n−i steps only for partitioning at
2
level i. Consequently the worst-case time complexity of quicksort is Ω(n ).
2
It can be proved that it is Θ(n ).
Fortunately, the probability of the worst case is extremely low, and the
average case is much closer to the best case. As a result,
Considering its space complexity, quicksort does not use any temporal
data structure. Thus the memory needs are determined by the number of
levels of recursion. As we have seen it before, in worst case, it is n. And
in the best case, it is about log n. Fortunately, in the average case, it is
Θ(log n). Consequently
It is known that on short arrays insertion sort is more ecient than the
fast sorting methods (merge sort, heap sort, quicksort). Thus procedure
quicksort(A, p, r ) can be optimized by switching to insertion sort on short
subarrays. Because in the recursive calls we determine a lot of short sub-
arrays, the speed-up can be signicant, although it does change neither the
time nor the space complexities.
QS(A : T[] ; p, r : N)
A p+c<r
q := partition(A, p, r )
QS(A, p, q− 1) insertionSort(A, p, r )
QS(A, q + 1, r )
31
c∈N is a constant. Its optimal value depends on many factors, but usually
it is between 20 and 50.
32
5 Linked Lists
One-way lists can represent nite sequences of data. They consist of zero or
more elements (i.e. nodes) where each element of the list contains some data
and linking information.
E1
+key : T
. . . // satellite data may come here
+next : E1*
+E1() { next := }
L1 =
L1 9 16 4 1
L1 25 9 16 4 1
L1 25 9 16 4
33
S1L_length(L : E1*) : N
If n is the length of list L,
n := 0 then
p := L the loop iterates n times,
p 6= so
n := n + 1 TS1L_length (n) = 1 + n,
therefore
p := p → next
TS1L_length (n) ∈ Θ(n).
return n
L2
L2 4 8 2
L2 17 4 8 2
L2 17 8 2
H1L_length(H : E1*) : N
TH1L_length (n) ∈ Θ(n)
return S1L_length(H → next) where n is the length of H1L H .
34
Such a list is ideal for representing a queue. Its method add(x : T ) copies
x into the key of the trailer node, and joins a new trailer node to the end of
the list.
q := p → next v → key := x
p → next := // satellite data may be read here
return q return H
35
5.1.5 Insertion sort of H1Ls
H1L_insertionSort(H : E1*)
s := H → next
A s 6=
u := s → next
u 6=
A s → key ≤ u → key
s → next := u → next
p := H ; q := H → next SKIP
s := u q → key ≤ u → key
p := q ; q := q → next
u → next := q ; p → next := u
u := s → next
mTIS (n) ∈ Θ(n) ∧ ATIS (n), M TIS (n) ∈ Θ(n2 ) ∧ SIS (n) ∈ Θ(1)
A n>1
n1 := b n2 c
mergeSort(&L : E1*)
L2 := cut(L, n1)
36
merge(L1, L2 : E1*) : E1*
A L1 → key ≤ L2 → key
L := t := L1 L := t := L2
L1 := L1 → next L2 := L2 → next
L1 6= ∧ L2 6=
A L1 → key ≤ L2 → key
t := t → next := L1 t := t → next := L2
L1 := L1 → next L2 := L2 → next
A L1 6=
t → next := L1 t → next := L2
return L
37
pointer referring to the front of the list identies the list.
L1 9 16 4 1
L1 25 9 16 4 1
L1 25 9 16 1
E2
+prev, next : E2* // refer to the previous and next neighbour or be this
+key : T
+ E2() { prev := next := this }
38
L2
L2 9 16 4 1
L2 25 9 16 4 1
L2 25 9 16 4
Some basic operations on C2Ls follow. Notice that these are simpler than
the appropriate operations of S2Ls. Clearly, T, S ∈ Θ(1) for each of them.
precede(q, r : E2*) follow(p, q : E2*)
// (∗q) will precede (∗r) // (∗q) will follow (∗p)
p := r → prev r := p → next
q → prev := p ; q → next := r q → prev := p ; q → next := r
p → next := r → prev := q p → next := r → prev := q
unlink(q : E2*)
(∗q)
// remove
p := q → prev ; r := q → next
p → next := r ; r → prev := p
q → prev := q → next := q
39
L1
L2
L1
L2
L1
L2
L1
L2
p r
L1
L2
p r
40
5.2.3 Example programs on C2Ls
We can imagine a C2L straightened, for example
H → [/][5][2][7][/] ← H representing h5; 2; 7i
or empty: H → [/][/] ← H representing hi.
C2L_read(&H : E2*) setEmpty(H : E2*)
H := new E2 p := H → prev
read(x) p 6= H
p := new E2 unlink(p)
p → key := x delete p
precede(p, H ) p := H → prev
H → [/][/] ← H
H → [/][5][/] ← H
H → [/][5][2][/] ← H
H → [/][5][2][7][/] ← H
u 6= H
length(H : E2*) : N
A s → key ≤ u → key
unlink(u) n := 0
p := s → prev p := H → next
s := u p 6= H ∧ p → key > u → key p 6= H
p := p → prev n := n + 1
follow(p, u) p := p → next
u := s → next return n
0
H → [/][5][2][7][2 ][/] ← H
H → [/][2][5][7][20 ][/] ← H
H → [/][2][5][7][20 ][/] ← H
H → [/][2][20 ][5][7][/] ← H
mTIS (n) ∈ Θ(n); ATIS (n), M TIS (n) ∈ Θ(n2 ) ∧ SIS (n) ∈ Θ(1)
41
where n is the length of C2L H. Clearly procedure insertionSort(H : E2*)
is stable.
We work with
the following invariant where
q → key if 6 H
q=
key(q, H) =
∞ if q=H
(H, q) is the sublist of the items between H and q ,
[q, H) is the sublist of the items starting with q but before H.
42
q
Hu → [/][1][2][4][6][/] ← Hu
Hi → [/][4][5][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][6][/] ← Hu
Hi → [/][4][5][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][/] ← Hu
Hi → [/][4][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][/] ← Hu
Hi → [/][4][8][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][8][/] ← Hu
Hi → [/][4][9][/] ← Hi
r
q
Hu → [/][1][2][4][5][6][8][9][/] ← Hu
Hi → [/][4][/] ← Hi
r
unionIntersection(Hu , Hi : E2∗)
q := Hu → next ; r := Hi → next
q 6= Hu ∧ r 6= Hi
A q → key < r → key A q → key > r → key A q → key = r → key
p := r
r := r → next q := q → next
q := q → next
unlink(p) r := r → next
precede(p, q )
r 6= Hi
p := r ; r := r → next ; unlink(p)
precede(p, Hu )
43
Illustration of the partition part of QS(H, H ) on C2L H below:
(Partition of the sublist strictly between p and s, i.e. partition of sublist
(p, s). The pivot is ∗q , r goes on sublist (q, s), elements smaller then the
pivot are moved before the pivot.)
p s
H → [/][5][2][4][6][5][20 ][/] ← H
q r
p s
H → [/][2][5][4][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][5][6][5][20 ][/] ← H
q r
p s
H → [/][2][4][20 ][5][6][5][/] ← H
q r
44
6 Trees, binary trees
Up till now, we worked with arrays and linked lists. They have a common
property: There is a rst item in the data structure and any element has
exactly one other item next to it except of the last element which has no
successor, according to the following scheme: O O O O O .
Therefore the arrays and linked lists are called linear data structures.
(Although in the case of cyclic lists the linear data structure is circular.)
Thus the arrays and linked lists can be considered representations of linear
graphs.
The size of a tree is the number of nodes in the tree. The size of tree t is
denoted by |t| (n(t) = |t|). The number of internal nodes of t is i(t).
n(t) or
The number of leaves of t is l(t). Clearly n(t) = i(t) + l(t) for a tree. For
example, n() = 0, and considering gure 8, n(t1 ) = i(t1 )+l(t1 ) = 1+2 = 3,
2 In
this chapter the trees are always rooted, directed trees. We do not consider free trees
here which are undirected connected graphs without cycles.
45
t1 t2 t3 t4
6 level 0
3 7 level 1
height = 3
2 5 8 level 2
1 4 level 3
Figure 8: Simple binary trees. The nodes of the trees are represented by
circles. If the structure of a subtree is not important or unknown, we denote
it with a triangle.
We can speak of the levels of a nonempty tree. The root is at level zero
(which is the topmost level), its children are at level one, its grandchildren
at level two and so on. Given a node at level i, its children are at level i+1
(always in downwards direction).
The height of a nonempty tree is equal to the level of the leaves at its
lowest level. −1. The height of tree t
The height of the empty tree is
is denoted by h(t). For example, h() = −1, and considering gure 8,
h(t1 ) = 1, h(t2 ) = 1, h(t3 ) = 0, and h(t4 ) = 1 + max(h(t4 →lef t), 0) where
t4 →lef t is the unknown left subtree of t4 .
All the hierarchical structures can be modeled by trees, for example the
directory hierarchy of a computer. Typically, each node of a tree is labeled
by some key , and maybe with other data.
46
6.2 Binary trees
Binary trees are useful, for example for representing sets and multisets (i.e.
bags) like dictionaries and priority queues.
A binary tree is a tree where each internal (i.e. non-leaf ) node has at most
two children. If a node has two children, they are the lef t child, and the
right child of their parent. If a node has exactly one child, then this child is
the lef t or right child of its parent. This distinction is important.
If t is a nonempty binary tree (i.e. t 6= ), then ∗t is the root node of
the tree, t → key is the key labeling ∗t, t → lef t is the left subtree of
t, and t → right is the right subtree of t. If ∗t does not have left child,
t → lef t = , i.e. the left subtree of t is empty. Similarly, if ∗t does not have
right child, t → right = , i.e. the right subtree of t is empty.
If ∗p is a node of a binary tree, p is the (sub)tree rooted by ∗p, thus
p → key is its key, p → lef t is its left subtree, and p → right is its right
subtree. If ∗p has left child, it is ∗p → lef t. If ∗p has right child, it is
∗p → right. If ∗p has parent, it is ∗p → parent. The tree rooted by the
parent is p → parent. (Notice that the inx operator → binds stronger than
the prex operator ∗. For example, ∗p → lef t = ∗(p → lef t).) If ∗p does
not have parent, p → parent = .
If p = , all the expressions ∗p, p → key , p → lef t, p → right, p →
parent etc. are erroneous.
Properties 6.1
h() = −1
t 6= binary tree =⇒ h(t) = 1 + max(h(t → lef t), h(t → right))
A strictly binary tree is a binary tree where each internal node has two chil-
dren. The following property can be proved easily by induction on i(t).
A complete binary tree is a strictly binary tree where all the leaves are at
the same level. It follows that in a complete nonempty binary tree of height
h we have 20 nodes at level 0, 21 nodes at level 1, and 2i nodes at level i
(0 ≤ i ≤ h). Therefore we get the following property.
47
A nonempty nearly complete binary tree is a binary tree which becomes com-
plete if we delete its lowest level. The is also nearly complete binary tree.
(An equivalent definition: A nearly complete binary tree is a binary tree
which can be received form a complete binary tree by deleting zero or more
nodes from its lowest level.) Notice that the complete binary trees are also
nearly complete, according to this denition.
Because a nonempty nearly complete binary tree is complete, possibly
h
except at its lowest level h, it has 2 − 1 nodes at its rst h − 1 levels, at
h h
least 1 node and at most 2 nodes at its lowest level. Thus n ≥ 2 − 1 + 1 ∧
n ≤ 2h − 1 + 2h where n is the size of the tree.
Notice that there are binary trees with h = blog nc, although they are
not nearly complete.
48
Node
+ key : T // T is some known type
+ lef t, right : Node*
+ Node() { lef t := right := } // generate a tree of a single node.
+ Node(x : T ) { lef t := right := ; key := x }
Figure 9: Binary tree with parent pointers. (We omitted the keys of the
nodes here.)
Node3
+ key : T // T is some known type
+ lef t, right, parent : Node3*
+ Node3(p:Node3*) { lef t := right := ; parent := p }
+ Node3(x : T , p:Node3*) { lef t := right := ; parent := p ; key := x }
49
6.4 Binary tree traversals
preorder(t : Node*) inorder(t : Node*)
A t 6= A t 6=
process(t) inorder(t → lef t)
→ lef t)
preorder(t SKIP process(t) SKIP
A t 6=
Q: Queue ; Q.add(t)
¬Q.isEmpty()
s := Q.rem()
postorder(t : Node*)
process(s)
SKIP
A t 6= A s → lef t 6=
postorder(t → lef t) Q.add(s → lef t) SKIP
postorder(t → right) SKIP A s → right 6=
process(t) Q.add(s → right) SKIP
Tpreorder (n), Tinorder (n), Tpostorder (n), TlevelOrder (n) ∈ Θ(n) where n = n(t) and
the time complexity of process(t) is Θ(1). For example, process(t) can print
t → key :
process(t)
cout << t → key << ` `
We can slightly reduce the running times of preoder and inorder traversals,
if we eliminate the tail recursions. (It does not aect the time complexities.)
preorder(t : Node*) inorder(t : Node*)
t 6= t 6=
process(t) inorder(t → lef t)
t := t → right t := t → right
50
t t
1 4
2 5 2 6
3 4 6 7 1 3 5 7
t t
7 1
3 6 2 3
1 2 4 5 4 5 6 7
Figure 10: left upper part: preorder, right upper part: inorder, left lower
part: postorder, right lower part: level order traversal of binary tree t.
Postorder:
h(t : Node*) : Z
A t 6=
return return −1
1+max(h(t → lef t),h(t → right))
Preorder:
preorder_h(t : Node* ; level, &max : Z)
t 6=
level := level + 1
h(t : Node*) : Z
A level > max
max := level := −1 max := level SKIP
51
6.4.2 Using parent pointers
inorder_next(p : Node3*) : Node3*
q := p → right
A q 6=
q → lef t 6= q := p → parent
q 6= ∧ q → lef t 6= p
q := q → lef t
p := q ; q := q → parent
return q
Exercise 6.6 What can we say about the space complexities of the dierent
binary tree traversals and other subroutines above?
The empty tree is represented by the empty string. We can use dierent
kinds of parentheses for easier reading. For example, see Figure 11.
Figure 11: The same binary tree in graphical and textual representations.
Notice that omitting the parenthesis from a textual form of a tree we receive
the inorder traversal of that tree.
52
6.6 Binary search trees
Binary search trees (BST) are a particular type of containers: data struc-
tures that store "items" (such as numbers, names etc.) in memory. They
allow fast lookup, addition and removal of items, and can be used to im-
plement either dynamic sets of items, or lookup tables that allow nding an
item by its key (e.g., nding the phone number of a person by name).
Binary search trees keep their keys in sorted order, so that lookup and
other operations can use the principle of binary search: when looking for a
key in a tree (or a place to insert a new key), they traverse the tree from
root to leaf, making comparisons to keys stored in the nodes of the tree and
deciding, on the basis of the comparison, to continue searching in the left
or right subtrees. On average, this means that each comparison allows the
operations to skip about half of the tree, so that each lookup, insertion or
deletion takes time proportional to the logarithm of the number of items
stored in the tree. This is much better than the linear time required to
nd items by key in an (unsorted) array, but slower than the corresponding
operations on hash tables.
(The source of the text above is Wikipedia.)
A binary search tree (BST) is a binary tree, whose nodes each store a key
(and optionally, some associated value). The tree additionally satises the
binary search tree property, which states that the key in each node must be
greater than any key stored in the left sub-tree, and less than any key stored
in the right sub-tree. (See Figure 12. Notice that there is no duplicated key
in a BST.)
h 6
d l 3 7
b f j n 1 4 8
a c e g i 2
53
inorderPrint(t : Node*)
A t 6=
inorderPrint(t → lef t)
inorderPrint(t → right)
search(t : Node* ; k : T) : Node*
t 6= ∧ t → key 6= k
A k < t → key
t := t → lef t t := t → right
return t
insert(&t : Node* ; k : T)
// See Figure 13.
A t=
t := A k < t → key A k > t → key A k = t → key
new Node(k) insert(t → lef t, k) insert(t → right, k) SKIP
remMin(&t, &minp : Node*)
// See Figure 14.
min(t : Node*) : Node*
A t → lef t =
t → lef t 6= minp := t
t := t → lef t t := minp → right remMin(t → lef t, minp)
return t minp → right :=
54
6 6
3 7 3 7
1 4 8 1 4 8
2 2 5
Figure 13: Inserting 5 into the BST on the left gives the BST on the right:
First we compare 5 with the key of the root, which is 6, and 5 < 6. Thus
we insert 5 into the left subtree. Next we we compare 5 with 3, and 5 > 3,
so we insert 5 into the right subtree of node 3, and again 5 > 4, therefore
we insert 5 into the right subtree of node 4. The right subtree of this node
is empty. Consequently we create a new node with key 5, and we substitute
that empty right subtree with this new subtree consisting of this single new
node.
del(&t : Node* ; k : T)
// See Figure 16.
A t 6=
A k < t → key A k > t → key A k = t → key
SKIP
del(t → lef t, k ) del(t → right, k ) delRoot(t)
delRoot(&t : Node*)
// See Figure 17.
Exercise 6.8 What can we say about the space complexities of the dierent
BST operations above?
55
6 6
3 7 3 7
1 4 8 2 4 8
Figure 14: Removing the node with minimal key from the BST on the left
gives the BST on the right: The leftmost node of the original BST is substi-
tuted by its right subtree. Notice that deleting the node with key 1 from
the BST on the left gives the same BST as result: The left subtree of this
node is empty. Thus deleting it means substituting it with its right subtree.
56
4 4
3 6 3 6
2 5 9 2 5 7
Figure 15: Removing the node with maximal key from the BST on the
left gives the BST on the right: The rightmost node of the original BST is
substituted by its left subtree. Notice that deleting the node with key 9
from the BST on the left gives the same BST as result: The right subtree
of this node is empty. Thus deleting it means substituting it with its left
subtree.
4 4
3 6 3 7
2 5 9 2 5 9
Figure 16: Deleting node 6 of the BST on the left gives the BST on the
right: Node 6 has two children. Thus we remove the minimal node of its
right subtree and substitute node 6 with it.
Let us suppose that we use the rst n elements of an array (indexed from
zero), for example A : T[m], i.e. we use the subarray A[0..n). Then the root
node of a nonempty tree is A[0]. Node A[i] is internal node, i lef t(i) < n.
Then lef t(i) = 2i + 1. The right child of node A[i] exists, i right(i) < n.
Then right(i) = 2i + 2. Node A[j] is not the root node of the tree, i j > 0.
j−1
Then parent(j) = b c.
2
4 There are also min priority queues where we can check or remove a minimal item.
57
4 5
3 7 3 7
2 5 9 2 6 9
Figure 17: Deleting the root node of the BST on the left gives the BST on
the right: The root node has two children. Thus we remove the minimal
node of its right subtree and substitute the root node with it. (Notice that
we could remove the maximal node of the left subtree and substitute the root
node with it.)
6 7
6 5 3 4
4 1 3 5 2
T→ some number type, where T is the element type of the priority queue,
and the items are compared according to their priorities. For example, the
processes of a computer are often scheduled according to their priorities.
Our representation is similar to our representation of the Stack type,
and some operations are also the same, except of the names. The actual
elements of the priority queue are in the subarray A[0..n) containing a (binary
maximum) heap.
58
A[0]
A[1] A[2]
PrQueue
− A : T[] // T is some known type
− n : N // n ∈ 0..A.length is the actual length of the priority queue
+ PrQueue(m : N){ A := new T[m]; n := 0 } // create an empty priority queue
+ add(x : T ) // insert x into the priority queue
+ remMax():T // remove and return the maximal element of the priority queue
+ max():T // return the maximal element of the priority queue
+ isFull() : B {return n = A.length}
+ isEmpty() : B {return n = 0}
+ ∼ PrQueue() { delete A }
+ setEmpty() {n := 0} // reinitialize the priority queue
59
PrQueue::add(x : T)
A n < A.length
j := n ; n := n + 1
A[j] := x ; i := parent(j)
j > 0 ∧ A[i] < A[j] PrQueueOverow
swap(A[i], A[j])
j := i ; i := parent(i)
PrQueue::max() : T
A n>0
return A[0]
PrQueueUnderow
PrQueue::remMax() : T
A n>0
max := A[0]
n := n − 1 ; A[0] := A[n]
PrQueueUnderow
sink(A, 0, n)
return max
sink(A : T[] ; k, n : N)
i := k ; j := lef t(k) ; b := true
j <n∧b
// A[j] is the left child of A[i]
A j + 1 < n ∧ A[j + 1] > A[j]
j := j + 1 SKIP
Exercise 6.9 What can we say about the space complexities of the dierent
heap and priority queue operations above?
60
op 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- 8 6 7 6 5 3 4 4 1 3 5 2
add(8) 8 6 7 6 5 *3 4 4 1 3 5 2 #8
. .. 8 6 *7 6 5 #8 4 4 1 3 5 2 3
. .. *8 6 #8 6 5 7 4 4 1 3 5 2 3
. 8 6 8 6 5 7 4 4 1 3 5 2 3
add(2) 8 6 8 6 5 7 *4 4 1 3 5 2 3 #2
. 8 6 8 6 5 7 4 4 1 3 5 2 3 2
add(9) 8 6 8 6 5 7 *4 4 1 3 5 2 3 2 #9
. .. 8 6 *8 6 5 7 #9 4 1 3 5 2 3 2 4
. .. *8 6 #9 6 5 7 8 4 1 3 5 2 3 2 4
. 9 6 8 6 5 7 8 4 1 3 5 2 3 2 4
remMax() ∼9 6 8 6 5 7 8 4 1 3 5 2 3 2 ∼4
max := 9 *4 6 #8 6 5 7 8 4 1 3 5 2 3 2
. .. 8 6 *4 6 5 7 #8 4 1 3 5 2 3 2
. .. 8 6 8 6 5 7 *4 4 1 3 5 2 3 #2
return 9 8 6 8 6 5 7 4 4 1 3 5 2 3 2
remMax() ∼8 6 8 6 5 7 4 4 1 3 5 2 3 ∼2
max := 8 *2 6 #8 6 5 7 4 4 1 3 5 2 3
. .. 8 6 *2 6 5 #7 4 4 1 3 5 2 3
. .. 8 6 7 6 5 *2 4 4 1 3 5 2 #3
return 8 8 6 7 6 5 3 4 4 1 3 5 2 2
Figure 20: Changes of A : Z[15] while applying add() and remMax() opera-
tions to it. At the add() operations, # prex identies the actual element,
and ∗ is the prex of its parent. Similarly, at sinking, ∗ denotes the
parent, and # is the prex of its greatest child. At remMax(), ∼ is the
prex of both of the maximum to be removed and of the key to be moved
into its place.
61
6.10 Heap sort
In Heap sort, rst we take the array to be sorted and build a heap form it.
We consider the array as a level-continuous binary tree.
While building the heap we consider the level order of the tree, and we
start from the last internal node of it. We go back in level order and sink
the root node of each subtree making a heap out of it. (See Figure 21.)
buildMaxHeap(A : T[n])
k := parent(n − 1) downto 0
sink(A, k, n)
When the heap is ready, we swap its rst and last items. Then the last
element is maximal element of the array. We cut it from the tree. Next we
sink the root of the tree, and receive a heap again. Again we swap, cut and
sink, and we have the two largest elements at the end of the array and a heap
without these elements before them. We repeat this process of swap, cut and
sink until we have only a single element in the heap which is a minimum
of the original array. And the whole array is sorted at this moment. (See
Figure 22.)
heapSort(A : T[n])
buildMaxHeap(A)
m := n
m>1
m := m − 1 ; swap(A[0], A[m])
sink(A, 0, m)
The time complexity of each sinking is O(log n) because h ≤ blog nc for each
subtree. Thus M TbuildMaxHeap (n) ∈ O(n log n) and the time complexity of the
main loop of heapSort() is also O(n log n).
M TheapSort (n) ∈ O(n log n) because M TbuildMaxHeap (n) ∈ O(n log n) and
the time complexity of the main loop of heapSort() is also O(n log n).
62
t t
18 18
23 51 23 51
42 14 35 97 42 19 35 97
53 60 19 53 60 14
t t
18 18
23 51 23 97
60 19 35 97 60 19 35 51
53 42 14 53 42 14
t t
18 18
60 97 60 97
23 19 35 51 53 19 35 51
53 42 14 23 42 14
t t
97 97
60 18 60 51
53 19 35 51 53 19 35 18
23 42 14 23 42 14
Figure 21: From array h18, 23, 51, 42, 14, 35, 97, 53, 61, 19i we build a maxi-
mum heap. Notice that we work on the array from the beginning to the end.
The binary trees show the logical structure of the array. Finally we receive
array h97, 60, 51, 53, 19, 35, 18, 23, 42, 14i.
op 0 1 2 3 4 5 6 7 8 9
64
7 Lower bounds for sorting
Proof. Clearly we have to check all the n items and only a limited number of
items is checked in a subroutine call or loop iteration (without the embedded
subroutine calls and loop iterations). Let this limit be k . Thus
mT (n) ∗ k ≥ n =⇒ mT (n) ≥ k1 n =⇒ mT (n) ∈ Ω(n).
The sorting algorithms we have studied before, that is insertion sort, heap
sort, merge sort, and quick sort are comparison sorts.
In this section, we assume without loss of generality that all the input
5
elements are distinct . Given this assumption, comparisons of the form ai =
aj and ai 6= aj are useless, so we can assume that no comparisons of this form
6
are made . We also note that the comparisons ai < aj , ai ≤ aj , ai ≥ aj , and
ai > a j are all equivalent in that they yield identical information about the
relative order of ai and aj . We therefore assume that all comparisons have
the form ai ≤ aj . [1]
5 In
this way we restrict the set of possible inputs and we are going to give a lower bound
for the worst case of comparison sorts. Thus, if we give a lower bound for the maximum
number of key comparisons (M C(n)) and for maximum time complexity (M T (n)) on this
restricted set of input sequences, it is also a lower bound for them on the whole set of input
sequences, because M C(n) and M T (n) are surely ≥ on a larger set than on a smaller set.
6 Anyway, if such comparisons are made, neither M C(n) nor M T (n) are decreased.
65
a≤b
a≤b b<a
b≤c a≤c
b≤c c<b a≤c c<a
Figure 23: The decision tree for insertion sort operating on the input sequence
ha, b, ci. There are 3! = 6 permutations of the 3 input items. Thus the
decision tree must have at least 3! = 6 leaves.
66
Proof. From the preceding discussion, it suces to determine the height h=
M C(n) of a decision tree in which each permutation appears as a reachable
leaf. Consider a decision tree of height h with l leaves corresponding to a
comparison sort on n elements. Because each of the n! permutations of the
input appears as some leaf, we have n! ≤ l. Since a binary tree of height h
h
has no more than 2 leaves, we have n! ≤ l ≤ 2h . [1]
Consequently
n
X n
X n
X lnm lnm lnm
M C(n) = h ≥ log n! = log i ≥ log i ≥ log ≥ ∗log ≥
i=1
2 2 2
i=d n
2 e i=d n
2 e
n n n n n n
≥ ∗ log = ∗ (log n − log 2) = ∗ (log n − 1) = log n − ∈ Ω(n log n)
2 2 2 2 2 2
Theorem 7.4 For any comparison sort algorithm M T (n) ∈ Ω(n log n).
Le us notice that heap sort and merge sort are asymptotically optimal in
the sense that theirM T (n) ∈ O(n log n) asymptotic upper bound meets the
M T (n) ∈ Ω(n log n) asymptotic lower bound from Theorem 7.4. This proves
that M T (n) ∈ Θ(n log n) for both of them.
67
8 Sorting in Linear Time
radix_sort(A : dDigitNumberhi ; d : N)
i := 1 to d
use a stable sort to sort list A on digit i
68
Here comes a bit more general study than needed for radix sort. When
distributing sort is used for radix sort, its key function ϕ must select the
appropriate digit.
The sorting problem: Given abstract list L of length n with element type
T , r ∈ O(n) positive integer,
ϕ : T → 0..(r−1) key selection function.
Let us sort list L with stable sort, and with linear time complexity.
distributing_sort(L : Thi ; r : N ; ϕ : T → 0..(r−1))
B : Thi[r] // array of lists, i.e. bins
k := 0 to r−1
Let B[k] be empty list // init the array of bins
L is not empty
k := r−1 downto 0
L := B[k] + L // append B[k] before L
Eciency of distributing sort: The rst and the last loop iterates r
times, and the middle loop n times. Provided that insertion and concatena-
tion can be performed in Θ(1) time, the time complexity of distributing sort
is consequently Θ(n + r) = Θ(n) because of the natural condition r ∈ O(n).
Its space complexity is determined by the size of array B . Thus it is Θ(r).
69
B3 = h103, 013i
L = h111, 211, 232, 002, 012, 103, 013i
In order for radix sort to work correctly, the digit sorts must be stable.
Distributing sort satises this requirement. If distributing sort runs in linear
time (Θ(n) where n is the length of the input list), and the number of digits
is constant d, radix sort also runs in linear time Θ(d ∗ n) = Θ(n).
Provided that we have to sort linked lists where the keys of the list-elements
are d-digit natural numbers with number base r, the implementation of the
algorithm above is straightforward. For example, let us suppose that we
have an L C2L with header, and function digit(i, r, x) can extract the ith
digit of number x, where digit 1 is the lowest-order digit and digit d is the
highest-order digit, in Θ(1) time.
radix_sort( L : E2* ; d, r : N )
BinHead : E2[r] // the headers of the lists representing the bins
i := 0 to r − 1
B[i] := &BinHead[i] // Initialize the ith pointer.
i := 1d to
70
distribute( L : E2* ; i : N;B : E2*[r] )
L → next 6= L
p := L → next ; unlink(p)
precede( p, B[digit(i, r, p → key)]
gather( B : E2*[r] ; L : E2* )
i := 0 to r−1
append(L, B[i]) // add to the end of L the elements form B[i]
append( L, Bi : E2* )
A Bi → next 6= Bi
p := L → prev ; q := Bi → next ; r := Bi → prev
p → next := q ; q → prev := p
SKIP
r → next := L ; L → prev := r
Bi → next := Bi ; Bi → prev := Bi
71
Remember that stable sorting algorithms maintain the relative order of
records with equal keys (i.e. values).
Counting sort is stable, and it is ideal auxiliary method of radix sort
because of its linear time complexity.
Here comes a bit more general study than needed for radix sort. When
counting sort is used for radix sort, its key function ϕ must select the appro-
priate digit.
counting_sort(A, B : T[n] ; r : N ; ϕ : T → 0..(r−1))
C : N[r] // counter array
k := 0 to r−1
C[k] := 0 // init the counter array
i := 0 to n−1
C[ϕ(A[i])]++ // count the items with the given key
k := 1 to r−1
C[k] += C[k − 1] // C[k] := the number of items with key ≤k
i := n − 1 downto 0
k := ϕ(A[i]) // k := the key of A[i]
C[k] − − // The next one with key k must be put before A[i] where
B[C[k]] := A[i] //Let A[i] be the last of the {unprocessed items with key k }
The rst loop of the procedure above assigns zero to each element of the
counting array C.
The second loop counts the number of occurences of key k in C[k], for
each possible key k.
The third loop sums the number of occurences of keys ≤ than k, consid-
ering each possible key k.
The number of keys ≤ 0 is the same as the number of keys = 0, so the
value of C[0] is not changed in the third loop. Considering greater keys we
have that the number of keys ≤k equals to the number of keys =k + the
number of keys ≤ k − 1. Thus the new value of C[k] can be counted with
adding the new value of C[k−1] to the old value C[k].
72
The fourth loop goes on the input array in reverse direction. We put the
elements of input array A into output array B : Considering any key k in the
input array, rst we process its last occurence. The element containing it is
put into the last place reserved for keys = k, i.e. intoB[C[k] − 1]: First we
decrease C[k] by one, and put this element into B[C[k]]. According to this
reverse direction the next occurence of key k is going to be the immediate
predecessor of the actual item etc. Thus the elements with the same key
remain in their original order, and we receive a stable sort.
The time complexity is clearly Θ(n + r). Provided that r ∈ O(n), Θ(n + r) =
Θ(n), and so T (n) ∈ Θ(n).
Regarding the space complexity, besides the input array, we have output
arrayB of n items and counter array C of r items. As a result, S(n) ∈
Θ(n + r) = Θ(n) since r ∈ O(n).
0 1 2 3 4 5
The input:
A: 02 32 30 13 10 12
The changes of counter array C [the rst column reects the rst loop ini-
tializing counter array C to zero, the next six columns reect the second loop
counting
P the items with key k, for each possible key; the column labeled by
reects the third loop which sums up the number of items with keys ≤ k;
and the last six columns reect the fourth loop placing each item of the input
array into its place in the output array]:
P
C 02 32 30 13 10 12 12 10 13 30 32 02
0 0 1 2 2 1 0
1 0 2
2 0 1 2 3 5 4 3 2
3 0 1 6 5
0 1 2 3 4 5
The output:
B: 30 10 02 32 12 13
Now we suppose that the result of the previous counting sort is to be sorted
according to the left-side digits of the numbers (of number base 4), i.e. func-
tion ϕ selects the leftmost digits of the numbers.
73
0 1 2 3 4 5
The input:
B: 30 10 02 32 12 13
0 1 2 3 4 5
The output:
A: 02 10 12 13 30 32
The rst counting sort ordered the input according to the right-side digits
of the numbers. The second counting sort ordered the result of the rst sort
according to the left-side digits of the numbers using a stable sort. Thus
in the nal result the numbers with the same left-side digits remained in
order according to their right-side digits. Consequently in the nal result the
numbers are already sorted according to their both digits.
Therefore the two counting sorts illustrated above form the rst and sec-
ond passes of a radix sort. And our numbers have just two digits now, so we
have performed a complete radix sort in this example.
Provided that the stable sort is counting sort, the time complexity of Radix
sort isΘ(d(n + r)). If d is a constant and r ∈ O(n), Θ(d(n + r)) = Θ(n), i.e.
T (n) ∈ Θ(n).
The space complexity of Radix sort based on Counting sort is clearly
determined by that of the later one. Thus S(n) ∈ Θ(n + r) = Θ(n) since
r ∈ O(n).
74
8.6 Bucket sort
We suppose that the items to be sorted are elements of the real interval [0; 1).
This algorithm is ecient, if the keys of the input are equally distributed
on [0; 1). We sort the buckets with some known sorting method like insertion
sort or merge sort.
bucket_sort( L : list )
n := the length of L
B : list[n] // Create the buckets B[0..(n−1)]
j := 0 to (n−1)
Let B[j] be empty list
L 6=
Remove the rst element of list L
Insert this element according to its key k into list B[bn ∗ kc]
j := 0 to (n−1)
Sort list B[j] nondecreasingly
ClearlymT (n) ∈ Θ(n). If the keys of the input are equally distributed
on [0; 1), AT (n) ∈ Θ(n). M T (n) depends on the sorting method we use
when sorting lists B[j] nondecreasingly. For example, using Insertion sort
M T (n) ∈ Θ(n2 ); using Merge sort M T (n) ∈ Θ(n log n).
Considering the space complexity of Bucket sort, we have temporal array
B of n elements. If we suppose that our lists are linked lists, we do not have
other extra data structure. Provided that we use Merge sort as the subroutine
of Bucket sort, the depth of recursion is O(log n) in any case. And Insertion
sort sorts in-place. Consequently mS(n), M S(n) ∈ Θ(n), if we sort a linked
list, and we use Insertion Sort or Merge sort as the subroutine of Bucket sort.
75
9 Hash Tables
Notations:
m : the size of the hash table
T [0..(m − 1)] : the hash table
T [0], T [1], . . . , T [m − 1] : the slots of the hash table
: empty slot in the hash table, if we use direct-address tables or key
collisions are resolved by chaining
E : the key of empty slots in case of open addressing
D : the key of deleted slots in case of open addressing
n : the number of records stored in the hash table
α = n/m : load factor
U : the universe of keys; k, k 0 , ki ∈ U
h : U → 0..(m − 1) : hash function
We suppose that the hash table does not contain two or more records with
the same key, and that h(k) can be calculated in Θ(1) time.
D
+ k : U // k is the key
+ . . . // satellite data
76
init( T :D*[m] )
search( T :D*[] ; k :U ):D*
i := 0 to m−1
T [i] := return T [k]
insert( T :D*[] ; p:D* ):B remove( T :D*[] ; k :U ):D*
A T [p → k] = p := T [k]
T [p → k] := p T [k] :=
return f alse
return true return p
Clearly Tinit (m) ∈ Θ(m). And for the other three operations we have T ∈
Θ(1).
77
the hash function maps two or more keys to the same slot, the corresponding
records are stored in the list identied by this slot.
E1
+key : U
. . . // satellite data may come here
+next : E1*
+E1() { next := }
init( T :E1*[m] )
search( T :E1*[] ; k :U ):E1*
i := 0 to m−1
T [i] := return searchS1L(T [h(k)],k)
insert( T :E1*[] ; p:E1* ):B
k := p → key ; s := h(k)
searchS1L( q :E1* ; k :U ):E1*
A searchS1L(T [s], k) =
p → next := T [s] q 6= ∧ q → key 6= k
T [s] := p return f alse q := q → next
return true return q
remove( T :E1*[] ; k :U ):E1*
s := h(k) ; p := ; q := T [s]
6 ∧ q → key 6= k
q=
p := q ; q := q → next
A q 6=
A p=
T [s] := q → next p → next := q → next SKIP
q → next :=
return q
Clearly Tinit (m) ∈ Θ(m). For the other three operations mT ∈ Θ(1),
n
M T (n) ∈ Θ(n), AT (n, m) ∈ Θ(1 + m ).
n
AT (n, m) ∈ Θ(1 + m ) is satised, if function h : U → 0..(m−1) is simple
uniform hashing, because the average length of the lists of the slots is equal
n
to
m
= α.
78
n
Usually
m
∈ O(1) is required. In this case AT (n, m) ∈ Θ(1) is also
satised for insertion, search, and removal.
h(k) = k mod m
is often a good choice, because it can calculated simply and eciently. And
if m is a prime number not too close to a power of two, it usually distributes
the keys evenly among the slots, i.e. on the integer interval 0..(m−1).
For example, if we want to resolve key collision by chaining, and we would
like to store approximately 2000 records with maximum load factor α ≈ 3,
then m = 701 is a good choice: 701 is a prime number which is close to
2000/3, and it is far enough from the neighboring powers of two, i.e. from
512, and 1024.
Keys in interval [ 0 ; 1): Provided that the keys are evenly distributed on
[ 0 ; 1), function
h(k) = bk ∗ mc
is also simple uniform hashing.
Multiplication method: Provided that the keys are real numbers, and
A ∈ (0; 1) is a constant,
h(k) = b{k ∗ A} ∗ mc
is a hash function. ({x} is the fraction part of x.) It does not distribute the
keys equally well with all√the possible values of A.
5−1
Knuth proposes A =
2
≈ 0.618, because it is likely to work reasonably
well. Compared to the division method it has the advantage that the value
of m is not critical.
Each method above supposes that the keys are numbers. If the keys are
strings, the characters can be considered digits of unsigned integers with the
appropriate number base. Thus the strings can be interpreted as big natural
numbers.
79
9.5 Open addressing
The hash table is T : R[m]. The records of type R are directly in the slots.
Each record has a key eld k : U ∪ {E, D} where E 6= D; E, D ∈
/ U are
global constants in order to indicate empty (E ) and deleted (D ) slots.
R
+ k : U ∪ {E, D} // k is a key or it is Empty or Deleted
+ . . . // satellite data
init( T :R[m] )
i := 0 to m−1
T [i].k := E
The empty and deleted slots together are free slots. The other slots are
occupied.) Instead of a single hash function we have m hash functions now:
In open addressing we try these in this order one after the other if needed.
Let us suppose that we want to insert record r with key k into the hash
table. First we probe slot h(k, 0). If it is occupied and its key is not k, we
try h(k, 1), and so on, throughout hh(k, 0), h(k, 1), . . . , h(k, m − 1)i until
- we nd an empty slot, or
- we nd an occupied slot with key k, or
- all the slots of the potential probing sequence have been considered but
found neither empty slot nor occupied slot with key k.
+ If we nd an empty slot, we put r into it. Otherwise insertion fails.
80
hh(k, 0), h(k, 1), . . . , h(k, m − 1)i is called potential probing sequence because
during insertion, or search (or deletion) only a prex of it is actually gener-
ated. This prex is called actual probing sequence.
The potential probing sequence must be a permutation of h0, 1, . . . , (m −
1)i, which means, that it covers the whole hash table, i.e. it does not refer
twice to the same slot.
The length of the actual probing sequence of insertion/search/deletion is
i, i this operation stops at probe h(k, i − 1).
When we search for the record with key k , again we follow the potential
probing sequence hh(k, 0), h(k, 1), . . . , h(k, m − 1)i.
- We stop successfully when we nd an occupied slot with key k.
- The search fails, if we nd an empty slot, or we use up the potential probing
sequence unsuccessfully.
In ideal case we have uniform hashing: the potential probe sequence of each
key is equally likely to be any of the m! permutations of h0, 1, . . . , (m − 1)i.
Provided that
- the hash table does not contain deleted slots,
- its load factor α = n/m satises 0 < α < 1, and
- we have uniform hashing,
+ the expected length of an unsuccessful search / successful insertion is at
most
1
1−α
+ and the expected length of a successful search / unsuccessful insertion is
at most
1 1
ln
α 1−α
For example, the rst result above implies that with uniform hashing, if the
hash table is half full, the expected number of probes in a unsuccessful search
(or in a successful insertion) is less than 2. If the hash table is 90% full, the
expected number of probes is less then 10.
Similarly, the second result above implies that with uniform hashing, if
the hash table is half full, the expected number of probes in a successful
search (or in an unsuccessful insertion) is less than 1.387. If the hash table
is 90% full, the expected number of probes is less then 2.559. [1].
81
E (let the slot be empty) is not correct. For example, let us suppose that we
inserted record r with key k into the hash table, but it could not be put into
T [h(k, 0)] because of key conict, and we put it into T [h(k, 1)]. And then we
delete the record at T [h(k, 0)]. If this deletion performed T [h(k, 0)].k := E ,
a subsequent search for key k would stop at the empty slot T [h(k, 0)], and it
would not nd record r with key k in T [h(k, 1)]. Instead, deletion performs
T [h(k, 0)].k := D. Then the subsequent search for key k does not stop at
slot T [h(k, 0)] (because neither it is empty nor it contains key k ), and it
nds record r with key k in T [h(k, 1)]. (Clearly the procedures of search and
deletion are not changed in spite of the presence of deleted slots.)
If we use a hash table for a long time, there may be many deleted slots and
no empty slot, although the table is far from being full. This means that the
unsuccessful searches will check all the slots, and also the other operations
slow down. So we have to get rid of the deleted slots, for example, by
rebuilding the whole table.
82
to a well dened rule, until we nd the appropriate slot, or we nd that the
actual operation is impossible. h1 must be simple uniform hash function.
The simplest strategy is linear probing:
i + i2
h(k, i) = h1 (k) + mod m (i ∈ 0..m − 1)
2
Thus
(i + 1) + (i + 1)2 i + i2
(h(k, i + 1) − h(k, i)) mod m = − mod m =
2 2
83
(i + 1) mod m
So it is easy to compute the slots of the probing sequences recursively:
Exercise 9.1 Write the structograms of the operations of hash tables with
quadratic probing (c1 = c2 = 1/2) applying the previous recursive formula.
h1 (k) = k mod m
h2 (k) = 1 + (k mod m0 )
is an eligible choice.
In case of double hashing for each dierent pairs of (h1 (k), h2 (k)) there is a
2
dierent probing sequence, and so we have Θ(m ) dierent probing sequences.
Although the number of the probing sequences of double hashing is far
from the ideal number m! of probing sequences, its performance appears to
be very close to that of the ideal scheme of uniform hashing.
84
of this slot. In column s we have a + sign for a successful, and an −
sign for an unsuccessful operation. In the table we do not handle satellite
data, we process only the keys. (See the details in section 9.5.2.)
In the last 11 columns of the table we represent the actual state of the
hash table. The cells representing empty slots are simply left empty. We
wrote the reserving key into each cell of occupied slots, while the cells of the
deleted slots contain letter D.
op key h2 probes s 0 1 2 3 4 5 6 7 8 9 10
init +
ins 32 10 + 32
ins 40 7 + 40 32
ins 37 4 + 37 40 32
ins 15 6 4; 10; 5 + 37 15 40 32
ins 70 1 4; 5; 6 + 37 15 70 40 32
src 15 6 4; 10; 5 + 37 15 70 40 32
src 104 5 5; 10; 4; 9 − 37 15 70 40 32
del 15 6 4; 10; 5 + 37 D 70 40 32
src 70 1 4; 5; 6 + 37 D 70 40 32
ins 70 1 4; 5; 6 − 37 D 70 40 32
del 37 4 + D D 70 40 32
ins 104 5 5; 10; 4; 9 + D 104 70 40 32
src 15 6 4; 10; 5; 0 − D 104 70 40 32
Solution:
85
insert( T :R[m] ; x:R ):Z
k := x.k ; j := h1 (k)
i := 0 ; d := h2 (k)
i < m ∧ T [j].k ∈
/ {E, D}
A T [j].k = k
search(T :R[m] ; k :U
return i++
):Z
A j≥0
T [j].k := D SKIP
return j
86