Data Structures and Algorithms
Data Structures and Algorithms
DATA TYPE
2
SIMPLE DATA TYPES
3
Abstract Data Type
4
Data Structure
6
Linear Data Structures
7
Non-Linear Data Structures
8
Data Structures and Programmes
9
ARRAYS
10
• Before any array is used in the computer, some memory
locations have to be created for storage of the elements. This is
often done by using the DIM instruction of BASIC programming
language or DIMENSION instruction of FORTRAN programming
language.
• For example, the instruction: DIM LAGOS (45) will create 45
memory locations for storage of the elements of the array
called LAGOS.
• In most programming languages, each element has the same
data type
• Array occupies a contiguous area of storage.
• Most programming languages have a built-in array data type.
11
Declaration of Arrays
• Variables normally only store a single value but, in some situations, it is useful to
have a variable that can store a series of related values – using an array. For example,
suppose a programme is required that will calculate the average age among a group
of six students. The ages of the students could be stored in six integer variables in C:
• int age1;
• int age2;
• int age3;
• However, a better solution would be to declare a six-element array:
• int age[6];
• This creates a six element array; the elements can be accessed as age[0] through
age[5] in C.
• A two-dimensional array (in which the elements are arranged into rows and columns)
declared by say DIM X(3,4) can be stored as linear arrays in the computer memory by
determining the product of the subscripts.
• The above can thus be expressed as DIM X (3 * 4) or DIM X (12).
12
Multi-dimensional Arrays
13
Suppose we want to represent this simple two-
dimensional array:
14
Classification of Arrays
15
Applications of Arrays
16
LIST DATA STRUCTURE
• Lists are the most fundamental data structure upon which most other data
structures are built and many more algorithms must operate. It’s not hard to
find examples of lists in the real world: shopping lists, to-do lists, train
timetables, order forms, even this “list of lists.”
• A list data structure is a sequential data structure, i.e. a collection of items
accessible one after the other, beginning at the head and ending at the tail.
• It is a widely used data structure for applications which do not need random
access.
• Lists differ from the stacks and queues data structures in that additions and
removals can be made at any position in the list.
• Lists also preserve insertion order so that, assuming there are no intervening
modifications, a given list will always return the same value for the same
position.
• Like arrays, lists make no attempt to preserve the uniqueness of values,
meaning a list may contain duplicate values.
17
Elements of a List
18
Operations
20
Array Lists
23
Properties of Linked-list
25
header
a1 a2 an
tail
26
iii. Doubly Linked Lists
28
• A frequently used metaphor is the idea of a stack of
plates in a spring loaded cafeteria stack.
• In such a stack, only the top plate is visible and
accessible to the user, all other plates remain hidden.
• As new plates are added, each new plate becomes
the top of the stack, hiding each plate below, pushing
the stack of plates down.
• As the top plate is removed from the stack, the plates
pop back up, and the second plate becomes the top
of the stack.
29
Application of Stacks
31
Operations on a Stack Contd.
32
Stack Storage Modes
35
Application of Queues
36
Operations on a Queue
37
Storing a Queue in a Static Data Structure
40
Adding a Node (Add)
• The new node is to be added at the tail of the queue. The reference
Queue.Tail should point to the new node, and the NextNode reference of
the node previously at the tail of the queue should point to the DataItem
of the new node.
41
Blocking Queues
42
TREES DATA STRUCTURE
• A tree is often used to represent a hierarchy. This is because
the relationships between the items in the hierarchy suggest
the branches of a botanical tree.
• In computer science, a tree is a widely-used data structure
that emulates a hierarchical tree structure with a set of linked
nodes.
43
• Each node in a tree has zero or more child nodes, which are below it
in the tree
• By convention, trees are drawn growing downwards.
• A node that has a child is called the child's parent node (or ancestor
node, or superior).
• A node has at most one parent. Nodes that do not have any children
are called leaf nodes. They are also referred to as terminal nodes.
• The height of a node is the length of the longest downward path to
a leaf from that node. The height of the root is the height of the
tree.
• The depth of a node is the length of the path to its root (i.e., its root
path)
44
• The topmost node in a tree is called the root node. Being the
topmost node, the root node will not have parents.
• It is the node at which operations on the tree commonly begin
(although some algorithms begin with the leaf nodes and work
up ending at the root).
• All other nodes can be reached from it by following edges or
links. (In the formal definition, each such path is also unique). In
diagrams, it is typically drawn at the top.
• An internal node or inner node is any node of a tree that has
child nodes and is thus not a leaf node.
• Similarly, an external node or outer node is any node that does
not have child nodes and is thus a leaf.
45
Binary Trees
The simplest form of tree is a binary tree. A binary tree consists of
• a node (called the root node) and
• left and right sub-trees.
• Both the sub-trees are themselves binary trees.
46
Properties of Binary Tree
47
Traversal methods
50
Postorder traversal
51
Inorder traversal
Inorder traversal visits the root in between
visiting the left and right subtrees that is,
1. Traverse the left subtree; and then
2. Visit the root; and then
3. Traverse the right subtree.
52
Common operations on Trees
53
Common uses of Trees
54
Key Terms
56
COMPARISON BETWEEN B-TREE AND B + TREE
B-TREE B+TREE
Has Lower fan-out compared with B+Tress Has very high fan-out (number of pointers
to child nodes in a node)
Leaf nodes has no linkage (i.e. Leaf Nodes pointing to Leaf nodes are Linked with each other
another leaf Node)
B trees contain data with each key B+ trees don't have data associated with
interior nodes
57
Storage Management
58
Static Memory Management
60
Phases of Storage Management
62
HASH FUNCTIONS
63
HASH TABLES
Scorpion: s + c + o + r + p + i+ o + n = 19 + 3 + 15 + 18 + 16 + 9 + 15 + 14 = 109
Snake: s + n + a + k + e = 19 + 14 + 1 + 11 + 5 = 50
Elvis: e + l + v + i + s = 5 + 12 + 22 + 9 + 19 = 67
Lives: l + i + v + e + s = 12 + 9 +22 +5 + 19 = 67
Looking at the generated values, you can see that the string “scorpion” would
be placed into an array at position 109, “snake” at position 50, “elvis” at
position 67 and “lives” also at position 67.
Note that the strings are not stored in any particular order
66
• There are two major problems with this approach.
• Take another look at the generated values. If these were used as index
positions within an array, then it would need too big enough to
accommodate the largest position, 109. Having filled only 4 of the 109
positions available—that is, 0 to 109—you would still have 105 empty
ones.
• One way to solve this problem is to modify the hash function to produce
only values within a certain range.
• If the size of the hash table was restricted to, for example, ten positions,
then the hash function could be modified to take the original result and
use a modulus, that is, the remainder after division (to find the remainder
after division by 10), as shown below:
67
• Scorpion: s + c + o + r + p + i+ o + n = 19 + 3 + 15 + 18 + 16 + 9 + 15 +
14 = 109/10 = 9
• Snake: s + n + a + k + e = 19 + 14 + 1 + 11 + 5 = 50/10 = 0
• Elvis: e + l + v + i + s = 5 + 12 + 22 + 9 + 19 = 67/10 = 7
• Lives: l + i + v + e + s = 12 + 9 +22 +5 + 19 = 67/10 = 7
• Now the addresses fall within the range 0 to 9 and can be stored in a
hash table of size 10.
• Unfortunately, there is still one more problem with the hash function
as described as it suffers from a high rate of collisions—different
values hashing to the same address. This is because “elvis” and “lives”
produced the same address. This is called collision.
68
• The first solution to the problem of collision
resolution is known as linear probing. Linear
probing is a very simple technique that, on
detecting a collision, searches linearly for the next
available slot.