0% found this document useful (0 votes)
2 views32 pages

Unit-i Notes

The document provides an overview of algorithm analysis, including concepts of space and time complexity, asymptotic notations, and specific data structures like AVL and B-Trees. It defines algorithms, their qualities, and performance analysis, emphasizing the importance of selecting the best algorithm based on resource requirements. Additionally, it covers AVL tree operations, rotations, and their complexities, illustrating how to maintain balance in the tree structure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views32 pages

Unit-i Notes

The document provides an overview of algorithm analysis, including concepts of space and time complexity, asymptotic notations, and specific data structures like AVL and B-Trees. It defines algorithms, their qualities, and performance analysis, emphasizing the importance of selecting the best algorithm based on resource requirements. Additionally, it covers AVL tree operations, rotations, and their complexities, illustrating how to maintain balance in the tree structure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

UNIT-I

Introduction to Algorithm Analysis, Space and Time Complexity analysis,

Asymptotic

Notations.

AVL Trees – Creation, Insertion, Deletion operations and Applications

B-Trees – Creation, Insertion, Deletion operations and Applications

Algorithm:
An algorithm is a set of well-defined instructions to solve a particular problem. It
takes a set of input(s) and produces the desired output.

For example,

An algorithm to add two numbers:

1. Take two number inputs


2. Add numbers using the + operator
3. Display the result

Qualities of a Good Algorithm


1. Input and output should be defined accurately.
2. Each step in the algorithm should be clear and unambiguous.

3. Algorithms should be most effective among many different ways to solve a

problem.

4. An algorithm should not include computer code. Instead, the algorithm

should be written in such a way that it can be used in different programming

languages.

Algorithm 1: Add two numbers entered by the user

Step 1: Start

Step 2: Declare variables num1, num2 and sum.

Step 3: Read values num1 and num2.

Step 4: Add num1 and num2 and assign the result to sum.

Sum= num1+num2

Step 5: Display sum


Step 6: Stop

Algorithm 2: Find the largest number among three numbers

Step 1: Start

Step 2: Declare variables a,b and c.

Step 3: Read variables a,b and c.

Step 4: If a > b

If a > c

Display a is the largest number.

Else

Display c is the largest number.

Else

If b > c

Display b is the largest number.

Else

Display c is the greatest number.

Step 5: Stop

Algorithm 3: Find the factorial of a number

Step 1: Start

Step 2: Declare variables n, factorial and i.

Step 3: Initialize variables

factorial = 1, i = 1

Step 4: Read value of n

Step 5: Repeat the steps until i = n

5.1: factorial = factorial*i

5.2: i = i+1

Step 6: Display factorial

Step 7: Stop
Performance Analysis

Performance Analysis of an algorithm:


If we want to go from city "A" to city "B", there can be many ways of doing this. We

can go by flight, by bus, by train and also by bicycle. Depending on the availability

and convenience, we choose the one which suits us. Similarly, in computer science,

there are multiple algorithms to solve a problem. When we have more than one

algorithm to solve a problem, we need to select the best one. Performance analysis

helps us to select the best algorithm from multiple algorithms to solve a problem.

When there are multiple alternative algorithms to solve a problem, we analyse them

and pick the one, which is best suitable for our requirements. The formal definition is

as follows...

Performance of an algorithm is a process of making evaluative judgement about

algorithms.

It can also be defined as follows...

Performance of an algorithm means predicting the resources, which are required

to an algorithm to perform its task.

That means when we have multiple algorithms to solve a problem, we need to select

a suitable algorithm to solve that problem.

We compare algorithms with each other which are solving the same problem, to select

the best algorithm. To compare algorithms, we use a set of parameters or set of

elements like memory required by that algorithm, the execution speed of that

algorithm, easy to understand, easy to implement, etc.,

Generally, the performance of an algorithm depends on the following elements...

1. Whether that algorithm is providing the exact solution for the problem?

2. Whether it is easy to understand?


3. Whether it is easy to implement?

4. How much space (memory) it requires to solve the problem?

5. How much time it takes to solve the problem? Etc.,

When we want to analyse an algorithm, we consider only the space and time required

by that particular algorithm and we ignore all the remaining elements.

Based on this information, performance analysis of an algorithm can also be defined

as follows...

Performance analysis of an algorithm is the process of calculating space and time

required by that algorithm.

Performance analysis of an algorithm is performed by using the following measures...

1. Space required to complete the task of that algorithm (Space Complexity). It

includes program space and data space

2. Time required to complete the task of that algorithm (Time Complexity)

Space Complexity:

When we design an algorithm to solve a problem, it needs some computer memory to

complete its execution. For any algorithm, memory is required for the following

purposes...

1. To store program instructions.

2. To store constant values.

3. To store variable values.

4. And for few other things like function calls, jumping statements etc,.

Space complexity of an algorithm can be defined as follows...

Total amount of computer memory required by an algorithm to complete its

execution is called as space complexity of that algorithm.


Generally, when a program is under execution it uses the computer memory for

THREE reasons. They are as follows...

1. Instruction Space: It is the amount of memory used to store compiled version

of instructions.

2. Environmental Stack: It is the amount of memory used to store information of

partially executed functions at the time of function call.

3. Data Space: It is the amount of memory used to store all the variables and

constants.

To calculate the space complexity, we must know the memory required to store

different datatype values (according to the compiler). For example, the C

Programming Language compiler requires the following...

1. 2 bytes to store Integer value.

2. 4 bytes to store Floating Point value.

3. 1 byte to store Character value.

4. 6 (OR) 8 bytes to store double value.

Consider the following piece of code...

Example 1

int square(int a)

return a*a;

}
In the above sample code, it requires 2 bytes of memory to store variable 'a' and

another 2 bytes of memory is used for return value.

That means, totally it requires 4 bytes of memory to complete its execution. And this

4 bytes of memory is fixed for any input value of 'a'.

Time Complexity:

What is Time complexity?

Every algorithm requires some amount of computer time to execute its instruction to

perform the task. This computer time required is called time complexity.

The time complexity of an algorithm can be defined as follows...

The time complexity of an algorithm is the total amount of time required by an

algorithm to complete its execution.

Generally, the running time of an algorithm depends upon the following...

1. Whether it is running on Single processor machine or Multi processor

machine.

2. Whether it is a 32 bit machine or 64 bit machine.

3. Read and Write speed of the machine.

4. The amount of time required by an algorithm to perform Arithmetic

operations, logical operations, return value and assignment operations etc.,

5. Input data

Calculating Time Complexity of an algorithm based on the system configuration is a

very difficult task because the configuration changes from one system to another

system. To solve this problem, we must assume a model machine with a specific

configuration. So that, we can able to calculate generalized time complexity according

to that model machine.


To calculate the time complexity of an algorithm, we need to define a model machine.

Let us assume a machine with following configuration...

1. It is a Single processor machine

2. It is a 32 bit Operating System machine

3. It performs sequential execution

4. It requires 1 unit of time for Arithmetic and Logical operations

5. It requires 1 unit of time for Assignment and Return value

6. It requires 1 unit of time for Read and Write operations

Now, we calculate the time complexity of following example code by using the above-

defined model machine...

Example 1: The following sample code

int sum (int a, int b)

return a+b;

In the above sample code, it requires 1 unit of time to calculate a+b and 1 unit of time

to return the value. That means, totally it takes 2 units of time to complete its

execution. And it does not change based on the input values of a and b. That means

for all input values, it requires the same amount of time i.e. 2 units.
Asymptotic Analysis:
The efficiency of an algorithm depends on the amount of time, storage and other
resources(Type and size of the processor) required to execute the algorithm. The
efficiency is measured with the help of asymptotic notations.

An algorithm may not have the same performance for different types of inputs. With
the increase in the input size, the performance will change.

The study of change in performance of the algorithm with the change in the order of
the input size is defined as asymptotic analysis.

Asymptotic Notations:
Asymptotic notations are the mathematical notations used to describe the running
time of an algorithm when the input tends towards a particular value or a limiting
value.

For Example: In bubble sort, when the input array is already sorted, the time taken by
the algorithm is linear i.e. the best case.

But, when the input array is in reverse condition, the algorithm takes the maximum
time (quadratic) to sort the elements i.e. the worst case.

When the input array is neither sorted nor in reverse order, then it takes average time.
These durations are denoted using asymptotic notations.

There are mainly three asymptotic notations:

1. Big-O notation
2. Omega notation
3. Theta notation

Big-O Notation (O-notation)

Big-O notation represents the upper bound of the running time of an algorithm.

Thus, it gives the worst-case complexity of an algorithm.


Big-O gives the upper bound of a function

O(g(n)) = { f(n): there exist positive constants c and n0

such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the set

O(g(n)) if there exists a positive constant c such that it lies between 0 and

cg(n), for sufficiently large n.

For any value of n, the running time of an algorithm does not cross the time

provided by O(g(n)).Meanwhile it gives the worst-case running time of an

algorithm.

Omega Notation (Ω-notation)

Omega notation represents the lower bound of the running time of an

algorithm. Thus, it provides the best case complexity of an algorithm.


Omega gives the lower bound of a function

Ω(g(n)) = { f(n): there exist positive constants c and n0

such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the set

Ω(g(n)) if there exists a positive constant c such that it lies above cg(n), for

sufficiently large n.

For any value of n, the minimum time required by the algorithm is given by

Omega Ω(g(n)).

Theta Notation (Θ-notation)

Theta notation encloses the function from above and below. Since it represents

the upper and the lower bound of the running time of an algorithm, it is used

for analysing the average-case complexity of an algorithm.


Theta bounds the function within constants factors

Θ(g(n)) = { f(n): there exist positive constants c1, c2 and n0

such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the set

Θ(g(n)) if there exist positive constants c1 and c2 such that it can be fitted in

between c1g(n) and c2g(n), for sufficiently large n.

If a function f(n) lies anywhere in between c1g(n) and c2g(n) for all n ≥ n0,

then f(n) is said to be asymptotically tight bound.


AVL Tree Data structure
AVL tree is a height-balanced binary search tree. That means, an AVL tree is also a
binary search tree but it is a balanced tree. A binary tree is said to be balanced if, the

difference between the heights of left and right subtrees of every node in the tree is

either -1, 0 or +1. In other words, a binary tree is said to be balanced if the height of

left and right children of every node differ by either -1, 0 or +1. In an AVL tree, every

node maintains an extra information known as balance factor. The AVL tree was

introduced in the year 1962 by G.M. Adelson-Velsky and E.M. Landis.

An AVL tree is defined as follows...

An AVL tree is a balanced binary search tree. In an AVL tree, balance factor of

every node is either -1, 0 or +1.

Balance factor of a node is the difference between the heights of the left and right

subtrees of that node. The balance factor of a node is calculated either height of left

subtree - height of right subtree (OR) height of right subtree - height of left subtree.

In the following explanation, we calculate as follows...

Balance factor = height of Left Subtree – height of Right Subtree

Example1 of AVL Tree


The above tree is a binary search tree and every node is satisfying balance factor
condition. So this tree is said to be an AVL tree.

Example2 of AVL Tree

Note: Every AVL Tree is a binary search tree but every Binary Search Tree need

not be AVL tree.


Complexity

Algorithm Average case Worst case

Space o(n) o(n)

Search o(log n) o(log n)

Insert o(log n) o(log n)

Delete o(log n) o(log n)


AVL Tree Rotations

In AVL tree, after performing operations like insertion and deletion we need to
check the balance factor of every node in the tree. If every node satisfies the
balance factor condition then we conclude the operation otherwise, we must
make it balanced. Whenever the tree becomes imbalanced due to any operation
we use rotation operations to make the tree balanced.

Rotation operations are used to make the tree balanced.

Rotation is the process of moving nodes either to left or to right to make the tree

balanced.

There are four rotations and they are classified into two types.

Single Left Rotation (LL Rotation)

In LL Rotation, every node moves one position to left from the current position.
To understand LL Rotation, let us consider the following insertion operation in
AVL Tree...
Single Right Rotation (RR Rotation)

In RR Rotation, every node moves one position to right from the current position. To
understand RR Rotation, let us consider the following insertion operation in AVL
Tree...

Left Right Rotation (LR Rotation)

The LR Rotation is a sequence of single left rotation followed by a single right


rotation. In LR Rotation, at first, every node moves one position to the left and
one position to right from the current position. To understand LR Rotation, let
us consider the following insertion operation in AVL Tree...
Right Left Rotation (RL Rotation)

The RL Rotation is sequence of single right rotation followed by single left rotation. In

RL Rotation, at first every node moves one position to right and one position to left

from the current position. To understand RL Rotation, let us consider the following

insertion operation in AVL Tree...

Operations on an AVL Tree

The following operations are performed on AVL tree...

1. Search

2. Insertion

3. Deletion
Search Operation in AVL Tree

In an AVL tree, the search operation is performed with O (log n) time


complexity. The search operation in the AVL tree is similar to the search
operation in a Binary search tree. We use the following steps to search an
element in AVL tree...

• Step 1 - Read the search element from the user. n


• Step 2 - Compare the search element with the value of root node in the
tree.
• Step 3 - If both are matched, then display "Given node is found!!!" and
terminate the function
• Step 4 - If both are not matched, then check whether search element is
smaller or larger than that node value.
• Step 5 - If search element is smaller, then continue the search process in
left subtree.
• Step 6 - If search element is larger, then continue the search process in
right subtree.
• Step 7 - Repeat the same until we find the exact element or until the
search element is compared with the leaf node.
• Step 8 - If we reach to the node having the value equal to the search value,
then display "Element is found" and terminate the function.
• Step 9 - If we reach to the leaf node and if it is also not matched with the
search element, then display "Element is not found" and terminate the
function.
Insertion Operation in AVL Tree:

In an AVL tree, the insertion operation is performed with O (log n) time


complexity. In AVL Tree, a new node is always inserted as a leaf node. The
insertion operation is performed as follows...

• Step 1 - Insert the new element into the tree using Binary Search Tree
insertion logic.
• Step 2 - After insertion, check the Balance Factor of every node.
• Step 3 - If the Balance Factor of every node is 0 or 1 or -1 then go for next
operation.
• Step 4 - If the Balance Factor of any node is other than 0 or 1 or -1 then
that tree is said to be imbalanced. In this case, perform
suitable Rotation to make it balanced and go for next operation.

Example: Construct an AVL Tree by inserting numbers from 1


to 8.
Deletion Operation in AVL Tree:

The deletion operation in AVL Tree is similar to deletion operation in BST. But
after every deletion operation, we need to check with the Balance Factor
condition. If the tree is balanced after deletion go for next operation otherwise
perform suitable rotation to make the tree Balanced.
Applications of AVL Trees
1. Most in-memory sets and dictionaries are stored using AVL trees.
2. Database applications, where insertions and deletions are less common
but frequent data lookups are necessary, also frequently employ AVL
trees.
3. In addition to database applications, it is employed in other applications
that call for better searching.
4. A balanced binary search tree called an AVL tree uses rotation to keep
things balanced.
5. It may be used in games with plotlines as well.
6. It is mostly utilized in business sectors where it is necessary to keep
records on the employees that work there and their shift changes.
B - Tree Data structure
In search trees like binary search tree, AVL Tree, Red-Black tree, etc., every node

contains only one value (key) and a maximum of two children. But there is a special

type of search tree called B-Tree in which a node contains more than one value (key)

and more than two children. B-Tree was developed in the year 1972 by Bayer and

McCreight with the name Height Balanced m-way Search Tree. Later it was named

as B-Tree.
B-Tree can be defined as follows...

B-Tree is a self-balanced search tree in which every node contains multiple


keys and has more than two children.

Note1: The number of keys in a node and number of children for a node depends on
the order of B-Tree.
Note2: Every B-Tree has an order.

B-Tree of Order m has the following properties...

1. Property 1 - All leaf nodes must be at same level.

2. Property 2 - All nodes except root must have at least [m/2]-1 keys and

maximum of m-1 keys.

3. Property 3 - All non leaf nodes except root (i.e. all internal nodes) must have

at least m/2 children.

4. Property 4 - If the root node is a non leaf node, then it must have atleast

2 children.

5. Property 5 - A non leaf node with n-1 keys must have n number of children.

6. Property 6 - All the key values in a node must be in Ascending Order.

For example, B-Tree of Order 4 contains a maximum of 3 key values in a node

and maximum of 4 children for a node.

Example
Operations on a B-Tree

The following operations are performed on a B-Tree...

1. Search

2. Insertion

3. Deletion

Search Operation in B-Tree

The search operation in B-Tree is similar to the search operation in Binary Search

Tree. In a Binary search tree, the search process starts from the root node and we make

a 2-way decision every time (we go to either left subtree or right subtree). In B-Tree

also search process starts from the root node but here we make an n-way decision

every time. Where 'n' is the total number of children the node has. In a B-Tree, the

search operation is performed with O(log n) time complexity. The search operation is

performed as follows...

Step 1 - Read the search element from the user.

Step 2 - Compare the search element with first key value of root node in the tree.

Step 3 - If both are matched, then display "Given node is found!!!" and terminate

the function

Step 4 - If both are not matched, then check whether search element is smaller or

larger than that key value.


Step 5 - If search element is smaller, then continue the search process in left

subtree.

Step 6 - If search element is larger, then compare the search element with next key

value in the same node and repeat steps 3, 4, 5 and 6 until we find the exact match

or until the search element is compared with last key value in the leaf node.

Step 7 - If the last key value in the leaf node is also not matched then display

"Element is not found" and terminate the function.

Insertion Operation in B-Tree

In a B-Tree, a new element must be added only at the leaf node. That means, the new key Value is

always attached to the leaf node only. The insertion operation is performed as follows...

Step 1 - Check whether tree is Empty.

Step 2 - If tree is Empty, then create a new node with new key value and insert it into the tree
as a root node.

Step 3 - If tree is Not Empty, then find the suitable leaf node to which the new key value is
added using Binary Search Tree logic.

Step 4 - If that leaf node has empty position, add the new key value to that leaf node in
ascending order of key value within the node.

Step 5 - If that leaf node is already full, split that leaf node by sending middle value to its
parent node. Repeat the same until the sending value is fixed into a node.

Step 6 - If the spilting is performed at root node then the middle value becomes new root node
for the tree and the height of the tree is increased by one.

Example
Construct a B-Tree of Order 3 by inserting numbers from 1 to 10.

The Order of B Tree is m=3

Min. Number of keys= m/2-1=3/2-1=1.5-1=0.5=0

Max. Number of keys=m-1=3-1=2

Min. Number of children’s=m/2=3/2=1

Max. Number of Children’s=m=3


Example2:
Insertion operation

The insertion operation for a B Tree is done similar to the Binary Search Tree but the

elements are inserted into the same node until the maximum keys are reached. The

insertion is done using the following procedure −

Step 1 − Calculate the maximum (m−1) and, minimum (⌈m/2⌉−1) number of keys a

node can hold, where m is denoted by the order of the B Tree.

Step 2 − The data is inserted into the tree using the binary search insertion and once

the keys reach the maximum number, the node is split into half and the median key

becomes the internal node while the left and right keys become its children .

Step 3 − All the leaf nodes must be on the same level.


The keys, 5, 3, 21, 9, 13 are all added into the node according to the binary search

property but if we add the key 22, it will violate the maximum key property. Hence,

the node is split in half, the median key is shifted to the parent node and the insertion

is then continued.

Another interruption occurs during the insertion of 11, so the node is split and
median is shifted to the parent.
While inserting 16, even if the node is split in two parts, the parent node also overflows

as it reached the maximum keys. Hence, the parent node is split first and the median

key becomes the root. Then, the leaf node is split in half the median of leaf node is

shifted to its parent.

The final B tree after inserting all the elements is achieved.


Deletion operation of a B Tree
The Deletion operation in a B Tree is slightly different from the deletion operation of
a Binary Search Tree.

Let us say the node to be deleted is called the target key. The target key can either be at
the leaf node or an internal node

The deletion of node in a B-Tree can be broadly classified into two cases:

1. Deletion At Leaf Node.

2. Deletion At Internal Node.

1. If the target key is at the leaf node:

If the target key is at the leaf node, we further study the given data to check if any of

the following cases exist:

Case 1: If the leaf node consists of the min number of keys according to the given

degree/order, then the key is simply deleted from the node.

Case 2: If the leaf contains the minimum number of keys, then:

Case 2a: The node can borrow a key from the immediate left sibling node, if it has

more than the minimum number of keys. The transfer of the keys take place through

the parent node, i.e, the maximum key of the left sibling moves upwards and replaces

the parent; while the parent key moves down to the target node from where the target

key is simply deleted.

Case 2b: The node can borrow a key from the immediate right sibling node, if it has

more than the minimum number of keys. The transfer of the keys take place through

the parent node, i.e, the minimum key of the right sibling moves upwards and replaces

the parent; while the parent key moves down to the target node from where the target

key is simply deleted.

Case 2c: If neither of the siblings have keys more than the minimum number of keys

required then, merge the target node with either the left or the right sibling along with

the parent key of respective node.


2.If the target key is at the internal node:

If the target key is at an internal node, we further study the given data to check if

any of the following cases exist:

Case 1: If the left child has more than the minimum number of keys, the target key in

the internal node is replaced by its inorder predecessor ,i.e, the largest element of the

left child node.

Case 2: If the right child has more than the minimum number of keys, the target key

in the internal node is replaced by it’s inorder successor ,i.e, the smallest element of

the right child node.

Applications of B Tree:
1. Large databases employ it to access information stored on discs.
2. Using the B-Tree, finding data in a data set can be done in a great deal less time.
3. Multilevel indexing is possible with the indexing feature.
4. The B-tree method is also used by the majority of servers.
5. In CAD systems, B-Trees are used to catalogue and search geometric data.
6. Other applications of B-Trees include encryption, computer networks, and
natural language processing.
7. Since accessing values stored in a large database that is stored on a disc takes a
long time, B trees are used to index the data and provide quick access to the
actual data stored on the disks.
8. In the worst case, it takes O (n) running time to search a database with n key
values that is not sorted or indexed. However, if we use B Tree to index this
database, it will be searched in O (log n) time in worst case.

You might also like