0% found this document useful (0 votes)
53 views

UNIT IV - Searching and Sorting

Uploaded by

Sai Vighnesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

UNIT IV - Searching and Sorting

Uploaded by

Sai Vighnesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

UNIT IV SEARCHING AND SORTING TECHNIQUES

4.1 SEARCHING

In computer science, a search algorithm is an algorithm that retrieves information stored within
some data structure.various searching techniques are
1.linear search
2.binary search
3.fibonacci search

1.Linear search

Linear search is a very simple search algorithm. In this type of search, a sequential search
is made over all items one by one. Every item is checked and if a match is found then that particular
item is returned, otherwise the search continues till the end of the data collection.

Takes an array and searches each element from the beginning of array until it finds a match •
Array does not need to be sorted

Linear Search Analysis


• Linear search looks at every element until it finds a match
• The worst-case would be
– the element is not found or it is the last element in the array
– for an array of length n, n comparisons would be required
• Therefore, time complexity of O(n)

Click the link to view animation of linear search


http://www.cs.armstrong.edu/liang/animation/web/LinearSearch.html

2. Binary Search

Given a sorted array arr[] of n elements, write a function to search a given element x in arr[].
A simple approach is to do linear search.The time complexity of above algorithm is O(n). Another
approach to perform the same task is using Binary Search.

Binary Search:
Binary search takes a sorted array of length n:
– looks at the middle element of array, if it equals search key
● return array position
-else- – If middle element is greater than search key
● Changes upper bound of search area to [middle] – 1
• Searches again looking at middle element in new bounds
– If middle element is less than search key
• Changes lower bound of search area to [middle] + 1
• Searches again looking at middle element in new bounds
– If the element is not found,
returns a value outside of the range, or NIL.
Example:

The idea of binary search is to use the information that the array is sorted and reduce the
time complexity to O(Logn).

We basically ignore half of the elements just after one comparison.


1. Compare x with the middle element.
2. If x matches with middle element, we return the mid index.
3. Else If x is greater than the mid element, then x can only lie in right half subarray
after the mid element. So we recur for right half.
4. Else (x is smaller) recur for the left half.

Click the link to view animation of binary search


http://www.cs.armstrong.edu/liang/animation/web/BinarySearch.html

Linear search is good:

– for arrays with a small number of elements


– when you will not search the array many times

– when the values in the array will be changed

• Binary search is good:

– For arrays with a large number of elements

– When the values in the array are less likely to change (cost of maintaining sorted order)

3. Fibonacci search

Fibonacci Search is a comparison-based technique that uses Fibonacci numbers to search an element in
a sorted array.

Similarities with Binary Search:

1. Works for sorted arrays


2. A Divide and Conquer Algorithm.
3. Has Log n time complexity.

Differences with Binary Search:

1. Fibonacci Search divides given array in unequal parts


2. Binary Search uses division operator to divide range. Fibonacci Search doesn’t use /, but
uses + and -. The division operator may be costly on some CPUs.
3. Fibonacci Search examines relatively closer elements in subsequent steps. So when input
array is big that cannot fit in CPU cache or even in RAM, Fibonacci Search can be useful.

Background:

Fibonacci Numbers are recursively defined as F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) = 1. First few Fibonacci
Numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ..

Note: find programs for searching techniques look into class notes or record

4.2 INTRODUCTION TO SORTING

Definition. In general sorting means rearrangement of data according to a defined pattern. The
task of a sorting algorithm is to transform the original unsorted sequence to the sorted sequence.
Basic terminologies

Throughout this chapter ,some terms related to sorting will be used.these terms are defind in this
section

Internal sort: when a set of data to be sorted is small enough such that the entire sorting can be
performed in computer’s internal storage(primary memory) then the sorting is called internal sort.

External sort: sorting of large set of data(can’t be fit in primary memory),which is stored in low
speed computer’s external memory(such as hard disk,magnetic tape etc.) is called external sort.

1. Processing large files, unable to fit into the main memory


2. Restrictions on the access, depending on the external storage medium
3. Primary costs – for input-output
4. Main concern: minimize the number of times each piece of data is moved
5. between the external storage and the main memory.

Ascending order: non-decreasing order of data set [1,2,3,4,5,6,7,8,9]

Descending order: this type of order satisfies >= relation between any two consecutive data.
[9,8,7,6,5,4,3,2,1]

Lexicographic order: If data are in the form of characters or string of characters and are arranged
in the same order as in dictionary,then it is called lexicographic order.
Ex: [ada,bat,cat,mat,max,may,min]

Swap: Swap between two data storages implies the interchange of their contents.
Before swap : A[1] = 11 A[5] = 99
After swap: A[1] = 99 A[5] = 11
Inplace sorting: In-place sorting is a form of sorting in which a small amount of extra space is
used to manipulate the input set.
An in-place sorting algorithm directly modifies the list that it receives as input instead of creating a new list that
is then modified.

Stable sorting: A stable sorting algorithm place elements in the list that have equal sorting keys at the
relevant places that they were in the input.

Before swapping 1 2 1 4 4 5
After sorting 112445
Bubble sort is a stable algorithm, whereas e.g. quick sort isn't.

2.2 SORTING TECHNIQUES:

Though many sorting techniques have been developed,all of them can be classified into two broad
categories: 1.Internal sorting 2.External sorting

sort

Internal External

external

compariso distributio Two way


merge

insertion selection Exchange

Classification of sorting techniques

1.Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent
elements if they are in wrong order.
It is called Bubble sort, because with each iteration the smaller element in the list bubbles up
towards the first place, just like a water bubble rises up to the water surface.
Sorting takes place by stepping through all the data items one-by-one in pairs and comparing
adjacent data items and swapping each pair that is out of order.
Example:
First Pass:
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ),
Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 ) –> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) –> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does
not swap them.
Second Pass:
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) –> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm
needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )

Sorting using Bubble Sort Algorithm

Let's consider an array with values {5, 1, 6, 2, 4, 3}


int a[6] = {5, 1, 6, 2, 4, 3};
int i, j, temp;
for(i=0; i<6; i++)
{
for(j=0; j<6-i-1; j++)
{
if( a[j] > a[j+1])
{
temp = a[j];
a[j] = a[j+1];
a[j+1] = temp;
}
}
}
//now you can print the sorted array after this
Above is the algorithm, to sort an array using Bubble Sort. Although the above logic will sort
unsorted array, still the above algorithm isn't efficient and can be enhanced further. Because as per
the above logic, the for loop will keep going for six iterations even if the array gets sorted after the
second iteration.
Hence we can insert a flag and can keep checking whether swapping of elements is taking place
or not. If no swapping is taking place that means the array is sorted and we can jump out of the for
loop.
int a[6] = {5, 1, 6, 2, 4, 3};
int i, j, temp;
for(i=0; i<6; i++)
{
int flag = 0; //taking a flag variable
for(j=0; j<6-i-1; j++)
{
if( a[j] > a[j+1])
{
temp = a[j];
a[j] = a[j+1];
a[j+1] = temp;
flag = 1; //setting flag as 1, if swapping occurs
}
}
if(!flag) //breaking out of for loop if no swapping takes place
{
break; } }
In the above code, if in a complete single cycle of j iteration(inner for loop), no swapping takes
place, and flag remains 0, then we will break out of the for loops, because the array has already
been sorted.
Complexity Analysis of Bubble Sorting
In Bubble Sort, n-1 comparisons will be done in 1st pass, n-2 in 2nd pass, n-3 in 3rd pass and so
on. So the total number of comparisons will be
(n-1)+(n-2)+(n-3)+.....+3+2+1
Sum = n(n-1)/2
i.e O(n2)
Hence the complexity of Bubble Sort is O(n2).
The main advantage of Bubble Sort is the simplicity of the algorithm.Space complexity for
Bubble Sort is O(1), because only single additional memory space is required for temp variable
Best-case Time Complexity will be O(n), it is when the list is already sorted.
Worst and Average Case Time Complexity: O(n*n). Worst case occurs when array is reverse
sorted.
Best Case Time Complexity: O(n). Best case occurs when array is already sorted.
Auxiliary Space: O(1)
Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are already sorted.
Sorting In Place: Yes
Stable: Yes
For logic explanation click on video link https://youtu.be/xWBP4lzkoyM
For animation bubble sort please click the link:
http://www.cs.armstrong.edu/liang/animation/web/BubbleSort.html

2.Selection Sort
The selection sort algorithm sorts an array by repeatedly finding the minimum element
(considering ascending order) from unsorted part and putting it at the beginning. The algorithm
maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending order) from
the unsorted subarray is picked and moved to the sorted subarray.
Following example explains the above steps:
Sorting using Selection Sort Algorithm
void selectionSort(int a[], int size)
{
int i, j, min, temp;
for(i=0; i < size-1; i++ ){
min = i; //setting min as i
for(j=i+1; j < size; j++)
{
if(a[j] < a[min]) //if element at j is less than element at min position
{
min = j; //then set min as j
}
}
temp = a[i];
a[i] = a[min];
a[min] = temp;
}
}
Complexity Analysis of Selection Sorting

Selection sort is not difficult to analyze compared to other sorting algorithms since none
of the loops depend on the data in the array.
Selecting the lowest element requires scanning all n elements (this takes n − 1 comparisons) and
then swapping it into the first position. Finding the next lowest element requires scanning the
remaining n − 1 elements and so on,
for (n − 1) + (n − 2) + ... + 2 + 1 = n(n - 1) / 2 ∈ O(n2)

Worst Case Time Complexity : O(n2)


Best Case Time Complexity : O(n2)
Average Time Complexity : O(n2)
Space Complexity : O(1)

Time Complexity: O(n2) as there are two nested loops.

Auxiliary Space: O(1)

The good thing about selection sort is it never makes more than O(n) swaps and can be useful
when memory write is a costly operation.

For logic explanation click the video link


https://youtu.be/xWBP4lzkoyM
For animation click link:
http://cs.armstrong.edu/liang/animation/web/SelectionSortNew.html

3.Insertion sort
Insertion sort algorithm arranges a list of elements in a particular order. In insertion sort
algorithm, every iteration moves an element from unsorted portion to sorted portion until all
the elements are sorted in the list.
Step by Step Process
The insertion sort algorithm is performed using following steps...
● Step 1: Assume that first element in the list is in sorted portion of the list and remaining
all elements are in unsorted portion.
● Step 2: Consider first element from the unsorted list and insert that element into the sorted
list in order specified.
● Step 3: Repeat the above process until all the elements from the unsorted list are moved
into the sorted list.

1.

Sorting using Insertion Sort Algorithm


int a[6] = {5, 1, 6, 2, 4, 3};
int i, j, key;
for(i=1; i<6; i++)
{
key = a[i];
j = i-1;
while(j>=0 && key < a[j])
{
a[j+1] = a[j];
j--;
}
a[j+1] = key;
}

Now lets, understand the above simple insertion sort algorithm. We took an array with 6 integers.
We took a variable key, in which we put each element of the array, in each pass, starting from the
second element, that is a[1].
Then using the while loop, we iterate, until j becomes equal to zero or we find an element which
is greater than key, and then we insert the key at that position.
In the above array, first we pick 1 as key, we compare it with 5(element before 1), 1 is smaller
than 5, we shift 1 before 5. Then we pick 6, and compare it with 5 and 1, no shifting this time.
Then 2 becomes the key and is compared with, 6 and 5, and then 2 is placed after 1. And this goes
on, until complete array gets sorted.
Time Complexity: O(n*n)

Auxiliary Space: O(1)

Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in reverse order.
And it takes minimum time (Order of n) when elements are already sorted.

Algorithmic Paradigm: Incremental Approach

Sorting In Place: Yes

Stable: Yes

Uses: Insertion sort is used when number of elements is small. It can also be useful when input
array is almost sorted, only few elements are misplaced in complete big array.
For logic explanation click on video link:
https://youtu.be/OGzPmgsI-pQ
Insert sort animation click link below:
http://cs.armstrong.edu/liang/animation/web/InsertionSortNew.html

4.QuickSort
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and
partitions the given array around the picked pivot. There are many different versions of
quickSort that pick pivot in different ways.
1. Always pick first element as pivot.
2. Always pick last element as pivot (implemented below)
3. Pick a random element as pivot.
4. Pick median as pivot.

The key process in quickSort is partition(). Target of partitions is, given an array and an element
x of array as pivot, put x at its correct position in sorted array and put all smaller elements (smaller
than x) before x, and put all greater elements (greater than x) after x. All this should be done in
linear time.
Pseudocode for recursive QuickSort function :

/* low --> Starting index, high --> Ending index */


quickSort(arr[], low, high)
{
if (low < high)
{
/* pi is partitioning index, arr[p] is now
at right place */
pi = partition(arr, low, high);

quickSort(arr, low, pi - 1); // Before pi


quickSort(arr, pi + 1, high); // After pi
}
}
Partition Algorithm
The logic is simple, we start from the leftmost element and keep track of index of smaller (or equal
to) elements as i. While traversing, if we find a smaller element, we swap current element with
arr[i]. Otherwise we ignore current element.
Pseudo code for partition()

/* This function takes last element as pivot, places the pivot element at its correct position in
sorted array, and places all smaller (smaller than pivot) to left of pivot and all greater elements to
right of pivot */

partition (arr[], low, high)


{
// pivot (Element to be placed at right position)
pivot = arr[high];

i = (low - 1) // Index of smaller element

for (j = low; j <= high- 1; j++)


{
// If current element is smaller than or
// equal to pivot
if (arr[j] <= pivot)
{
i++; // increment index of smaller element
swap arr[i] and arr[j]
}
}
swap arr[i + 1] and arr[high])
return (i + 1)
}
Illustration of partition() :

arr[] = {10, 80, 30, 90, 40, 50, 70}


Indexes: 0 1 2 3 4 5 6

low = 0, high = 6, pivot = arr[h] = 70


Initialize index of smaller element, i = -1

Traverse elements from j = low to high-1


j = 0 : Since arr[j] <= pivot, do i++ and swap(arr[i], arr[j])
i = 0
arr[] = {10, 80, 30, 90, 40, 50, 70} // No change as i and j // are same

j = 1 : Since arr[j] > pivot, do nothing


// No change in i and arr[]

j = 2 : Since arr[j] <= pivot, do i++ and swap(arr[i], arr[j])


i = 1
arr[] = {10, 30, 80, 90, 40, 50, 70} // We swap 80 and 30

j = 3 : Since arr[j] > pivot, do nothing


// No change in i and arr[]
j = 4 : Since arr[j] <= pivot, do i++ and swap(arr[i],
arr[j])
i = 2
arr[] = {10, 30, 40, 90, 80, 50, 70} // 80 and 40 Swapped
j = 5 : Since arr[j] <= pivot, do i++ and swap arr[i] with arr[j]
i = 3
arr[] = {10, 30, 40, 50, 80, 90, 70} // 90 and 50 Swapped

We come out of loop because j is now equal to high-1.Finally we place pivot at correct position
by swapping arr[i+1] and arr[high] (or pivot)
arr[] = {10, 30, 40, 50, 70, 90, 80} // 80 and 70 Swapped

Now 70 is at its correct place. All elements smaller than 70 are before it and all elements greater
than 70 are after It.
Analysis of QuickSort
Time taken by QuickSort in general can be written as following.
T(n) = T(k) + T(n-k-1) + O(n)
The first two terms are for two recursive calls, the last term is for the partition process. k is the
number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following are
three cases.
Worst Case: The worst case occurs when the partition process always picks greatest or smallest
element as pivot. If we consider above partition strategy where last element is always picked as
pivot, the worst case would occur when the array is already sorted in increasing or decreasing
order. Following is recurrence for worst case.
T(n) = T(0) + T(n-1) + O(n)
which is equivalent to
T(n) = T(n-1) + O(n)
The solution of above recurrence is O(n2).
Best Case: The best case occurs when the partition process always picks the middle element as
pivot. Following is recurrence for best case.
T(n) = 2T(n/2) + O(n)
The solution of above recurrence is O(nLogn).
Worst Case Time Complexity : O(n2)
Best Case Time Complexity : O(n log n)
Average Time Complexity : O(n log n)
Space Complexity : O(n log n)
● Space required by quicksort is very less, only O(n log n) additional space is
required.
● Quick sort is not a stable sorting technique, so it might change the occurrence of
two similar elements in the list while sorting.
For Partition in quicksort animation click link;

http://cs.armstrong.edu/liang/animation/web/QuickSortNew.html

For logic click video link

https://youtu.be/PgBzjlCcFvc

5.Merge Sort
Like QuickSort, Merge Sort is a Divide and Conquer algorithm.
It divides input array in two halves, calls itself for the two halves and then merges the two sorted
halves. The merge() function is used for merging two halves. The merge(arr, l, m, r) is key process
that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted sub-arrays into
one. See following C implementation for details.

MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
The following diagram from wikipedia shows the complete merge sort process for an example
array {38, 27, 43, 3, 9, 82, 10}. If we take a closer look at the diagram, we can see that the array
is recursively divided in two halves till the size becomes 1. Once the size becomes 1, the merge
processes comes into action and starts merging arrays back till the complete array is merged.
Time Complexity: Sorting arrays on different machines. Merge Sort is a recursive algorithm and
time complexity can be expressed as following recurrence relation.
T(n) = 2T(n/2) +
The above recurrence can be solved either using Recurrence Tree method or Master method. It
falls in case II of Master Method and solution of the recurrence is .
Time complexity of Merge Sort is in all 3 cases (worst, average and best) as merge sort
always divides the array in two halves and take linear time to merge two halves.
Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation
Stable: Yes

For logic explanation click video link:


https://youtu.be/JSceec-wEyw
For animation of merging of two sorted arrays click link
http://cs.armstrong.edu/liang/animation/web/MergeSortNew.html

6.Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort ..
etc) is Ω(nLogn), i.e., they cannot do better than nLogn.
Radix Sort is the answer. The idea of Radix Sort is to do digit by digit sort starting from least
significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
The Radix Sort Algorithm
1) Do following for each digit i where i varies from least significant digit to the most significant
digit.
………….a) Sort input array using counting sort (or any stable sort) according to the i’th digit.
Example:
Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66

Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802
occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]
170, 90, 802, 2, 24, 45, 75, 66

Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes
before 2 in the previous list.]
802, 2, 24, 45, 66, 170, 75, 90

Sorting by most significant digit (100s place) gives:


2, 24, 45, 66, 75, 90, 170, 802

Click for video link


https://youtu.be/nu4gDuFabIM
Click for animation example
http://cs.armstrong.edu/liang/animation/web/RadixSort.html

7.ShellSort
ShellSort is mainly a variation of Insertion Sort. In insertion sort, we move elements only one
position ahead. When an element has to be moved far ahead, many movements are involved.

The idea of shellSort is to allow exchange of far items.


In shellSort, we make the array h-sorted for a large value of h.
We keep reducing the value of h until it becomes 1.
An array is said to be h-sorted if all sublists of every h’th element is sorted.

/* function to sort arr using shellSort */


int shellSort(int arr[], int n)
{
// Start with a big gap, then reduce the gap
for (int gap = n/2; gap > 0; gap /= 2)
{
// Do a gapped insertion sort for this gap size.
// The first gap elements a[0..gap-1] are already in gapped order
// keep adding one more element until the entire array is
// gap sorted
for (int i = gap; i < n; i += 1)
{
// add a[i] to the elements that have been gap sorted
// save a[i] in temp and make a hole at position i
int temp = arr[i];

// shift earlier gap-sorted elements up until the correct


// location for a[i] is found
int j;
for (j = i; j >= gap && arr[j - gap] > temp; j -= gap)
arr[j] = arr[j - gap];

// put temp (the original a[i]) in its correct location


arr[j] = temp;
}
}
return 0;
}
Shell's original sequence: N/2 , N/4 , ..., 1 (repeatedly divide by 2);

A Shellsort's worst-case performance using Hibbard's increments is Θ(n 3/2).


The average performance is thought to be about O(n5/4)

Click for video link


https://youtu.be/SHcPqUe2GZM

Comparison of sorting algorithms

Time

Sort Average Best Worst Space Stability Remarks

Bubble O(n^2) O(n^2) O(n^2) Constant Stable Always use a modified


sort bubble sort

Modified O(n^2) O(n) O(n^2) Constant Stable Stops after reaching a


Bubble sorted array
sort

Selection O(n^2) O(n^2) O(n^2) Constant Stable Even a perfectly sorted


Sort input requires scanning
the entire array

Insertion O(n^2) O(n) O(n^2) Constant Stable In the best case (already
Sort sorted), every insert
requires constant time

Shell sort O(nlogn O(n) O(n^2) costant stable Depends on the gap
) between two elements
Merge Sort O(n*log O(n*log(n) O(n*log(n) Depends Stable On arrays, merge sort
(n)) ) ) requires O(n) space; on
linked lists, merge sort
requires constant space

Quicksort O(n*log O(n*log(n) O(n^2) Constant Stable Randomly picking a


(n)) ) pivot value (or shuffling
the array prior to sorting)
can help avoid worst
case scenarios such as a
perfectly sorted array.

Algorithm Data Structure In place Worst Case Auxiliary Space Complexity


Quicksort Array yes O(n)

Mergesort Array No O(n)

Bubble Sort Array yes O(1)

Insertion Array yes O(1)


Sort

Shell sort Array yes O(1)

Select Sort Array yes O(1)

Radix Sort Array No O(n+k)

You might also like