UNIT IV - Searching and Sorting
UNIT IV - Searching and Sorting
4.1 SEARCHING
In computer science, a search algorithm is an algorithm that retrieves information stored within
some data structure.various searching techniques are
1.linear search
2.binary search
3.fibonacci search
1.Linear search
Linear search is a very simple search algorithm. In this type of search, a sequential search
is made over all items one by one. Every item is checked and if a match is found then that particular
item is returned, otherwise the search continues till the end of the data collection.
Takes an array and searches each element from the beginning of array until it finds a match •
Array does not need to be sorted
2. Binary Search
Given a sorted array arr[] of n elements, write a function to search a given element x in arr[].
A simple approach is to do linear search.The time complexity of above algorithm is O(n). Another
approach to perform the same task is using Binary Search.
Binary Search:
Binary search takes a sorted array of length n:
– looks at the middle element of array, if it equals search key
● return array position
-else- – If middle element is greater than search key
● Changes upper bound of search area to [middle] – 1
• Searches again looking at middle element in new bounds
– If middle element is less than search key
• Changes lower bound of search area to [middle] + 1
• Searches again looking at middle element in new bounds
– If the element is not found,
returns a value outside of the range, or NIL.
Example:
The idea of binary search is to use the information that the array is sorted and reduce the
time complexity to O(Logn).
– When the values in the array are less likely to change (cost of maintaining sorted order)
3. Fibonacci search
Fibonacci Search is a comparison-based technique that uses Fibonacci numbers to search an element in
a sorted array.
Background:
Fibonacci Numbers are recursively defined as F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) = 1. First few Fibonacci
Numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ..
Note: find programs for searching techniques look into class notes or record
Definition. In general sorting means rearrangement of data according to a defined pattern. The
task of a sorting algorithm is to transform the original unsorted sequence to the sorted sequence.
Basic terminologies
Throughout this chapter ,some terms related to sorting will be used.these terms are defind in this
section
Internal sort: when a set of data to be sorted is small enough such that the entire sorting can be
performed in computer’s internal storage(primary memory) then the sorting is called internal sort.
External sort: sorting of large set of data(can’t be fit in primary memory),which is stored in low
speed computer’s external memory(such as hard disk,magnetic tape etc.) is called external sort.
Descending order: this type of order satisfies >= relation between any two consecutive data.
[9,8,7,6,5,4,3,2,1]
Lexicographic order: If data are in the form of characters or string of characters and are arranged
in the same order as in dictionary,then it is called lexicographic order.
Ex: [ada,bat,cat,mat,max,may,min]
Swap: Swap between two data storages implies the interchange of their contents.
Before swap : A[1] = 11 A[5] = 99
After swap: A[1] = 99 A[5] = 11
Inplace sorting: In-place sorting is a form of sorting in which a small amount of extra space is
used to manipulate the input set.
An in-place sorting algorithm directly modifies the list that it receives as input instead of creating a new list that
is then modified.
Stable sorting: A stable sorting algorithm place elements in the list that have equal sorting keys at the
relevant places that they were in the input.
Before swapping 1 2 1 4 4 5
After sorting 112445
Bubble sort is a stable algorithm, whereas e.g. quick sort isn't.
Though many sorting techniques have been developed,all of them can be classified into two broad
categories: 1.Internal sorting 2.External sorting
sort
Internal External
external
1.Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent
elements if they are in wrong order.
It is called Bubble sort, because with each iteration the smaller element in the list bubbles up
towards the first place, just like a water bubble rises up to the water surface.
Sorting takes place by stepping through all the data items one-by-one in pairs and comparing
adjacent data items and swapping each pair that is out of order.
Example:
First Pass:
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ),
Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 ) –> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) –> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does
not swap them.
Second Pass:
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) –> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm
needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
2.Selection Sort
The selection sort algorithm sorts an array by repeatedly finding the minimum element
(considering ascending order) from unsorted part and putting it at the beginning. The algorithm
maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending order) from
the unsorted subarray is picked and moved to the sorted subarray.
Following example explains the above steps:
Sorting using Selection Sort Algorithm
void selectionSort(int a[], int size)
{
int i, j, min, temp;
for(i=0; i < size-1; i++ ){
min = i; //setting min as i
for(j=i+1; j < size; j++)
{
if(a[j] < a[min]) //if element at j is less than element at min position
{
min = j; //then set min as j
}
}
temp = a[i];
a[i] = a[min];
a[min] = temp;
}
}
Complexity Analysis of Selection Sorting
Selection sort is not difficult to analyze compared to other sorting algorithms since none
of the loops depend on the data in the array.
Selecting the lowest element requires scanning all n elements (this takes n − 1 comparisons) and
then swapping it into the first position. Finding the next lowest element requires scanning the
remaining n − 1 elements and so on,
for (n − 1) + (n − 2) + ... + 2 + 1 = n(n - 1) / 2 ∈ O(n2)
The good thing about selection sort is it never makes more than O(n) swaps and can be useful
when memory write is a costly operation.
3.Insertion sort
Insertion sort algorithm arranges a list of elements in a particular order. In insertion sort
algorithm, every iteration moves an element from unsorted portion to sorted portion until all
the elements are sorted in the list.
Step by Step Process
The insertion sort algorithm is performed using following steps...
● Step 1: Assume that first element in the list is in sorted portion of the list and remaining
all elements are in unsorted portion.
● Step 2: Consider first element from the unsorted list and insert that element into the sorted
list in order specified.
● Step 3: Repeat the above process until all the elements from the unsorted list are moved
into the sorted list.
1.
Now lets, understand the above simple insertion sort algorithm. We took an array with 6 integers.
We took a variable key, in which we put each element of the array, in each pass, starting from the
second element, that is a[1].
Then using the while loop, we iterate, until j becomes equal to zero or we find an element which
is greater than key, and then we insert the key at that position.
In the above array, first we pick 1 as key, we compare it with 5(element before 1), 1 is smaller
than 5, we shift 1 before 5. Then we pick 6, and compare it with 5 and 1, no shifting this time.
Then 2 becomes the key and is compared with, 6 and 5, and then 2 is placed after 1. And this goes
on, until complete array gets sorted.
Time Complexity: O(n*n)
Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in reverse order.
And it takes minimum time (Order of n) when elements are already sorted.
Stable: Yes
Uses: Insertion sort is used when number of elements is small. It can also be useful when input
array is almost sorted, only few elements are misplaced in complete big array.
For logic explanation click on video link:
https://youtu.be/OGzPmgsI-pQ
Insert sort animation click link below:
http://cs.armstrong.edu/liang/animation/web/InsertionSortNew.html
4.QuickSort
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and
partitions the given array around the picked pivot. There are many different versions of
quickSort that pick pivot in different ways.
1. Always pick first element as pivot.
2. Always pick last element as pivot (implemented below)
3. Pick a random element as pivot.
4. Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element
x of array as pivot, put x at its correct position in sorted array and put all smaller elements (smaller
than x) before x, and put all greater elements (greater than x) after x. All this should be done in
linear time.
Pseudocode for recursive QuickSort function :
/* This function takes last element as pivot, places the pivot element at its correct position in
sorted array, and places all smaller (smaller than pivot) to left of pivot and all greater elements to
right of pivot */
We come out of loop because j is now equal to high-1.Finally we place pivot at correct position
by swapping arr[i+1] and arr[high] (or pivot)
arr[] = {10, 30, 40, 50, 70, 90, 80} // 80 and 70 Swapped
Now 70 is at its correct place. All elements smaller than 70 are before it and all elements greater
than 70 are after It.
Analysis of QuickSort
Time taken by QuickSort in general can be written as following.
T(n) = T(k) + T(n-k-1) + O(n)
The first two terms are for two recursive calls, the last term is for the partition process. k is the
number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following are
three cases.
Worst Case: The worst case occurs when the partition process always picks greatest or smallest
element as pivot. If we consider above partition strategy where last element is always picked as
pivot, the worst case would occur when the array is already sorted in increasing or decreasing
order. Following is recurrence for worst case.
T(n) = T(0) + T(n-1) + O(n)
which is equivalent to
T(n) = T(n-1) + O(n)
The solution of above recurrence is O(n2).
Best Case: The best case occurs when the partition process always picks the middle element as
pivot. Following is recurrence for best case.
T(n) = 2T(n/2) + O(n)
The solution of above recurrence is O(nLogn).
Worst Case Time Complexity : O(n2)
Best Case Time Complexity : O(n log n)
Average Time Complexity : O(n log n)
Space Complexity : O(n log n)
● Space required by quicksort is very less, only O(n log n) additional space is
required.
● Quick sort is not a stable sorting technique, so it might change the occurrence of
two similar elements in the list while sorting.
For Partition in quicksort animation click link;
http://cs.armstrong.edu/liang/animation/web/QuickSortNew.html
https://youtu.be/PgBzjlCcFvc
5.Merge Sort
Like QuickSort, Merge Sort is a Divide and Conquer algorithm.
It divides input array in two halves, calls itself for the two halves and then merges the two sorted
halves. The merge() function is used for merging two halves. The merge(arr, l, m, r) is key process
that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted sub-arrays into
one. See following C implementation for details.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
The following diagram from wikipedia shows the complete merge sort process for an example
array {38, 27, 43, 3, 9, 82, 10}. If we take a closer look at the diagram, we can see that the array
is recursively divided in two halves till the size becomes 1. Once the size becomes 1, the merge
processes comes into action and starts merging arrays back till the complete array is merged.
Time Complexity: Sorting arrays on different machines. Merge Sort is a recursive algorithm and
time complexity can be expressed as following recurrence relation.
T(n) = 2T(n/2) +
The above recurrence can be solved either using Recurrence Tree method or Master method. It
falls in case II of Master Method and solution of the recurrence is .
Time complexity of Merge Sort is in all 3 cases (worst, average and best) as merge sort
always divides the array in two halves and take linear time to merge two halves.
Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation
Stable: Yes
6.Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort ..
etc) is Ω(nLogn), i.e., they cannot do better than nLogn.
Radix Sort is the answer. The idea of Radix Sort is to do digit by digit sort starting from least
significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
The Radix Sort Algorithm
1) Do following for each digit i where i varies from least significant digit to the most significant
digit.
………….a) Sort input array using counting sort (or any stable sort) according to the i’th digit.
Example:
Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66
Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802
occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]
170, 90, 802, 2, 24, 45, 75, 66
Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes
before 2 in the previous list.]
802, 2, 24, 45, 66, 170, 75, 90
7.ShellSort
ShellSort is mainly a variation of Insertion Sort. In insertion sort, we move elements only one
position ahead. When an element has to be moved far ahead, many movements are involved.
Time
Insertion O(n^2) O(n) O(n^2) Constant Stable In the best case (already
Sort sorted), every insert
requires constant time
Shell sort O(nlogn O(n) O(n^2) costant stable Depends on the gap
) between two elements
Merge Sort O(n*log O(n*log(n) O(n*log(n) Depends Stable On arrays, merge sort
(n)) ) ) requires O(n) space; on
linked lists, merge sort
requires constant space