0% found this document useful (0 votes)
155 views

1-Analysis and Design F Algorithms

This document provides information about the analysis of algorithms course CS 1401. It is divided into 6 units that cover topics like advanced data structures, divide and conquer algorithms, dynamic programming, backtracking, computational geometry, and selected topics. It also lists reference books for the course. The document then provides definitions and criteria for algorithms, discusses complexity analysis in terms of time and space resources, and provides examples of analyzing the time complexity of algorithms like insertion sort, binary search, and quicksort. The time complexities are O(n^2) for insertion sort, O(log n) for binary search, and O(n log n) for quicksort in the best case.

Uploaded by

Kartik Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views

1-Analysis and Design F Algorithms

This document provides information about the analysis of algorithms course CS 1401. It is divided into 6 units that cover topics like advanced data structures, divide and conquer algorithms, dynamic programming, backtracking, computational geometry, and selected topics. It also lists reference books for the course. The document then provides definitions and criteria for algorithms, discusses complexity analysis in terms of time and space resources, and provides examples of analyzing the time complexity of algorithms like insertion sort, binary search, and quicksort. The time complexities are O(n^2) for insertion sort, O(log n) for binary search, and O(n log n) for quicksort in the best case.

Uploaded by

Kartik Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 83

Analysis of algorithms

CS 1401
ANALYSIS OF ALGORITHMS (CS 1401)

UNIT 1:
Introduction, Review of basic concepts,
Advanced data structures like Binomial
Heaps, Fibonacci Heaps 5(L)
UNIT 2:
Divide and Conquer with examples such as
Sorting, Matrix
Multiplication, Convex hull etc 6(L)
UNIT 3:
Dynamic programming with examples such
as Kanpsack, All pair shortest paths etc
4(L)

UNIT 4:
Backtracking, Branch and Bound
with examples such as Travelling Salesman
Problem etc 6(L)
UNIT 5:
Algorithms involving Computational
Geometry 4(L)

UNIT 6:
Selected topics such as NP completeness,
Approximation algorithms, Randomized
algorithms, String Matching 5(L)
Text/Reference Books

Introduction to Algorithms by
Thomas H. Coreman, Charles E. Leiserson
and Ronald L. Rivest

Fundamentals of Computer Algorithms by


E. Horowitz & S Sahni

The Design and Analysis of Computer


Algorithms by Aho, Hopcraft, Ullman
ALGORITHM

A finite set of instructions which if followed


accomplish a particular task.
In addition every algorithm must satisfy
following criteria:
1. Input: zero or more quantities externally
supplied
2. Output: at least one quantity is produced
3. Definiteness: Each instruction must be
clear and unambiguous.
4. Finiteness: In all cases algorithm must
terminate after finite number
of steps.
5. Effectiveness: each instruction must be
sufficiently basic.
• Two algorithms on two systems
• Algorithm A1 50 n lg n
• Algorithm A2 2 n2

A2 Super computer
108 ins/sec

A1
P.C
106 ins /sec
For n = 106

Time taken by Super Computer


= 2.(106)2/ 108 = 20,000 sec

Time taken by P .C.


= 50 . 106 lg 106 / 106 = 1,000 sec
Thus by using a fast algorithm , the personal
computer gives results
20 times faster than the result given by super
computer using a slow algorithm.

Thus a good algorithm is like a sharp knife, it


does exactly what it is supposed to do
with a minimum amount of effort.
Complexity
Some questions to answer:
 How fast can we solve a
problem?
 There may be many algorithms for a
given problem. Which algorithm to use?
 What are the classical algorithm
design techniques ?
 Are there problems inherently difficult
to solve?
How do we express the complexity of algorithm?

Resources : Time and Space

Complexity lower bounds for problems.

Complexity classes P, NP etc.


Pseudocode

• Pseudocode is an English language like


representation of the code required for an algorithm.

• It is partly English, partly structured code.

• The English part provides a relaxed syntax that is


easy to read.

• The code part consists of an extended version of the


basic algorithmic constructs-sequence, selection
and iteration.
Sequence, selection, loop
• A sequence is a series of statements that do not
alter the execution path within an algorithm.
• Statements such as assign and add are sequence
statements.
• A call to another algorithm is also considered a
sequence statement.
• Selection statements evaluate one or more
alternatives. Paths are followed based on its
result.
• The typical selection statement is the two
way selection
• if (condition) action 1 else action 2.
• The part of the loop are identified by
indentation.
• Loop iterates a block of code. It closely
resembles the while loop. It is a pretest loop.
Example
Algorithm deviation
It finds deviations from average.

Pre: nothing
Post: numbers are read and deviation
from average printed
1 i= 0
2 Loop(all data are read)
1 I=I+1
2 read numbers into array[i]
3 sum = sum + number
3 Average = sum / I
4 Print (average)
5 J=0
6 Loop (j < i)
1 j = j+ 1
2 dev = array[j] – average
3 print (array [ j] . Dev)
7 Return
8 End deviation
Analysis vs. Design

• Analysis: predict the cost of an algorithm in


terms of resources and performance

• Design: design algorithms which minimize


the cost
Time Complexity
Real Time:

To analyze the real time complexity of a program


we need to determine two numbers for each
statement in it:

• amount of time a single statement will take.

• No. of times it is executed.

• Product of these two, will be the total time taken


by the statement.
First no. depends upon the machine and
compiler used , hence the real time
complexity is machine dependent.
Machine model: Generic Random Access
Machine (RAM)

• Executes operations sequentially


• Set of primitive operations:
• Arithmetic. Logical, Comparisons, Function calls

• Simplifying assumption: all ops cost 1 unit


• Eliminates dependence on the speed of our computer,
otherwise impossible to verify and to compare
Frequency count

• To make analysis machine independent


it is assumed that every instruction takes
the same constant amount of time for
execution.

• Hence the determination of time


complexity of a program is the matter of
summing the frequency counts of all the
statements.
Linear loop
1 i=1
2 Loop(I <= 1000)

1 application code
2 I=I + 1

The body of the loop is repeated 1000 times.

1 I=1
2 Loop (I <= n)
1 Application code
2 I=I+2

For this the code time is proportional to n


Logarithm Loops
Multiply loops Divide loops
1 I=1 1 I=n
2 Loop (I < n) 2 loop( i>= 1)
1 application code 1 application
code 2 I=I/2
2 I = i*2

F(n) = [log n] F(n) = [log n]


Nested loop- linear logarithmic

• 1 I=1
• 2 loop(I <= n)
1 j=1
2 loop( j < = n)
1 application code
2 j = j *2
3 I=I+1

F(n ) = [n log n]
Dependent Quadratic
1 I=1
2 loop ( I < = n)
1 j=1
2 loop( j < = i)
1 application code
2 j=j+1
3 I=I + 1
no of iterations in the body of the inner loop is
1 + 2 + 3 + 4 +… + n I.e. n(n +1)/2

On an average = ( n+1/)2
thus total no of iterations = n (n+1)/2
Quadratic
1 i=1
2 Loop (I < = n)
1 j=1
2 Loop( j < = n)
1 application code
2 j = j+1
3 I = i+1

F(n) = n2
Insertion sort

INSERTION-SORT (A, n) ⊳ A[1 . . n]


for j ← 2 to n
do key ← A[j]
i←j–1
“pseudocode” while i > 0 and A[i] > key
do A[i+1] ← A[i]
i←i–1
A[i+1] ← key
1 j i n
A:
key
sorted
Insertion Sort
// At start, the singleton sequence A[1] is trivially
sorted
Freq.
1 for j  2 to length[A] n
2 do key  A[j] n-1
* insert A[j] into sorted array
* sequence A[1….j-1]
3 I  j-1 n-1
// let tj be the number of times the following
//while-loop is tested for the value j
4 while I> 0 and A[I] > key tj
5 Do A[I+1]  A[I] (tj-1)
6 I  I-1 (tj–1)
7 A[I+1]  key n-1
T(n) =

c1 n + c2 (n-1) + c3(n-1)+

c4 tj + c5 (tj-1) +

c6 (tj-1) + c7 (n-1)
• If Cj = 1

• T(n) = n + (n-1) + (n-1) + tj + 2 (tj-1) + (n-1)


• = n + 3(n-1) + 3tj -2(n-1)
• T(n) = 2n + 3  (tj ) for j = 1 to n-1

• = O( n2 )
• average case:
xj is being inserted into the sorted sequence
x1 x2 ... x j-1
• the probability that xj is the largest: 1/j
• takes 2 data movements
• the probability that xj is the second largest : 1/j
• takes 3 data movements
:
• # of movements for inserting xj: 1 4 7 5

2 3 j 1 j  3
  
j j j 2
n
j  3 ( n  8)( n  1)
M    O(n 2 )
j 2 2 4
Worst Case and best case
Best case: file is already sorted
in this case tj = 1

so total time is a linear function of n

Worst case: if file is in the reverse order


in this case tj = j-1

so total time is a quadratic function of n


Binary search

• sorted sequence : (search 9)


1 4 5 7 9 10 12 15
step 1 
step 2 
step 3 
• best case: 1 step = O(1)
• worst case: (log2 n+1) steps = O(log n)
• average case: O(log n) steps
n cases for successful search
n+1 cases for unsuccessful search

Average # of comparisons done in the binary tree:


1  k i 1 
A(n) = 2 n  1   i 2  k ( n  1), where k = log n+1
i 1
k
Assume
k
n=2 proved by induction
 i 2 i 1  2 k ( k  1)  1 on k
i 1

1
A(n) = (( k  1) 2 k  1  k(2 k  1))
2n  1
k as n is very large
= log n
= O(log n)
Quick sort

11 5 24 2 31 7 8 26 10 15
↑ ↑
11 5 10 2 31 7 8 26 24 15
↑ ↑
11 5 10 2 8 7 31 26 24 15
△• Recursively apply the
△ same procedure.
7 5 10 2 8 11 31 26 24 15
|← <11 →| |← > 11 →|
Best case : O(n log n)

A list is split into two sublists with almost equal size.

log n rounds are needed.


In each round, n comparisons (ignoring the element used to
split) are required.
Worst case : O(n2)

In each round, the number used to split is either the


smallest or the largest.
n(n  1)
n  (n  1)    1   O(n 2 )
2
Average case: O(n log n)
s n-s
include the splitter
T(n) = Avg (T(s)  T( n  s))  cn
1 s n
1 n
= 
n s 1
( T ( s )  T(n  s) )  cn

1
= n
(T(1)+T(n1)+T(2)+T(n2)+…+T(n)+T(0))+cn, T(0)=0
1
= n
(2T(1)+2T(2)+…+2T(n1)+T(n))+cn
(n1)T(n) = 2T(1)+2T(2)+…+2T(n1) + cn2……(1)
(n2)T(n-1)=2T(1)+2T(2)+…+2T(n2)+c(n1)2…(2)

(1)  (2)

(n1)T(n) (n2)T(n1) = 2T(n1)+c(2n1)


(n1)T(n) nT(n1) = c(2n1)

T(n) T(n  1) 1 1
=  c (  )
n n 1 n n 1
1 1 1 1 1
=c( n  n  1 )+c( n  1  n  2 )+…+c( 2  1 )+T(1), T(1) = 0
1 1 1 1 1
=c( n n  1 2 )+c( n  1 n  2 ...1)
  ... 
• <= 2log(n)+ constant

• Thus
• T(n) <= n log(n)
Example of insertion sort

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done
Worst/ best/average cases

• Worst case is the longest running time for


any input of size n
• O-notation represents upper bound I.e. an
upper bound for worst case.
• Best case is time taken for some input
data set that results in best possible
performance.You cannot do better. This is
lower bound.
• Average case is the average performance
Asymptotic Notation
• Goal: to simplify analysis by getting rid of
unneeded information (like “rounding”
1,000,001≈1,000,000)
• We want to say in a formal way 3n2 ≈ n2
•The “Big-Oh” Notation:

•given functions f(n) and g(n), we


say that f(n) is O(g(n) ) if and only
if there are positive constants c
and n0 such that f(n)≤ c g(n) for
n ≥ n0
f(n) is O(g(n) if

f ( n)
lim n is finite.
g ( n)
Example
For functions f(n) f(n) = 2n + 6
and g(n) (to the
right) there are
positive constants c
and n0 such that:
f(n) ≤ c g(n)
for n ≥ n0
conclusion:
2n+6 is O(n).
Another Example

On the other hand…


n2 is not O(n) because
there is no c and n0 such
that: n2 ≤ c n for n ≥ n0

(As the graph to the right


illustrates, no matter how
large a c is chosen there is
an n big enough that
N2 > c n ) .
Asymptotic Notation (cont.)

• Note: Even though it is correct to say “7n - 3 is O(n3)”, a


better statement is “7n - 3 is O(n)”, that is, one should
make the approximation as tight as possible

• Simple Rule: Drop lower order terms and constant


factors
7n-3 is O(n)
8n2log n + 5n2 + n is O(n2log n)
Asymptotic Notation (terminology)
• Special classes of algorithms:

constant O(1)
logarithmic: O(log n)
linear: O(n)
quadratic: O(n2)
polynomial: O(nk), k ≥ 1
exponential: O(an), a > 1
EXAMPLE

Consider 1/3 n2 – 5n

The dominating term is n2


Therefore it should be of O(n2)

•Given a positive constant c , a positive integer


n0 to be found such that
1/3 n2 - 5 n  c n2
Dividing the inequality throughout by n2

1/3 - 5/n  c

Therefore if n0  1 , choosing c  1/3 the


inequality will never be violated

Hence the expression is indeed of O(n2)


Polynomial of degree k is O( nk)
T(n) = ak nk + ak-1 nk-1 +……..a0 is O( nk)
T(n) <= |ak| nk +| ak-1| nk-1 +…….|a0|
<= nk(|ak| n0 +| ak-1| n-1 +…….|a0|n-k)
<= nk(|ak| n0 +| ak-1| n-1 +…….|a0|n-k) for n >1
Since n >1 n I-k <1 for I 0 <= I <= k
Therefore
T(n) <= nk ( |a0| + |a1 | + …. |ak |)
let c = ( |a0| + |a1 | + …. |ak |)
And f(n) = nk
T(n) <= c f(n) for n >1
Thus T(n) = O( nk)
•“Relatives” of the Big-Oh

• (f(n)): Big Omega--asymptotic


lower bound
• (f(n)): Big Theta--asymptotic tight
bound
Little o
• Informally, saying some equation f(n) = o(g(n))
means f(n) becomes insignificant relative to g(n)
as n approaches infinity. The notation is read, "f of
n is little oh of g of n".

• Formal Definition: f(n) = o(g(n)) means for all c > 0


there exists some k > 0 such that 0 ≤ f(n) < cg(n)
for all n ≥ k. The value of k must not depend on n,
but may depend on c
f ( n)
lim n 0
g ( n)

3n+2 =o(n2 )

3n  2
since lim n 2
0
n

3n+2 ≠ o(n) while 3n+2 =O(n)


Asymptotic Analysis of The Running Time
• Use the Big-Oh notation to express the
number of primitive operations executed
as a function of the input size.
• Comparing the asymptotic running time
-an algorithm that runs in O(n) time is
better than one that runs in O(n2) time
-similarly, O(log n) is better than O(n)
-hierarchy of functions: log n << n << n2
<< n3 << 2n
Caution! Beware of very large constant factors.

An algorithm running in time 1,000,000 n is still


O(n) but might be less efficient on your data set

than one running in time 2n2, which is O(n2)


Properties of the O notation
• Constant factors may be ignored
 " k > 0, kf is O( f)
Properties of the O notation
• Constant factors may be ignored
 " k > 0, kf is O( f)
• Higher powers grow faster
• nr is O( ns) if 0  r  s
Properties of the O notation
• Constant factors may be ignored
 " k > 0, kf is O( f)
• Higher powers grow faster
• nr is O( ns) if 0  r  s
Fastest growing term dominates a sum
• If f is O(g), then f + g is O(g)
eg an4 + bn3 is O(n4 )
Properties of the O notation
• Constant factors may be ignored
 " k > 0, kf is O( f)
• Higher powers grow faster
• nr is O( ns) if 0  r  s
. Fastest growing term dominates a sum
• If f is O(g), then f + g is O(g)
eg an4 + bn3 is O(n4 )
. Polynomial’s growth rate is determined by leading
term
• If f is a polynomial of degree d,
then f is O(nd)
Properties of the O notation
• f is O(g) is transitive
• If f is O(g) and g is O(h) then f is O(h)
Properties of the O notation

• Product of upper bounds is upper bound for the


product
• If f is O(g) and h is O(r) then fh is O(gr)
Properties of the O notation
• f is O(g) is transitive
• If f is O(g) and g is O(h) then f is O(h)
• Product of upper bounds is upper bound for the
product
• If f is O(g) and h is O(r) then fh is O(gr)
• Exponential functions grow faster than powers
• nk is O( bn ) " b > 1 and k  0
eg n20 is O( 1.05n)
Properties of the O notation
• f is O(g) is transitive
• If f is O(g) and g is O(h) then f is O(h)
• Product of upper bounds is upper bound for the
product
• If f is O(g) and h is O(r) then fh is O(gr)
• Exponential functions grow faster than powers
• nk is O( bn ) " b > 1 and k  0
eg n20 is O( 1.05n)
• Logarithms grow more slowly than powers
• logbn is O( nk) " b > 1 and k > 0
eg log2n is O( n0.5)
• Thus {log2(n)}2 is O(nk) for any k > 0
Properties of the O notation
• f is O(g) is transitive
• If f is O(g) and g is O(h) then f is O(h)
• Product of upper bounds is upper bound for the
product
• If f is O(g) and h is O(r) then fh is O(gr)
• Exponential functions grow faster than powers
• nk is O( bn ) " b > 1 and k  0
eg n20 is O( 1.05n)
• Logarithms grow more slowly than powers
• logbn is O( nk) " b > 1 and k > 0
eg log2n is O( n0.5)
Important!
Properties of the O notation
• All logarithms grow at the same rate
• logbn is O(logdn) " b, d > 1
Properties of the O notation
• All logarithms grow at the same rate
• logbn is O(logdn) " b, d > 1

• Sum of first n rth powers grows as the (r+1)th power


n
  kr is  ( nr+1 )
k=1
n
eg  i = n(n+1) is  ( n2 )
k=1 2
Polynomial and Intractable Algorithms
• Polynomial Time complexity
• An algorithm is said to be polynomial if it is
O( nd ) for some integer d
• Polynomial algorithms are said to be efficient
• They solve problems in reasonable times!

• Intractable algorithms
• Algorithms for which there is no known
polynomial time algorithm
We will come back to this important class later in
the course
Categories of algorithm efficiency
Efficiency Big O
Constant O(1)
Logarithmic O(log n)
Linear O(n)
Linear logarithmic O(n log n)
Quadratic O(n2)
Polynomial O(nk)
Exponential O(cn)
Factorial O(n!)
-notation
For function g(n), (g(n)) is
given by:

(g(n)) = {f(n):  +ve


constants c1, c2, and n0 such
that 0  c1g(n)  f(n) 
c2g(n),
"n  n0 }

Intuitively: Set of all functions that


have the same rate of growth as g(n).

g(n) is an asymptotically tight bound for f(n).


O -notation
For function g(n), O(g(n)) is
given by:

O(g(n)) = {f(n):  +ve


constants c and n0 such that 0 
f(n)  cg(n), "n  n0 }

Intuitively: Set of all functions


whose rate of growth is the same as
or lower than that of g(n).
g(n) is an asymptotic upper bound for f(n).
f(n) = (g(n))  f(n) = O(g(n)).
(g(n))  O(g(n)).
 -notation
For function g(n), (g(n)) is
given by:
(g(n)) = {f(n):  +ve
constants c and n0 such that 0 
cg(n)  f(n), "n  n0 }
Intuitively: Set of all
functions whose rate of
growth is the same as or
higher than that of g(n).
g(n) is an asymptotic lower bound for f(n).
f(n) = (g(n))  f(n) = (g(n)).
(g(n))  (g(n)).
Relations Between , O, 
Practical Complexity

250

f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n^2
f(n) = n^3
f(n) = 2^n

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Problem size
10 102 103 104

log2n 3.3 6.6 10 13.3

n 10 102 103 104

nlog2n 0.33x10 0.7x103 104 1.3x105

n2 102 104 106 108

2n 1024 1.3x103 >10100 >10100

n! 3x106 >10100 >10100 >10100


• Q1 Find functions f and g such that
• (i) f is O (g)
• (ii) f is not Θ (g)
• (iii) f(n) > g(n) for infinitely many n.

• Let us take g(n) = ½ for all n


• f(n) be a function with values in(0,1) close
to 1.
• f(n) = sin(n) or
• f(n) = 1/n if n is perfect square and 1
otherwise.
Exercise 1
If the complexity of the algorithm DOIT can
be expressed as O(n2)
Calculate the complexity of the following
program segment:
I=1
loop ( I < n)
DOIT(…)
I = I* 2
Answer : O(n2 log2n)
Exercise
Reorder the following complexity from
smallest to largest:
(a) N log2(n)
(b) n + n2 + n3
(c) 24
(d) n
Answer :
(a) 24
(b) n
(c) N log2(n)
(d) n + n2 + n3
Find Two non negative functions f(n)
and g(n) such that neither f(n) = O(g(n))
nor g(n) is O(f(n))
Let f(n) = 1+ cos (n)
g(n) = 1+ sin(n)

when f(n) = 0, g(n) = 2


And
when g(n) = 0 , f(n) = 2
it is not possible to find a positive constant c such that f(n) <=
g(n) for large n . Because g(n) always comes back to zero when
f(n) is 2
Is 2n+1 = O(2n )? Is 2 2n = O( 2 n)?
• Is 2n+1 = O(2n )? Yes!
• For any c >= 2

• 2n+1 <= C(2n ) for all n >= 1


• However

• Is 2 2n = O( 2 n)? Not correct.


• Lim 2 2n / ( 2 n ) is not finite.

You might also like