0% found this document useful (0 votes)
3 views

DSA-ch1

Uploaded by

fitsumhlina777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

DSA-ch1

Uploaded by

fitsumhlina777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Chapter One:

Lecture One- Analysis of Algorithm

College of computing Science,


Department of Software Engineering, DBU
Instructor: Bekele M.(MSc in Software Eng.)
Address: Email: [email protected]
June 2024 E.C
Problems, Algorithms and Programs
 Problem is a task to be performed.
 Best thought of as inputs and matching outputs.
 Example given id, find the detail of students
 Problem definition should include constraints on the resources
that may be consumed by any acceptable solution.
 The search algorithm Shall return in less than 2 seconds

2
Algorithm
 Algorithms are steps that need to be followed to solve a
problem.
 A recipe
 An algorithm takes the input to a problem and transforms
it to the output.
 A mapping of input to output problem

algorithm

input “computer” output


3
Algorithm Design
 You have a problem to solve
 Analyze the problem and identify the requirements
 Design an efficient algorithm
 Use good data structures
 Show that your algorithm works!
 Prove its correctness
 Study the efficiency of your algorithm
 Run time
 Storage required

 For a problem given we might come up with different algorithms

 write an algothm for computing the Factorial number?

4
Example
• Two algorithms for computing the Factorial
• Which one is better?
2)
1) int factorial (int n)
int factorial (int n) {
{ if (n<=1) return 1;
if (n <= 1) return 1; else {
else fact = 1;
return n * factorial(n-1); for (k=2; k<=n; k++)
} fact *= k;
return fact;
}

5
Algorithm vs Program
Algorithm Program

Design Implementation

Domain knowledge Programmer


Any language Program language
h/w and operation systems h/w and operation systems

Analyze Testing

6
Priori Analysis Posteriori Testing
Algorithm Program

Independent of language Language dependent

Hardware Independent Hardware dependent

Time and space function Watch time and bytes

7
Properties of Algorithms
1. Finiteness:
• Algorithm must complete after a finite number
of steps.
• Algorithm should have a finite number of steps.

Example of Finite Steps Example of Infinite Steps


int i=0; while(true)
while(i<10) {
{ cout<<“Hello”;
cout<< i; }
i++;
}
Con’d
2. Definiteness (Absence of ambiguity):
• Each step must be clearly defined, having
one and only one interpretation.
• At each point in computation, one should be
able to tell exactly what happens next.
3. Sequential:
• Each step must have a uniquely defined
preceding and succeeding step.
• The first step (start step) and last step (halt step)
must be clearly noted.
4. Feasibility:
• It must be possible to perform each instruction.
• Each instruction should have possibility to be
executed.
 a) for(int i=0; i<0; i++)
 {
 cout<< i; // there is no possibility that this
 } //statement to be executed.
 b) if(5>7)
 {
 cout<<“hello”; // not executed.
 }
5. Correctness:
• It must compute correct answer for all possible
legal inputs.
• The output should be as expected and required
and correct.
6. Language Independence:
• It must not depend on any one programming
 language.
7. Completeness:
• It must solve the problem completely.
8. Effectiveness:
• Doing the right thing. It should yield the correct result all
the time for all of the possible cases.
9. Efficiency:
• It must solve with the least amount of computational
resources such as time and space.
• Producing an output as per the requirement
 within the given resources (constraints).
 Example: Write a program that takes a number and
 displays the square of the number.

1) int x; 2) int x,y;


cin>>x; cin>>x;
cout<<x*x; y=x*x;
cout<<y;
Example:
Write a program that takes two numbers and
displays the sum of the two.
Program a Program b Program c (the most efficient)
cin>>a; cin>>a; cin>>a;
cin>>b; cin>>b; cin>>b;
sum= a+b; a= a+b; cout<<a+b;
cout<<sum; cout<<a;

All are effective but with different efficiencies.


10. Input/output:
• There must be a specified number of input
 values, and one or more result values.
• Zero or more inputs and one or more outputs.
11. Precision:
• The result should always be the same if the
algorithm is given identical input.
12. Simplicity:
• A good general rule is that each step should
carry out one logical step.
• What is simple to one processor may not be
simple to another.
13. Levels of abstraction:
• Used to organize the ideas expressed in
algorithms.
•Used to hide the details of a given activity
and refer to just a name for those details.
•The simple (detailed) instructions are hidden
inside modules.
•Well-designed algorithms are organized in
terms of levels of abstraction.
Algorithm analysis
 Studies computing resource requirements of different
algorithms
 Computing Resources
 Running time (Most precious)
 Memory usage
 Communication bandwidth etc
 Why need algorithm analysis ?
 Writing a working program is not good enough
 The program may be inefficient!
 If the program is run on a large data set, then the running time becomes
an issue
 Goal is to pick up an efficient algorithm for the problem at
hand
16
Reasons to perform analyze algorithms
 It enables us to:
 Predict performance of algorithms
 Compare algorithms.
 Provide guarantees on running time/space of algorithms
 Understand theoretical basis.
 Primary practical reason: avoid performance bugs.
 client gets poor performance because programmer did not
understand performance characteristics

17
How to Measure Efficiency/performance?
 Two approaches to measure algorithms efficiency/performance
 Empirical
 Implement the algorithms and
 Trying them on different instances of input
 Use/plot actual clock time to pick one
 Theoretical/Asymptotic Analysis
 Determine quantity of resource required
mathematically needed by each algorithms

18
Example- Empirical
Actual clock time
Input size

19
Drawbacks of empirical methods
 It is difficult to use actual clock because clock time varies based on
 Specific processor speed
1.78GHz 2.12GHz
10s <10s
 Current processor load
• Only the work 10s
• With printing 15s
• With printing & browsing the internet >15s

 Specific data for a particular run of the program


 Input size
 Input properties
 Programming language (C++, java, python …)
 The programmer (You, Me, Billgate …)
 Operating environment/platform (PC, sun, smartphone etc)
 Therefore, it is quite machine dependent

20
Example
 Problem: Need to multiply two positive integers a and b
 Algorithm 1: Multiply a and b
 Algorithm 2:
1. V = a, W = b
2. While W > 1
V =V + b; W=W-1
3. Output V
 First algorithm has 1 multiplication.
 Second has b additions and subtractions.
 For some computer architectures, 1 multiplication is more expensive than
b additions and subtractions.
 Ideally, we would like to program all choices and run all of them in the
machine we are going to use and find which is efficient!-Very difficult

21
Machine independent analysis
 Critical resources:
 Time, Space (disk, RAM), Programmer’s effort, Ease of use
(user’s effort).
 Factors affecting running time:
 System dependent effects.
 Hardware: CPU, memory, cache, …
 Software: compiler, interpreter, garbage collector, …
 System: operating system, network, other apps, …
 System independent effects
 Algorithm.
 Input data/ Problem size
22
Cont..
 For most algorithms, running time depends on “size” of the
input.
 Size is often the number of inputs processed
 Example:- in searching problem, size is the no of items to be
sorted
 Running time is expressed as T(n) for some function T
on input size n.

23
Machine independent analysis
 Efficiency of an algorithm is measured in terms of the number of basic
operations it performs.
 Not based on actual time-clock
 We assume that every basic operation takes constant time.
 Arbitrary time
 Examples of Basic Operations:
 Single Arithmetic Operation (Addition, Subtraction, Multiplication)
 Assignment Operation
 Single Input/Output Operation
 Single Boolean Operation
 Function Return
 We do not distinguish between the basic operations.
 Examples of Non-basic Operations are
 Sorting, Searching.
24
Machine independent analysis
 Problem: Need to multiply two positive integers a and b
 Algorithm 1: Multiply a and b
 Algorithm 2:
1. V = a, W = b
2. While W > 1
3. V =V + a; W=W-1
4. Output V

 Algorithm 1 uses 1 basic operation


 Algorithm 2 uses 4b basic operations

 Therefore, Algorithm 1 is more efficient.

25
Rules for computing complexity
 The complexity function T(n) of a given algorithm is
determined by counting the number of operations of the
algorithm. But there are two kinds of counting the
operations:
o In-formal method: tries to count all the operations of the algorithm
o Formal method: counts operations by ignoring the variable
initializations, loop controls…

26
Analysis Rules:
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
 Assignment Operation
 Single Input/Output Operation
 Single Boolean Operations
 Single Arithmetic Operations
 Function Return
3. Running time of a selection statement (if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
4. Loops: Running time for a loop is equal to the running time for the statements
inside the loop * number of iterations.
The total running time of a statement inside a group of nested loops is the
running time of the statements multiplied by the product of the sizes of all the
loops.
For nested loops, analyze inside out.
 Always assume that the loop executes the maximum number of iterations
possible.
5. Running time of a function call is 1 for setup + the time for any parameter
27
calculations
By:Y&M + the time required for the execution of the function body.
Examples: Count of Basic Operations T(n)

 Sample Code
int count()
{
Int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i = 0;i < n;i++)
k = k+1;
return 0;
}

28
Examples: Count of Basic Operations T(n)
Count of Basic Operations
Sample Code (Time Units)
int count()
{
 1 for the assignment statement: int k=0
Int k=0;
cout<< “Enter an integer”;  1 for the output statement.

cin>>n;  1 for the input statement.


 In the for loop:
for (i = 0;i < n;i++)  1 assignment, n+1tests, and n increments.
k = k+1;  n loops of 2 units for an assignment, and an addition.
return 0;  1 for the return statement.
}
 T (n) = 1+1+1+(1+n+1+n)+2n+1 = 4n+6

29
Examples: Count of Basic Operations T(n)
int total(int n)
{
Int sum=0;
for (int i=1;i<=n;i++)
sum=sum+i;
return sum;
}

30
Examples: Count of Basic Operations T(n)
Count of Basic Operations
Sample Code (Time Units)

int total(int n)
{
 1 for the assignment statement: int sum=0
Int sum=0;
 In the for loop:
for (int i=1;i<=n;i++)  1 assignment, n+1tests, and n increments.
sum=sum+i;  n loops of 2 units for an assignment, and an
return sum; addition.
 1 for the return statement.
}

 T (n) = 1+ (1+n+1+n)+2n+1 = 4n+4

31
Examples: Count of Basic Operations T(n)
void func()
{
Int x=0;
Int i=0;
Int j=1;
cout<< “Enter an Integer value”;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}

32
Examples: Count of Basic Operations T(n)

Sample Code Count of Basic Operations (Time Units)


void func()
{
Int x=0;  1 for the first assignment statement: x=0;
Int i=0;  1 for the second assignment statement: i=0;
Int j=1;  1 for the third assignment statement: j=1;
cout<< “Enter an Integer value”;  1 for the output statement.
cin>>n;  1 for the input statement.
while (i<n){  In the first while loop:
x++;  n+1tests
i++;  n loops of 2 units for the two increment (addition)
} operations
while (j<n)
 In the second while loop:
{
 n tests
j++;
 n-1 increments
}
 T (n) = 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5
}

33
Examples: Count of Basic Operations T(n)
 Sample Code
int sum (int n)
{
int partial_sum= 0;
for (int i = 1; i <= n; i++)
partial_sum= partial_sum+ (i * i * i);
return partial_sum;
}

34
Examples: Count of Basic Operations T(n)

Sample code Count of Basic Operations (Time Units)

int sum (int n)


{
int partial_sum= 0;  1 for the assignment.
for (int i = 1; i <= n; i++)  1 assignment, n+1tests, and n increments.
partial_sum= partial_sum+ (i * i * i);  n loops of 4 units for an assignment, an
addition, and two multiplications.
return partial_sum;
 1 for the return statement.
}
 T (n) = 1+(1+n+1+n)+4n+1 =
6n+4

35
Simplified Rules to Compute Time Units
 for Loops: Formally
 In general, a for loop translates to a summation. The index and bounds of
the summation are the same as the index and bounds of the for loop

36
Simplified Rules to Compute Time Units
 Nested Loops: Formally
 Nested for loops translate into multiple summations, one for
each for loop.

37
Simplified Rules to Compute Time Units
 Consecutive Statements: Formally
 Add the running times of the separate blocks of your code.

38
Simplified Rules to Compute Time Units
 Conditionals:
 If (test) s1 else s2: Compute the maximum
of the running time for s1 and s2.

if (test == 1) {
for ( int i = 1; i <= N; i++) {
sum = sum+i;
}}
else
{
for ( int i = 1; i <= N; i++) {
for ( int j = 1; j <= N; j++) {
sum = sum+i+j;
}}

39
Example: Computation of Run-time
 Suppose we have hardware capable of executing 106
instructions per second. How long would it take to
execute an algorithm whose complexity function was T
(n) = 2n2 on an input size of n =108?

40
Example: Computation of Run-time
 Suppose we have hardware capable of executing 106
instructions per second. How long would it take to
execute an algorithm whose complexity function was T
(n) = 2n2 on an input size of n =108?

The total number of operations to be performed would be T(108):


T(108) = 2*(108)2 =2*1016
The required number of seconds would be given by
T(108)/106 so:
Running time = 2*1016/106 = 2*1010
The number of seconds per day is 86,400 so this is about 231,480 days
(634 years).

41
Types of Algorithm complexity analysis
 Best case. Exam ple
int search(int a[ ] ,int &x, int n)
 T(n) = minimum time of algorithm on any input of size IL
for(int i=0;i<=n && a[i]!=x;i++)
 Lower bound on cost. {
 Determined by “easiest” input. if(i==n-1)
 Provides a goal for all inputs. {

 Worst case. return


-I;
 T(n) = maximum time of algorithm on any input of size n.
}}
 Upper bound on cost. return
 Determined by “most difficult” input. I;

 Provides a guarantee for all inputs. }

 Average case. Expected cost for random input.


 Need a model for “random” input.
 Provides a way to predict performance.
42
Best, Worst and Average Cases
 Not all inputs of a given size take the same time.
 Sequential search for K in an array of n integers:
 Begin at first element in array and look at each element in turn until K is
found.
 Best Case: [Find at first position: 1 compare]
 Worst Case: [Find at last position: n compares]
 Average Case: [(n + 1)/2 compares]
 While average time seems to be the fairest measure, it may be
difficult to determine.
 Depends on distribution. Assumption for above analysis: Equally likely at any
position.
 When is worst case time important?
 algorithms for time-critical systems
43
Order of Growth and Asymptotic Analysis
 Suppose an algorithm for processing a retail store’s inventory takes:
 10,000 milliseconds to read the initial inventory from disk, and then
 10 milliseconds to process each transaction (items acquired or sold).
 Processing n transactions takes (10,000 + 10 n) milliseconds.
 Even though 10,000 >> 10, the "10 n" term will be more important if the
number of transactions is very large.
 We also know that these coefficients will change if we buy a faster
computer or disk drive, or use a different language or compiler.
 we want to ignore constant factors (which get smaller and smaller as technology
improves)
 In fact, we will not worry about the exact values, but will look at “broad
classes" of values.

44
Growth rates
 The growth rate for an algorithm is the rate at which the
cost of the algorithm grows as the size of its input grows.

45
Rate of Growth
 Consider the example of buying elephants and goldfish:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)
since the cost of the gold fish is insignificant when
compared with cost of elephants
 Similarly, the low order terms in a function are relatively
insignificant for large n
n4 + 100n2 + 10n + 50 ~ n4
i.e., we say that n4 + 100n2 + 10n + 50 and n4 have the same rate of growth
More Examples: fB(n)=n2+1 ~ n2
 fA(n)=30n+8 ~ n

46
Visualizing Orders of Growth
 On a graph, as
you go to the
right, a faster
growing

Value of function 
function fA(n)=30n+8
eventually
becomes
larger...
fB(n)=n2+1

Increasing n 

47
Asymptotic analysis
 Refers to the study of an algorithm as the input size "gets big"
or reaches a limit.
 To compare two algorithms with running times f(n) and g(n),
we need a rough measure that characterizes how fast each
function grows- growth rate.
 Ignore constants [especially when input size very large]
 But constants may have impact on small input size

 Several notations are used to describe the running-


time equation for an algorithm.
 Big-Oh (O), Little-Oh (o)
 Big-Omega (Ω), Little-Omega(w)
 Theta Notation()
48
Big-Oh Notation
 Definition
 For f(n) a non-negatively valued function, f(n) is in set O(g(n)) if
there exist two positive constants c and n0 such that f(n)≤cg(n)for
all n>n0 .
 Usage: The algorithm is in O(n2) in [best ,average, worst]
case.
 Meaning: For all data sets big enough (i.e., n > n0), the
algorithm always executes in less than cg (n) steps [in best,
average or worst case].

49
Big-Oh Notation - Visually

50
Big-O Visualization
 O(g(n)) is the set of functions
with smaller or same order of
growth as f(n)

 Wish tightest upper bound:


 While T(n) = 3n2 is in O(n3),
we prefer O(n2).
 Because, it provides more
information to say O(n2) than
O(n3)
Big-O
 Demonstrating that a function f(n) is in big-O of a
function g(n) requires that we find specific constants c
and no for which the inequality holds.

 The following points are facts that you can use for Big-Oh
problems:
 1<= n for all n >= 1
 n <= n2 for all n >= 1
 2n <= n! for all n >= 4
 log2n <= n for all n >= 2
 n <= nlog2n for all n >= 2

52
Examples
 f(n) = 10n + 5 and g(n) = n. Show that f(n) is in O(g(n)).
 To show that f(n) is O(g(n)) we must show constants c and no
such that
 f(n) <= c.g(n) for all n >= no

 10n + 5 <= c.n for all n >= no


 <= 10n+5n for all n >= 1
 <= 15n for all n >= 1
 <= cn for all n>=1
 Therefore:- f(n) is in O(g(n)) for c = 15, and no = 1

53
Examples
 2n2 = O(n3): 2n2 ≤ cn3  2 ≤ cn  c = 1 and n0= 2

 n2 = O(n2): n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1

 1000n2+1000n = O(n2):
1000n2+1000n ≤ 1000n2+ n2
<= 1001n2 c=1001 and n0 = 1000

 n = O(n2): n ≤ cn2  cn ≥ 1  c = 1 and n0= 1

54
No Uniqueness
 There is no unique set of values for n0 and c in proving the
asymptotic bounds

 Prove that 100n + 5 = O(n2)


 100n + 5 ≤ 100n + n = 101n ≤ 101n2
for all n ≥ 5
n0 = 5 and c = 101 is a solution
 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2
for all n ≥ 1
n0 = 1 and c = 105 is also a solution
 Must find SOME constants c and n0 that satisfy the asymptotic
notation relation

55
Exe rcis e s :

I. f(n) =3n +2 and g(n)=n show t hat f(g(n))


2. f(n) =Sn2+4n and show t hat f(n)=O(n 2)
3. f(n) =3n 2+4n+ Iand show t hat f(n)=O(n 2)
Order of common functions
Notation Name Example

O(1) Constant Adding two numbers, c=a+b

O(log n) Logarithmic Finding an item in a sorted array with a binary search or a


search tree (best case)
O(n) Linear Finding an item in an unsorted list or a malformed tree
(worst case); adding two n-digit numbers
O(nlogn) Linearithmic Performing a Fast Fourier transform; heap sort, quick sort
(best case), or merge sort
O(n2) Quadratic Multiplying two n-digit numbers by a simple algorithm;
adding two n×n matrices; bubble sort (worst case or naive
implementation), shell sort, quick sort (worst case), or
insertion sort
57
Some properties of Big-O
 Constant factors are may be ignored
 For all k>0, kf is O(f)
 The growth rate of a sum of terms is the growth rate of
its fastest growing term.
 Ex, an3 + bn2 is O(n3 )
 The growth rate of a polynomial is given by the growth
rate of its leading term
 If f is a polynomial of degree d, then f is O(nd)

58
Implication of Big-Oh notation
 We use Big-Oh notation to say how slowly code might
run as its input grows.
 Suppose we know that our algorithm uses at most
O(f(n)) basic steps for any n inputs, and n is sufficiently
large, then we know that our algorithm will terminate
after executing at most constant times f(n) basic steps.
 We know that a basic step takes a constant time in a
machine.
 Hence, our algorithm will terminate in a constant times
f(n) units of time, for all large n.

59
Lower Bounds - Omega ()
 We say that “ f(n) is omega g(n),”' which we write f(n) =
(g(n)), if there exists an integer n0 and a constant C>0
such that for all integers n ≥ n0, f(n) ≥ Cg(n).

 The definition of omega is almost identical to that of Big


O.
 The only difference is in the comparison—
 for Big-O it is f(n)  Cg(n);
 for omega, it is f(n) ≥ Cg(n).
 All of the same conventions and caveats apply to omega
as they do to Big O.

60
Omega ()- Visually

 Just as Big O notation


provides an asymptotic
upper bound on a function,
Ω notation provides an
asymptotic lower bound.

61
Example
 Consider the function f(n)=5n2-64n+256. Clearly, f(n) is non-
negative for all integers n ≥ 0.
 We wish to show that f(n) = (n2). Therefore, we need to find an
integer n0 and a constant C>0 such that for all integers n ≥ n0,
f(n) ≥ C n2.
 Suppose we choose C=1. Then
f(n) ≥ C n2.  5n2 – 64n + 256 ≥ n2
 4n2 – 64n +256 ≥ 0
 4(n-8)2 ≥ 0
 Since (n-8)2 > 0 for all values of n ≥ 0, we conclude that n0 = 0.
 So, we have that for C=1 and that n0 = 0, f(n) ≥ C n2 for all
integers n ≥ n0. Hence, f(n) = (n2). .
62
Theta ()
 For a given function g(n), we denote by (g(n)) the
set of functions.


(g(n))
f (n):thereexistpositive
constantsc1,c2,andn0 s.t.
0c1g(n)  f (n) c2g(n)forallnn0  

A function f (n) belongs to the set (g(n)) if there
exist positive constants c1 and c2 such that it can be
“sand- wiched” between c1g(n) and c2g(n) or sufficienly
large n.
 f (n) (g(n))means that there exists some constant c1
and c2 s.t. c1g(n)  f (n) c2g(n) for large enough n.
63
f(n
)

f(n) =
0 (g(n))
Little o
 We say that ``f(n) is little o g(n),'' which we write f(n)=o(g(n)),
if and only if f(n) is O(g(n)) but f(n) is not (g(n)).

 Little o notation represents a kind of loose asymptotic


bound in the sense that if we are given that f(n)=o(g(n)), then
we know that g(n) is an asymptotic upper bound since
f(n)=O(g(n)), but g(n) is not an asymptotic lower bound since
f(n)=O(g(n)) and f(n)   (g(n)) implies that f(n)  (g(n)).

 For example, consider the function f(n)=n+1. Clearly, f(n) =


O(n2). Clearly too, f(n)  (n2), since no matter what C we
choose, for large enough n, Cn2 ≥ n+1. Thus, we may write f(n)
= n + 1 = o(n2).
65
Review questions
 You are given this set of growth functions: n!, 2n, 2n2,
5nlog n, 20n, logn.
 For Each growth function , find a value (a positive integer) for which the
function is the most efficient of the six. If there is no integer value for which it
is most efficient, answer "none".
 put the functions into their appropriate positions in the list so that finally the
list will contain all the functions in ascending order of their growth rates.

 2-SUM. Given N distinct integers, how many DUALS sum to exactly


zero?
 What is the running time of your algorithm
 3-SUM. Given N distinct integers, how many triples sum to exactly
zero?
 What is the running time of your algorithm

66
Reading Assignment

Review of elementary data structures
Heap and Heap Sort
Hashing
sets representation
UNION, FIND operation

67
End of Lecture One

68

You might also like