Design and Analysis of Algorithm: Unit 1
Design and Analysis of Algorithm: Unit 1
Algorithm
Unit 1
Asymptotic Notations:
methods for solving recurrence relations:
Brief Review of Graphs:
Sets and Disjoint Sets, Union:
Union-Find in Disjoint-Set Union (DSU):
Sorting Algorithms:
Divide and conquer
Strassen's matrix multiplication algorithm and its analysis:
Unit 2
Greedy
Knapsack Problem
Huffman Coding:
Job Sequencing with Deadlines:
Minimum Spanning Tree (MST)
Properties of Minimum Spanning Tree:
Algorithms for Finding Minimum Spanning Tree:
Steps for Finding Minimum Spanning Tree:
Importance and Applications:
Time Complexity:
Single Source Shortest Paths (SSSP) Problem
Bellman-Ford Algorithm: Shortest Paths with Negative Weights
Backtracking: Exploring Solutions One Path at a Time
The 8 Queens Problem: Backtracking in Action
Graph Coloring with Backtracking: Exploring Colors One Vertex at a Time
Hamiltonian Cycles and analysis using backtracking:
Unit 3
DP
Ingredients of Dynamic Programming
Unit 1
Notation: f(n) = O(g(n)) means f(n) grows no faster than g(n) asymptotically.
Example: Linear search has a time complexity of O(n), meaning its worst-case
time grows linearly with input size.
Notation: f(n) = Ω(g(n)) means f(n) grows at least as fast as g(n) asymptotically.
Example: Sorting a fully sorted array with insertion sort takes O(n) time, but its
best-case complexity is Ω(1) as it only needs to iterate once.
Describes the tight bound of an algorithm's complexity, indicating both the upper
and lower bounds are the same.
Notation: f(n) = Θ(g(n)) means f(n) grows at the same rate as g(n)
asymptotically.
Example: Merge sort has a time complexity of Θ(n log n), meaning its average,
best, and worst-case time complexities all grow at this rate.
Time Complexity:
Key Points:
Asymptotic notations focus on long-term behavior, not exact execution times for
specific inputs.
Additional Notes:
Example:
Consider the following algorithm to find the maximum value in an array:
Algorithm:
2. Iterate through the rest of the array, comparing each element to max_value .
4. Return max_value .
The algorithm performs a constant amount of work for each element in the
array, leading to a linear relationship between time and input size.
Steps:
Substitute the guess into the recurrence and try to solve for the constant
factors.
Example:
Steps:
Example:
Steps:
Label each node with the cost of the corresponding function call.
Sum the costs at each level of the tree to derive a recurrence for the total
cost.
4. Master Theorem:
A powerful tool for solving recurrences of the form T(n) = aT(n/b) + f(n), where a
≥ 1, b > 1, and f(n) is a polynomial function.
Provides a direct solution for common cases based on the relationship between
a, b, and f(n).
Cases:
If f(n) = Ω(n^(log_b(a) + ε)) for some ε > 0, and af(n/b) ≤ cf(n) for some
constant c < 1 and all sufficiently large n, then T(n) = Θ(f(n)).
5. Generating Functions:
Types of graphs:
Graph Representations:
Traversing a graph: Visiting each vertex once, using techniques like Depth-
First Search (DFS) or Breadth-First Search (BFS).
Finding shortest paths: Determining the path with the least cost (weight)
between two vertices. Dijkstra's algorithm is a popular example.
Topological sorting: Ordering vertices in a directed acyclic graph such that for
every directed edge (u, v), u comes before v in the ordering.
Applications of Graphs:
Further Exploration:
Explore different types of graphs like bipartite graphs, complete graphs, etc.
Disjoint Sets:
Example: {1, 3, 5} and {2, 4, 6} are disjoint, but {1, 3} and {2, 3} are not.
Union of Sets:
The union of two sets A and B is a new set containing all elements that are in A
or B or in both A and B (no duplicates).
Represented by A ∪ B.
Example: {1, 3, 5} ∪ {2, 4, 6} = {1, 2, 3, 4, 5, 6}.
Standard union (as described above) would result in duplicate elements if any
exist in both sets.
Implementation:
Parent-Pointer Representation:
The representative element of a set is the one whose parent points to itself.
Path compression is crucial for efficiency: During Find, update each node's
parent to point directly to the root, flattening the tree.
Union: O(α(n))
Path Compression:
During Find, updates each node's parent to point directly to the root, resulting in
flatter trees and faster subsequent Find operations.
Optimizations:
Union by Rank: Merges the smaller tree into the larger tree, keeping trees
balanced.
Applications:
Many more
Example in Python:
Python
class
UnionFind:def
def
find(self, x):if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Pat
return self.parent[x]
Key Points:
Sorting Algorithms:
Bubble Sort:
Time Complexity: O(n^2) (worst and average case), O(n) (best case)
Insertion Sort:
Time Complexity: O(n^2) (worst and average case), O(n) (best case)
Selection Sort:
Merge Sort:
Quick Sort:
Heap Sort:
Searching Algorithms:
Linear Search:
Binary Search:
Key Points:
For large datasets, algorithms with O(n log n) or better time complexity are
generally preferred.
Algorithm choice depends on factors like input size, data structure, desired
performance, and memory constraints.
Additional Notes:
Sorting algorithms are often used to prepare data for efficient searching.
Specialized sorting algorithms exist for particular data types (e.g., Counting Sort
for integers with a limited range).
Bubble Sort:
Insertion Sort:
Iterates through the unsorted part, inserting each element into its correct
position in the sorted part.
Selection Sort:
Repeatedly finds the minimum element in the unsorted part of the array and
swaps it with the element at the beginning of the unsorted part.
Merge Sort:
Quick Sort:
Chooses a pivot element and partitions the array into elements smaller and
larger than the pivot.
Heap Sort:
Builds a max-heap from the array, ensuring that each parent node is greater
than or equal to its children.
Repeatedly extracts the maximum element (root of the heap) and places it
at the end of the array, rebuilding the heap after each extraction.
Searching Algorithms:
Linear Search:
Iterates through the array, comparing each element with the target value
until a match is found or the end of the array is reached.
Binary Search:
Repeatedly divides the search interval in half and compares the middle
element with the target value, narrowing down the search until the target is
found or the interval becomes empty.
3. Combine: Combine the solutions to the subproblems to get the solution to the
original problem. This may involve merging or aggregating the results from each
subproblem.
The power of divide and conquer lies in its ability to handle complex problems by
breaking them down into manageable pieces. This can lead to improved efficiency,
particularly for problems where the cost of solving a subproblem is significantly less
than the cost of solving the original problem directly.
Here are some common examples of divide and conquer algorithms:
Merge Sort: Sorts a list by dividing it in half, recursively sorting each half, and
then merging the sorted halves.
Quick Sort: Chooses a pivot element and partitions the list into elements
smaller and larger than the pivot, then recursively sorts each partition.
Binary Search: Repeatedly divides the search space in half, comparing the
middle element with the target value, until the target is found or the search
space is empty.
Not all problems: Not all problems are suitable for a divide and conquer
approach.
Divide and Conquer Approach: Breaks down matrix multiplication into smaller
subproblems, reducing the number of multiplications required compared to the
standard algorithm.
Algorithm:
1. Divide: Split matrices A and B into submatrices A11, A12, A21, A22, B11, B12,
B21, B22.
3. Combine: Compute the final result matrix C using the seven products:
C11 = P1 + P4 - P5 + P7
C12 = P3 + P5
C21 = P2 + P4
C22 = P1 - P2 + P3 + P6
Complexity Analysis:
Key Points:
Additional Notes:
Strassen's algorithm is suitable for matrices with dimensions that are powers of
2.
Conclusion:
Strassen's algorithm demonstrates the power of divide and conquer to achieve
faster algorithms for complex problems, even though its practical application is
limited due to overhead and constant factors. It's a significant milestone in
theoretical computer science and has inspired further research into faster matrix
multiplication algorithms.
Unit 2
Greedy
The Greedy Method is a powerful algorithmic paradigm used to solve optimization
problems by making locally optimal choices at each step, hoping they will
cumulatively lead to the globally optimal solution. It's like navigating a maze by
always choosing the path that seems closest to the exit at each intersection, aiming
to reach the finish line as quickly as possible.
Here's a breakdown of the key features of the Greedy Method:
Key characteristics:
Simple and efficient: Often easy to implement and understand, with relatively
low computational cost.
Application areas:
Dijkstra's Algorithm: Finds the shortest path from a source node to all other
nodes in a weighted graph by iteratively relaxing edges and exploring the most
promising paths first.
Not applicable to all problems: Works best for problems with well-defined
local optimality criteria and no dependencies between choices.
Knapsack Problem
Theory:
The knapsack problem is a classic optimization problem where you must pack items
into a knapsack with a limited capacity (weight or volume) to maximize the total
There are different types of knapsack problems, each with slightly different
variations and solution strategies:
0/1 Knapsack Problem: You can only include an item once (either 0 or 1 times)
in the knapsack.
Algorithm:
This section will focus on the Dynamic Programming approach for the 0/1
Knapsack Problem. It's a powerful and efficient technique for solving knapsack
problems of any size.
1. Dynamic Programming Table:
Create a 2D table dp with n rows (number of items) and W columns (knapsack
capacity). Each cell dp[i][j] represents the maximum value achievable using only
the first i items with a weight limit of j .
If j < w_i (current item's weight exceeds the remaining capacity), dp[i][j]
The final value in the table, dp[n][W] , represents the maximum achievable value
within the knapsack capacity.
To find the actual items included in the optimal solution, backtrack through the
table. Start at dp[n][W] , and for each previous item i :
Trace back this path to identify the items in the optimal solution.
Example:
Consider a knapsack with a capacity of 5 and three items:
A 3 6
B 2 5
C 1 6
Using the dynamic programming table approach, you'll find the optimal solution is to
include items A and C for a total value of 12.
Complexity Analysis:
Time Complexity: O(nW) due to iterating through the table with n rows and W
columns.
Additional Resources:
Huffman Coding:
Theory:
Huffman coding is a lossless data compression technique that assigns variable-
length codes to symbols based on their frequency in a message. Frequent symbols
are assigned shorter codes, while less frequent symbols get longer codes. This
reduces the overall size of the encoded message compared to fixed-length coding
schemes.
Key Principles:
Tree construction: Build a binary tree where each node represents a symbol
and its frequency.
Algorithm:
1. Frequency Table: Create a table listing each symbol and its corresponding
frequency in the message.
2. Priority Queue: Build a priority queue with nodes representing each symbol,
with frequency as the priority (lower frequency = higher priority).
3. Tree Building: Repeatedly extract the two highest-priority nodes from the
queue. Combine them into a new parent node with a total frequency equal to
Example:
Consider a message with the following symbol frequencies:
A: 3
B: 2
C: 1
D: 4
1. Frequency Table:
Symbol Frequency
A 3
B 2
C 1
D 4
1. Tree Building:
A+B+C+D
/ \
D(8) A+B+C (6)
/ \
A (3) B+C (3)
/ \
B (2) C (1)
D 0
A 10
B 110
C 111
1. Encoding:
Time Complexity: O(n log n) due to priority queue operations and tree building.
Benefits:
Limitations:
Further Resources:
NP-hardness: For problems with large numbers of jobs and tight deadlines,
finding the optimal solution becomes computationally difficult.
Algorithms:
Several algorithms can be used to solve the job sequencing with deadlines
problem. Here are two common ones:
1. Greedy Algorithm:
Schedule jobs in the order they appear in the sorted list, as long as they don't
exceed the current processor availability.
This approach is simple and efficient but may not always lead to the optimal
solution.
2. Dynamic Programming:
Build a table that stores the maximum number of jobs that can be completed
within each time slot while respecting deadlines.
This approach is more complex than the greedy algorithm but guarantees an
optimal solution for smaller problems.
Complexity Analysis:
Greedy Algorithm: O(n log n) for sorting jobs and O(n) for scheduling, leading to
a total of O(n log n) time complexity.
Dynamic Programming: O(n^2) time complexity due to iterating over the table
and considering all possible job combinations.
Additional Considerations:
Further Resources:
2. Minimality: The total weight of the edges in the MST is minimized, ensuring the
sum of edge weights is as small as possible.
It's a greedy algorithm that builds the MST by iteratively adding the shortest
edge that does not form a cycle.
It sorts all edges by weight and adds them one by one, ensuring that no
cycle is formed.
2. Prim's Algorithm:
2. Edge Selection:
Select the next edge that can be added to the MST without forming a cycle
and has the minimum weight.
3. Adding Edges:
They help optimize costs and resources by finding the most efficient way to
connect all nodes in a network with minimum total edge weight.
Time Complexity:
Kruskal's Algorithm has a time complexity of (O(E log E)) using efficient data
structures for edge sorting.
Bellman-Ford Algorithm: For graphs with possibly negative edge weights (but
no negative-weight cycles).
1. Initialization:
Create a priority queue Q and insert the source vertex s with priority 0.
2. Main Loop:
Mark u as visited.
Example:
Complexity Analysis:
Time Complexity:
Further Classification:
Variants:
A* Search (heuristic-based)
Applications:
The Bellman-Ford algorithm finds the shortest paths from a single source vertex
s to all other vertices in a directed graph with potential negative edge
weights.
Unlike Dijkstra's algorithm, it can handle graphs with negative weights, but it
comes at the cost of being slower.
Algorithm:
1. Initialization:
Create a distance array dist[] initialized to infinity for all vertices except
the source s (set dist[s] = 0 ).
2. Relaxation Loop:
Example:
min(∞, 0 + 3) = 3 .
Therefore, the shortest paths are: A-B (5), A-C (6), A-D (15).
Complexity Analysis:
Further Classification:
Variants:
Applications:
Game theory
Additional Notes:
The Bellman-Ford algorithm can detect negative weight cycles in the graph.
However, it is the only viable option for graphs with negative weights (assuming
no negative weight cycles).
General Method:
1. Start with a candidate solution: Begin with an initial partial solution (or no
solution at all).
2. Make a choice: Explore one possible option to extend the current solution
further.
Key Aspects:
Choices and constraints: The algorithm needs clearly defined choices (e.g.,
placing a queen on a chessboard) and constraints (e.g., queens cannot attack
Dead ends: Recognizing dead ends early helps avoid unnecessary exploration
and improves efficiency.
Data structures: Stacks and queues are commonly used to manage the
exploration state and backtrack when needed.
Examples:
Maze solving: Finding a path from the start to the end of a maze without hitting
walls.
Sudoku: Filling in the missing numbers in a Sudoku puzzle such that each row,
column, and 3x3 subgrid contains all digits from 1 to 9 exactly once.
Breadth-first search: Explores all options at the current level before going
deeper.
The Problem:
Place 8 queens on an 8x8 chessboard so that no two queens can attack each other
(diagonally, horizontally, or vertically).
2. Place the first queen: Pick any square on the board (e.g., row 1, column 1).
3. Recursively explore:
Place the second queen on a valid square: Any square that doesn't attack
the first queen (no row, column, or diagonal overlap).
Example:
For the second queen, consider (2,3) (valid), (3,1) (invalid - attacks first queen),
etc. until you find a valid placement.
Continue placing queens recursively, backtracking when you hit dead ends.
If you successfully place all 8 queens without conflicts, you've found a solution!
Implementation Tips:
Variations:
Try finding all possible solutions for the 8 queens problem (there are 92!).
The backtracking algorithm for the 8 queens problem exhibits both strengths and
weaknesses in terms of its time and space complexity:
Time Complexity:
The worst-case scenario involves exhaustive exploration of all invalid and valid
placements before finding a solution. On an n x n chessboard with n
queens, the number of such placements is approximately n^(n-1).
Space Complexity:
In the worst case, the entire solution path up to the final queen needs to be
stored, leading to a space complexity of O(n).
However, as solutions are usually found earlier, the actual space usage is often
less and depends on the specific exploration path.
Algorithm:
1. Prepare: Define the graph with its vertices and edges. Choose a set of
available colors (m colors).
Check if assigning the chosen color is valid (no adjacent vertices with
the same color).
If no valid color can be assigned to the current vertex, backtrack and try
different colorings for the previous vertex.
3. Success or Failure:
If all options have been exhausted without finding a valid coloring, the graph
is not m-colorable.
Example:
Consider a graph with 4 vertices and 3 available colors (red, green, blue). You start
with the first vertex and try each color, checking if its adjacent vertices already have
that color. If not, you recursively color the rest of the graph with this chosen color. If
everything works out, you've found a solution. Otherwise, backtrack and try different
Space Complexity: O(V) due to the stack used to store the current state (color
assignments) during exploration.
Further Notes:
Additional constraints like specific color preferences for certain vertices can be
incorporated into the coloring process.
A Hamiltonian cycle in a graph is a closed loop that visits every vertex exactly
once and returns to the starting vertex.
Algorithm (Backtracking):
1. Initialization:
2. Recursive Exploration:
If all vertices are visited and the current vertex is adjacent to the starting
vertex, a Hamiltonian cycle is found.
Backtrack:
Example:
Consider a graph with vertices {A, B, C, D} and edges {(A, B), (A, C), (B, C), (B, D),
(C, D)}.
Complexity Analysis:
Time Complexity: O(n!), where n is the number of vertices. In the worst case,
the algorithm tries all possible permutations of vertices.
Space Complexity: O(n), due to the recursive calls and the path array.
It's not guaranteed to find a Hamiltonian cycle even if one exists, as it might get
stuck in parts of the graph that don't lead to a solution.
For large graphs, it's often impractical to find Hamiltonian cycles using
backtracking. Heuristic or approximation algorithms might be more suitable for
real-world applications.
Further Notes:
Unit 3
DP
Dynamic programming (DP) is a powerful technique for solving optimization
problems that exhibit optimal substructure and overlapping subproblems.
Here's a breakdown of the key concepts:
What is Dynamic Programming?
A technique for solving complex problems by breaking them down into smaller,
overlapping subproblems and storing the solutions to these subproblems to
avoid redundant computation.
Imagine climbing stairs: instead of calculating how many ways to reach each
step from the bottom each time, DP stores the solutions for each step, making
computations for higher steps efficient.
Key Characteristics:
Benefits of DP:
Examples of DP Algorithms:
Shortest path problems: Finding the shortest path between two points in a
graph by storing minimum distances to intermediate nodes.
3. Build the table: Fill the table bottom-up or top-down, solving subproblems and
utilizing values already stored.
4. Construct the solution: Use the stored subproblem solutions to build the
optimal solution for the original problem.
This is the bedrock of DP. The problem must be able to be decomposed into
smaller subproblems, where the optimal solution for the whole can be
assembled by combining the optimal solutions of its parts. Think of a jigsaw
puzzle; each piece contributes to the complete picture.
2. Overlapping Subproblems:
Imagine making many batches of cookies - you wouldn't re-invent the wheel for
each batch. Similarly, subproblems in DP often appear repeatedly throughout
the solution process. Recognizing and efficiently storing their solutions avoids
redundant calculations, saving time and effort.
3. Memoization/Tabulation:
These are the cooking methods of DP. Recursion breaks down the problem into
smaller calls of itself, eventually reaching the base case and building up the
solution. Iteration, on the other hand, systematically fills the memoization table
or calculates solutions for all subproblems in a loop.
5. Base Cases:
6. Efficiency Trade-Offs:
Given a sequence of matrices A₁ through Aₙ, with dimensions (m₁xn₁) for A₁,
(n₁xn₂) for A₂, and so on, find the order in which to multiply them such that the
total number of scalar multiplications is minimized.
Naive Approach:
3. Recurrence relation: For i < j, consider all possible split points k (i ≤ k < j) and
define:
M[i,j] = min(M[i,k] + M[k+1,j] + m_i * m_k * m_j) for all k between i and j-1
This equation essentially checks the minimum cost of multiplying Ai through Ak and
Ak+1 through Aj, then adding their cost to the cost of multiplying Ak with the
resulting matrices.
2. Construct the optimal order: Backtrack through the table from M[1,n] to
identify the split points k that contributed to the minimum cost. This reveals the
optimal order of multiplication.
Complexity Analysis:
Benefits:
Extensions:
Given two strings X and Y, find the longest subsequence that is common to both.
Naive Approach:
Brute-forcing all possible subsequences of X and comparing them to Y is inefficient
and impractical for long strings. This approach leads to exponential time complexity.
DP Solution:
1. Define subproblems: Let L(i,j) represent the length of the LCS for the prefixes
of X up to index i and Y up to index j.
If X[i] != Y[j]: L(i,j) = max(L(i-1,j), L(i,j-1)) (take the longer subsequence from
either neglecting X[i] or Y[j]).
2. Construct the LCS: Backtrack through the table from L(m,n) (where m and n
are the lengths of X and Y, respectively) to identify the matching characters that
contribute to the LCS. This reveals the longest common sequence.
Time complexity: O(mn) due to iterating through all subproblems in the table.
Benefits:
Finds the optimal LCS length and sequence for any pair of strings.
Extensions:
*Overall, DP offers a fast and effective solution for the LCS problem. Its ability
to dynamically solve subproblems and reuse their solutions makes it a powerful
tool for string comparison and related optimization tasks.
Given a set of keys with associated search probabilities, construct a binary search
tree (BST) that minimizes the average search cost (expected number of
comparisons needed to find a key).
Dynamic Programming Approach:
1. Define Subproblems:
Let E[i, j] represent the minimum expected search cost for a BST
containing keys i to j (inclusive).
2. Base Cases:
3. Recurrence Relation:
For i < j :
E[i, j] = min(k=i to j-1) { E[i, k] + E[k+1, j] + w(i, j) }
The formula explores all possible root nodes k and chooses the one
minimizing the overall cost.
4. Memoization Table:
5. Construction:
Backtrack through the table to determine the optimal root choices for each
subproblem, constructing the OBST.
Complexity Analysis:
Additional Considerations:
Weighted External Path Length: Sum of path lengths from the root to each
external node, weighted by their search probabilities.
Variations:
Efficiently finds the optimal tree structure for any set of keys and probabilities.
Problem:
Given a set of N items with weights w[i] and values v[i] , and a knapsack capacity
of W , find the subset of items that maximizes the total value within the weight
constraint.
2. Base Cases:
3. Recurrence Relation:
Consider item i :
4. Memoization Table:
5. Optimal Solution:
Backtracking (Optional):
Traverse the table backwards to identify the chosen items in the optimal
solution.
Complexity Analysis:
Benefits:
Applications:
Overall, dynamic programming proves its power in tackling the 0-1 knapsack
problem. Its ability to optimize subproblems and reuse their solutions makes
it a valuable tool for a wide range of optimization challenges.
Analysis:
Approximation: While not always optimal, DP-based algorithms can offer near-
optimal solutions within an acceptable time frame.
Insights: Studying how DP algorithms build solutions can reveal patterns and
properties of the TSP landscape, aiding in the development of heuristic or
approximation algorithms.
Computational complexity: Even with DP, solving TSP for large graphs
remains computationally expensive.
Floyd-Warshall Algorithm:
Problem: Finds all-pairs shortest paths in a weighted graph, meaning the shortest
path between every pair of vertices.
Key Features:
Algorithm:
1. Initialization:
2. Iterative Updates:
Analysis:
Key Points:
It can detect negative cycles if any D[i][i] becomes negative during the
process.
It's often a preferred choice for dense graphs with many edges.
Applications:
Strengths:
Limitations:
Key Features:
Bounds and pruning: Utilizes upper and lower bounds on the objective
function to estimate the potential solutions of subproblems and discard those
unlikely to yield the optimal result.
Method:
1. Start with the original problem: Define the search space, objective function,
and initial bounds.
3. Calculate bounds: Evaluate the objective function or estimate upper and lower
bounds for each subproblem.
7. Stop when there are no more promising branches: The current best solution
is the optimal solution (within the accuracy of the bounds).
Complexity Analysis:
The complexity of B&B depends heavily on the problem, the effectiveness of its
branching and bounding strategies, and the size of the search space.
Best-case: O(n log n) if effective pruning eliminates most of the search space.
Strengths:
Flexibility: Applicable to various problem types with defined search spaces and
objective functions.
Heuristics can improve efficiency: Combining B&B with good heuristics can
significantly reduce the search space and improve efficiency.
Limitations:
May not find feasible solutions: For problems with no feasible solutions, B&B
can get stuck exploring infeasible branches.
Branching Strategy:
Each level of the branch and bound tree represents a decision point: include or
exclude an item.
The left branch corresponds to including the item and the right branch to
excluding it.
Bounding Function:
A simple bound can be the sum of the values of the remaining items that
haven't been included yet.
We can also use more sophisticated bounds like fractional knapsack values to
tighten the bound and prune more branches.
Pruning Criteria:
A branch can be pruned if its lower bound is already less than the current best
solution (maximum value found so far).
This guarantees that no further exploration down that branch can lead to a
better solution.
Algorithm Outline:
2. Explore the topmost level of the branch and bound tree, creating two branches
for each item.
If the lower bound is less than the current best solution, prune the branch.
5. The current best solution at the end of the exploration is the optimal solution for
the knapsack problem.
Advantages:
Disadvantages:
Overall, Branch and Bound offers a powerful approach to solving the 0/1
knapsack problem, especially for smaller instances or with good bounding
functions. Its ability to systematically explore and prune the search space
helps guarantee optimality, making it a valuable tool for discrete optimization
problems.
Each level of the B&B tree represents a partial tour visited so far.
Branches are created by choosing the next city to visit from the unvisited ones.
Bounding Function:
Estimating the remaining distance to complete the tour is crucial for pruning
non-promising branches.
Christofides' bound: Adds twice the weight of the MST to the shortest
Hamiltonian cycle found in a modified graph.
Pruning Criteria:
This avoids exploring potentially worse paths and speeds up the search.
Algorithm Outline:
2. Explore the topmost level of the B&B tree, creating branches for each remaining
city.
If the lower bound is greater than the current best solution, prune the
branch.
5. The current best solution at the end of the exploration is the optimal solution
found using B&B.
Advantages:
Can guarantee finding a good, though not necessarily the absolute shortest,
solution.
Disadvantages:
Not guaranteed to find the optimal solution for all TSP instances.
Overall, Branch and Bound, coupled with effective bounding functions and
pruning strategies, can be a valuable tool for tackling the TSP, especially for
smaller instances or finding good approximate solutions. For larger
problems, alternative algorithms like heuristics or metaheuristics might be
more efficient for finding acceptable solutions within a reasonable time frame.
Finds all occurrences of a pattern string (P) within a text string (T).
Algorithm:
2. Compare characters:
3. Match Found:
4. Repeat:
Analysis:
Space Complexity: O(1), as only a constant amount of extra space is used for
variables and pointers.
Key Points:
Simple and easy to understand, but can be inefficient for large texts and
patterns.
Strengths:
Simple implementation.
Limitations:
Inefficient for large texts and patterns due to repeatedly comparing characters.
Doesn't leverage any knowledge about the pattern or text structure to optimize
comparisons.
Algorithm:
3. Rolling hash and comparison: If the hashes match, compare the characters
individually to confirm an actual match. Otherwise, shift the window by one
character in T, recalculate the hash for the new window (using the rolling hash
property), and compare it to the pattern's hash.
Analysis:
Time Complexity: O(m + n), where m is the length of P and n is the length of T.
The rolling hash computation significantly reduces redundant comparisons
compared to the naive algorithm.
Space Complexity: O(1), as only a constant amount of extra space is used for
variables and temporary values.
Key Points:
Offers significant efficiency improvement over the naive algorithm for large texts
and patterns.
Strengths:
Simple implementation.
Limitations:
The DFA transitions between states based on the characters it reads in the text
string.
A match is found when the DFA reaches an accepting state, indicating the
pattern has been found within the text.
Construction:
1. States: Create states for each possible prefix of the pattern, including an initial
state and an accepting state for the full pattern.
Algorithm:
Follow the corresponding transition in the DFA based on the current state
and the character.
3. Match Found: If the DFA reaches the accepting state, a match is found at the
current position in the text.
4. Repeat: Continue reading characters and following transitions until the end of
the text is reached.
Analysis:
Time Complexity: O(n), where n is the length of the text string. Each character
is processed only once.
Space Complexity: O(m), where m is the length of the pattern string, to store
the DFA.
Strengths:
Limitations:
Key Idea: Preprocesses the pattern to create a "partial match" table (also called
"failure function"), guiding efficient shifts during the search to avoid redundant
comparisons.
Algorithm:
Create an array lps (longest proper prefix which is also a suffix) of length
m (pattern length).
Initialize lps[0] = 0 .
For i = 1 to m-1:
lps[i-1] + 1 .
Else (mismatch):
Analysis:
Time Complexity: O(n + m) in the worst case, where n is the text length and m
is the pattern length. Preprocessing takes O(m), and searching takes O(n) due
Key Advantages:
Avoids redundant comparisons by using the partial match table to guide shifts.
Strengths:
Limitations:
In conclusion, the KMP algorithm stands out as a highly efficient and elegant
approach to string matching, offering optimal worst-case time complexity and
clever strategies to avoid redundant comparisons. Its ability to leverage
pattern structure makes it a valuable tool for various text search tasks,
especially when time efficiency is critical.
O(n): Linear time - increases proportionally with input size (e.g., traversing
a list).
O(log n): Logarithmic time - increases logarithmically with input size (e.g.,
binary search).
3. Space Complexity:
Just like time, we also analyze the memory requirements of algorithms using
Big O notation.
O(n): Linear space - increases proportionally with input size (e.g., storing
every element in a list).
O(log n): Logarithmic space - increases logarithmically with input size (e.g.,
recursion with constant additional memory per level).
For these classes, no known efficient algorithms exist for all instances, although
specialized methods may work for specific cases.
Polynomial Complexity:
Algorithms with a time or space complexity that grows as a polynomial function
of the input size (e.g., O(n), O(n^2), O(log n)).
Key Characteristics:
Sorting algorithms like merge sort and quicksort (O(n log n)).
Non-Polynomial Complexity:
Algorithms with a time or space complexity that grows faster than any
polynomial function of the input size (e.g., O(2^n), O(n!), O(n^n)).
Key Characteristics:
Examples:
Key Differences:
Practicality Tractable for most inputs Often intractable for larger inputs
Implications:
Selecting appropriate algorithms for specific tasks based on input size and
resource constraints.
NP-Completeness:
Imagine problems where verifying a proposed solution is easy in polynomial
time, but finding that solution in the first place seems much harder. NP-
complete problems fall into this category.
Key Characteristics:
Examples:
Key Characteristics:
Examples:
3-SAT problem.
The big question is: are there any NP-complete problems that are also easy
to solve (P = NP)? This remains an unsolved problem in theoretical computer
science, with significant implications for the field.
Research in this area focuses on proving whether specific problems are NP-
complete, finding efficient approximations, or even proving P = NP (which would
overturn current complexity assumptions).
approximation algorithms
When Optimal Solutions Are Elusive:
Key Idea: Accept a slight deviation from the optimal solution in exchange for
significantly faster computation time.
Common Techniques:
Applications:
This process continues until all necessary comparisons are made, guaranteeing the
correct sorting of the input sequence.
Flow networks are crucial in modeling and solving flow-related problems such
as network flow optimization, circulation, and transportation problems.
Understanding the flow and operations within sorting networks aids in hardware
optimization, while comprehending flow networks is essential for solving various
flow-related optimization problems in different domains.
The Ford-Fulkerson algorithm and its connection to sorting networks are two
intriguing topics in algorithm design and analysis. Here's a breakdown of each:
Efficiency: Runs in O(mf) time, where m is the number of edges and f is the
maximum flow value.
Limitations:
Efficient for small data: Can be faster than software sorting algorithms for small
datasets.
Challenges:
Scalability: Designing efficient networks for large data sizes can be complex.
Examples:
Algorithms:
Greedy algorithms: Choose edges that seem promising at each step, aiming
to build a large matching. Examples include Hungarian algorithm and the
augmenting path method.
Network flow algorithms: Convert the matching problem into a flow network
problem and utilize flow optimization techniques. Ford-Fulkerson algorithm can
be adapted for this purpose.
Properties:
Kőnig's Theorem: The size of the maximum matching equals the minimum
size of a vertex cover in the graph (a set of vertices that includes at least one
endpoint of every edge). This provides a valuable connection between matching
and covering problems.
Hall's Marriage Theorem: A bipartite graph has a perfect matching (all vertices
are matched) if and only if every subset of one side has at least as many
neighbors on the other side. This theorem allows for efficient checks for the
existence of perfect matchings.
Applications:
Complexity:
Finding the maximum matching in a bipartite graph can be done in O(V^3/2 log
V) time for dense graphs and O(E sqrt(V)) time for sparse graphs, where V is
the number of vertices and E is the number of edges.
Conclusion:
Maximum bipartite matching is a powerful tool for solving real-world problems
involving assignments and pairings. With diverse algorithms, theoretical
connections to other graph problems, and numerous applications, it's a valuable
concept in theoretical and practical computer science.
Comparison Network:
A comparison network is a network of comparators used to sort elements. It
represents a sorting algorithm with a fixed topology of comparators that compare
pairs of elements. These comparators perform comparisons and swaps to achieve
the sorted order.
Properties:
Zero-One Principle:
The zero-one principle is a concept used in sorting networks. It states that for a
sorting network to correctly sort all inputs, it’s sufficient to test it with all possible
combinations of 0s and 1s (binary inputs) within a fixed size. If it correctly sorts all
possible binary inputs, it will sort all inputs, confirming its correctness.
Merging Network:
A merging network is a network designed to merge or combine two sorted
sequences into a single sorted sequence. It uses a series of comparators to merge
the sequences by comparing elements and arranging them in sorted order.
Comparators: Elements that compare two inputs and swap them if they are in
the wrong order.
Functionality: Inputs enter at the top, flow through comparators, and exit
sorted at the bottom.
Properties:
Zero-One Principle:
Key Idea: A comparison network works correctly for all inputs if and only if it
works correctly for all inputs consisting of only 0s and 1s.
Proof: Relies on the fact that comparators only consider the relative order of
inputs, not their specific values.
Steps:
Merging Network:
Building Blocks: Merging networks are often used as building blocks for larger
sorting networks, including bitonic sorting networks.
Key Applications:
Parallel Sorting: Bitonic sorting networks and other comparison networks can
be efficiently parallelized for sorting on multi-processor systems.
Additional Insights:
NOTE
made with gemini pro so content maybe wrong :)
updated:
https://yashnote.notion.site/DAA-1082ac510a41474dbcac15d4923aef46?pvs=4