DSA - Unit V
DSA - Unit V
• More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is denoted
by G(E, V).
Directed edge:
Undirected edge:
○ Undirected graph:
■ all the edges are undirected
■ Example: flight network
• When an edge connects two vertices, the vertices are said to be adjacent to each other and the
edge is incident on both vertices.
A cycle is a path where the first and last vertices are the same. A simple cycle is a cycle with no repeated
vertices or edges (except the first and last vertices).
We say that one vertex is connected to another if there is a path that contains both of them.
• A graph is connected if there is a path from every vertex to every other vertex.
• A spanning tree of a connected graph is a subgraph that contains all of that graph’s vertices and is a
single tree. A spanning forest of a graph is the union of spanning trees of its connected components.
• A bipartite graph is a graph whose vertices can be divided into two sets such that all edges connect a
vertex in one set with a vertex in the other set.
• In weighted graphs integers (weights) are assigned to each edge to represent (distances or costs).
• Graphs with relatively few edges (generally if it edges < |V| log |V|) are called
sparse graphs.
• Graphs with relatively few of the possible edges missing are called dense.
Applications of Graphs
Graph Representation
As in other ADTs, to manipulate graphs we need to represent them in some useful form. Basically, there
are three ways of doing this:
• Adjacency Matrix
• Adjacency List
• Adjacency Set
Adjacency Matrix
• Graph Declaration for Adjacency Matrix To represent graphs, we need the number of vertices,
the number of edges and also their interconnections. So, the graph can be declared as:
• In this method, we use a matrix with size V × V. The values of matrix are boolean. Let us assume
the matrix is Adj. The value Adj[u, v] is set to 1 if there is an edge from vertex u to vertex v and
0 otherwise.
• In the matrix, each edge is represented by two bits for undirected graphs. That means, an edge
from u to v is represented by 1 value in both Adj[u ,v ] and Adj[u,v]. To save time, we can process
only half of this symmetric matrix. Also, we can assume that there is an “edge” from each vertex
to itself. So, Adj[u, u] is set to 1 for all vertices.
The adjacency matrix for this graph can be given as:
• Now, let us concentrate on the implementation. To read a graph, one way is to first read the
vertex names and then read pairs of vertex names (edges). The code below reads an undirected
graph.
• The adjacency matrix representation is good if the graphs are dense. The matrix requires O(V2)
bits of storage and O(V2) time for initialization.
• If the number of edges is proportional to V2, then there is no problem because V2 steps are
required to read the edges. If the graph is sparse, the initialization of the matrix dominates the
running time of the algorithm as it takes takes O(V2).
Adjacency List
• Graph Declaration for Adjacency List In this representation all the vertices connected to a vertex
v are listed on an adjacency list for that vertex v. This can be easily implemented with linked
lists.
• That means, for each vertex v we use a linked list and list nodes represents the connections
between v and other vertices to which v has an edge. The total number of linked lists is equal to
the number of vertices in the graph.
Comparison of Graph Representations
• Directed and undirected graphs are represented with the same structures. For directed graphs,
everything is the same, except that each edge is represented just once. An edge from x to y is
represented by a 1 value in Adj[x][y] in the adjacency matrix, or by adding y on x’s adjacency
list. For weighted graphs, everything is the same, except fill the adjacency matrix with weights
instead of boolean values.
Graph Traversals
• To solve problems on graphs, we need a mechanism for traversing the graphs. Graph traversal
algorithms are also called graph search algorithms. Like trees traversal algorithms (Inorder,
Preorder, Postorder and Level-Order traversals), graph search algorithms can be thought of as
starting at some source vertex in a graph and “searching” the graph by going through the edges
and marking the vertices. Now, we will discuss two such algorithms for traversing the graphs.
• Depth First Search [DFS]
• Breadth First Search [BFS]
class Vertex {
public char label;
public boolean visited;
public Vertex(char Jab) /
label= lab;
visilcd = false;
}
class Graph i
private final int maxVertices = 20; private Vertex vertexLis,IJ;
private int adjMatrix(!II; private inr. vettexCount private Queue theQueue; public Graph() {
vertexList O new Vettexjmax\/err.icesj; adjMatrix = new intfmaxVertices][maxVertices]; vertexCount
= O;
for(int y=O; y<maxVertices; y++)
for(int x=O; x<maxVertices; x++)
adjMatrix(xJ[yJ = O; theQueue = new Queue();
}
public void addVe1 ex(char lab){ vertexList[vertexCount++I = new Vertex{lab);
}
public void addEdge(int start, intend) I
adjMatrixlstartJlend( = I;
adjMatrix[endl(startJ = I;
}
public void displayVcrtcx(int v) { System.our.print(verrexList(vJ.label);
}
As an example, let us consider the same graph as that of the DFS example. The BFS traversal can be
shown as:
Time complexity of BFS is O(V + E), if we use adjacency lists for representing the graphs, and O(V2)
for adjacency matrix representation.
Applications of BFS
• Finding all connected components in a graph
• Finding all nodes within one connected component
• Finding the shortest path between two nodes
• Testing a graph for bipartiteness
Comparing DFS and BFS
• Comparing BFS and DFS, the big advantage of DFS is that it has much lower memory
requirements than BFS because it’s not required to store all of the child pointers at each level.
Depending on the data and what we are looking for, either DFS or BFS can be advantageous. For
example, in a family tree if we are looking for someone who’s still alive and if we assume that
person would be at the bottom of the tree, then DFS is a better choice. BFS would take a very
long time to reach that last level.
Applications of Shortest Path Algorithms
Lets = C. The distance from C to C is 0. Initially, distances to all other nodes are not computed, and we
initialize the second column in the distance table for all vertices (except C) with -1 as below.
Algorithm
➢ Running time: O(|E| + |V|), if adjacency lists are used. In for loop, we are checking the outgoing
edges for a given vertex and the sum of all examined edges in the while loop is equal to the
number of edges which gives O(|E|).
➢ If we use matrix representation the complexity is O(|V|2), because we need to read an entire row
in the matrix of length |V| in order to find the adjacent vertices for a given vertex.
➢ A famous solution for the shortest path problem was developed by Dijkstra. Dijkstra’s algorithm
is a generalization of the BFS algorithm. The regular BFS algorithm cannot solve the shortest
path problem as it cannot guarantee that the vertex at the front of the queue is the vertex closest
to source s.
➢ Before going to code let us understand how the algorithm works. As in unweighted shortest path
algorithm, here too we use the distance table. The algorithm works by keeping the shortest
distance of vertex v from the source in the Distance table. The value Distance[v] holds the
distance from s to v. The shortest distance of the source to itself is zero. The Distance table for
all other vertices is set to –1 to indicate that those vertices are not already processed.
Dijkstra’s Algorithm
• It uses greedy method: Always pick the next closest vertex to the source.
• It uses priority queue to store unvisited vertices by distance from s.
• It does not work with negative weights.
Difference between Unweighted Shortest Path and Dijkstra’s Algorithm
1) To represent weights in the adjacency list, each vertex contains the weights of the edges (in
addition to their identifier).
2) Instead of ordinary queue we use priority queue [distances are the priorities] and the vertex
with the smallest distance is selected for processing.
3) The distance to a vertex is calculated by the sum of the weights of the edges on the path from
the source to that vertex.
4) We update the distances in case the newly computed distance is smaller than the old distance
which we have already computed.
➢ The above algorithm can be better understood through an example, which will explain each
step that is taken and how Distance is calculated. The weighted graph below has 5 vertices
from A –E. The value between the two vertices is known as the edge cost between two vertices.
➢ For example, the edge cost between A and C is 1. Dijkstra’s algorithm can be used to find the
shortest path from source A to the remaining vertices in the graph.
For this simple graph, we can have multiple spanning trees as shown below.
The algorithm we will discuss now is minimum spanning tree in an undirected graph. We assume that
the given graphs are weighted graphs. If the graphs are unweighted graphs then we can still use the
weighted graph algorithms by treating all weights as equal. A minimum spanning tree of an
undirected graph G is a tree formed from graph edges that connect all the vertices of G with
minimum total cost (weights). A minimum spanning tree exists only if the graph is connected. There
are two famous algorithms for this problem:
• Prim’s Algorithm
• Kruskal’s Algorithm
Prim’s Algorithm
➢ Prim’s algorithm is almost the same as Dijkstra’s algorithm. As in Dijkstra’s algorithm, in
Prim’s algorithm we keep the values distance and paths in the distance table.
➢ The only exception is that since the definition of distance is different, the updating statement
also changes a little. The update statement is simpler than before.
The entire implementation of this algorithm is identical to that of Dijkstra’s algorithm. The running
time is O(|V|2) without heaps [good for dense graphs], and O(ElogV) using binary heaps [good for
sparse graphs].
Kruskal’s Algorithm
➢ The algorithm starts with V different trees (V is the vertices in the graph). While constructing
the minimum spanning tree, every time Kruskal’s alorithm selects an edge that has minimum
weight and then adds that edge if it doesn’t create a cycle.
➢ So, initially, there are | V | single-node trees in the forest. Adding an edge merges two trees
into one. When the algorithm is completed, there will be only one tree, and that is the minimum
spanning tree. There are two ways of implementing Kruskal’s algorithm:
➢ By using Disjoint Sets: Using UNION and FIND operations
➢ By using Priority Queues: Maintains weights in priority queue
➢ The appropriate data structure is the UNION/FIND algorithm [for implementing forests]. Two
vertices belong to the same set if and only if they are connected in the current spanning forest.
Each vertex is initially in its own set.
➢ If u and v are in the same set, the edge is rejected because it forms a cycle. Otherwise, the edge
is accepted, and a UNION is performed on the two sets containing u and v. As an example,
consider the following graph (the edges show the weights).
Now let us perform Kruskal’s algorithm on this graph. We always select the edge which has minimum
weight.
From the above graph, the edges which have minimum weight (cost) are: AD and BE. From
these two we can select one of them and let us assume that we select AD (dotted line).
BE now has the lowest cost and we select it (dotted lines indicate selected edges).
And the next low cost is 9 (BD and EG). Selecting BD forms a cycle so we discard it. Adding
EG will not form a cycle and therefore with this edge we complete all vertices of the graph.