0% found this document useful (0 votes)
9 views18 pages

DSA - Unit V

Uploaded by

sachuphd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views18 pages

DSA - Unit V

Uploaded by

sachuphd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

UNIT V : GRAPHS

What is Graph in Data Structure and Algorithms?


• A Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also
referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph.

• More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is denoted
by G(E, V).

Graph: Vertices and edges are positions and store elements

Definitions that we use:

Directed edge:

■ ordered pair of vertices (u, v)

■ first vertex u is the origin

■ second vertex v is the destination

■ Example: one-way road traffic

Undirected edge:

■ unordered pair of vertices (u, v)

Example: railway lin


○ Directed graph:
■ all the edges are directed
■ Example: route network

○ Undirected graph:
■ all the edges are undirected
■ Example: flight network

• When an edge connects two vertices, the vertices are said to be adjacent to each other and the
edge is incident on both vertices.

• A graph with no cycles is called a tree. A tree is an acyclic connected graph.

• A self loop is an edge that connects a vertex to itself.


• Two edges are parallel if they connect the same pair of vertices.

• The Degree of a vertex is the number of edges incident on it.


• A subgraph is a subset of a graph’s edges (with associated vertices) that form a graph.
• A path in a graph is a sequence of adjacent vertices. Simple path is a path with no repeated
vertices. In the graph below, the dotted lines represent a path from G to E.

A cycle is a path where the first and last vertices are the same. A simple cycle is a cycle with no repeated
vertices or edges (except the first and last vertices).

We say that one vertex is connected to another if there is a path that contains both of them.

• A graph is connected if there is a path from every vertex to every other vertex.

• If a graph is not connected then it consists of a set of connected components.


• A directed acyclic graph [DAG] is a directed graph with no cycles.

• A forest is a disjoint set of trees.

• A spanning tree of a connected graph is a subgraph that contains all of that graph’s vertices and is a
single tree. A spanning forest of a graph is the union of spanning trees of its connected components.

• A bipartite graph is a graph whose vertices can be divided into two sets such that all edges connect a
vertex in one set with a vertex in the other set.

• In weighted graphs integers (weights) are assigned to each edge to represent (distances or costs).

• Graphs with all edges present are called complete graphs.

• Graphs with relatively few edges (generally if it edges < |V| log |V|) are called

sparse graphs.

• Graphs with relatively few of the possible edges missing are called dense.

• Directed weighted graphs are sometimes called network.


• We will denote the number of vertices in a given graph by |V|, and the number of edges by |E|. Note
that E can range anywhere from 0 to |V|(|V| – l)/2 (in undirected graph). This is because each node
can connect to every other node.

Applications of Graphs

• Representing relationships between components in electronic circuits

• Transportation networks: Highway network, Flight network

• Computer networks: Local area network, Internet, Web

• Databases: For representing ER (Entity Relationship) diagrams in databases, for representing


dependency of tables in databases

Graph Representation

As in other ADTs, to manipulate graphs we need to represent them in some useful form. Basically, there
are three ways of doing this:

• Adjacency Matrix

• Adjacency List

• Adjacency Set

Adjacency Matrix

• Graph Declaration for Adjacency Matrix To represent graphs, we need the number of vertices,
the number of edges and also their interconnections. So, the graph can be declared as:

• In this method, we use a matrix with size V × V. The values of matrix are boolean. Let us assume
the matrix is Adj. The value Adj[u, v] is set to 1 if there is an edge from vertex u to vertex v and
0 otherwise.

• In the matrix, each edge is represented by two bits for undirected graphs. That means, an edge
from u to v is represented by 1 value in both Adj[u ,v ] and Adj[u,v]. To save time, we can process
only half of this symmetric matrix. Also, we can assume that there is an “edge” from each vertex
to itself. So, Adj[u, u] is set to 1 for all vertices.
The adjacency matrix for this graph can be given as:

• Now, let us concentrate on the implementation. To read a graph, one way is to first read the
vertex names and then read pairs of vertex names (edges). The code below reads an undirected
graph.
• The adjacency matrix representation is good if the graphs are dense. The matrix requires O(V2)
bits of storage and O(V2) time for initialization.
• If the number of edges is proportional to V2, then there is no problem because V2 steps are
required to read the edges. If the graph is sparse, the initialization of the matrix dominates the
running time of the algorithm as it takes takes O(V2).
Adjacency List
• Graph Declaration for Adjacency List In this representation all the vertices connected to a vertex
v are listed on an adjacency list for that vertex v. This can be easily implemented with linked
lists.
• That means, for each vertex v we use a linked list and list nodes represents the connections
between v and other vertices to which v has an edge. The total number of linked lists is equal to
the number of vertices in the graph.
Comparison of Graph Representations

• Directed and undirected graphs are represented with the same structures. For directed graphs,
everything is the same, except that each edge is represented just once. An edge from x to y is
represented by a 1 value in Adj[x][y] in the adjacency matrix, or by adding y on x’s adjacency
list. For weighted graphs, everything is the same, except fill the adjacency matrix with weights
instead of boolean values.

Graph Traversals
• To solve problems on graphs, we need a mechanism for traversing the graphs. Graph traversal
algorithms are also called graph search algorithms. Like trees traversal algorithms (Inorder,
Preorder, Postorder and Level-Order traversals), graph search algorithms can be thought of as
starting at some source vertex in a graph and “searching” the graph by going through the edges
and marking the vertices. Now, we will discuss two such algorithms for traversing the graphs.
• Depth First Search [DFS]
• Breadth First Search [BFS]

Depth First Search [DFS]


• DFS algorithm works in a manner similar to preorder traversal of the trees. Like preorder
traversal, internally this algorithm also uses stack.
• Tree edge: encounter new vertex
• Back edge: from descendent to ancestor
• Forward edge: from ancestor to descendent
• Cross edge: between a tree or subtrees
For most algorithms boolean classification, unvisited/visited is enough (for three color implementation
refer to problems section). That means, for some problems we need to use three colors, but for our
discussion two colors are enough.
• Initially all vertices are marked unvisited (false). The DFS algorithm starts at a vertex u in the
graph. By starting at vertex u it considers the edges from u to other vertices.
• If the edge leads to an already visited vertex, then backtrack to current vertex u. If an edge
leads to an unvisited vertex, then go to that vertex and start processing from that vertex.
• That means the new vertex becomes the current vertex. Follow this process until we reach the
dead-end. At this point start backtracking.
• The process terminates when backtracking leads back to the start vertex. The algorithm based
on this mechanism is given below: assume Visited[] is a global array.
• As an example, consider the following graph. We can see that sometimes an edge leads to an
already discovered vertex. These edges are called back edges, and the other edges are called
tree edges because deleting the back edges from the graph generates a tree.
• The final generated tree is called the DFS tree and the order in which the vertices are processed
is called DFS numbers of the vertices. In the graph below, the gray color indicates that the
vertex is visited (there is no other significance). We need to see when the Visited

Breadth First Search [BFS]


• The BFS algorithm works similar to level – order traversal of the trees. Like level – order
traversal, BFS also uses queues. In fact, level – order traversal got inspired from BFS.
BFS works level by level. Initially, BFS starts at a given vertex, which is at level 0. In the first
stage it visits all vertices at level 1 (that means, vertices whose distance is 1 from the start
vertex of the graph). In the second stage, it visits all vertices at the second level. These new
vertices are the ones which are adjacent to level 1 vertices. BFS continues this process until
all the levels of the graph are completed. Generally queue data structure is used for storing
the vertices of a level.

class Vertex {
public char label;
public boolean visited;
public Vertex(char Jab) /
label= lab;
visilcd = false;
}
class Graph i
private final int maxVertices = 20; private Vertex vertexLis,IJ;
private int adjMatrix(!II; private inr. vettexCount private Queue theQueue; public Graph() {
vertexList O new Vettexjmax\/err.icesj; adjMatrix = new intfmaxVertices][maxVertices]; vertexCount
= O;
for(int y=O; y<maxVertices; y++)
for(int x=O; x<maxVertices; x++)
adjMatrix(xJ[yJ = O; theQueue = new Queue();
}
public void addVe1 ex(char lab){ vertexList[vertexCount++I = new Vertex{lab);
}
public void addEdge(int start, intend) I
adjMatrixlstartJlend( = I;
adjMatrix[endl(startJ = I;
}
public void displayVcrtcx(int v) { System.our.print(verrexList(vJ.label);
}

public void bfs() {


vertexList[OJ.wasVisited = c.·ue; clisplayVertex(O); theQueue.insert(O);
intv2;
while( ltheQueue.isEmpty() ){
int v I = theQueue.remove();
while( (v2=getAdjUnvisiteclVertex(vl)l I= -1) { vertexList[v2].wasVi.sited = tn.te; clisolavVertexfv2l:
theQueue.insert(v2);
}
for(int j=O; j<nVerts; j++)
vertexListLiJ.wasVisited = false;
}
public int getAcljUnvisitedVertex(int v) { for(int j=O; j<vertexCount; j++)
if(adjMatrix(vlLi]==1 && vertexListLl)-visited==false)
retumj;
return -1;

As an example, let us consider the same graph as that of the DFS example. The BFS traversal can be
shown as:

Time complexity of BFS is O(V + E), if we use adjacency lists for representing the graphs, and O(V2)
for adjacency matrix representation.
Applications of BFS
• Finding all connected components in a graph
• Finding all nodes within one connected component
• Finding the shortest path between two nodes
• Testing a graph for bipartiteness
Comparing DFS and BFS
• Comparing BFS and DFS, the big advantage of DFS is that it has much lower memory
requirements than BFS because it’s not required to store all of the child pointers at each level.
Depending on the data and what we are looking for, either DFS or BFS can be advantageous. For
example, in a family tree if we are looking for someone who’s still alive and if we assume that
person would be at the bottom of the tree, then DFS is a better choice. BFS would take a very
long time to reach that last level.
Applications of Shortest Path Algorithms

• Finding fastest way to go from one place to another


• Finding cheapest way to fly/send data from one city to another
• Shortest Path in Unweighted Graph
• Let s be the input vertex from which we want to find the shortest path to all other vertices.
Unweighted graph is a special case of the weighted shortest-path problem, with all edges a weight
of 1. The algorithm is similar to BFS and we need to use the following data structures:
• A distance table with three columns (each row corresponds to a vertex):
• Distance from source vertex.
• Path – contains the name of the vertex through which we get the shortest distance.
• A queue is used to implement breadth-first search. It contains vertices whose distance from the
source node has been computed and their adjacent vertices are to be examined.
• As an example, consider the following graph and its adjacency list representation.

The adjacency list for this graph is:

Lets = C. The distance from C to C is 0. Initially, distances to all other nodes are not computed, and we
initialize the second column in the distance table for all vertices (except C) with -1 as below.
Algorithm

➢ Running time: O(|E| + |V|), if adjacency lists are used. In for loop, we are checking the outgoing
edges for a given vertex and the sum of all examined edges in the while loop is equal to the
number of edges which gives O(|E|).

➢ If we use matrix representation the complexity is O(|V|2), because we need to read an entire row
in the matrix of length |V| in order to find the adjacent vertices for a given vertex.

Shortest path in Weighted Graph [Dijkstra’s]

➢ A famous solution for the shortest path problem was developed by Dijkstra. Dijkstra’s algorithm
is a generalization of the BFS algorithm. The regular BFS algorithm cannot solve the shortest
path problem as it cannot guarantee that the vertex at the front of the queue is the vertex closest
to source s.

➢ Before going to code let us understand how the algorithm works. As in unweighted shortest path
algorithm, here too we use the distance table. The algorithm works by keeping the shortest
distance of vertex v from the source in the Distance table. The value Distance[v] holds the
distance from s to v. The shortest distance of the source to itself is zero. The Distance table for
all other vertices is set to –1 to indicate that those vertices are not already processed.
Dijkstra’s Algorithm
• It uses greedy method: Always pick the next closest vertex to the source.
• It uses priority queue to store unvisited vertices by distance from s.
• It does not work with negative weights.
Difference between Unweighted Shortest Path and Dijkstra’s Algorithm
1) To represent weights in the adjacency list, each vertex contains the weights of the edges (in
addition to their identifier).
2) Instead of ordinary queue we use priority queue [distances are the priorities] and the vertex
with the smallest distance is selected for processing.
3) The distance to a vertex is calculated by the sum of the weights of the edges on the path from
the source to that vertex.
4) We update the distances in case the newly computed distance is smaller than the old distance
which we have already computed.
➢ The above algorithm can be better understood through an example, which will explain each
step that is taken and how Distance is calculated. The weighted graph below has 5 vertices
from A –E. The value between the two vertices is known as the edge cost between two vertices.
➢ For example, the edge cost between A and C is 1. Dijkstra’s algorithm can be used to find the
shortest path from source A to the remaining vertices in the graph.

Disadvantages of Dijkstra’s Algorithm


• As discussed above, the major disadvantage of the algorithm is that it does a blind search,
thereby wasting time and necessary resources.
• Another disadvantage is that it cannot handle negative edges. This leads to acyclic graphs and
most often cannot obtain the right shortest path.
Relatives of Dijkstra’s Algorithm
• The Bellman- Ford algorithm computes single-source shortest paths in a weighted digraph. It
uses the same concept as that of Dijkstra’s algorithm but can handle negative edges as well. It has
more running time than Dijkstra’s algorithm.
• Prim’s algorithm finds a minimum spanning tree for a connected weighted graph. It implies
that a subset of edges that form a tree where the total weight of all the edges in the tree is minimized.
Minimal Spanning Tree
➢ The Spanning tree of a graph is a subgraph that contains all the vertices and is also a tree. A
graph may have many spanning trees. As an example, consider a graph with 4 vertices as shown
below. Let us assume that the corners of the graph are vertices.

For this simple graph, we can have multiple spanning trees as shown below.
The algorithm we will discuss now is minimum spanning tree in an undirected graph. We assume that
the given graphs are weighted graphs. If the graphs are unweighted graphs then we can still use the
weighted graph algorithms by treating all weights as equal. A minimum spanning tree of an
undirected graph G is a tree formed from graph edges that connect all the vertices of G with
minimum total cost (weights). A minimum spanning tree exists only if the graph is connected. There
are two famous algorithms for this problem:

• Prim’s Algorithm
• Kruskal’s Algorithm

Prim’s Algorithm
➢ Prim’s algorithm is almost the same as Dijkstra’s algorithm. As in Dijkstra’s algorithm, in
Prim’s algorithm we keep the values distance and paths in the distance table.
➢ The only exception is that since the definition of distance is different, the updating statement
also changes a little. The update statement is simpler than before.
The entire implementation of this algorithm is identical to that of Dijkstra’s algorithm. The running
time is O(|V|2) without heaps [good for dense graphs], and O(ElogV) using binary heaps [good for
sparse graphs].

Kruskal’s Algorithm
➢ The algorithm starts with V different trees (V is the vertices in the graph). While constructing
the minimum spanning tree, every time Kruskal’s alorithm selects an edge that has minimum
weight and then adds that edge if it doesn’t create a cycle.
➢ So, initially, there are | V | single-node trees in the forest. Adding an edge merges two trees
into one. When the algorithm is completed, there will be only one tree, and that is the minimum
spanning tree. There are two ways of implementing Kruskal’s algorithm:
➢ By using Disjoint Sets: Using UNION and FIND operations
➢ By using Priority Queues: Maintains weights in priority queue
➢ The appropriate data structure is the UNION/FIND algorithm [for implementing forests]. Two
vertices belong to the same set if and only if they are connected in the current spanning forest.
Each vertex is initially in its own set.
➢ If u and v are in the same set, the edge is rejected because it forms a cycle. Otherwise, the edge
is accepted, and a UNION is performed on the two sets containing u and v. As an example,
consider the following graph (the edges show the weights).

Now let us perform Kruskal’s algorithm on this graph. We always select the edge which has minimum
weight.
From the above graph, the edges which have minimum weight (cost) are: AD and BE. From
these two we can select one of them and let us assume that we select AD (dotted line).

DF is the next edge that has the lowest cost (6).

BE now has the lowest cost and we select it (dotted lines indicate selected edges).

Next, AC and CE have the low cost of 7 and we select AC.

Then we select CE as its cost is 7 and it does not form a cycle.


The next low cost edges are CB and EF. But if we select CB, then it forms a cycle. So we
discard it. This is also the case with EF. So we should not select those two.

And the next low cost is 9 (BD and EG). Selecting BD forms a cycle so we discard it. Adding
EG will not form a cycle and therefore with this edge we complete all vertices of the graph.

You might also like