Single source shortest path for undirected graph is basically the breadth first traversal of the graph. Here the graph we consider is unweighted and hence the shortest path would be the number of edges it takes to go from source to destination. You can find posts on the same topic for weighted graphs, and that is solved using Dijkstra’s or Bellman Ford algorithms. This post is written from the competitive programming perspective. Here I want to focus on the details of simplified implementations.
Problem Statement – Shortest Path for Undirected Graph
Most of the times, the problem is defined as follows:
Given a graph with V vertices (numbered from 1 to V) and E edges. Find the shortest path from a source vertex to all other vertices. If a path is not possible (which happens in case of disconnected graphs), then return -1 for those destination vertices.
Here is a sample problem:
Consider V =5 and E = 3 and the list of edges be [ (1, 2), (1,3) , (4,5)]. Also, the source vertex is 1. We need to find the shortest distance to each of the remaining vertices (2, 3, 4, 5).
Diagrammatically the graph can be represented as below:
This is a disconnected graph.For source vertex 1, the shortest path to vertex 2 and 3 is 1, and the shortest path to destination 4 and 5 is -1.
In a competitive programming scenario, you code mostly to the present requirement. Here, we need to think, if we really need the complete graph representation and if we need to declare the Edge, Vertex and Graph classes. We prefer to represent the given data in a compact manner with minimal memory footprint.
Approaching the problem
There are three parts to solving any problem like this
- Determine the representation of graph which makes processing easier for the problem
- Determine the basic algorithm to solve the problem
- Find out the appropriate data structure to store the output
How to store such graphs?
Storage is a sensitive issue, the way we want to store the graph depends on the type of processing we need to do and the density of data. Here we see that the graph is sparse and we can easily choose a adjacency list structure.
The storage would look like below:
1 -> 2 -> 3
2 -> null
3 -> null
4 -> 5
5 -> null
Some part of the problem is already solved if you have correctly identified the representation mechanism. The above representation is basically an array of sets. We would also need an array of length |V| to store distances to each vertex from source.
The algorithm to solve this problem
To know the list L of vertices directly reachable from a given vertex v, we must process the vertex v exactly once. Hence, each vertex in the list L is at a distance one from the vertex v. Once vertex v is processed, we can process all the direct connections of v (i.e. the contents of list L) . So, all the vertices directly reachable from the vertices in L are at a distance of two from the vertex v and so on.
As you noticed that we are possibly doing a batch like processing, where all the vertices directly connected to any vertex is processed at once. We can store these in a queue, where it is easy to process them without losing track.
- Start from the source vertex, assign it a distance 0 and add it in the queue
- Initialize a distance variable to 1 and a distance array to -1
- Process the queue till its not empty
- Remove the first element, fetch all vertices in its adjacency list. if the distance is -1, assign it the value from the distance variable.
- Push the vertices in another helper queue.
- Repeat the above process till all the vertices from the queue is processed.
- The distance array gives you the respective shortest distances of all the vertices from the source
Space is required for three things:
- Adjacency List and it is equal to O(|E|+|V|)
- Both the queues – it is at most equal to O(|V|)
- An array for storing the distances, which is of size O(|V|)
Running time calculations
- Both the while loops combined together run till the queue is not empty and each vertex can be added to the queue exactly once. Hence, it is O(|V|)
- The inner for loop runs for K times for each vertex where K is the length of the adjacency list of the vertex at max.
- The sum of all the adjacency lists would be |E|
- Hence, the total time would be proportional to O(|E|+|V|)
Shortest path for undirected Graph can be calculated using Breadth First Search, this is one version of the same. The take away from this post should be to understand where to cut down on storage and running time and how to conveniently store the graph such that we can skip the unnecessary edges and vertices. In the above graph, we didn’t even process the disconnect component which contains the vertices 4 and 5.