## Introduction

What is the minimum length of the network cable we require if we have to connect 100 computing machines in a building distributed across multiple floors? How do I guarantee that there can be no other minimum length possible than what I derive? Why is it even important to find the minimum length?

##### The history of the problem

You can read the classic problem solved by MST Applications of Minimum Spanning Tree

##### Defining Spanning Trees

- A spanning tree is always defined for a weighted Graph G(V,E) where the weights are positive. This means that all the edges carry some positive weight.
- A spanning tree is a sub graph G'(V, E’) where E’ is a subset of E. This means a spanning tree contains all the vertices of the graph and a set of edges which makes sure that there is no cycles in the spanning tree.
- A graph can have multiple Spanning Trees.

##### Defining Minimum Spanning Trees

- A Minimum Spanning Tree in reality is a minimum weight spanning tree where the weight is the sum of weights of all its edges.
- As we know that there are more than one spanning tree, if we know all the spanning trees and then find the minimum of them, we will get the MST.

Let me put first the algorithm and the source code for finding minimum spanning tree for the visitors who came looking for that. But I strongly recommend you to read the Further Reading section.

## Idea – Minimum Spanning Tree Prim’s Algorithm

- Consider Graph G(V,E), and let U be a subset of V. In the beginning U is an empty set.
- Maintain a Priority Queue of V-U such that the key for each element in the Queue is the weight of the least-weighted edge connecting it to a vertex in U.
- So, in the beginning all the keys would be infinite because there is nothing in U and any edge in V-U which is connecting to U can be considered of infinite length.
- Choose one of the vertex from the Queue and assign the key a value zero. It is purely random because we need to start from it.
- While the queue is not empty, extract the element (u) with minimum key from the queue and for each vertex v in the adjacency list of u , if v is still in the Priority Queue check if the weight(u, v) is less that the key of v assign w(u,v) to key of v and make a note of, which vertex is responsible for v’s minimum key.

When the algorithm is complete, we get the minimum spanning tree by knowing all the responsible vertices. Below pseudo code and dry run might make it more clear.

##### Psuedo Code

1 2 3 4 5 6 7 8 9 |
Queue <- V Key[v] <- infinite for each vertex v in V key[s] <- 0, \\ a random vertex to start with. while Q is NOT EMPTY u <- Q.exractMIN for each v IN adjacency list of u if v is still in Q and key[v] > weight(u,v) key[v] <- weight(u,v) P[v] <- u |

The array P will give the minimum spanning tree after the algorithm is over.

##### Dry Run

Consider the below weighted graph for dry run.

Initialize the priority Queue with all the vertices and assign infinite to each key. Assign zero to A’s key.

##### Step 1 :

Extract the minimum (i.e. A) from the Queue and evaluate its adjacent vertices which are still in the queue (B, D, F). For B, 2 is less than its current key (i.e. infinity), so decrease B’s key to 2 and make a note that A is the parent of B. Similarly for D, 6 is less than its current key(i.e. infinity), so decrease D’s key to 6 and make a note that A is the parent of D. Also for F, 5 is less than its current key(i.e. infinity), so decrease F’s key to 5 and make a note that A is the parent of D.

Here is the state of the array and the queue after step 1.

##### Step 2 :

Extract the minimum (i.e. B) from the Queue and evaluate its adjacent vertices which are still in the queue (C). For c, 7 is less than its current key (i.e. infinity), so decrease C’s key to 7 and make a note that B is the parent of C.

Here is the state of the array and the queue after step 2.

##### Step 3 :

Extract the minimum (i.e. F) from the Queue and evaluate its adjacent vertices which are still in the queue (C, E). For C, 1 is less than its current key (i.e. 7), so decrease C’s key to 1 and make a note that F is the parent of C. Similarly for E, 3 is less than its current key (i.e. infinity), so decrease E’s key to 3 and make a note that F is the parent of E.

Here is the state of the array and the queue after step 3.

##### Step 4 :

Extract the minimum (i.e. C) from the Queue and evaluate its adjacent vertices which are still in the queue (D). For D, 9 is greater than its current key (i.e. 6),hence no need to decrease the key or change the state of array.

Here is the state of the array and the queue after step 4.

##### Step 5 :

Extract the minimum (i.e. E) from the Queue and evaluate its adjacent vertices which are still in the queue (D). For D, 4 is less than its current key (i.e. 6), so decrease D’s key to 4 and make a note that E is the parent of D.

Here is the state of the array and the queue after step 5.

##### Step 6 :

Extract the minimum (i.e. D) from the Queue and evaluate its adjacent vertices which are still in the queue (G). For G, 8 is less than its current key (i.e. infinity), so decrease G’s key to 8 and make a note that D is the parent of G.

Here is the state of the array and the queue after step 6.

##### Step 7 :

Extract the minimum (i.e. G) from the Queue and evaluate its adjacent vertices which are still in the queue (None). So the algorithm completes.

The minimum spanning tree can be given by looking at the parent array. It should contain the edges { (A,B), (A,F), (F,C), (F,E), (E,D), (D,G)}.

The minimum spanning tree is shown below:

##### Source Code

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
private void primsMST(List graph) { PriorityQueue priorityQueue = new PriorityQueue(); for (Vertex v : graph) { v.key = Integer.MAX_VALUE; priorityQueue.add(v); } Vertex peek = priorityQueue.peek(); peek.key = 0; while (!priorityQueue.isEmpty()) { Vertex minVertex = priorityQueue.poll(); if (minVertex.incidentEdges != null && minVertex.incidentEdges.size() > 0) { for (Edge edge : minVertex.incidentEdges) { if (priorityQueue.contains(edge.end) && edge.weight < edge.end.key) { priorityQueue.remove(edge.end); edge.end.key = edge.weight; edge.end.previous = minVertex; priorityQueue.add(edge.end); } } } } } |

Here is the git hub link to download the complete source code.

## Analysis of the Algorithm

**Note:** As I have not written any post for Priority Queues, you might face some issues understanding the below analysis. I will make sure that I write some post to discuss priority queues in detail ASAP.

If we see the pseudo code the following things are happening:

**Initialization**– The first three lines, it takes time proportional to the number of vertices O(|V|)**Extract Min**– This happens for another |V| times because the size of the queue is |V| at max and in each iteration we invoke the extractMin once.**The For Loop**– This happens for k number of times for each vertex where k is the degree of the current vertex. Now considering all the iterations the total degree of all the vertices will the sum of degrees of each vertex. Also, if you followed Graph Theory posts, we know that the sum of all the degrees is two times the number of edges or we can say that it is of the order |E|.- Hence the if condition is executed a max of O(|E|) times. Closely looking into the if condition we realize that the code is trying to decrease the key of the element in priority queue. And this triggers a re-arrangement of elements in the priority queue.

##### Copying the pseudo code for quick reference.

1 2 3 4 5 6 7 8 9 |
Queue <- V Key[v] <- infinite for each vertex v in V key[s] <- 0, \\ a random vertex to start with. while Q is NOT EMPTY u <- Q.exractMIN for each v IN adjacency list of u if v is still in Q and key[v] > weight(u,v) key[v] <- weight(u,v) P[v] <- u |

So the total running time is the time taken by all the above mentioned steps. Assume that the time taken to extractMin is T_{ExtractMin} and the time taken to decrease the key is T_{DecreaseKey}. Then the total time taken would be:

##### T = O(|V| * T_{ExtractMin} + |E| * T_{DecreaseKey})

.

The interesting stuff is that the running time depends on the underlying data structure by which we implement the priority queue.

##### For an array based implementation of Priority Queue

The T_{ExtractMin} will take O(|V|) time because it is similar to finding the minimum in an array.

The T_{DecreaseKey} will take constant time as we can access the element in constant time and just decrease its key.

Hence T = O(|V|*|V| + |E|) , which is O(|V|^{2}). Also, |E| is always less than or equal to |V|^{2}.

##### For a binary heap based implementation of Priority Queue

The T_{ExtractMin} will take O(log|V|) time because it is similar to finding the minimum in a binary tree which will have height log|V|.

The T_{DecreaseKey} will take O(log|V|) time for the same reason as above.

Hence T = O(|V|*log|V| + |E|*log|V|) , which is O(|E|*log|V|). Also, |E| is always less than or equal to |V|^{2}.

##### For a fibonacci heap based implementation of Priority Queue

The T_{ExtractMin} will take O(log|V|) amortized cost. *will discuss about this in another post when we discuss Heaps*.

The T_{DecreaseKey} will take O(1) amortized cost.

Hence T = O(|V|*log|V| + |E|) , which is O(|E|+ |V|*log|V|).

Hope you enjoyed reading. Don’t forget to subscribe to TechieMe to get updates on latest posts.