Topological Ordering for Graphs

Introduction

The topological ordering for graphs get many applications where the nature of the problem is ordered sequential processing. Few of the problems in this category are as follows:

  • Dynamic Linking/Loading of programs while building and execution.
  • Deciding the pre-requisites in a course structure.
  • Job Scheduling in Processors or Assembly Line Processing.
  • Building an Alien Dictionary from given words.

There are many more but I am sure you got the idea. When we say topological ordering of graph, we are necessarily talking about the ordering of the vertices of the graph.

Understanding the Problem of Topological Ordering for Graphs

Consider a scenario where someone wants to learn all about data structures and algorithms and they want to seek some advice from you so that it is a good learning experience and they understand everything they read and develop interest in the topic. What they are looking for is a ordered list of topics such that if they read it in the order prescribed by you, they will surely be able to understand the subject. Basically redefining the Table of Contents of the Data Structure and Algorithms book.

There is a saying “You only discover what to write first after writing the complete book.”

Topological Ordering for Graphs

.
The ordering basically outputs a sequentially ordered list such that the element at the end of the list is not a prerequisite for completion to any element in the list. In a more general notion, an element at index i in the list is not a prerequisite for any element at index 1 .. i-1.

To rephrase it, if we want to execute a task Ti which is dependent on Ta and Tb. Then the topological ordering will be Ta, Tb, Ti.

Can all graphs be topologically ordered or sorted?

The answer is NO. Not all the graphs can have a topological ordering. The ones which can have are discussed below:

  • The above section talked about dependencies and pre-requisites, which means that the nature of the problem consists of a directed graph (because, it is not possible to talk about dependencies in a graph which is not directed).
  • Also, we talked about maintaining an order which is ever increasing or ever decreasing. That means there always exists a task T which is dependent on a series of tasks. But it is highly unlikely or may I say impossible that the dependency chain of this task T will contain itself. In more formal terminology the graph must be a DAG (Directed Acyclic Graph).

The second point derives a corollary that there must be one Sink Vertex in the graph. Which means there must not be any outgoing edge for at least one vertex. Similarly there needs to be at least on Source Vertex in the Graph, which means there is no incoming edges to this vertex.

If these properties exist in a Graph then they can be topologically ordered.
For e.g. see the below graph which meets both the properties. The topological ordering of the graph would be : A, D, E, B, F, G, C, H
Topological Ordering for Graphs

Approach to Solve the problem

You can solve this problem in multiple ways, here are few of them.

Starting from the Source Vertex

Let us consider the above graph for demonstration purpose. Here is the steps we need to follow.

  • First we need to choose one of the source vertices and the obvious choice is among A and D, we choose A.
  • Delete all the outgoing edges from A, so we will delete (A,E) and (A,B).
  • Now choose among the available vertices with no incoming edges, we have D and E, we choose D here.
  • Delete all the outgoing edges from D, so we will delete (D,B).
  • Now choose among the available vertices with no incoming edges, we have E and B, we choose E here.
  • Delete all the outgoing edges from E, so we will delete (E,F) AND (E,G).
  • Now choose among the available vertices with no incoming edges, we have B and F, we choose B here.
  • Delete all the outgoing edges from B, so we will delete (B,C).
  • Now choose among the available vertices with no incoming edges, we have F, we choose F here.
  • Delete all the outgoing edges from F, so we will delete (F,G).
  • Now choose among the available vertices with no incoming edges, we have G, we choose G here.
  • Delete all the outgoing edges from G, so we will delete (G,C).
  • Now choose among the available vertices with no incoming edges, we have C, we choose C here.
  • Delete all the outgoing edges from C, so we will delete (C,H).
  • Now choose among the available vertices with no incoming edges, we have H, we choose H here.
  • There is no more connected vertices, hence we are done with the algorithm.

Here is an animation of the above steps which will help you understand the steps pictorially
ANIMATION

Starting from the Sink Vertex

You can perform the above set of operations from the sink vertex and arrive to one of the topological ordering by deleting the incoming edges to the sink vertex.
The approach and number of steps mostly remains the same.
Remember that the best way to delete the edges incoming or outgoing from a vertex is to delete the vertex itself.

Pseudo Code

For the purpose of the Pseudo code and the source code, we would prefer that the graph is given to us as a adjacency list, where each list contains all the nodes coming out of a given vertex. For more on Adjacency List, please refer to my tutorial on Basics of Graph Theory

Also, I would like to start from the sink as it is easier to identify all the sink vertices in an adjacency list representation.

 

This is just to explain the idea behind the topological sort. We can definitely improve on the running time and get rid of inner loops to a great extent by choosing the underlying data structure carefully. May be if each vertex contains the list of vertices predecessor vertices, it will reduce the running time significantly.

Source Code

Please download the complete source code from the Github repository of Techieme

Analysis

Talking about Analysis, we can do much better than what is shown above.
Running Time : In the above code, the first loop runs for |V| times for identifying the initial sinks. The second looping construct is a nested loop.
The outer loop runs for |V| times as the queue will only have each vertex at max one time.
The inner loop runs for |V| times as well to iterate over all the existing vertices in G. If the data structure used for Graph allowed concurrent modification this would not have run for |V| times for each iteration of the outer loop.
The contains check on adjacency list depends on the implementation of the list. If Java offers a O(1) contains check on the list then good.

Finally it would be O(|V|*|V|) for this code. In case we could have stored more information in a better data structure, it would have been O(|V|+|E|)

Auxiliary Space : We use an extra queue so it would be O(|V|).

Conclusion

This post is just an introduction to topological ordering, we can really achieve better results in terms of faster running time. Stay connected for other posts on the topic. This concept will solve the problems of dynamic linking and loading given a set of program files and their dependencies, we need to identify an order in which it can be loaded so that we do not pollute the container.

May be the Spring framework uses the same for Autowiring or Dependency Injection.

Don’t forget to subscribe to TechieMe to get updates on latest posts.

  • For vertices in order of 10^7, it is possible in O(|E| + |V|) using just Queue+Array. I was interested in the approach for larger number of vertices and in particular for complete graph.

    I would also like to know if you have posted something for "Hashmap implementation in java" and it's capacity which provides query in amortized O(1) as far as I have heard.

    Also I would suggest you to put an image explaining the forward and backward edges concept for possibilities of topological ordering for better visualization.

    • Out of the three things you mentioned. I think the hashing thing is pretty much covered in detail on this blog. Also, you can use the search bar, it helps 🙂

      http://techieme.in/hashing-in-detail-part-one/

      Will cater the other ones in upcoming posts.

      • Thanks for the beautiful response. On a quick overview of the 3 posts on hashing I found different techniques of hash implementation. I was particularly interested in :

        Class HashMap<K,V> used in java. I would like to know what technique exactly java uses for this class.

        • Sorry, I have started a series on Java Language Specification but haven't reached till HashMap. I would suggest you reading the JLS