Tree Diameter

Introduction

This is the fifth article in the Tree Traversals – Online Classes.

A very interesting problem is to find the diameter of a tree. The challenge is not in coding the algorithm but in understanding the meaning of the problem. We will try to define the problem so that visualizing the solution becomes easy.

Defining the problem – Tree Diameter

Few confusions about the diameter must be avoided before we start.

  • The tree diameter is always the distance between two leaf nodes
  • The diameter may or may not pass through the root.
  • The diameter can also lie on one of the two sides, the left sub tree or the right sub tree.
  • There can be multiple diameters in a tree
  • The diameter is the longest path between two nodes, which means we need to know the distance between every two node and the max distance is the diameter.

tree diameter

In the above tree, the diameter passes through the root and the distance between all the leaf nodes are mentioned in the below table.

tree diameter table

Please do not pay attention to the numbers written inside the parenthesis of each node for now. We will discuss them in detail later in the article.

The maximum value in any cell of the table is 8 which means that the diameter of the tree is 8. Using this table we can also identify the nodes for which we have the diameter.
Any of the below eight pairs can make a diameter whose size is 8. Which means if you travel between the nodes in any pair, you will get the diameter.

N11 – N13
N11 – N14
N11 – N15
N11 – N16
N12 – N13
N13 – N14
N14 – N15
N15 – N16

As any other tree traversal problems, even this can be solved in an iterative as well as recursive way. We will discuss them one by one.

Approach – Tree Diameter

By looking at the table and tree we can understand that what exactly can be called as a diameter. So how are we going to solve this?
The easiest way to solve this problem is to find the maximum distance between leaf nodes of each internal node and then get the maximum of all the distances and term it as diameter.

tree diameter table for max distance

For e.g.: The above table contains the max distance between the farthest leaf nodes for each of the internal node. It shows that the node N01 has the max value, so the diameter of the tree is 8.

Recursive Approach – Tree Diameter

Another way of approaching this problem is as follows. As we mentioned above that the diameter can
1) completely lie in the left sub tree or
2) completely lie in the right sub tree or
3) may span across the root

Which means that the diameter can be ideally derived by
1) the diameter of left tree or
2) the diameter of right tree or
3) the height of left sub tree + the height of right sub tree + 1 ( 1 to add the root node when the diameter spans across the root node)

And we know that the diameter is the lengthiest path, so we take the maximum of 1 and 2 in case it lies in either of the side or wee take 3 if it spans through the root.

This seems to be a good recursive definition.

Pseudo Code Recursive – Tree Diameter

Source Code Recursive – Tree Diameter

Source Code – Height of a node

Analysing the running time of Recursive Code for Tree Diameter

We know that Calculating the height of the node takes O(NlogN) in average case.
Lets say that T(N) is the running time then we can defined
T(N) = 2 * (Time to calculate Height of a node) + T(N/2) + T(N/2) + constant;
T(N) = 2 * NlogN + 2 * T(N/2) + cN
T(N) = 2 * T(N/2) + N * (2 * logN + c)

This is a similar recurrence as Merge Sort so, it will result into
T(N) = NlogN + 2 * NlogN + cN , we can drop the lower order terms and get the running time as N(logN)

It also requires auxiliary space which is for the recursion stack frame. At every recursion it needs frames equal to the height. So the amount of space would be O(logN).

Iterative Approach – Tree Diameter

Now we have a tree, we need a meta information with each of the node so that each node knows following:
1) The height of its left child,
2) The height of its right child and
3) The farthest distance between its leaf nodes.

Once each node has this information, we need a temporary variable to keep track of the maximum path. By the time the algorithm finishes, we have the value of diameter in the temp variable.

Now, we need to solve this problem in a bottom up approach, because we have no idea about the three values for the root. But we know these values for the leaves.

Steps to solve
1) Initialize all the leaves with leftHeight and rightHeight as 1.
2) Initialize all the leaves with maxDistance as 0, we make it a point that if either of the leftHeight or rightHeight is 1 we make the maxDistance = 0
3) Move upward one at a time and calculate the values for the immediate parent. It would be easy because now we know these values for the children.
4) At a given node,
a) assign leftHeight as maximum of (leftHeight or rightHeight of its left child).
b) assign the rightHeight as maximum of (leftHeight or rightHeight of its right child).
c) if any of these values (leftHeight or rightHeight) is 1 make the maxDistance as zero.
d) if both the values are greater than zero make the maxDistance as leftHeight + rightHeight – 1
5) Maintain the maxDistance in a temp variable and if in step 4 the maxDistance is more than the current value of the variable, replace it with the new maxDistance value.
6) At the end of the algorithm the value in maxDistance is the diameter.

Now as we solve it using a bottom up approach and visit the leaves first, that means we have to traverse the tree in a Post Order way because that visit the leaves first.

Source Code Recursive – Tree Diameter

Main Driver method

The source code for BinaryNode, CustomBinaryNode and Tree creation is attached. Please download if you need the complete code.

Analysing the running time of Iterative Code for Tree Diameter

1) The first while loop runs for N times where N is the number of nodes in the tree. As each node is pushed in the Stack. Inside the loop we do a fixed amount of work that is popping one element from the stack and pushing two elements in the stack. So the running time for first loop is O(N)

2) The second loop also runs for N times as the number of elements in the stack is N. Inside the loop we still do a constant work, may be the constant factor is bigger as we are doing a lot of work compared to the previous loop.

So the work done in running this algorithm may reduce to O(N). And all this by choice of an intelligent data structure. There is auxiliary space required which O(N) because we need two stacks of size N and we also want the tree nodes to store three extra values per node.

Conclusion

We learnt the meaning of diameter of a tree and also understood how to find it out with or without recursion. Tree Diameter is a wonderful problem and with a choice of a good structure we can improve the running time.

If you liked the article, please subscribe through email and like us on Facebook

  • empo

    I am sorry, but i couldnt understand the first approach. How did you fill that table? I might not have applied much time and mind, but it would be better if this could be better without putting that much mind into it. Anyways, i liked the site. Keep doing.

    • @Mangal_Dev Hey, the tables are just to understand the meaning of a diameter. All the values are manually populated by counting the number of nodes from one leaf node to another. I didn't apply any mathematical formula or programming fact. It is not related to any of the two approaches mentioned later in the article

  • @Mangal_Dev:disqus Hey, the tables are just to understand the meaning of a diameter. All the values are manually populated by counting the number of nodes from one leaf node to another. I didn't apply any mathematical formula or programming fact. It is not related to any of the two approaches mentioned later in the article.