Easy way to understand quick sort

Introduction

This article touches in detail all the aspects of a quick sort algorithm. I tried to write it as simple as possible to make it an easy way to understand quick sort. This is really an interesting algorithm to learn and I have found that most of the students or professionals feel it hard to understand or re-produce in code.

Problem Statement

You are given an array A[a1,a2,a3, . . . . . . , an] having elements in a random order and we need to re-produce the array A[b1, b2, b3, . . . . , bn] where b1 < b2 < b3 < b4 < . . . < bn .

Basic idea behind quick sort

The basic idea behind implementing quick sort is to divide the array in small parts and then sort the individual parts. The quick sort algorithm uses partitioning as its major routine.

The important thing about quick sort is that its in-place sort, which precisely means that it will not require any extra space. It just re-arranges the elements in the same array. Another in-place sort is Insertion Sort.

So now that it is clear that quick sort does partitions, it is obvious to conclude that the strategy used is Divide and Conquer.

The Divide Step
Partition a given array into two sub-arrays around a pivot ‘x’ such that elements in lower subarray <= x <= elements in upper subarray.
quicksort-invariant

The Conquer Step
Recursively sort the two subarrays.

At any point of time when an iteration is over, during the run of this algorithm the state of the array will be like below:

invariant_1

The image above shows that we maintain two pointers i and j, where the elements at indices less than i is always less than a chosen pivot element x and the elements at indices greater than i and less than j is always greater than the pivot element. Also, we do not have any idea about the elements lying at indices on the right of j.

Let’s try to understand what is happening in a given iteration of quick sort

Step1:
step1

We start with an array of 8 elements, initialize the pointers p=0, i=0 and j=i+1. p points to the pivot element (pivot need not be the first element of the array but for simplicity purpose let us suppose it is the first element).

Keep incrementing j to the right till we get something less than pivot element. j reaches at the number 5 which is lesser than the pivot element (6). At this point, value of i=0, j=3 and p=0 and whenever we find such an element in the right we swap the element at j with the element at i+1. We increment i by 1 and j remains at its same position.

Step2:
tep2

We start with the new location of j and repeat the same step (move right till we don’t find something smaller than pivot element or we reach the end of the array). j reaches at the number 3 which is lesser than the pivot element (6). At this point value of i=1, j=5 and p=0 and we swap the element at j with the element at i+1. We increment i by 1 and j remains at the same position.

Step3:
step3

We start with the new location of j and repeat the same step (move right till we don’t find something smaller than pivot element or we reach the end of the array). j reaches at the number 2 which is lesser than the pivot element (6). At this point value of i=2, j=6 and p=0 and we swap the element at j with the element at i+1. We increment i by 1 and j remains at the same position.

Step4:

tep4

We start with the new location of j and repeat the same step (move right till we don’t find something smaller than pivot element or we reach the end of the array). j reaches at the end and the loop terminates. When the loop terminates that is considered being the end of iteration. At this point we have to swap the pivot element with the element at i to maintain the variant we talked in the beginning, such that everything at the left of pivot element is less than the pivot element and everything at the right of pivot element is greater than the pivot element.

The array at the end of first iteration should look like the below:

step5

At this point we are sure that the element 6 is at its right position, this also tells that when the array will be completely sorted, the index of 6 will be 3.

Also, everything at the left of 6 is left sub-array which contains element less than 6 and everything to the right of 6 is the right sub-array which will contain all the elements greater than 6.

The same thing has to be repeated on the left and right sub-arrays individually and so on. Eventually we end up with the sorted array.

Pseudo Code

Below we try to write the above explained steps in code like structure.
Partition Routine
Below is the partition routing for the algorithm, it returns the index at which the pivot is placed after completion of each iteration. In the above illustration it is the blue oval, the value of i for which is 3. So the partition routine will return 3 after the end of iteration one.

partition routine

Algorithm
Below is the complete algorithm which uses the partition routine in a recursive manner. In the below algorithm we call the QuickSort(A,p,q) routine with the initial values of p=0 and q=n as we pass the complete array.

When the QuickSort routine is invoked, it calls the Partition(A, p ,q) routine and returns the index r where the pivot is to be placed or we can say it the index which divides the whole array into two parts, to the left of which are elements which are always smaller than pivot and the to the right are elements always greater than the pivot.

qucksort

After getting the value of r it calls the QuickSort routine recursively two times with different parameters :
Quicksort(A, p, r-1) – to sort the left sub-array
and
Quicksort(A, r+1, q) – to sort the right sub-array

Code and Implementation

Analysis

 

The partition routine iterates through the array once, hence it runs in linear time O(N). The swap routine is a constant time operation, hence its run time complexity is O(1). The quickSort routine is the recursive step, hence we can write the recurrence relation for it.

T(N) = 2T(N/2) + O(N)

This is for the case, where our pivot divides the array in two equal parts. In worst cases, it might end up dividing into skewed arrays, but again that can be normalized by a random pivot selection.

The solution to this recurrence is derived by Master method as this recurrence is of the form T(N) = aT(N/b) + f(N). The solution for such recurrences is given by the second case of the Master method because the work done on each level is roughly equal. Hence the solution is T(N) = N log N , this is because the values of b and a is 2 and log2 to the base 2 is 1.

For the complete explanation of the master method please visit my previous blog post