Skip to the content of the web site.

Project G.2: Merge sort

Up to this point, we have written a plethora of different versions of merging two lists, so we will now introduce but one more; however, in this case, we will author the code for you.

1. Merging adjacent sorted sub-arrays

Suppose you have a single array T array[] and there are two adjacent sub-arrays, each of which are known to be sorted:

In this array, indices $a$ through $b - 1$ are sorted, and indices $b$ through $c - 1$ are sorted. Notice that we choose our indices to match the standard understanding that the indices of an array go from $0$ to $n - 1$. The number of entries in the first array is $b - a$ and the number of entries in the second array is $c - b$. For example,

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
34 15 65 40 42 50 59 68 80 3 22 23 29 46 57 65 73 99 44 2 48

where the entries from 3 to 8 are sorted and the entries from 9 to 17 are sorted, so $a = 3$, $b = 9$ and $c = 18$.

We would like to merge these two sub-arrays into a single sorted array with $c - a$ entries:

In our example, we would end up with

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
34 15 65 3 22 23 29 40 42 46 50 57 59 65 68 73 80 99 44 2 48

The declaration of this function would be

template <typename T>
void merge( T array[], size_t a, size_t b, size_t c ) {

We cannot easily merge in place, and therefore this function will:

  1. allocate a new array of capacity c - a,
  2. merge the two arrays into the newly allocated array,
  3. copy the entries back to the original array, and
  4. delete the new array.
template <typename T>
void merge( T array[], size_t a, size_t b, size_t c ) {
        if ( ((b - a) == 0) || ((c - b) == 0) ) {
                // there is nothing to merge...
void merge( T array[], size_t a, size_t b, size_t c ) {
        if ( ((b - a) == 0) || ((c - b) == 0) ) {
                // there is nothing to merge...
                return;
        } else {
                // Create a new array of sufficient capacity
                T *tmp_array{ new T[c - a] };
        
                // Merge the two sub-arrays so long as both are not empty
                size_t k0{a};
                size_t k1{b};
                
                size_t m{0};
        
                while ( (k0 < b) && (k1 < c) ) {
                        if ( array[k0] < array[k1] ) {
                                tmp_array[m] = array[k0];
                                ++k0;
                        } else {
                                tmp_array[m] = array[k1];
                                ++k1;
                        }
                
                        ++m;
                }

                // Copy the remaining entries over
                while ( k0 < b ) {
                        tmp_array[m] = array[k0];
                        ++k0;
                        ++m;
                }

                while ( k1 < c ) {
                        tmp_array[m] = array[k1];
                        ++k1;
                        ++m;
                }

                // Copy the entries back to the original array
                for ( size_t k{0}; k < c - a; ++k ) {
                        array[a + k] = tmp_array[k];
                }

                // Deallocate the temporary array
                delete[] tmp_array;
        }
}

We will now use this to sort an array.

2. Merge sort

Suppose we have an array of capacity n. If we want to sort this array, we have already looked at selection sort, insertion sort and bucket sort. Both selection sort and insertion sort are very slow for large arrays, and bucket sort is not useful if the number of possible values is too large.

Suppose we want to write a function with the declaration

template <typename T>
void merge_sort( T array[], size_t a, size_t c ) {

that will sort a list as follows:

  1. If the capacity of the array is $0$ or $1$, we are done: the array is already sorted.
  2. Otherwise,
    1. find the midpoint between the two indices,
    2. call merge sort recursively on both sub-arrays, and
    3. merge the two sorted sub-arrays into a single array.
template <typename T>
void merge( T array[], size_t a, size_t b, size_t c );

template <typename T>
void merge_sort( T array[], size_t a, size_t c ) {

template <typename T>
void merge_sort( T array[], size_t a, size_t c ) {
        if ( (c - a) <= 1 ) {
                // An array of capacity 0 or 1 is already sorted
                return;
        } else {
		// Do not use (a + c)/2, as this may overflow
                size_t b{a + (c - a)/2};

                merge_sort( array, a, b );
                merge_sort( array, b, c );
                merge( array, a, b, c );
        }
}

In order to experiment with the randomly-ordered arrays this author has run the merge sort we implemented above on various capacities a million to one-hundred-and-sixty million entries and calculated the run times by running each test ten times and averaging them. The data is in the data directory, and a plot of the time it takes to sort an array of a specific capacity is shown in this figure:

Extrapolating, if we had sufficient memory, we could sort an array of capacity one billion in 142 seconds, an array of capacity one trillion in 45 minutes, and an array of capacity one quadrillion in five years and ten months.

If you compare these run times with those of selection sort and insertion sort, you will see that these times are significantly faster (extrapolating from the data we collected on selection sort, that algorithm would take five years and five months just sort an array of capacity one billion); however, bucket sort is still significantly faster than merge sort; the only difference is that merge sort does not require that all the entries are from a fixed set of points. For example, bucket sort could never sort an array of double, while merge sort can sort an array of double.

3. Why does merge sort work?

When you take a look at the merge sort algorithm, there are only three lines of code, two of which are recursive calls, that the third is simply merging two sub-arrays.

Let us consider this from an inductive proof:

Suppose the array capacity is $n = 1$. In this case, the array is already sorted, so merge sort works correctly, as it simply returns. This is the base case.

Suppose that merge sort works correctly for all capacities less than or equal to $n$. This is our inductive hypothesis.

Now, suppose that the array capacity is now $n + 1$. If we divide that array into two approximately equal sub-arrays, then if $n \ge 1$ and $n + 1$ is even, then $1 \le \frac{n + 1}{2} \le n$; and if $n \ge 1$ and $n + 1$ is odd, then $1 \le \frac{n}{2} < \frac{n}{2} + 1 \le n$. Therefore, both sub-arrays will always have a capacity that is less than or equal to $n$. By our inductive hypothesis, merge sort will successfully sort both of these arrays.

Once we have both sub-arrays correctly sorted, it should be clear that the merging process must produce a sorted array of capacity $n + 1$, and therefore if merge sort did sort arrays of capacity less than or equal to $n$, it must also be capable of sorting an array of capacity $n + 1$.

Therefore, by the process of mathematical induction, merge sort must be able to sort an array of an arbitrary size $n \ge 1$.

Visually speaking, the following image shows the process of applying merge sort on an array of capacity 8. The backgrounds indicate function calls from the same function body. The first and second calls to merge sort and the third call to merge are indicated by red, cyan and purple. The order in which the functions are called is indicated by a number in a circle.

4. Optimizations

We will look at two optimizations:

  1. reducing the number of memory allocations and deallocations to one each, and
  2. calling insertion sort if the capacity of the array being sorted is smaller.

4.1 Reducing the memory allocations and deallocations

One optimization is that we must call new and delete many times. It would be significantly better if a single temporary array was allocated, and this temporary array is used for merging. This optimization speeds up the algorithm by over 30%. For this optimization, there are now three functions:

template <typename T>
void merge_sort( T array[], size_t n );

template <typename T>
void merge_sort( T array[], T tmp_array[], size_t a, size_t c );

template <typename T>
void merge( T array[], T tmp_array[], size_t a, size_t b, size_t c );

Here the user would call the first function. The first function would allocate an array of equal size and call the second function. The second function would recursively call itself and merge with the temporary array until the array is finally sorted. Then, the first function would deallocate the temporary array.

There are other optimizations you could use that would speed up the process even more; however, we will leave those for your course on algorithms and data structures.

4.2 Calling insertion_sort for smaller arrays

There is a significant amount of overhead with merge sort, for there are a significant number of recursive function calls being made, and while this recursion makes it much faster to sort very large arrays also makes it rather painfully slow to sort smaller arrays; however, even when we are sorting larger arrays, through this process of sub-dividing arrays, ultimately we are sorting many smaller arrays. To solve this problem, when we get to a sufficiently small array, we call insertion sort. The question is, what is a sufficiently small array: while running numerous tests, it seems that 60 is a reasonable number at which to transition between merge sort and insertion sort. This reduces the run time by another 15%.

Note that this now requires an insertion sort that only sorts a sub-array of a given array, from a to c - 1, as opposed from 0 to c - 1 (we use c instead of n to consistent with the use above:

template 
void merge_sort( T array[], T tmp_array[], size_t a, size_t c );

template <typename T>
void insertion_sort( T array[], size_t a, size_t c );

template <typename T>
void insertion_sort( T array[], size_t a, size_t c ) {
	for ( size_t n{a + 1}; n < c; ++n ) {
		T const tmp{array[n]};
		size_t k{n};

		for ( ; k > a; --k ) {
			if ( array[k - 1] > tmp ) {
				array[k] = array[k - 1];
			} else {
				break;
			}
		}

		array[k] = tmp;
	}
}

template <typename T>
void merge_sort( T array[], T tmp_array[], size_t a, size_t c ) {
	if ( (c - a) <= 60 ) {
		insertion_sort( array, a, c );
	} else {
		size_t b{a + (c - a)/2};

		merge_sort( array, tmp_array, a, b );
		merge_sort( array, tmp_array, b, c );
		merge( array, tmp_array, a, b, c );
	}
}