Merge Sort Algorithm

Quick reference

	Complexity
Worst case time
Best case time
Average case time
Space

Strengths:

Fast. Merge sort runs in , which scales well as n grows.
Parallelizable. Merge sort breaks the input into chunks, each of which can be sorted at the same time in parallel.

Weaknesses:

Space. Merge sort takes up extra space, including space for the recursive call stack.

The High-Level Idea

Merge sort is a recursive algorithm that works like this:

split the input in half
sort each half by recursively using this same process
merge the sorted halves back together

Like all recursive algorithms, merge sort needs a base case. Here, the base case is an input array with one item. A one-element array is already sorted.

Splitting the Input

Here's a visual representation of merge sort repeatedly breaking the input array in half:

[8, 3, 2, 7, 9, 1, 4] is broken in half ([8, 3, 2, 7] and [9, 1, 4]). Those halves are broken in half again ([8, 3], [2, 7]; [9], [1, 4]) and again ([8], [3], [2], [7], [9], [1], [4]) until they're just one item each.

At the bottom, we've got a bunch of "sorted" one-item arrays that we want to combine into one large sorted array.

Merging sorted arrays

Combining two one-item arrays is easy—the smaller item goes first and the larger item goes second.

Single-element lists are merged together to make two-element sorted lists. ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [1] + [4] -> [1, 4]

Now, how do we merge two of these two-item arrays?

That's a bit trickier, and it definitely requires more than one comparison.

Let's come up with a more general algorithm that can merge two sorted arrays of any size.

Take this example:

Two smaller sorted lists ([1, 3, 5] and [2, 6, 8]) will be merged into a sorted list with 6 elements ([_, _, _, _, _, _])

The smallest item has to be the first item in either the first array or the second array. So, we compare the first element in both arrays and place the smaller of the two at the front of the merged array.

The first element in one of the lists ([1, 3, 5]) is 1. The first element in the other list ([2, 6, 8]) is 2.

The first element in one of the lists ([1, 3, 5]) is 1. The first element in the other list ([2, 6, 8]) is 2. 1 is smaller, so it goes at the beginning of our sorted list ([1, _, _, _, _, _])

We can keep repeating this process, skipping over any items that we've already added to the merged array. At each step, we'll just compare the earliest unmerged items from both arrays and add the smaller one.

The merged list is [1, _, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 2 (from [2, 6, 8]).

The merged list is [1, _, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 2 (from [2, 6, 8]). 2 is smaller than 3, so it goes next in our sorted list, which is now [1, 2, _, _, _, _].

The merged list is [1, 2, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 6 (from [2, 6, 8]).

The merged list is [1, 2, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 6 (from [2, 6, 8]). 3 is smaller than 6, so it goes next in our sorted list, which is now [1, 2, 3, _, _, _].

The merged list is [1, 2, 3, _, _, _]. Now we're comparing 5 (from [1, 3, 5]) and 6 (from [2, 6, 8]).

The merged list is [1, 2, 3, _, _, _]. Now we're comparing 5 (from [1, 3, 5]) and 6 (from [2, 6, 8]). 5 is smaller than 6, so it goes next in our sorted list, which is now [1, 2, 3, 5, _, _].

Once we exhaust one array, we can just append any remaining items from the other one to the end of the merged array.

All of the items from [1, 3, 5] are in the sorted list. 6 and 8 from [2, 6, 8] still need to be added to the merged list.

6 and 8 are appended to the merged, sorted list, which is now [1, 2, 3, 5, 6, 8]

Using this more general approach, we can easily combine our two-element arrays into four-element arrays, combine those into eight-element arrays, and so on, until all the items are in a single, sorted array.

Small lists ([3, 8]; [2, 7]; [9]; [1, 4]) are merged to form larger sorted lists ([2, 3, 7, 8]; [1, 4, 9]). Those larger lists are merged again into one sorted list with all the items: [1, 2, 3, 4, 7, 8, 9].

Implementation

Here's how we'd code it up:

Execution Order

We described merge sort as two steps: a splitting step and a merge step.

In reality, these steps are interleaved through recursive calls.

When we split the input into two arrays, we'll actually finish sorting the left half (splitting it all the way down to one-element arrays and merging them back together) before we even touch the right half.

The input list [8, 3, 2, 7, 9, 1, 4] is split in half, with the left half containing [8, 3, 2, 7]. That half gets broken into 1 element lists ([8, 3]; [2, 7] -> [8], [3], [2], [7]), which get merged together ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [3, 8] + [2, 7] -> [2, 3, 7, 8]). Meanwhile, the right half of the input ([9, 1, 4]) is untouched.

Then, we'll do the same thing with the right half before combining both sorted halves together to get the full sorted array.

The right half of [8, 3, 2, 7, 9, 1, 4] is [9, 1, 4]. That half gets split into single-element lists ([9]; [1, 4] -> [9], [1], [4]), which are merged into sorted lists ([1] + [4] -> [1, 4]; [9] + [1, 4] -> [1, 4, 9]

Understanding the actual execution order is important for thinking through the space cost on the call stack.

Time Complexity

Look at each row in the execution diagram:

The input list ([8, 3, 2, 7, 9, 1, 4]) is broken in half ([8, 3, 2, 7] and [9, 1, 4]), broken in half again ([8, 3]; [2, 7]; [9]; [1, 4]), and again ([8], [3], [2], [7], [9], [1], [4]). The single-element lists are merged together ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [1] + [4] -> [1, 4]), and merged again ([3, 8] + [2, 7] -> [2, 3, 7, 8]; [9] + [1, 4] -> [1, 4, 9]) and again ([2, 3, 7, 8] + [1, 4, 9] -> [1, 2, 3, 4, 7, 8, 9]) to form the final sorted list.

Each row in the diagram is time.

In the upper half, we do some constant-time work for each item by copying the item from the input array into a smaller array.
In the bottom half, we do some constant-time work for each item by copying the item from a smaller, sorted array into the larger, merged array.

How many rows are there in the diagram? In the top half, we're dividing n in half until we get down to 1. Then, in the bottom half we do the reverse—multiplying by two until we get to n. That's divisions followed by multiplications.. Adding the two together, we've got levels.

So overall, merge sort takes time.

Space Complexity

When combining two sorted arrays into a single bigger one, we need scratch space to hold the merged result.

Since the arrays we'll be combining have items, we'll need space in total.

But, don't we need less than space for some of the merges?

Yes. For instance, when we're merging two arrays with one element each, that requires space.

But, by the time we get to the final merge, we'll have two halves that each have n/2 items, so we'll need space for the combined result.

Do we really need that scratch space? Can't we just be more clever about how we merge the sorted arrays?

Try for yourself! Turns out you can't really do the merge in space without bumping up the time complexity.

If you look carefully, our implementation uses more space () than is strictly necessary.

In the arguments for each recursive call, we're passing in a copy of the input. And, since we might have recursive calls before we hit the base case, that's copies of the input.

We did this to keep the implementation nice and tidy (and not distract from the key intuition behind merge sort).

But with some cleverness, you can get it down to extra space. Try it. :)

Interview coming up?

Get the free 7-day email crash course. You'll learn how to think algorithmically, so you can break down tricky coding interview questions.

No prior computer science training necessary—we'll get you up to speed quickly, skipping all the overly academic stuff.

No spam. One-click unsubscribe whenever.

You're in! Head over to your email inbox right now to read day one!