Merge Sort Algorithm

Quick reference

	Complexity
Worst case time
Best case time
Average case time
Space

Strengths:

Fast. Merge sort runs in , which scales well as n grows.
Parallelizable. Merge sort breaks the input into chunks, each of which can be sorted at the same time in parallel.

Weaknesses:

Space. Merge sort takes up extra space, including space for the recursive call stack.

The High-Level Idea

Merge sort is a recursive algorithm that works like this:

split the input in half
sort each half by recursively using this same process
merge the sorted halves back together

Like all recursive algorithms, merge sort needs a base case. Here, the base case is an input list with one item. A one-element list is already sorted.

Splitting the Input

Here's a visual representation of merge sort repeatedly breaking the input list in half:

[8, 3, 2, 7, 9, 1, 4] is broken in half ([8, 3, 2, 7] and [9, 1, 4]). Those halves are broken in half again ([8, 3], [2, 7]; [9], [1, 4]) and again ([8], [3], [2], [7], [9], [1], [4]) until they're just one item each.

At the bottom, we've got a bunch of "sorted" one-item lists that we want to combine into one large sorted list.

Merging sorted lists

Combining two one-item lists is easy—the smaller item goes first and the larger item goes second.

Single-element lists are merged together to make two-element sorted lists. ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [1] + [4] -> [1, 4]

Now, how do we merge two of these two-item lists?

That's a bit trickier, and it definitely requires more than one comparison.

Let's come up with a more general algorithm that can merge two sorted lists of any size.

Take this example:

Two smaller sorted lists ([1, 3, 5] and [2, 6, 8]) will be merged into a sorted list with 6 elements ([_, _, _, _, _, _])

The smallest item has to be the first item in either the first list or the second list. So, we compare the first element in both lists and place the smaller of the two at the front of the merged list.

The first element in one of the lists ([1, 3, 5]) is 1. The first element in the other list ([2, 6, 8]) is 2.

The first element in one of the lists ([1, 3, 5]) is 1. The first element in the other list ([2, 6, 8]) is 2. 1 is smaller, so it goes at the beginning of our sorted list ([1, _, _, _, _, _])

We can keep repeating this process, skipping over any items that we've already added to the merged list. At each step, we'll just compare the earliest unmerged items from both lists and add the smaller one.

The merged list is [1, _, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 2 (from [2, 6, 8]).

The merged list is [1, _, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 2 (from [2, 6, 8]). 2 is smaller than 3, so it goes next in our sorted list, which is now [1, 2, _, _, _, _].

The merged list is [1, 2, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 6 (from [2, 6, 8]).

The merged list is [1, 2, _, _, _, _]. Now, we're comparing 3 (from [1, 3, 5]) and 6 (from [2, 6, 8]). 3 is smaller than 6, so it goes next in our sorted list, which is now [1, 2, 3, _, _, _].

The merged list is [1, 2, 3, _, _, _]. Now we're comparing 5 (from [1, 3, 5]) and 6 (from [2, 6, 8]).

The merged list is [1, 2, 3, _, _, _]. Now we're comparing 5 (from [1, 3, 5]) and 6 (from [2, 6, 8]). 5 is smaller than 6, so it goes next in our sorted list, which is now [1, 2, 3, 5, _, _].

Once we exhaust one list, we can just append any remaining items from the other one to the end of the merged list.

All of the items from [1, 3, 5] are in the sorted list. 6 and 8 from [2, 6, 8] still need to be added to the merged list.

6 and 8 are appended to the merged, sorted list, which is now [1, 2, 3, 5, 6, 8]

Using this more general approach, we can easily combine our two-element lists into four-element lists, combine those into eight-element lists, and so on, until all the items are in a single, sorted list.

Small lists ([3, 8]; [2, 7]; [9]; [1, 4]) are merged to form larger sorted lists ([2, 3, 7, 8]; [1, 4, 9]). Those larger lists are merged again into one sorted list with all the items: [1, 2, 3, 4, 7, 8, 9].

Implementation

Here's how we'd code it up:

def combine_sorted_lists(list_one, list_two): list_one_index = 0 list_two_index = 0 merged_list = [] # Both lists have some items left in them. while list_one_index < len(list_one) and list_two_index < len(list_two): # Choose the smaller of the two items and add it to the # merged list. if list_one[list_one_index] <= list_two[list_two_index]: merged_list.append(list_one[list_one_index]) list_one_index += 1 else: merged_list.append(list_two[list_two_index]) list_two_index += 1 # Grab any lingering items in the first list after we've # exhausted the second list while list_one_index < len(list_one): merged_list.append(list_one[list_one_index]) list_one_index += 1 # Grab any lingering items in the second list after we've # exhausted the first list while list_two_index < len(list_two): merged_list.append(list_two[list_two_index]) list_two_index += 1 return merged_list def merge_sort(the_list): # Base case: single element list if len(the_list) <= 1: return the_list # Split the input in half middle_index = len(the_list) / 2 left = the_list[:middle_index] right = the_list[middle_index:] # Sort each half left_sorted = merge_sort(left) right_sorted = merge_sort(right) # Merge the sorted halves return combine_sorted_lists(left_sorted, right_sorted)

Execution Order

We described merge sort as two steps: a splitting step and a merge step.

In reality, these steps are interleaved through recursive calls.

When we split the input into two lists, we'll actually finish sorting the left half (splitting it all the way down to one-element lists and merging them back together) before we even touch the right half.

The input list [8, 3, 2, 7, 9, 1, 4] is split in half, with the left half containing [8, 3, 2, 7]. That half gets broken into 1 element lists ([8, 3]; [2, 7] -> [8], [3], [2], [7]), which get merged together ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [3, 8] + [2, 7] -> [2, 3, 7, 8]). Meanwhile, the right half of the input ([9, 1, 4]) is untouched.

Then, we'll do the same thing with the right half before combining both sorted halves together to get the full sorted list.

The right half of [8, 3, 2, 7, 9, 1, 4] is [9, 1, 4]. That half gets split into single-element lists ([9]; [1, 4] -> [9], [1], [4]), which are merged into sorted lists ([1] + [4] -> [1, 4]; [9] + [1, 4] -> [1, 4, 9]

Understanding the actual execution order is important for thinking through the space cost on the call stack.

Time Complexity

Look at each row in the execution diagram:

The input list ([8, 3, 2, 7, 9, 1, 4]) is broken in half ([8, 3, 2, 7] and [9, 1, 4]), broken in half again ([8, 3]; [2, 7]; [9]; [1, 4]), and again ([8], [3], [2], [7], [9], [1], [4]). The single-element lists are merged together ([8] + [3] -> [3, 8]; [2] + [7] -> [2, 7]; [1] + [4] -> [1, 4]), and merged again ([3, 8] + [2, 7] -> [2, 3, 7, 8]; [9] + [1, 4] -> [1, 4, 9]) and again ([2, 3, 7, 8] + [1, 4, 9] -> [1, 2, 3, 4, 7, 8, 9]) to form the final sorted list.

Each row in the diagram is time.

In the upper half, we do some constant-time work for each item by copying the item from the input list into a smaller list.
In the bottom half, we do some constant-time work for each item by copying the item from a smaller, sorted list into the larger, merged list.

How many rows are there in the diagram? In the top half, we're dividing n in half until we get down to 1. Then, in the bottom half we do the reverse—multiplying by two until we get to n. That's divisions followed by multiplications.. Adding the two together, we've got levels.

So overall, merge sort takes time.

Space Complexity

When combining two sorted lists into a single bigger one, we need scratch space to hold the merged result.

Since the lists we'll be combining have items, we'll need space in total.

But, don't we need less than space for some of the merges?

Yes. For instance, when we're merging two lists with one element each, that requires space.

But, by the time we get to the final merge, we'll have two halves that each have n/2 items, so we'll need space for the combined result.

Do we really need that scratch space? Can't we just be more clever about how we merge the sorted lists?

Try for yourself! Turns out you can't really do the merge in space without bumping up the time complexity.

If you look carefully, our implementation uses more space () than is strictly necessary.

In the arguments for each recursive call, we're passing in a copy of the input. And, since we might have recursive calls before we hit the base case, that's copies of the input.

We did this to keep the implementation nice and tidy (and not distract from the key intuition behind merge sort).

But with some cleverness, you can get it down to extra space. Try it. :)

Interview coming up?

Get the free 7-day email crash course. You'll learn how to think algorithmically, so you can break down tricky coding interview questions.

No prior computer science training necessary—we'll get you up to speed quickly, skipping all the overly academic stuff.

No spam. One-click unsubscribe whenever.

You're in! Head over to your email inbox right now to read day one!