But it doesn't have to end here! Sign up for the 7-day coding interview crash course and you'll get a free Interview Cake problem every week.
In order to win the prize for most cookies sold, my friend Alice and I are going to merge our Girl Scout Cookies orders and enter as one unit.
Each order is represented by an "order id" (an integer).
We have our lists of orders sorted numerically already, in lists. Write a function to merge our lists of orders into one sorted list.
We can do this in time and space.
If you're running a built-in sorting function, your algorithm probably takes time for that sort.
Think about edge cases! What happens when we've merged in all of the elements from one of our lists but we still have elements to merge in from our other list?
We could simply concatenate (join together) the two lists into one, then sort the result:
What would the time cost be?
, where n is the total length of our output list (the sum of the lengths of our inputs).
We can do better. With this algorithm, we're not really taking advantage of the fact that the input lists are themselves already sorted. How can we save time by using this fact?
A good general strategy for thinking about an algorithm is to try writing out a sample input and performing the operation by hand. If you're stuck, try that!
Since our lists are sorted, we know they each have their smallest item in the 0th index. So the smallest item overall is in the 0th index of one of our input lists!
Which 0th element is it? Whichever is smaller!
To start, let's just write a function that chooses the 0th element for our sorted list.
Okay, good start! That works for finding the 0th element. Now how do we choose the next element?
Let's look at a sample input:
To start we took the 0th element from alices_list and put it in the 0th slot in the output list:
We need to make sure we don't try to put that 1 in merged_list again. We should mark it as "already merged" somehow. For now, we can just cross it out:
Or we could even imagine it's removed from the list:
Now to get our next element we can use the same approach we used to get the 0th element—it's the smallest of the earliest unmerged elements in either list! In other words, it's the smaller of the leftmost elements in either list, assuming we've removed the elements we've already merged in.
So in general we could say something like:
Can you implement this in code?
Okay, this algorithm makes sense. To wrap up, we should think about edge cases and check for bugs. What edge cases should we worry about?
Here are some edge cases:
Actually, (3) will always happen. In the process of merging our lists, we'll certainly exhaust one before we exhaust the other.
Does our function handle these cases correctly?
If both lists are empty, we're fine. But for all other edge cases, we'll get an IndexError.
How can we fix this?
We can probably solve these cases at the same time. They're not so different—they just have to do with indexing past the end of lists.
To start, we could treat each of our lists being out of elements as a separate case to handle, in addition to the 2 cases we already have. So we have 4 cases total. Can you code that up?
Be sure you check the cases in the right order!
Cool. This'll work, but it's a bit repetitive. We have these two lines twice:
Same for these two lines:
That's not DRY. Maybe we can avoid repeating ourselves by bringing our code back down to just 2 cases.
See if you can do this in just one "if else" by combining the conditionals.
You might try to simply squish the middle cases together:
But what happens when my_list is exhausted?
We'll get an IndexError when we try to access my_list[current_index_mine]!
How can we fix this?
First, we allocate our answer list, getting its size by adding the size of my_list and alices_list.
We keep track of a current index in my_list, a current index in alices_list, and a current index in merged_list. So at each step, there's a "current item" in alices_list and in my_list. The smaller of those is the next one we add to the merged_list!
But careful: we also need to account for the case where we exhaust one of our lists and there are still elements in the other. To handle this, we say that the current item in my_list is the next item to add to merged_list only if my_list is not exhausted AND, either:
The if statement is carefully constructed to avoid an IndexError from indexing past the end of a list. We take advantage of Python 2.7's short circuit evaluation and check first if the lists are exhausted.
time and additional space, where n is the number of items in the merged list.
The added space comes from allocating the merged_list. There's no way to do this "in place" because neither of our input lists are necessarily big enough to hold the merged list.
But if our inputs were linked lists, we could avoid allocating a new structure and do the merge by simply adjusting the next pointers in the list nodes!
In our implementation above, we could avoid tracking current_index_merged and just compute it on the fly by adding current_index_mine and current_index_alices. This would only save us one integer of space though, which is hardly anything. It's probably not worth the added code complexity.
Trivia! Python's native sorting algorithm is called Timsort. It's actually optimized for sorting lists where subsections of the lists are already sorted. For this reason, a more naive algorithm:
is actually faster until n gets pretty big. Like 1,000,000.
Also, in Python 2.6+, there's a built-in function for merging sorted lists into one sorted list: heapq.merge.
What if we wanted to merge several sorted lists? Write a function that takes as an input a list of sorted lists and outputs a single sorted list with all the items from each list.
Do we absolutely have to allocate a new list to use for the merged output? Where else could we store our merged list? How would our function need to change?
We spent a lot of time figuring out how to cleanly handle edge cases.
Sometimes it's easy to lose steam at the end of a coding interview when you're debugging. But keep sprinting through to the finish! Think about edge cases. Look for off-by-one errors.
Powered by qualified.io