Counting sort is a very time-efficient (and somewhat space-inefficient) algorithm for sorting that avoids comparisons and exploits the time insertions and lookups in a list.
The idea is simple: if you're sorting integers and you know they all fall in the range 1..100, you can generate a sorted list this way:
- Allocate a list num_counts where the indices represent numbers from our input list and the values represent how many times the index number appears. Start each value at 0.
- In one pass of the input list, update num_counts as you go, so that at the end the values in num_counts are correct.
- Allocate a list sorted_list where we'll store our sorted numbers.
- In one in-order pass of num_counts put each number, the correct number of times, into sorted_list.
Counting sort takes time and additional space (for the new list that we end up returning).
Wait, aren't we nesting two loops towards the bottom? So shouldn't it be time? Notice what those loops iterate over. The outer loop runs once for each unique number in the list. The inner loop runs once for each time that number occurred.
So in essence we're just looping through the n numbers from our input list, except we're splitting it into two steps: (1) each unique number, and (2) each time that number appeared.
Here's another way to think about it: in each iteration of our two nested loops, we append one item to sorted_list. How many numbers end up in sorted_list in the end? Exactly how many were in our input list! n!
There are some rare cases where even though our input items aren't integers bound by constants, we can write a function that maps our items to integers from 0 to some constant such that different items will always map to different integers. This allows us to use counting sort.