Dijkstra's Algorithm

Dijkstra's algorithm finds the shortest path from one node to all other nodes in a weighted graph.

Say we had the following graph, which represents the travel cost between different cities in the southeast US:

The diagram represents the travel cost between cities: Atlanta, Memphis, Mobile, Nashville, New Orleans, Savannah.

Traveling from Memphis to Nashville? The cheapest route isn't to go straight from one to the other! (You'll actually want to go via New Orleans, Mobile, and Atlanta for the cheapest path.)

This sort of thing can happen in real life. Lots of airlines like to route through large hubs and might not have much service directly between two smaller cities.

Dijkstra's algorithm lets us find the cheapest route from one city to all other cities. So for Memphis:

The diagram represents the cheapest route from Memphis to all other cities.

What if we only want the cheapest path between one pair of nodes? Just pick one of them as the starting point and run Dijkstra's algorithm. Finding the cheapest path to all nodes includes finding the cheapest path to the other node in the pair.

But isn't Dijkstra's algorithm overkill if we only care about one pair of nodes? Actually no, because we'll still need to consider other nodes in the graph to make sure we've found the lowest-cost weighted path. (Take a look at going from Memphis to Nashville, where the cheapest path includes over half the nodes in the graph.)

The Algorithm

Dijkstra's algorithm is like breadth-first search (BFS), except we use a priority queue instead of a normal first-in-first-out queue. Each item's priority is the cost of reaching it.

Let's work through an example before coding it up. We'll use our graph of cities from before, starting at Memphis.

Since we're already at Memphis, the cost to get there is 0. We don't know how to get anywhere else though, so those costs are \infty. We'll store all these costs in a table and initialize our priority queue.

City Cost
Atlanta \infty
Memphis 0
Mobile \infty
Nashville \infty
New Orleans \infty
Savannah \infty

First up in our queue, we've got Memphis. As with BFS, we'll start by looking at its neighbors.

The diagram represents initialized values of travel cost between cities. The cost to Memphis is equal to zero. Other values are unknown.

Looking at its neighbors, we can get to Atlanta (at a cost of 10), Mobile (cost is 7), Nashville (cost is 15) and New Orleans (cost is 3). So we'll update those travel costs in our table and our priority queue.

Here's what our table and priority queue look like now. To help keep track of things, we'll put a check mark next to nodes we've dequeued. So far, it's just Memphis.

City Cost
Atlanta 10
Memphis 0
Mobile 7
Nashville 15
New Orleans 3
Savannah \infty
The diagram represents the travel graph where Memphis is dequeued.

What next? Repeat!

Next in our priority queue we've got New Orleans, at a cost of 3. Let's remove it from the priority queue and examine its neighbors:

  • We've already dequeued Memphis, so we can skip it.
  • From New Orleans, we can get to Mobile at a cost of 3. Since it takes 3 to get from Memphis to New Orleans, the total cost of this path is 6. That's cheaper than the path we found earlier (where we went straight from Memphis to Mobile at a cost of 7), so we need to update its cost.

We'll also record the fact that we got to Mobile "via" New Orleans. This'll be important for figuring out which cities our cheapest route passes through at the end. (We could skip this step if we just wanted to calculate the lowest price, not the actual path.)

City Cost Via
Atlanta 10 Memphis
Memphis 0
Mobile 6 New Orleans
Nashville 15 Memphis
New Orleans 3 Memphis
Savannah \infty
The diagram represents the travel graph without Memphis and New Orleans.

Now we'll repeat the process again. The next city in our priority queue is Mobile at a cost of 6.

Looking at its neighbors:

  • Memphis and New Orleans have already been dequeued, so we can skip them.
  • We can get to Atlanta (2) via Mobile (6), for a total cost of 8. That's cheaper than our current path to Atlanta (10, via Memphis).
  • We can get to Savannah (6) via Mobile (6), for a total cost of 12. That's cheaper than our current cost (\infty, since we don't have a path yet).

Updating our priority queue and table:

City Cost Via
Atlanta 8 Mobile
Memphis 0
Mobile 6 New Orleans
Nashville 15 Memphis
New Orleans 3 Memphis
Savannah 12 Mobile
The diagram represents the travel graph without Memphis, New Orleans and Mobile.

Next up, Atlanta, at a cost of 8:

  • We can get to Savannah (1) via Atlanta (8), for a total cost of 9. That's cheaper than our current route via Mobile, which cost 12.
  • We can get to Nashville (2) via Atlanta (8), for a total cost of 10. That's cheaper than our current route via Memphis, which cost 15.
  • Memphis and Mobile have already been dequeued, so we can skip them.

Once we dequeue Atlanta, we know that we won't find a cheaper path later on. So, even though there are a few different ways to get to Atlanta, we don't have to consider the cost of all those paths; we just care about the cost of the shortest one.

Here's how things look now:

City Cost Via
Atlanta 8 Mobile
Memphis 0
Mobile 6 New Orleans
Nashville 10 Atlanta
New Orleans 3 Memphis
Savannah 9 Atlanta
The diagram represents the travel graph without Memphis, New Orleans, Mobile and Atlanta.

Almost there. Next city is Savannah, at a cost of 9.

We've already dequeued both of Savannah's neighbors, so we don't need to do anything.

City Cost Via
Atlanta 8 Mobile
Memphis 0
Mobile 6 New Orleans
Nashville 10 Atlanta
New Orleans 3 Memphis
Savannah 9 Atlanta
The diagram represents the travel graph without Memphis, New Orleans, Mobile, Atlanta and Savannah.

Finally, our last city is Nashville, at a cost of 10.

Again, both its neighbors have already been dequeued. We're done!

Here's what our final table looks like:

City Cost Via
Atlanta 8 Mobile
Memphis 0
Mobile 6 New Orleans
Nashville 10 Atlanta
New Orleans 3 Memphis
Savannah 9 Atlanta

Backtracking

Our table above stores the cost of getting to any city from Memphis, but it doesn't store the actual path. To find it, we'd start at that city and use the Via column to backtrack through the graph until we get to Memphis. This is the same way you recover paths after doing a breadth-first search.

Implementation

Here's an implementation in code. We'll split our table of data across a few different hash maps and a hash set.

import heapq def dijkstra(graph_adjacency_list, start_node): # Initially, it costs an unknown amount to get anywhere # other than the start_node (which we can get to for free) cost_to_get_to = { node : float('inf') for node in graph_adjacency_list } cost_to_get_to[start_node] = 0 # Track which nodes we've already dequeued -- we know # we've already found the shortest path to each of them # (This is the "checkmark" from our table) dequeued_nodes = set() # Priority queue ordering cities by the cost to get to them priority_queue = [] for node in graph_adjacency_list: heapq.heappush(priority_queue, (cost_to_get_to[node], node)) while len(priority_queue) > 0: # Dequeue the next node from our priority queue cheapest_cost, cheapest_node = heapq.heappop(priority_queue) dequeued_nodes.add(cheapest_node) # Update cost_to_get_to for neighboring nodes for neighbor, edge_cost in graph_adjacency_list[cheapest_node]: # Nodes we dequeued earlier can be skipped if neighbor in dequeued_nodes: continue # Can we get there more cheaply via this neighbor node? if cost_to_get_to[cheapest_node] + edge_cost < cost_to_get_to[neighbor]: # Update the cost to reach the node cost_to_get_to[neighbor] = cost_to_get_to[cheapest_node] + edge_cost # Update the cost to reach this node in the priority queue heapq.heapupdate(priority_queue, neighbor, cost_to_get_to[neighbor]) return cost_to_get_to

Our implementation doesn't keep track of how we reached each node, and we don't recover the cheapest paths. Take a stab at adding that feature to the implementation!

Time and Space Complexity

What's the time complexity?

  • Our initialization involves a constant amount of work per node, plus inserting nodes into priorityQueue will take time. (Priority queues are built on heaps, which have insertions) That's time overall for all the initialization work.
  • We'll go through the loop times, and each time we call removeMin, taking time (assuming a heap-based priority queue). Over the entire algorithm, that's time.
  • We'll update costInQueue once per edge, or times. Each priority queue update costs time. That's time overall.

Putting all the steps together, the time complexity for Dijkstra's algorithm is . Sometimes, this complexity is written .

What about space complexity? All our data structures hold a constant amount of data for each node in the graph. That's space in total.

What's next?

If you're ready to start applying these concepts to some problems, check out our mock coding interview questions.

They mimic a real interview by offering hints when you're stuck or you're missing an optimization.

Try some questions now

. . .