You have a linked list and want to find the kth to last node.
Write a function kthToLastNode that takes an integer k and the headNode of a singly-linked list, and returns the kth to last node in the list.
We can do this in time.
We can do this in space. If you're recursing, you're probably taking space on the call stack!
It might be tempting to iterate through the list until we reach the end and then walk backward k nodes.
But we have a singly linked list! We can’t go backward. What else can we do?
What if we had the length of the list?
Then we could calculate how far to walk, starting from the head, to reach the kth to last node.
If the list has n nodes:
And our target is the kth to last node:
The distance from the head to the target is n-k:
Well, we don't know the length of the list (n). But can we figure it out?
Yes, we could iterate from the head to the tail and count the nodes!
Can you implement this approach in code?
What are our time and space costs?
time and space, where n is the length of the list.
More precisely, it takes n steps to get the length of the list, and another n-k steps to reach the target node. In the worst case k=1, so we have to walk all the way from head to tail again to reach the target node. This is a total of 2n steps, which is .
Can we do better?
The fact that we walk through our whole list once just to get the length, then walk through the list again to get to the kth to last element sounds like a lot of work. Perhaps we can do this in just one pass?
What if we had like a "stick" that was k nodes wide. We could start it at the beginning of the list so that the left side of the stick was on the head and the right side was on the kth node.
Then we could slide the stick down the list...
until the right side hit the end. At that point, the left side of the stick would be on the kth to last node!
How can we implement this? Maybe it'll help to keep two pointers?
We can allocate two variables that'll be references to the nodes at the left and right sides of the "stick"!
This'll work, but does it actually save us any time?
We can think of this two ways.
First approach: use the length of the list.
Second approach: maintain a k-wide "stick" in one walk down the list.
In either approach, make sure to check if k is greater than the length of the linked list! That's bad input, and we'll want to raise an error.
Both approaches use time and space.
But the second approach is fewer steps since it gets the answer "in one pass," right? Wrong.
In the first approach, we walk one pointer from head to tail (to get the list's length), then walk another pointer from the head node to the target node (the kth to last node).
In the second approach, rightNode also walks all the way from head to tail, and leftNode also walks from the head to the target node.
So in both cases, we have two pointers taking the same steps through our list. The only difference is the order in which the steps are taken. The number of steps is the same either way.
However, the second approach might still be slightly faster, due to some caching and other optimizations that modern processors and memory have.
Let's focus on caching. Usually when we grab some data from memory (for example, info about a linked list node), we also store that data in a small cache right on the processor. If we need to use that same data again soon after, we can quickly grab it from the cache. But if we don't use that data for a while, we're likely to replace it with other stuff we've used more recently (this is called a "least recently used" replacement policy).
Both of our algorithms access a lot of nodes in our list twice, so they could exploit this caching. But notice that in our second algorithm there's a much shorter time between the first and second times that we access a given node (this is sometimes called "temporal locality of reference"). Thus it seems more likely that our second algorithm will save time by using the processor's cache! But this assumes our processor's cache uses something like a "least recently used" replacement policy—it might use something else. Ultimately the best way to really know which algorithm is faster is to implement both and time them on a few different inputs!
Can we do better? What if we expect n to be huge and k to be pretty small? In this case, our target node will be close to the end of the list...so it seems a waste that we have to walk all the way from the beginning twice.
Can we trim down the number of steps in the "second trip"? One pointer will certainly have to travel all the way from head to tail in the list to get the total length...but can we store some "checkpoints" as we go so that the second pointer doesn't have to start all the way at the beginning? Can we store these "checkpoints" in constant space? Note: this approach only saves time if we know that our target node is towards the end of the list (in other words, n is much larger than k).
We listed two good solutions. One seemed to solve the problem in one pass, while the other took two passes. But the single-pass approach didn't take half as many steps, it just took the same steps in a different order.
So don't be fooled: "one pass" isn't always fewer steps than "two passes." Always ask yourself, "Have I actually changed the number of steps?"
Powered by qualified.io