The Tortoise and the Hare

Section 3.5 The Tortoise and the Hare

In this section, we discuss an algorithm that checks if a linked list ends in a cycle. Linked lists are a another way of representing lists in a computer and differ from arrays in the following way: A linked list is a list in which each element contains the address of the next element of a list.

If a linked list data structure is set up incorrectly, it may exhibit a cycle, i.e. a later element points to an earlier element in the list. In that case, the list is no list anymore but looks more like a “lasso”.

A straightforward algorithm to check if a linked list ends in a lasso is to traverse the list and keep a set of elements on the side to record the elements that have already been visited. If an element is visited that is already contained in the set, we have a cycle. However, this technique needs an additional data structure to implement this set (another list, for example). Furthermore, its runtime depends on the insert and lookup operations of the set.

The tortoise and the hare technique achieves the same goal without an additional data structure but just two pointers: The tortoise and the hare. The technique simply traverses the list. In each step, the tortoise advances one list element and the hare advances two elements. Now, the claim is that the tortoise and the hare meet (both pointers are equal) if and only if the list ends in a cycle. It is straightforward that they never meet if there is no cycle. But why do they meet if there is a cycle?

Assume that the list has \(n\) elements and ends in a cycle of \(l\) nodes. So, after \(n-l\) steps, the tortoise reaches element \(0\) of the cycle. This element is the only element to which a pointer from outside the cycle points to. Assume that the hare is at element \(d\) of the cycle. Now, in each step the distance from hare to tortoise will decrease by one. (The hare is in fact moving one element away from the tortoise in each step, but because they run on the cycle, it is actually moving closer.) So, after \(l-d\) steps they stand on the same element.

The following RISC-V code implements this algorithm. It assumes that each list element contains some payload data of four bytes (maybe a word) and a pointer to the next element at offset 4.

     .text
    .globl tortoise_and_hare

    # a0: address of first element
tortoise_and_hare:
    # a0 Hase
    # a1 Igel
    mv    a1, a0
loop:
    beqz  a0, not_cyclic
    lw    a0, 4(a0)
    beqz  a0, not_cyclic
    lw    a0, 4(a0)
    lw    a1, 4(a1)
    beq   a0, a1, cyclic
    j     loop
cyclic:
    li    a1, 1
    jr    ra
not_cyclic:
    li    a1, 0
    ret

An Introduction to Imperative Programming

Search Results:

Section 3.5 The Tortoise and the Hare