Skip to main content

Section 3.2 The Sieve of Eratosthenes

The sieve of Eratosthenes is an algorithm to enumerate all prime numbers smaller than a given number \(N\text{.}\) It uses a table with entries for each number from 2 to \(N-1\) and successively marks numbers in that table that are not prime. In the beginning, no table entry is marked.

To implement the table, we will use an array. An array is a data structure that stores a list of elements by putting them one after another in memory. Here, the list is the table of entries of 0 and 1 to indicate that a number is marked. Arrays have the appealing property that we can address an element in constant time by simply computing its address: If \(a\) is the address of element 0, we can simply compute the address element \(i\) by \(a+i\cdot s\) where \(s\) is the size of an element in bytes (of course all elements have to be of the same “type” and therefore have to have the same size in order for this to work).

Note that we will use an array from \(0\) to \(N-1\) because it makes indexing simpler at the expense of two unused entries (0 and 1).

The idea of the algorithm is simple: We look at each number from 2 to \(N-1\) successively. If number \(q\) is not marked yet, we mark all its multiples \(2q, 3q, \dots, \) smaller than \(N\text{.}\) For example, assume \(N=14\text{.}\) When looking at table entry 3, we need to mark 6, 9, 12. All numbers not marked are prime.

Let's think about why this algorithm is correct. If we arrive at \(q\) we know, because we considered all numbers smaller than \(q\) before, that the multiples of each number \(i\lt q\) have been marked. So if \(q\) is unmarked, it means that it does not divisible by any number bigger than 1. Therefore, \(q\) is prime.

The marking process can be sped up by the following observation: All multiples of \(q\) that contain a prime factor \(p\) less than \(q\) have already been marked when we considered \(p\text{.}\) So the smallest unmarked number is \(q^2\) and it is sufficient to start marking there. Consequently, there is no sense in marking multiples of a number \(q'\) with \(q'\cdot q'>N\text{.}\)

Running time.

Analyzing the running time of the optimized procedure is interesting. Let \(M\) be the greatest integer such that \(M^2\lt N\text{.}\) For each \(1\lt i\lt M\) we are marking all multiples greater or equal to \(i^2\) and smaller than \(N\text{.}\) So the overall number of markings performed is

\begin{align*} \sum_{i=2}^M \lfloor (N-i^2)/i\rfloor \amp \lt \sum_{i=1}^M(N-i^2)/i\\ \amp = N\sum_{i=1}^M \frac 1i-\sum_{i=1}^M i\\ \amp \lt N\ln M-\frac{M(M+1)}2\lt N\ln N \end{align*}

In the last step, we have exploited that \(1+\ln n\) is an upper bound to the \(n\)-th harmonic number \(1+\frac 12+\cdots+\frac 1n\text{.}\)

Let us finally give the implementation of the sieve of Erastothenes in MIPS assembly

 	.text
eratosthenes:
	.globl eratosthenes
	subu   $a2 $a1 $a0
	move   $t0 $a0

	# initialize tables with 0s
	b      zero_check
zero_loop:
	sb     $0 0($t0)
	addiu  $t0 $t0 1
zero_check:
	bne    $t0 $a1 zero_loop
	
	# we start with q=2
	li     $t0 2
era_loop:
	mul    $t9 $t0 $t0  # marking starts at q*q
	bge    $t9 $a2 end  # we don't need to mark
	                    # if q*q > N
	addu   $t1 $a0 $t0  # compute address of entry q
	lb     $t1 ($t1)    # load entry
	bnez   $t1 no_prime # if marked, no prime, no marking
	
	# here, q (in $t0) is a prime
	# mark all multiples of q beginning from q*q
	addu   $t1 $a0 $t9  # compute address of entry of q*q
	li     $t8 1        # put a 1 into a register
	b      mark_check
mark_loop:	          # loop to mark all multiples
	sb     $t8 ($t1)
	addu   $t1 $t1 $t0
mark_check:
	blt    $t1 $a1 mark_loop

no_prime:
	addiu  $t0 $t0 1    # set $t0 to q + 1
	b      era_loop
end:
	jr     $ra
Run