The Assembler

Section 2.3 The Assembler

We have seen in the last section that the instructions that a processor executes are just bit strings that reside in memory. Because we do not want to encode these bit strings by hand, we use an assembler. An assembler translates a textual representation of machine code into binary code, i.e. the bit strings of the instructions. To this end, the assembler reads in a file that contains machine code instructions in their textual form and directives for the assembler that direct the assembly process and influence the shape of the output. The output of the assembler is a binary file (sometimes called object file). This file contains the binary-encoded machine code instructions, data for the data segment, and meta-data for the linker that binds together multiple object files into an executable program. Figure 2.3.1 shows this process schematically. As indicated in Figure 2.3.1 a program can consist of multiple assembly files. Each such file is called a translation unit because they are all “translated” separately.

Figure 2.3.1. Translation process for an assembly code. The assembler translates each source file to object code and the linker binds them together to form an executable.

Subsection 2.3.1 Overview

Let us explore the ingredients of an assembly code file by means of the example in Listing 2.3.2. The figure shows seven instruction words that are located somewhere in the memory of the computer. The left column shows the instruction words as hexadecimal 32-bit numbers. The second and further columns show the textual representation of the respective instruction word.

0x00100593		addi x11, x0,  1
0x00C0006F		jal  x0,  12
0x02A585B3		mul  x11, x11, x10
0xFFF50513		addi x10, x10, -1
0xFEA04CE3		blt  x0,  x10, -8
0x00058513		addi x10, x11, 0
0x00008067		jalr x0,  x1,  0

Listing 2.3.2. Instruction words in hex and assembler instructions of a function that computes the factorial of a number given in register 4.

The second column represents the so-called mnemonic which is a short textual description of the opcode, i.e. the operation the instruction stands for such as mul for multiplication or addi for “add a register to an immediate”. The last column gives the operands of instruction. Registers are prefixed by an x, e.g. x10 refers to register 10. Numbers without an x represent immediates. So, addi x11 x0 1 adds the value 1 to the contents of register 0 and stores the result in register 11. Effectively, this instruction places the value 1 into register 11 because register 0 always reads as zero. The instructions jal, jalr and blt are control flow instructions that alter the value of the program counter. blt x0 x10 -8 for example tests, if the value in register 0 (which is always 0) is less than the value in register 10, i.e. if the value in register 10 is greater zero. If this is the case, it adds -8 to the program counter which places the program counter two instructions back in the instruction stream to the mul instruction. (Note that each instruction consumes four bytes.) If this is not the case, the program counter advances normally to the next instruction and the branch has no other effect. When adding a value to an address (like adding -16 to the pc in this example), this value is called an offset. Branches that take offsets are called relative branches whereas branches that take addresses as operands are called absolute branches. jalr is an absolute branch: it sets the program counter to the sum of the contents of its second operand register and its immediate operand. A branch instruction that branches on a condition like blt in this example, is called a conditional branch. Sometimes one also dedicates the term branch specifically to conditional branches and calls unconditional branches jumps. The instruction jal x0 12 is an unconditional branch: It adds 12 to the program counter.

Remark 2.3.3. Pseudo instructions.

It may seem as a crude “hack” to use an addition with a register that always reads zero just to move a value from one register to another. However, it saves us from introducing a dedicated instruction for that. This way, the instruction set stays small and clean which is part of the RISC (reduced instruction set) paradigm. The assembler however offers pseudo instructions such as li (load immediate = put an immediate into a register) or j (jump) that are just abbreviations for commonly used operations that can be expressed with the actual instruction set in a not so straightforward way.

Remark 2.3.4. Jumps and Branches.

Despite all the pseudo instructions, RISC-V has only two real jump instructions: jal and jalr. Both are unconditional jumps. jal stands for “jump and link” and jalr stands for “jump and link register”. jal takes a register and an offset as arguments. It stores the address of the instruction right behind itself into the register and adds the offset to the program counter. jalr takes two registers and an immediate. Like jal, it stores the address of the instruction right behind itself into the first register. Then, it sets the program counter to the sum of the contents of the second register and the immediate. Because jalr takes its jump target from another register, it is called an indirect jump. The purpose of storing the address of the next instruction is that in that way, they can be used to realise function calls (see Section 2.8). If the address of the next instruction is not needed, we use x0 as the first operand because all writes to x0 have no effect.

In addition to the unconditional jumps, RISC-V has a number of conditional branches. All of them take two registers and an offset. They compare the registers according to some criterion and if the criterion is met, they add the offset to the program counter. The conditional branch instructions are beq (branch if equal), bne (branch if not equal), blt (branch if less than), bge (branch if greater or equal), bltu (branch if less than unsigned) and bgeu (branch if greater or equal unsigned).

Subsection 2.3.2 An Example

Computing offsets for relative branches is tedious and error-prone. Even worse, if one inserts new code between the branch and its target, the offset has to be recalculated. One of the prime tasks of an assembler is therefore to provide labels. Every entity in an assembler file that will be put somewhere in memory (such as instructions but also static data, Section 2.5 Every instruction (in general every address in a segment) can be marked with a label that stands for the address of the entity. One can refer to these labels at various places (such as in the operand list of relative branch instructions) and the assembler automatically computes the appropriate offsets for us. Listing 2.3.5 shows the assembly file from which the assembler produced the binary code shown in Listing 2.3.2.

    .text
    .globl fact
fact:
    li    a1, 1      # load 1 into register a1
    j     check      # jump to label check
loop:
    mul   a1, a1, a0 # multiply a1 with a0 and put
                     # result into a1
    addi  a0, a0, -1 # Subtract 1 from a0
check:
    bgtz  a0, loop   # branch to loop, if a0 > 0
    mv    a0, a1
    ret              # return to caller

Listing 2.3.5. A function that calculates the factorial of a number. This number is expected to be in register a0 when the function is invoked.

Labels are defined by giving a name followed by a colon. fact, loop, check are all labels. All other occurrences of these labels refer to them. The address the label stands for is the address at which the instruction that follows the label will be placed in memory when the program is loaded.

Another difference from the assembly code in Listing 2.3.2 to Listing 2.3.5 is that it is customary to use other names for registers. Instead of referring to register 10 as x10 we write a0 here. These other names come from the so-called calling convention, a set of rules the we use to assign specific roles to registers when calling function which we will discuss in Section 2.8.

Additionally, the code above uses a few new instructions which are all pseudo-instructions, i.e. they are practical short hands for more complex instructions. Confer Section A.1 for a detailed list of pseudo-instructions.

Furthermore, Listing 2.3.5 shows some examples for assembler directives. .text is a directive that indicates that everything that follows is code. It activates the assembly code function which allows us to write mnemonic and register names instead of hand-coding instructions. .text also tells the linker later on that everything that follows has to be placed into the code (sometimes called text) segment. We will discuss segments later in Section 2.6. .globl fact makes the label fact visible from other translation units (see Figure 2.3.1). Labels that are not declared global are local to a translation unit and cannot be referred to from other translation units. This prevents that situation that programmers accidentally use the same label name in different files that would clash if the label's name was global. Note also the use of pseudo instructions as discussed in Remark 2.3.3.

Finally, let us give a main program that calls our factorial function. By convention, program execution starts at instruction with the label main. Our main program shall compute that factorial of 10, display the output on the console, and terminate the program.

.globl main
main:
    li   a0, 5       # load 10 into register a0
    call fact        # call function fact
    mv   a1, a0      # result is in a0; move to a1
    li   a0, 1       # load ecall number 1 into a0
    ecall            # call system to output
    li   a0, 10      # load ecall number 10 into a0
    ecall            # call system to end program

Listing 2.3.6. A main program for our factorial function

Our main program calls the operating system to perform input and output operations. Here, we do go into details about operating systems. For our purposes, it is sufficient to accept that there is some system that we can interact with using the ecall instruction. We specify what exactly we want the operating system to do by putting a certain number into the a0 register. Here, 1 means print the number stored in a1 as a decimal number on the console and 10 means terminate the program.

Remark 2.3.7. Program Termination.

Why do we explicitly need to terminate the program with an ecall? Listing 2.3.6 suggests that it is clear that the program ends after the last instruction. However, the instructions of our program are just some bytes in memory and behind these bytes there are other bytes that don't belong to our program. So our processor doesn't “see” the end of our program like we do in the listing above. Therefore, we have to hand back control to the operating system when our program has ended.

There is one other new instruction in Listing 2.3.6: call. call is a pseudo-instruction for jal (jump and link). A jal instruction takes two arguments: A register and an immediate. jal adds the immediate to the program counter and sets the program counter to the new address. Before setting the program counter to the new address, it stores the address of the instruction that follows the jal into the register given as the first argument. In the case of the call pseudo-instruction, the first argument is always register x1 which typically is used to hold the return register upon a function call. The return address is the address the function that call calls will want to return to. This is why register x1 is also called ra (= return address).

An Introduction to Imperative Programming