Syntax-Directed Code Generation

Section 9.3 Syntax-Directed Code Generation

Syntax-directed code generation is a very simple way of generating machine code from an abstract syntax tree. We traverse the abstract syntax tree recursively, generate code for each child of a node individually and combine these codes with code that comes from the node itself. This technique is very easy to implement because it can be done in a simple recursive AST traversal. However, the quality of the code is usually very low because the code generator lacks a “global view” of the code. To generate high-quality code, more advanced techniques are needed which are out of the scope of this text. Here, we only want to get a first impression of how higher-level languages are implemented with machine code.

To keep our code generator simple, we keep all local variables of a function in its stack frame and will use registers only when evaluating expressions. This is of course detrimental to performance because the code will have a lot of loads and stores to the memory but it will keep our code generator simple.

In principle, the code generator works in a similar way as the type elaboration routine that we discussed in the previous section: In the following, we define the following functions:

$\mathit{codeP}$ generates code for programs.
$\mathit{codeS}$ generates code for statements.
$\mathit{codeR}$ generates code the R-evaluation of an expression.
$\mathit{codeL}$ generates code the L-evaluation of an expression.

We define functions inductively over the abstract syntax of C0.

Subsection 9.3.1 Expressions

The functions $\mathit{codeL}$ and $\mathit{codeR}$ generate code for expressions. According to Section 6.4, only two elements of the abstract syntax of C0 are L-evaluable: the occurrence of a variable and the indirection expression. In our compiler, the L-evaluation of an expression yields a memory address. According to Definition 6.3.2 we model the R-evaluation of L-evaluable expression by L-evaluating them and then loading from the resulting address.

The function

\begin{equation*} \mathit{codeR}:(\Var\pto\Nat)\to\mathit{Regs}^\ast\to\Expr\to\Ty\to\mathit{Code} \end{equation*}

(and $\mathit{codeL}$ analogously) takes the following arguments:

A mapping that maps each variable's name to its offset in the stack frame.
A list of registers that can be used for evaluating the expression.
The expression to generate the code for.
The type of the expression as determined by the type checker in the last section.

It returns a piece of code the implements the evaluation of the expression. We set up the code such that the resulting value of the expression is always stored in the first register of the register list that we pass to $\mathit{codeR/L}\text{.}$ In the following, we are defining both functions recursively over the different elements of the abstract expression syntax of C0.

Constants.

The R-evaluation of a constant just places the constant in the first register of the register list.

\begin{equation*} \begin{prooftree} \AxiomC{} \LeftLabel{[CConst]} \UnaryInfC{$\codeR{(\rlist)}{\denot{c}\enspace\CInt}{c_\mathsf{const}}$} \end{prooftree} \end{equation*}

# c_const
li r, c

Variables.

The L-evaluation of a local variable produces the address of the variable in the stack frame. To this end, we use the function $\mathit{offs}$ to lookup the offset $d$ of the variable in the stack frame and add it to the stack pointer. This assumes that the stack frame is sufficiently large. We will ensure this in the code generation for statements by tracking the memory consumption and generating code that allocates a sufficiently large stack frame.

\begin{equation*} \begin{prooftree} \AxiomC{$\mathit{offs}\,x=d$} \LeftLabel{[CVar]} \UnaryInfC{$\codeL{(\rlist)}{\denot{x}\enspace k}{c_\mathsf{var}}$} \end{prooftree} \end{equation*}

# c_var
addi r, sp, d

L- to R-evaluation.

If an expression is L-evaluable, the R-evaluation of that expression consists of L-evaluating it, which yields the corresponding address and then loading from it. Here, we need the type information to select the appropriate load instruction: When loading a character, we need to load a byte and sign-extend it. When loading an integer or a pointer, we have to load a word. The following rule makes it possible to omit individual R-evaluation rules for L-evaluable expressions and use this rule to handle the R-evaluation for them.

\begin{equation*} \begin{prooftree} \AxiomC {$\codeL{(\rlist)}{\denot{l}\enspace k}{c}$} \noLine \UnaryInfC{$L = \text{load instruction for type }k$} \LeftLabel{[CLtoR]} \UnaryInfC{$\codeR{(\rlist)}{\denot{l}\enspace k}{c_{LtoR}}$} \end{prooftree} \end{equation*}

# c_LtoR
c
L r, r

Address-of and Indirection.

The operators $\CAddrOf{}$ and $\CIndir{}$ toggle between L- and R-evaluation. They do however not generate any code themselves!

\begin{equation*} \begin{prooftree} \AxiomC {$\codeR{(\rlist)}{\denot{e}\enspace\CPtr k}{c}$} \LeftLabel{[CIndir]} \UnaryInfC{$\codeL{(\rlist)}{\denot{\CIndir e}\enspace k}{c}$} \end{prooftree} \end{equation*}

\begin{equation*} \begin{prooftree} \AxiomC {$\codeL{(\rlist)}{\denot{l}\enspace k}{c}$} \LeftLabel{[CAddrOf]} \UnaryInfC{$\codeR{(\rlist)}{\denot{\CAddrOf l}\enspace\CPtr k}{c}$} \end{prooftree} \end{equation*}

Binary Expressions.

For binary operators, we show the case for arithmetic operators here, one of the two sub-expressions needs to be evaluated first. The register that was used to keep the value of that sub-expression cannot be used to evaluate the second sub-expression because the register has to hold the computed value throughout the evaluation of the second sub-expression. So, $\mathit{codeR}$ needs at least two registers to generate code that evaluates the expression. If the registers are exhausted, the compiler is allowed to report an error and abort compilation.

\begin{equation*} \begin{prooftree} \AxiomC {$\codeR{(r_1\lcc r_2\lcc rs)}{\denot{e_1}\enspace k_1}{c_1}$} \noLine \UnaryInfC{$\codeR{(r_2\lcc rs)}{\denot{e_2}\enspace k_2}{c_2}$} \LeftLabel{[CBinary]} \UnaryInfC{$\codeR{(r_1\lcc r_2\lcc rs)}{\denot{e_1\mathbin{o}e_2}\enspace k}{c_{Binary}}$} \end{prooftree} \end{equation*}

In principle, one could define an additional rule

\begin{equation*} \begin{prooftree} \AxiomC {$\codeR{(r_1\lcc r_2\lcc rs)}{\denot{e_2}\enspace k_1}{c_2}$} \noLine \UnaryInfC{$\codeR{(r_2\lcc rs)}{\denot{e_1}\enspace k_2}{c_1}$} \LeftLabel{[CBinary2]} \UnaryInfC{$\codeR{(r_1\lcc r_2\lcc rs)}{\denot{e_1\mathbin{o}e_2}\enspace k}{c_{Binary2}}$} \end{prooftree} \end{equation*}

which just swaps the order in which the sub-expressions are evaluated as can be seen in the code that they generate:

# c_Binary
c1
c2
o  r1, r1, r2

# c_Binary2
c2
c1
o  r1, r1, r2

In principle, having both rules [CBinary] and [CBinary2] would make our code generator non-determinstic: For each binary expression we could apply either rule. So, it would be nice to have a deterministic policy that tells us when to use which rule.

Looking at this closer, it turns out that the order in which the expressions are evaluated has an impact on the register consumption. Consider the following expression tree:

When evaluating the right sub-expression first in each expression, the code looks as follows

G r1
F r2
E r3
D r4
C r3 r4 r3
B r2 r3 r2
A r1 r2 r1

and will use four registers. When generating code for the left sub-expression first in each expression, we get code that looks like this

D r1
E r2
C r1 r1 r2
F r2
B r1 r1 r2
G r2
A r1 r1 r2

which uses only two registers. Now, fixing the order in which code for the expressions is generated a priori, can yield a sub-optimal register consumption. The question is if there is a way of determining the order in which to generate code for the expressions such that register consumption is minimized?

It turns out there is. The following observation is key: Consider a binary expression $e\text{.}$ Assume we knew the register consumption of both sub-expressions of $e\text{:}$ Say the code for the left sub-expression $e_1$ uses $\mathit{regs}(e_1)$ registers and the code for the right sub-expression uses $\mathit{regs}(e_2)$ registers. If $\mathit{regs}(e_1)=\mathit{regs}(e_2)\text{,}$ it doesn't matter which sub-expression we evaluate first because we need one register to keep the result of one sub-expression while evaluating the other. So, in total, we need $\mathit{regs}(e_1)+1$ registers to evaluate $e$ itself. However, if $\mathit{regs}(e_1)\ne\mathit{regs}(e_2)\text{,}$ then we evaluate the one with the higher register demand first, and then we can use one of the registers we used to evaluate that sub-expression that won't be used by the other in which we can keep the value. So, in conclusion: We should always generate code for the sub-expression first that has the higher register demand. We can determine the register demand recursively by the following function that implements the considerations above:

\begin{align*} \mathit{regs}(x) \amp = 1\\ \mathit{regs}(c) \amp = 1\\ \mathit{regs}(e_1\mathbin{o}e_2) \amp = \begin{cases} \mathit{regs}(e_1)+1\amp \text{if }\mathit{regs}(e_1)=\mathit{regs}(e_2) \\ \max(\mathit{regs}(e_1),\mathit{regs}(e_2)) \amp \text{otherwise} \end{cases} \end{align*}

One can show [17] that generating code this way results in minimal register consumption. In summary, our compiler can use the function $\mathit{regs}$ to break the non-determinism of [CBinary] and [CBinary2] and always select the one that leads to minimal register consumption.

Subsection 9.3.2 Statements

Let us consider code generation for statements which is handled by the function

\begin{equation*} \mathit{codeS}:(\Var\pto\Nat)\to\Stmt\to\mathit{Code} \end{equation*}

which takes the following parameters:

A mapping that maps each variable's name to its offset in the stack frame.
A C0 statement.

$\mathit{codeS}$ returns a piece of RISC-V code that implements the statement.

While and If.

The code of the while loop and the if-then-else is composed from the code that implements the bodies of the statements and the code that implements the R-evaluation of the conditional. Note that the labels in the machine code have to be unique, i.e. for each instance of a while loop or an if-then-else in the program, we have to use fresh label names.

\begin{equation*} \begin{prooftree} \AxiomC{$\codeR{(r\lcc rs)}{\denot e\enspace k}{c_1}$} \noLine \UnaryInfC{$\codeS{\denot{s}}{c_2}$} \LeftLabel{[CWhile]} \UnaryInfC{$\codeS{\denot{\CWhile es}}{c_\mathsf{While}}$} \end{prooftree} \end{equation*}

    # c_while
    j T
L:
    c2
T:
    c1
    bnez r, L

\begin{equation*} \begin{prooftree} \AxiomC{$\codeR{(r\lcc rs)}{\denot e\enspace k}{c}$} \noLine \UnaryInfC{$\codeS{\denot{s_1}}{c_1}$} \noLine \UnaryInfC{$\codeS{\denot{s_2}}{c_2}$} \LeftLabel{[CIf]} \UnaryInfC{$\codeS{\denot{\CIf e{s_1}{s_2}}}{c_\mathsf{If}}$} \end{prooftree} \end{equation*}

    # c_if
    c
    beqz r, F
    c1
    j N
F:
    c2
N:

Assignment.

For the assignment, we have to evaluate the left-hand and right-hand side expressions. The evaluation of the left-hand side yields an address and the right-hand side a value. After both expressions have been evaluated, we need to store the value into the stack slot at the address we got. To this end, we need the appropriate store that stores a value of type $k_1$ which is the type of $l\text{.}$ Note also that we have freedom in which sub-expression (left or right) we evaluate first and the same reasoning as for the binary arithmetic expression above applies here. For the sake of brevity we are only giving one rule here.

\begin{equation*} \begin{prooftree} \AxiomC{$\codeL{(r_1\lcc r_2\lcc rs)}{\denot l\enspace k_1}{c_1}$} \noLine \UnaryInfC{$\codeR{(r_2\lcc rs)}{\denot e\enspace k_2}{c_2}$} \noLine \UnaryInfC{$S=\text{store that stores value of type }k_1$} \LeftLabel{[CAssign]} \UnaryInfC{$\codeS{\denot{\CAssign le}}{c_\mathsf{Assign}}$} \end{prooftree} \end{equation*}

# c_assign
c1
c2
S r2, (r1)

Blocks.

The code for a block is the code for the constituent program. We discuss code generation for programs in Subsection 9.3.3. In addition, the block rule takes care of allocating stack slots for the local variables declared in the block. The number $\delta$ is the maximum offset assigned in the current stack slot assignment $\mathit{offs}\text{.}$ The code for the constituent statements is generated with an extended stack slot assignment $\mathit{offs}'$ that has room for the local variables declared in the block.

\begin{equation*} \begin{prooftree} \AxiomC{$\delta=\max\{\mathit{offs}(x)\mid x\in\domain\mathit{offs}\}$} \noLine \UnaryInfC{$\mathit{offs}'=\mathit{offs}[x_i\mapsto\delta+4i \text{ for }1\le i\le m]$} \noLine \UnaryInfC{$\codePO{\mathit{offs'}}{\denot{p}}{c}$} \LeftLabel{[CBlock]} \UnaryInfC{$\codeS{\denot{\CBlock{k_1\ x_1;\dots;k_m\ x_m;\ p}}}{c_\mathsf{Block}}$} \end{prooftree} \end{equation*}

# c_block
c

Subsection 9.3.3 Programs

Generating code for the program part of a C0 program is simple. The empty program has no associated code. The sequence operator just prepends the code generated for the statement to the code generated for the program.

\begin{equation*} \begin{prooftree} \AxiomC{} \LeftLabel{[CTerm]} \UnaryInfC{$\codeP{\denot{\CTerm}}{c_\mathsf{Term}}$} \end{prooftree} \end{equation*}

# c_term

\begin{equation*} \begin{prooftree} \AxiomC{$\codeS{\denot{s}}{c_s}$} \AxiomC{$\codeP{\denot{p}}{c_p}$} \LeftLabel{[CSeq]} \BinaryInfC{$\codeP{\denot{\CSeq sp}}{c_\mathsf{Seq}}$} \end{prooftree} \end{equation*}

# c_seq
c_s
c_p

Subsection 9.3.4 The Function Prologue and Epilogue

We'll use the code generation functions for expressions and statements that we have devised in the previous two subsections to generate code for an entire function. Our formalization of C0 does not know functions but we are discussing a small extension of the language here nevertheless. The code $c$ of the body $p$ of a function $f$ with parameters $k_1\,x_1;\dots;k_m\,x_m;$ can be generated by considering the function's body wrapped in a block where parameters are declared as local variables:

\begin{equation*} c = \mathit{codeS}\enspace\emptyset\enspace\denot{{\CBlock{k_1\,x_1;\dots;k_m\,x_m;\,p}}} \end{equation*}

The body code $c$ now has to extended by a prologue and epilogue that sets up the stack frame and saves the return register:

f:
    .globl
    addi  sp, sp, -(S+4)
    sw    ra, S(sp)
    sw    a0, 0(sp)
    ...
    sw    an, 4n(sp)
    c     # body code
f_end:
    lw    ra, S(sp)
    addi  sp, sp, S+4
    ret

For the sake of simplicity, we are only considering functions that have no more than four parameters which is the maximum number of parameters passed in registers. Functions with more parameters use the stack to pass the additional arguments which would complicate our code slightly. $S$ is the maximum size of the stack frame, i.e. the maximum offset used in rule [CVar] plus 4. Here, the stack frame is four more bytes large because we also have to save the return address. Before placing the body code $c\text{,}$ we save the argument registers into the stack frame so that they can be accessed like local variables.

An Introduction to Imperative Programming

Search Results: