Section 2.5 The Data Segment
In addition to the code, programs also contain
static data. The term “static” means that the extent (the size) of the data is known
statically which means it is known prior to the program's execution and does not vary during execution. This means that the memory for static data can be provisioned during assembly and linking and allocated directly when the program is loaded into memory. As we will see later in
Section 2.6, the memory the program uses is organized into segments that each serve a distinct purpose. The
data segment is the segment that holds the static data.
It is common that static data is initialized, i.e. it is set to certain values before the program is started. A typical example for such data are strings (character sequences, i.e. text) that the program uses. If a program prints out a message, that message has to be kept somewhere.
In the assembler file, the .data
directive starts the declaration of static data. Use .space n
to reserve some uninitialized \(n\) bytes. To reference this memory later on, you can put a label in front of the directive like so:
.data
some_bytes:
.space 1000
To make allocating static data more comfortable, there are a couple of directives to create static data of different sizes and initialize it at the same time. The directives .byte
, .half
, .word
allocate bytes, half-words (MIPS slang for 2 bytes), and words (4 bytes). .ascii
\(s\) and .asciiz
\(s\) allocate memory that is initialized to the ASCII codes of the string \(s\text{.}\) .asciiz
also appends a byte with value 0. This is a common way of signaling the end of the string that is used, for example, in the language C. For example:
.data
hello:
.asciiz "Hello World"
corresponds to the directives
.data
hello:
.byte 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20
.byte 0x57, 0x6f, 0x72, 0x6c, 0x64, 0x00
One peculiar aspect is that data must be aligned depending on its size. If the processor accesses \(n\) bytes of memory using a load or store instruction, the address of access needs to be divisible by \(n\text{,}\) otherwise the processor will trigger an exception and stop executing. For example, the following program
.data
.ascii "Hallo"
x:
.byte 8, 0, 0, 0
.text
.globl main
main:
lw $t0 x
would cause an exception because the address of label x
is 0x10000005 which is not divisible by 4. But since we are using the lw
instruction that loads a word (4 bytes) into a register, we get an exception. We can force a specific alignment for the next label using the .align
directive like so:
.data
.ascii "Hallo"
.align 2
x:
.byte 8, 0, 0, 0
.text
.globl main
main:
lw $t0 x
Run