Computations are done with values. For example, 2 is a value. A type is a set of values. Aside from the void type, which represents the empty set of values, we distinguish scalar and aggregate types. The values of scalar types, in contrast to the values of aggregate types, are not composed of other values. For example, 2 is a value of the scalar type int and {3, 4} is a value of the aggregate type struct {int a, b;}, which represents pairs of ints.
During the program's running time, values are stored in containers.
Remark4.3.1.
The C language specification [1] (or C Standard) calls these “object”. Since this term is very generic and can easily lead to confusions, we use the term “container”.
Every container has a size (in bytes), an address and a life time. The address uniquely identifies a container at every point in time. There are no two different containers with the same address.
The addresses of containers are also values. Consequently, there may be containers whose values are addresses of other containers. Such containers are called pointers. We discuss them in more detail in Section 4.10.
Variables are identifiers that occur in a variable declaration. Such a variable declaration consists of the variable itself and a type. For example, int x; declares a variable x of type int. When executing a variable declaration, two actions are performed: First, a new container is allocated with a size derived from the type of the variable declaration. If, for example, an int value has a size of 4 bytes, the container bound to x needs to have a size of at least 4 bytes. Second, the variable is bound to the container's address, it references the container. This link persists until the end of the variables scope, where the container is deallocated.
Remark4.3.2.Nomenclature.
In practice, the term “variable” is commonly used for both: The identifier and the container. When talking about a concrete execution, for example, one can say “the variable x has the value 5” and mean that the container that is bound to the variable x contains the value 5.
Remark4.3.3.Imperative variables compared to mathematical variables.
In functional programming languages as well as in mathematics, variables have a different meaning than in imperative programming languages. Essentially, both paradigms, functional and imperative, agree that variables are immutably bound when they are declared. In functional languages, they are bound to a value, whereas, in imperative programming, the binding is to the address of a container whose content is mutable. Consequently, there are two ways to evaluate a variable in an imperative programming language: To the address of its bound container (L-evaluation) and to its contents (R-evaluation). We discuss these in detail in Subsection 4.8.2 and Chapter 6.
A container's life time determines when it is allocated and deallocated during program execution. There are several kinds of life times:
Subsection4.3.1Local Variables
Containers for local variables are allocated and bound to their identifier when executing a variable declaration. They are deallocated after the last statement of the innermost block that includes the variable declaration. Therefore, we can refer to a local variable anywhere after its declaration within the innermost surrounding block that surrounds its declaration.
Reconsider the example in Remark 4.1.4. When the declaration in line 2 is executed, a container large enough for an int is being allocated for local variable y and the value 0 is stored in it. Assume the identifier x has a value greater than 0 so that the if condition in line 3 is true. When line 4 is executed, a new container for an int is allocated and bound to y. Note that the container allocated by line 2 is now no longer accessible through the identifier y: The inner declaration of y shadows the outer one. After the then-block has finished executing in line 5, the container for y is deallocated and the old binding for y is being restored. When block is left, the container allocated in line 2 will again be accessible through the indentifier y.
Subsection4.3.2Global Variables
Global Variables are declared outside of function bodies. Containers for them are sized an bound to their identifiers just as for local variables. The difference is that a global variable's life time spans from program start until program end. It is therefore available at all times, in all subsequently declared functions.
Use of global variables is a bad practice and should therefore, whenever possible, be avoided. The motivation for this guide line is that global variables make it very easy to introduce hard-to-retrace dependencies between different functions. When we use global variables, we can therefore no longer understand and verify a function individually outside of its calling context.
Example4.3.4.Global Variable.
Consider the following function:
int data[1024];
void sort() {
/* sort the values in data */
}
This sorting function can only operate if the data to be sorted has been copied to the array data before. This reduces its flexibility and makes the function difficult to comprehend: Its parameters do not say what is to be sorted. A better design would give sort a parameter with a pointer type that should point to the array that needs to be sorted and a parameter for the length of the array.
Subsection4.3.3Dynamically Allocated Containers
Commonly, the number and size of the required containers in a program depends on its input and is not known statically. We need to allocate such variables dynamically. The C function malloc accepts a number of bytes as an argument, allocates a new container of this size, and returns its address. The container can be deallocated with a call to free with its address as the argument.
Containers that are not eventually deallocated are a programming error and cause memory leaks. Especially for longer-running programs, these are a serious problem as they can progressively leak all available memory of the system.
Example4.3.5.Dynamically Allocated Containers.
Consider the following function:
int *unit_vector(int dimension, int n_dimensions) {
int *res = malloc(sizeof(res[0]) * n_dimensions);
for (int i = 0; i < n_dimensions; i++)
res[i] = 0;
res[dimension] = 1;
return res;
}
The statement in the first line allocates two new containers: The first one can carry an address; its address is bound to the variable res. The second one originates from the call to malloc. The argument to the malloc call determines the size of this container. sizeof(res[0]) is the size of an int (see Section 4.10), and the container should have n_dimensions-times this size. So, this call to malloc requests a container that is large enough to carry n_dimensions many ints. malloc returns the address of such a container, which is then stored in the container bound to res.
The subsequent loop writes into each space for an int in the container a 0. In the following statement, a 1 is written to the dimension-th position of the container. Lastly, the function returns the address of the allocated container to its caller.
It is noteworthy that the life time of the dynamically (with malloc) allocated container does not end with the return to the calling function, in contrast to the local variable res that carries its address.
We will discuss dynamic memory allocation in more detail in Section 4.12.