Using Object Orientation Properly

Section 8.10 Using Object Orientation Properly

Object-oriented programming languages help to solve an important problem for developing large software systems: easy extensibility. “Easy” here means that extending the system only requires changes at a small number of confined program locations. This is important since modifications that address many program locations are very prone to mistakes: A developer that performs such modifications needs to understand the entire code affected by the changes to ensure that the modification is correct and complete. Object-oriented languages facilitate this by providing means that, if used properly, confine the required changes for an extension of the program. The following properties of object-oriented languages are central for this goal:

The separation of interface and implementation is helpful for implementing abstract data types. This supports extensibility since code can refer to the interface rather than a concrete implementation. Subtyping ensures that all extensions provide the functionality required by the interface.
The concentration of data and corresponding code in a class prevents implementation details from being scattered among large parts of the code and from being used by code parts that actually do not need them.

We want to discuss these aspects now by means of a simple example that we implement in functional (OCaml), imperative (C), and an object-oriented (Java) language. The goal of looking at three different languages from three different “paradigms” is to contrast the support each language offers for implementing our small example and to compare and relate object-oriented programming to other programming paradigms.

Our program shall represent the abstract syntax trees of a simple arithmetic expression language. This language consists only of variables and the addition of expressions:

\begin{equation} \begin{array}{rclll} \text{Category} \amp \amp \text{Abstract Syntax} \amp \text{Concrete Syntax} \amp \text{Description} \\ \hline \mathit{Exp}\ni e \amp \syndef \amp \mathtt{Add}[e_1,e_2] \amp e_1+e_2 \amp \text{Addition} \\ \amp | \amp \mathtt{Var}_x \amp x \amp \text{Variable} \\ \end{array}\tag{8.1} \end{equation}

This example is representative for many situations in programming: There is a category of “things” and different variants of them. Here the category is the syntactical category of expressions and the variants are addition and occurrence of a variable. But you can also think of the category “element of a graphical user interface” and the variants would be “button”, “checkbox field”, “text entry field” and so on. Or, just to give another example, the category could be “input/output stream” and the variants could be “file”, “inter-process pipe”, “network socket”, and so on.

Coming back to our example, we want to provide two operations for these expressions: Creating a textual representation of an expression as a string and evaluating an expression. Listing 8.10.1, Listing 8.10.2, and Listing 8.10.3 show the implementation of this data structure and the operations in OCaml, C, and Java.

OCaml.

Let us first consider the implementation in OCaml. Functional languages traditionally have good support for implementing data types like the one we need for the AST of our expression language. They provide Algebraic Data Types (not to be confused with abstract data types) which allow the programmer to specify the variants of which the data type is composed. Here Add and Var. The operations that operate on values of that data type typically use match constructs to discriminate the individual cases.

type exp =
  | Var of string
  | Add of exp * exp

let rec to_string e = 
  match e with
  | Var v -> v
  | Add (l, r) -> (to_string l) ^ " + " ^ (to_string r)

let rec eval e st =
  match e with
  | Var v -> st v 
  | Add (l, r) -> (eval l st) + (eval r st)

Listing 8.10.1. Implementation of the abstract syntax of the simple expression language in OCaml.

C.

Let us now consider the implementation in C (Listing 8.10.2). The C implementation exposes the low-level details of algebraic data types and shows how they might be implemented in a functional language “under the hood”. The struct exp_t contains a union containing the data for the two variants of our ADT. The member field op determines whether the expression is an addition or the occurrence of a variable. Which variant is represented is set by the corresponding constructor (exp_init_var and exp_init_add).

struct env_t;
typedef struct env_t env_t;
int env_get(env_t const *e, char const* name);

typedef enum {
    ADD, VAR,
} operator_t;

typedef struct exp_t {
    operator_t op;
    union {
        struct exp_t *opnds[2];
        char const* var_name;
    };
} exp_t;

exp_t* exp_init_add(exp_t* t,
                    exp_t* l, exp_t* r) {
    t->op = ADD;
    t->opnds[0] = l;
    t->opnds[1] = r;
    return t;
}

exp_t* exp_init_var(exp_t* t, char const* n) {
    t->op = VAR;
    t->var_name = n;
    return t;
}

int exp_eval(exp_t const* t, env_t const* e) {
    switch(t->op) {
    case ADD:  return exp_eval(t->opnds[0], e)
                    + exp_eval(t->opnds[1], e);
    case VAR:  return env_get(e, t->var_name);
    }
}

void exp_print(exp_t const* t, FILE* f) {
    switch(t->op) {
    case ADD:  exp_print(t->opnds[0], f);
               fprintf(f, " + ");
               exp_print(t->opnds[1], f);
               break;
    case VAR:  fputs(t->var_name, f);
               break;
    }
}

Listing 8.10.2. Implementation of the abstract syntax of the simple expression language in C.

The implementations for expression evaluation (in exp_eval) and printing (in exp_print) use this field to execute the corresponding variant of the code. These two functions thus contain the corresponding code for all expression variants. A variant's implementation is therefore scattered over many places in the C code: The enum that lists all variants, a member field in the union for the data and a branch in every function implementing an operation of the data type.

Since C does not have inheritance, the exp_t struct collects fields for all subclasses. The lack of classes requires branches in case distinctions for the method implementations (see exp_eval and exp_print). These functions can access all fields, meaning that there is no encapsulation between the Add and Const classes.

Java.

public interface Env {
    int get(String varName);
}

public interface Exp {
    public int eval(Env e);
    public String toString();
}

public class Add implements Exp {
    private Exp left, right;

    public Add(Exp left, Exp right) {
        this.left = left;
        this.right = right;
    }

    public int eval(Env e) {
        return left.eval(e) + right.eval(e);
    }

    public String toString() {
        return "" + left + " + " + right;
    }
}

public class Var implements Exp {
    private String name;

    public Var(String name) {
        this.name = name;
    }

    public int eval(Env e) {
        return e.get(this.name);
    }

    public toString() {
        return this.name;
    }
}

Listing 8.10.3. Implementation of the abstract syntax of the simple expression language in Java.

The Java implementation uses the object-oriented way of implementing algebraic data types. An interface declares all operations that can be applied to expressions. Every class implementing this interface corresponds to a variant. Every aspect of this variant (implementation of operations, initialization, data) are gathered in the class. Subtyping ensures that such a class can be used whenever an object of type Exp is expected. The value of the field op in the C implementation corresponds to the concrete type of the object in the Java implementation. The resolution of the overridden methods replaces the case distinctions in the C functions exp_eval and exp_print. The code that handles this internally is, without further ado by the programmer, inserted automatically by the compiler.

An Introduction to Imperative Programming

Search Results:

Section 8.10 Using Object Orientation Properly

OCaml.

C.

Java.