03. Interpreter: Variables and Scopes

3.1 Introduction

In the last chapter we implemented a simple calculator. To make it look like a programming language we need to add 3 more aspects:

  1. Variables. For manipulating states.
  2. Control flows. Like conditionals and loops.
  3. Functions. For reusing code.

Here are some sample programs that demonstrate the syntax of our language:

;; define the function `fib` with an argument `n`
(def fib (n)
    ;; if-then-else
    (if (le n 0)
        (then 0)
        (else (+ n (call fib (- n 1))))))   ;; function call

(call fib 5)
;; another version of the `fib` function
(def fib (n) (do
    (var r 0)           ;; variable declaration
    (loop (gt n 0) (do  ;; loop
        (set r (+ r n)) ;; variable assignment
        (set n (- n 1))
    ))
    (return r)          ;; return from a function call
))

(call fib 5)

3.2 New Commands for Variables

A program is a sequence of operations that manipulate states, so we’ll add a new construct that performs a sequence of operations. The do command evaluates its arguments in order and returns the last argument. It also creates a new “scope” for variables.

(do a b c ...)

And we add 2 new commands for variable declaration and assignment.

(var a init)    ;; create a new variable with a initial value
(set a value)   ;; assign a new value to a variable

I’ve also added comments to the language. A semicolon will skip the rest of the line. Comments are handled by augmenting the skip_space function:

def skip_space(s, idx):
    while True:
        save = idx
        # try to skip spaces
        while idx < len(s) and s[idx].isspace():
            idx += 1
        # try to skip a line comment
        if idx < len(s) and s[idx] == ';':
            idx += 1
            while idx < len(s) and s[idx] != '\n':
                idx += 1
        # no more spaces or comments
        if idx == save:
            break
    return idx

3.3 Variables and Scopes

Variables are scoped — they can only be accessed by their sibling expressions. We’ll use a map to store variables while evaluating an expression.

The problem is that subexpressions can define variables whose names collide with a parent scope. This is solved by using per-scope mappings instead of a global mapping.

The pl_eval function got a new argument (env) for storing variables.

def pl_eval(env, node):
    # read a variable
    if not isinstance(node, list):
        assert isinstance(node, str)
        return name_loopup(env, node)[node]
    ...

The env argument is a linked list, it contains the map for the current scope and the link to the parent scope. Entering and leaving a scope is just adding or removing the list head.

The variable lookup function traverses the list upwards until the name is found.

def name_loopup(env, key):
    while env:  # linked list traversal
        current, env = env
        if key in current:
            return current
    raise ValueError('undefined name')

The code for evaluating a new scope: create a new map and link it to the current scope. The then and else commands are aliases to the do command. They are just syntactic sugar so that you can write (if (then xxx) (else yyy)) instead of (if xxx yyy).

def pl_eval(env, node):
    ...
    # new scope
    if node[0] in ('do', 'then', 'else') and len(node) > 1:
        new_env = (dict(), env) # add the map as the linked list head
        for val in node[1:]:
            val = pl_eval(new_env, val)
        return val  # the last item

The code for the var and set commands is now straightforward.

def pl_eval(env, node):
    ...
    # new variable
    if node[0] == 'var' and len(node) == 3:
        _, name, val = node
        scope, _ = env
        if name in scope:
            raise ValueError('duplicated name')
        val = pl_eval(env, val)
        scope[name] = val
        return val
    # update a variable
    if node[0] == 'set' and len(node) == 3:
        _, name, val = node
        scope = name_loopup(env, name)
        val = pl_eval(env, val)
        scope[name] = val
        return val

3.4 Testing

The interpreter accepts a sequence of expressions instead of a single expression. This is done by wrapping the input with a do command.

def pl_parse_prog(s):
    return pl_parse('(do ' + s + ')')

A sample program demonstrating variables and scopes:

def test_eval():
    def f(s):
        return pl_eval(None, pl_parse_prog(s))

    assert f('''
        ;; first scope
        (var a 1)
        (var b (+ a 1))
        ;; a=1, b=2
        (do
            ;; new scope
            (var a (+ b 5))     ;; name collision
            (set b (+ a 10))
        )
        ;; a=1, b=17
        (* a b)
    ''') == 17

We’ll add control flows and functions in the next chapter.