From Source Code To Machine Code
build-your-own.org eBook·Paperback
⟵ prev Contents next ⟶

🔥 My other Book: Build Your Own Redis

04. Interpreter: Control Flows and Functions

4.1 If-Then-Else

A new command is added for the if-then-else control flow:

(if condition yes)
(if condition yes no)

The if command is similar to the (? yes no) operator in the calculator chapter. Except that the “else” part is optional. So the code is modified to handle both of them.

def pl_eval(env, node):
    ...
    # conditional
    if len(node) in (3, 4) and node[0] in ('?', 'if'):
        _, cond, yes, *no = node
        no = no[0] if no else ['val', None] # the `else` part is optional
        new_env = (dict(), env) # new scope
        if pl_eval(new_env, cond):
            return pl_eval(new_env, yes)
        else:
            return pl_eval(new_env, no)

Note that I created a new scope before evaluating the condition, this allows a variable declaration in the condition, for example: (if (var aaa bbb) (then use aaa here)). This is just syntactic sugar.

4.2 Loops

The syntax of the loop command:

(loop condition body)
(break)
(continue)

The code for handling the loop command has one thing in common with the if command: it translates to a Python loop, just like the if command translates to a Python if.

def pl_eval(env, node):
    ...
    # loop
    if node[0] == 'loop' and len(node) == 3:
        _, cond, body = node
        ret = None
        while True:
            new_env = (dict(), env)
            if not pl_eval(new_env, cond):
                break
            ret = pl_eval(new_env, body)
        return ret

We have also added the break and continue commands. They are implemented by (ab)using Python exceptions. You can also propagate them explicitly via the return value of the pl_eval if you don’t like such hacks or you can’t use exceptions.

def pl_eval(env, node):
    ...
    # loop
    if node[0] == 'loop' and len(node) == 3:
        _, cond, body = node
        ret = None
        while True:
            new_env = (dict(), env)
            if not pl_eval(new_env, cond):
                break
            try:
                ret = pl_eval(new_env, body)
            except LoopBreak:
                break
            except LoopContinue:
                continue
        return ret
    # break & continue
    if node[0] == 'break' and len(node) == 1:
        raise LoopBreak
    if node[0] == 'continue' and len(node) == 1:
        raise LoopContinue
class LoopBreak(Exception):
    def __init__(self):
        super().__init__('`break` outside a loop')

class LoopContinue(Exception):
    def __init__(self):
        super().__init__('`continue` outside a loop')

4.3 Functions

The syntax:

(def func-name (arg-names...) body)
(call func-name args...)

The code for function definition does nothing significant. It just performs some sanity checks and puts the function name in the map.

def pl_eval(env, node):
    ...
    # function definition
    if node[0] == 'def' and len(node) == 4:
        _, name, args, body = node
        # sanity checks
        for arg_name in args:
            if not isinstance(arg_name, str):
                raise ValueError('bad argument name')
        if len(args) != len(set(args)):
            raise ValueError('duplicated arguments')
        # add the function to the scope
        dct, _ = env
        key = (name, len(args))
        if key in dct:
            raise ValueError('duplicated function')
        dct[key] = (args, body, env)
        return

Note that I added the number of arguments to the key, this allows a form of “overloading” — functions with the same name but different numbers of arguments can coexisit. It also distinguishes function names from variable names.

Now the call command is handled. Function arguments are treated like new variables. Just create a new scope, put the arguments in it, and evaluate the body.

def pl_eval(env, node):
    ...
    # function call
    if node[0] == 'call' and len(node) >= 2:
        _, name, *args = node
        key = (name, len(args))
        fargs, fbody, fenv = name_loopup(env, key)[key]
        # args
        new_env = dict()
        for arg_name, arg_val in zip(fargs, args):
            new_env[arg_name] = pl_eval(env, arg_val)
        # call
        try:
            return pl_eval((new_env, fenv), fbody)
        except FuncReturn as ret:
            return ret.val

Special care: the parent scope is not the scope of the caller, but the scope in which the function was defined. (The scope is saved when defining a function).

The (return) command is handled like the (break) or (continue).

def pl_eval(env, node):
    ...
    # return
    if node[0] == 'return' and len(node) == 1:
        raise FuncReturn(None)
    if node[0] == 'return' and len(node) == 2:
        _, val = node
        raise FuncReturn(pl_eval(env, val))
class FuncReturn(Exception):
    def __init__(self, val):
        super().__init__('`return` outside a function')
        self.val = val

4.4 Done

Congratulations. We have built an interpreter for a mini programming language.

def test_eval():
    def f(s):
        return pl_eval(None, pl_parse_prog(s))
    assert f('''
        (def fib (n)
            (if (le n 0)
                (then 0)
                (else (+ n (call fib (- n 1))))))
        (call fib 5)
    ''') == 5 + 4 + 3 + 2 + 1
    assert f('''
        (def fib (n) (do
            (var r 0)
            (loop (gt n 0) (do
                (set r (+ r n))
                (set n (- n 1))
            ))
            (return r)
        ))
        (call fib 5)
    ''') == 5 + 4 + 3 + 2 + 1

Our interpreter is not much more difficult than the calculator in the previous chapter. Variables are solved by extra states, control flows are translated into Python control flows — it’s pretty much still simple recursion.

The interpreter still depends on an existing language for execution. How do we compile our language into machine code and run it natively on the CPU? That’s the next challenge.

( Report an Error | Ask a Question) @ build-your-own.org

See also:
codecrafters.io offers “Build Your Own X” courses in many programming languages.
Including Redis, Git, SQLite, Docker, and more.
Check it out

⟵ prev Contents next ⟶