🆕 This chapter is part of the WIP book:
5 Cool Coding Challenges from Beginner to Advanced

to get notified of new chapters and the book's release.
🔥 My other Book: Build Your Own Redis

# P2. Variables, Control Flows, and Functions (Intermediate)

### 01. Introduction

In the last chapter we implemented a simple calculator. To make it look like a programming language we need to add 3 more aspects:

1. Variables. For manipulating states.
2. Control flows. Like conditionals and loops.
3. Functions. For reusing code.

Here are some sample programs that demonstrate the syntax of our language:

``````;; define the function `fib` with an argument `n`
(def fib (n)
;; if-then-else
(if (le n 0)
(then 0)
(else (+ n (call fib (- n 1))))))   ;; function call

(call fib 5)``````
``````;; another version of the `fib` function
(def fib (n) (do
(var r 0)           ;; variable declaration
(loop (gt n 0) (do  ;; loop
(set r (+ r n)) ;; variable assignment
(set n (- n 1))
))
(return r)          ;; return from a function call
))

(call fib 5)``````

I’ve also added comments to the language. Comments handled by augmenting the `skip_space` function:

``````def skip_space(s, idx):
while True:
save = idx
# try to skip spaces
while idx < len(s) and s[idx].isspace():
idx += 1
# try to skip a line comment
if idx < len(s) and s[idx] == ';':
idx += 1
while idx < len(s) and s[idx] != '\n':
idx += 1
# no more spaces or comments
if idx == save:
break
return idx``````

### 02. Variables

A program is a sequence of operations that manipulate states; thus, we’ll add a new construct that performs a sequence of operations. The `do` command evaluates its arguments in order and returns the last argument.

``(do a b c ...)``

And we add 2 new commands for variable declaration and assignment.

``````(var a init)    ;; create a new variable with a initial value
(set a value)   ;; assign a new value to a variable``````

Variables are scoped — they can only be accessed by their sibling expressions. We’ll use a map to store variables while evaluating an expression.

The problem is subexpressions can define variables whose names collide with a parent scope. This is solved by using per-scope mappings rather than a global mapping.

The `pl_eval` function got a new argument (`env`) for storing variables.

``````def pl_eval(env, node):
if not isinstance(node, list):
assert isinstance(node, str)
return name_loopup(env, node)[node]
...``````

The `env` argument is a linked list, it contains the map for the current scope and the link to the parent scope. The variable lookup function traverses the list upwards until the name is found.

``````def name_loopup(env, key):
while env:
current, env = env
if key in current:
return current
raise ValueError('undefined name')``````

The code for evaluating a new scope: create a new map and link it to the current scope. The `then` and `else` commands are just syntactic sugars.

``````def pl_eval(env, node):
...
# new scope
if node in ('do', 'then', 'else') and len(node) > 1:
new_env = (dict(), env)
for val in node[1:]:
val = pl_eval(new_env, val)
return val``````

The code for the `var` and `set` commands is now straightforward.

``````def pl_eval(env, node):
...
# new variable
if node == 'var' and len(node) == 3:
_, name, val = node
scope, _ = env
if name in scope:
raise ValueError('duplicated name')
val = pl_eval(env, val)
scope[name] = val
return val
# write a variable
if node == 'set' and len(node) == 3:
_, name, val = node
scope = name_loopup(env, name)
val = pl_eval(env, val)
scope[name] = val
return val``````

### 03. Control Flows

The handling of if-then-else is almost the same as in the last chapter. Except that I created a new scope before evaluating the condition, this allows a variable declaration in the condition, for example: `(if (var aaa bbb) (then use aaa here))`. This is also just syntactic sugar.

``````def pl_eval(env, node):
...
# conditional
if len(node) in (3, 4) and node in ('?', 'if'):
_, cond, yes, *no = node
no = no if no else ['val', None]
new_env = (dict(), env)
if pl_eval(new_env, cond):
return pl_eval(new_env, yes)
else:
return pl_eval(new_env, no)``````

The `loop` is a new command. Its evaluation translates directly into a Python loop. Again, the new scope is for syntactic sugar.

``````def pl_eval(env, node):
...
# loop
if node == 'loop' and len(node) == 3:
_, cond, body = node
ret = None
while True:
new_env = (dict(), env)
if not pl_eval(new_env, cond):
break
try:
ret = pl_eval(new_env, body)
except LoopBreak:
break
except LoopContinue:
continue
return ret
# break & continue
if node == 'break' and len(node) == 1:
raise LoopBreak
if node == 'continue' and len(node) == 1:
raise LoopContinue``````

We are (ab)using Python exceptions for the `(break)` and `(continue)` control flows. You can also propagate them explicitly via the return value of the `pl_eval` if you don’t like such hacks.

``````class LoopBreak(Exception):
def __init__(self):
super().__init__('`break` outside a loop')

class LoopContinue(Exception):
def __init__(self):
super().__init__('`continue` outside a loop')``````

### 04. Functions

The code for function definition does nothing significant. It just performs some sanity checks and puts the function name into the map. Note that I added the number of arguments to the key, this distinguishes function names from variable names and allows a form of “overloading”.

``````def pl_eval(env, node):
...
# function definition
if node == 'def' and len(node) == 4:
_, name, args, body = node
for arg_name in args:
if not isinstance(arg_name, str):
if len(args) != len(set(args)):
raise ValueError('duplicated arguments')
dct, _ = env
key = (name, len(args))
if key in dct:
raise ValueError('duplicated function')
dct[key] = (args, body, env)
return``````

Now the `call` command is handled. Function arguments are treated as new variables. Just create a new scope, put the arguments in it, and evaluate the body.

``````def pl_eval(env, node):
...
# function call
if node == 'call' and len(node) >= 2:
_, name, *args = node
key = (name, len(args))
fargs, fbody, fenv = name_loopup(env, key)[key]
# args
new_env = dict()
for arg_name, arg_val in zip(fargs, args):
new_env[arg_name] = pl_eval(env, arg_val)
# call
try:
return pl_eval((new_env, fenv), fbody)
except FuncReturn as ret:
return ret.val
# return
if node == 'return' and len(node) == 1:
raise FuncReturn(None)
if node == 'return' and len(node) == 2:
_, val = node
raise FuncReturn(pl_eval(env, val))``````

Special care: the parent scope is not the scope of the caller, but the scope in which the function was defined.

The `(return)` command is handled like the `(break)` or `(continue)`.

``````class FuncReturn(Exception):
def __init__(self, val):
super().__init__('`return` outside a function')
self.val = val``````

### 05. Done

Congratulations. We have built an interpreter for a mini programming language.

``````def test_eval():
def f(s):
return pl_eval(None, pl_parse_prog(s))
assert f('''
(def fib (n)
(if (le n 0)
(then 0)
(else (+ n (call fib (- n 1))))))
(call fib 5)
''') == 5 + 4 + 3 + 2 + 1
assert f('''
(def fib (n) (do
(var r 0)
(loop (gt n 0) (do
(set r (+ r n))
(set n (- n 1))
))
(return r)
))
(call fib 5)
''') == 5 + 4 + 3 + 2 + 1``````

### 06. Closing Remarks

Our interpreter is not much more difficult than the calculator in the last chapter. Variables are solved by extra states, control flows are interpreted by Python control flows — it’s pretty much still simple recursion.

The interpreter still depends on an existing language for execution. How do we compile our language into machine code and run it natively on the CPU? That is the challenge of the next chapter.