The While Loop
A while loop allows to execute a block (the body of the loop) multiple times, for as long as a condition expression evaluates to a truthy value. The syntax of the expression is while <expr> <block>
, where <block>
. Its evaluation works as follows:
- Evaluate the condition. If the result is truthy go to 2., else go to 3.
- Evaluate the body, then go to 1.
- Evaluation of the while loop is over, the result is the result of the last evaluation of the body, or
nil
if the body has not been evaluated at all.
A small example program:
let mut a = true;
let mut b = true
while a {
if b {
b = false;
} else {
a = false;
42;
}
}
In this program, the body of the loop is executed two times, the condition is evaluated three times, and the final result is the value 42
.
Using the while loop, it is possible to write programs that never stop evaluating, for example while true {}
. Before we introduced the while loop, evaluation always mapped a program to a value. Now, evaluation only maps some programs to a value while others don't have a well-defined result. Even worse, it is now impossible to define an algorithm that can determine whether a given pavo program terminates or not. There are good reasons for having the while loop in the language though: It allows us to express arbitrary computations. Going into further details would go beyond the scope of this text, the branch of computer science that deals with these things (computability theory) is one of the oldest and most fundamental ones. The main takeaway is that there are some problems that cannot be solved by computer programs, and unfortunately, categorizing arbitrary programs based on some runtime criteria is among them.
There are two special expressions that can appear in the body of a while loop (but that can not appear in other places of a program). When a continue
expression is evaluated, the loop jumps to the next iteration: the condition gets evaluated again, and iteration either continues or is finished (evaluating to nil
). The break
expression direcly finishes the evaluation of the loop, the loop expression evaluates to nil
. Both break
and continue
may optionally be followed by an addition expression. This expression is evaluated as part of evaluating the break/continue expression. In case of continue
, it yields the result of the loop iteration (so the loop evaluates to it if the loop condition evaluates to false
or nil
after the continue expression). In case of break
, it directly yields the result of evaluating the loop.
let mut a = true;
while true {
if a {
a = false;
continue;
break 42; # never evaluated
} else {
break 0;
break 1; # never evaluated
}
} # evaluates to 0, takes two iterations
Exceptions
Remember how evaluation was supposed to turn an expression (or sequence thereof) into a value? That was only a simplification. The actual mechanism is a tiny bit more complicated: Evaluation either successfully yields a value (as is the case with all expressions covered so far), or it might result in an exception. We refer to an expression that yields an exception as throwing the exception. An exception is just an arbitrary value. Whenever the evaluation of a subexpression results in an exception, evaluation of the outer expression (or sequence of expressions) yields that same exception - without evaluating any further subexpressions. So in the programm <expr-1>; <expr-2>
, if evaluating <expr-1>
throws 42
, then the whole program throws 42
and <expr-2>
is nevere evaluated. By extension, this means that running a pavo program has one of three possible outcomes: It might successfully evaluate to a value, it might throw a value, or it might end up in an infinite loop and never terminate.
The expression throw <expr>
evaluates <expr>
and then throws the resulting value.
(throw 42) == (throw 43) # throws 42, the `throw 43` expression is never evaluated
At this point, exceptions don't seem very useful, they are just a mechanism for terminating a program early. Their usefulness stems from another expression that allows us to react to exceptions: try <block-1> catch <name> <block-2>
. The try expression is evaluated by first evaluating the expressions in <block-1>
. If none of them throws, then the whole expresion evaluates to the value of the block. But if the block throws, evaluation of the try expression resumes by executing <block-2>
in an environment in which name
is bound to the exception. The name can optionally be preceded by mut
to make the binding mutable. A few examples:
try {42} catch n { 17 }; # evaluates to 42, the catch block is never executed
try {
0;
throw 1; # will be caught, execution resumes at the catch block
throw 2; # never executed
} catch n {
n
} # evaluates to 1
try { throw 0 } catch mut n {
n = 1;
1
} # evaluates to 1
try { throw 0 } catch n { throw 1 } # throws 1
The whole point of adding exceptions to a programming language is to make it such that every action (i.e. evaluation of ane expression) can either succeed or fail. We expect things to succeed most of the time, so the way that programs are written reflects this: as long as nothing goes wrong, they are essentially executed in sequence. In the exceptional case, we leave this linear control flow and the try/catch mechanism kicks in to handle things. This is another design choice in the language that helps with readability: We don't need to pollute the "happy path" of program execution with error handling all the time, instead, we can isolate the error handling logic in specific places (the catch blocks).
Chapter Recap
This chapter covered:
- the final shape of the evaluation process: An expression or sequence of expressions is evaluated in a environment and either yields a value, throws an exception, or diverges)
- scoping rules for name binding within blocks
- the if expression for conditional execution
- the while expression for repeated execution
- the throw and try/catch expressions for working with exceptions
- that there are problems that no computer program can solve
Chapter 3: Go With the Flow
So far, the evaluation of pavo programs has always been straightforward: Evaluate all expressions of the program in order, recursively including their subexpressions. In this chapter we will learn about expressions that allow to change this control flow, allowing to skip over some expressions or to otherwise deviate from linear execution order.
Conditional Evaluation
An if expression allows to evaluate some sequence of expressions only if a condition expression evaluates to a value other than false
or nil
. Such a value is called truthy.
if <expr> { <exprs> }
is an expression, where<exprs>
is a semicolon-separated sequence of expressions (just like a program).- it is evaluated by first evaluating
<expr>
. If it evaluates tofalse
ornil
, the whole expression evaluates tonil
. Otherwise, it evaluates to the result of evaluating the chained<exprs>
.
A few examples:
if true { 42 }; # 42
if 42 {0; 1}; # 1
if 42 {}; # nil
if false {42}; # nil, the expression 42 is never evaluated
if nil {42}; # nil, the expression 42 is never evaluated
There is also a second form of the if expression: the if-else expression. An if expression may continue with else { <exprs> }
, where <exprs>
is again again a sequence of chained expressions. These are only evaluated if the condition evaluated to false
or nil
(falsey for short).
if false {
42 # never evaluated
} else {
0;
1
} # 1
Finally, instead of following the else
with braces, it may be followed by exactly one blocky expression. A blocky expression is an expression beginning with if
, or one beginning with while
, for
, case
or loop
(all expressions we haven't encountered yet). This is purely to make the code more readable by saving on curly braces, as shown in the following example:
if nil {
0
} else if false {
1
} else {
2
} # 2
Blocks and Scope
A sequence of semicolon-chained expressions within curly braces is called a block. Blocks have an impact on names and the environment: they restrict the scope) of names. Name bindings only extend to the end of their block. A concrete examle:
let a = 42; # a comes into scope, can be used until the end of the program
if true {
let b = 43; # b comes into scope, can be used until the end of the block
a == b; #
} # b goes out of scope, any occurences of b beyond this point would be free
a;
This is necessary to prevent situations where a name could be used even though it was never bound:
if false {
let a = 42
};
a # This is not a valid pavo program, because a is free. There would be no sensible semantics if this was valid.
The interaction between scopes and shadowing can be a bit tricky:
let a = 0;
if true {
let a = 1;
a; # evaluates to 1
};
a # evaluates to 0
To get a clearer grasp of how this works, you can imagine the inner name was completely different:
# this example is equivalent to the prior one
let a = 0;
if true {
let inner_a = 1;
inner_a; # evaluates to 1
};
a # evaluates to 0
The same goes for shadowing where mutability is involved:
let a = 0;
let mut b = 0;
if true {
let mut a = 1;
a = 2; # this is ok because the immutable binding is shadowed by a mutable one
let mut b = 1;
b = 2;
a; # evaluates to 2
b; # evaluates to 2
};
a; # evaluates to 0, the outer binding has never been touched
b; # evaluates to 0, the outer binding has never been touched
Renaming the above program for clarity yields the following, equivalent program:
let a = 0;
let mut b = 0;
if true {
let mut inner_a = 1;
inner_a = 2;
let mut inner_b = 1;
inner_b = 2;
inner_a; # evaluates to 2
inner_b; # evaluates to 2
};
a; # evaluates to 0, the outer binding has never been touched
b; # evaluates to 0, the outer binding has never been touched
Mutable Bindings
Bindings as introduced so far are immutable: Once a name is bound to a value, this binding does not change. Sometimes we may want to change the value a name is bound to over time. To do so, we must first mark the binding as mutable by introducing it as let mut <name> = <expr>
. This works just like a regular let expression. Next, we introduce the assignment expression for changing the value to which a mutable binding is bound: <name> = <expr>
evaluates <expr>
and rebinds the name <name>
to the resulting value. The assignment expression itself evaluates to nil
.
let mut a = 42;
a = 43;
a; # evaluates to 43
Only bindings that have been declared as mutable can be assigned to. A program that tries to assign to an immutable binding is invalid and cannot be evaluated at all. This is the third and final kind of static error that pavo programs can have (the other two were invalid syntax and occurence of free names).
In principle, immutable bindings are strictly less powerful than mutable bindings. So why aren't mutable bindings the default (and why are there immutable bindings at all)? This is primarily so that programmers can protect themselves from writing incorrect programs by accident. If you don't intend to mutate a binding, then the language will stop you from accidentally assigning to it. This can be very helpful, especially as programs become large. Additionally, it makes reading the program much simpler: The reader knows they don't have to mentally keep track of a bindings evolution over time it isn't exlpicitly marked as immutable.
Writing programs such that other humans can easily read them is one of the harder aspects of programming - but also one of the most crucial ones. Nobody will use your program or build upon your code if they cannot discern what it does. And they cannot perform changes (e.g. to make it more efficient, add new features, or fix errors) if they cannot understand it in the first place.
Chapter Checklist
This chapter covered:
- programs as sequences of expressions
- the
nil
value for conveying absence of meaningful information - the concept of an environment as a partial mapping from names to values, and its role in the evaluation process
- the let expression to bind names to values
- the assignment expression that can manipulate mutable bindings
- the importance of writing programs that are easy to read
Chapter 2: Name of the Game
Suppose you wrote a lengthy, sophisticated expression, for example 42 == 43
. Suppose further you wanted to use the resulting value at multiple points in your program, e.g. to check whether it was equal to itself. You would have to duplicate the expression, resulting in an awkwardly long program: (42 == 43) == (42 == 43)
. If you wanted to change the original expression later, you would need to manually find all places where you used it and then update them. This process is error-prone and inefficient.
Pavo offers a solution: The result of an evaluation step can be given a name. At later points in the program, the name can be used and its expression evaluates to the value to which the name was bound earlier. This chapter explains the mechanisms that are involved to make this work.
Syntax and Semantics of Names
In order to use names in the programming language, we need a new kind of expression:
- A name is an expression. A name is a sequence of at least one and at most 255 characters that meets the following criteria:
- it only consists of the characters
a
toz
,A
toZ
,0
to9
and_
- it does not begin with a digit (
0
to9
) - it is not one of the following reserved words:
nil
,true
,false
,if
,else
,return
,break
,while
,mut
,loop
,case
,throw
,try
,catch
,for
,let
,rec
,_
- it only consists of the characters
The semantics of names present a problem: We want to assign names dynamically, and we might even want the same name to refer to different values at different expressions in the same program. These requirements cannot be met with an evaluation function that only depends on the expression that is evaluated. So we need to introduce a new concept: An environment. We look at a simplified definition first, and will later refine it.
For now, an environment simply associates a set of names with one correponding value per name. A name that appears in an environment is called bound (to the corresponing value and within the environment), and a name that is not part of an environment is called free. The pair of a name and its corresponding value is called a binding.
Evaluation of an expression always occurs in the context of an environment. There is a default environment that is used for evaluating pavo programs. Implementations may provide mechanisms to evaluate programs in the non-default environment, but unless explicitly stated otherwise, we always assume programs/expressions to be evaluated in the default environment. The default environment includes many useful bindings, for now we only need to concern ourselves with the name int_val_max
that is bound to the value 9223372036854775807
.
With environments defined, the semantics of bound names are straightforward: A bound name expression evaluates to the value to which the name is bound in the evaluation environment. For example, in the default environment, int_val_max == 42
evaluates to false
.
What about free names? What would kjhkjhjk
(which is not bound in the default environment) evaluate to? This is a trick question: such an expression cannot be evaluated at all, similar to how *&[ z$ $
cannot be evaluated. An expression can only be evaluated in an environment if it contains no free names. Just like a syntax error, this is a static error: It is detected before evaluation of the program starts at all. This means that defining the semantics of bound names only is sufficient to get a well-defined semantics overall.
Binding Names
So far, names and environments are not very useful, we are stuck with the same environment throughout the whole evaluation of a program. The next expression changes that: let <name> = <expr>
is an expression that evaluates <expr>
and binds the <name>
to that value. Even if the name had been bound to a value previously, the new binding is used when evaluating further expressions. This process of "overwriting" a binding is called shadowing.
Before we can further explore these concepts, we need to introduce a couple of things. First, what does the let expression itself evaluate to? We don't really care, the expression is used to modify the environment, not to compute a value. To this end, pavo provides the nil
value. The value nil
is used whenever we don't actually care about the result of evaluating an expression. There is a corresponding nil
expression, it evaluates to the value nil
(just like e.g. the expression true
evaluates to the value true
).
The second problem we need to address is that a program consists of only a single expression, so once we wrote a let expression, there's no further expression left that could use the new binding. We fix this by expanding the definition of a program:
- A program consists of zero or more expressions, each separated by any amount of whitespace, a semicolon, and again any amount of whitespace. There may be up to one trailing semicolon (wrapped with any amount of whitespace).
- A program of zero expressions evaluates to
nil
. A program of one or more expressions is evaluated by evaluating the expressions in sequence and yielding the result of evaluating the last expression.
So now we can write programs like the following:
let a = 42;
let b = 43;
a == b; # This expression (and thus the whole program) evaluates to `false`.
Show whole feed