|
| MATH0069: Programming Language Implementation TechniquesDepartment of Mathematical SciencesA Normal Form |
|---|
An obvious awkwardness about the CEK rules given earlier is the way
in which we have to define multiple rules for any term that needs to
know the values of sub-terms in order to proceed, as is the case with
function application, conditions and let binding.
The rules can be simplified by translating terms into A normal
form (ANF) and then using a ANF version of the CEK machine. In
summary, the purpose of rewriting the program into ANF is that any
sub-term is guaranteed to be one of an integer constant, a string
constant, a variable or a rec term, each of which is
trivially evaluable - a task we delegate to the gamma
function. The consequence is that every source term can then be
evaluated in a single step, since there are no complex sub-terms on
which we have to call the interpreter recursively.
Simplification of the rules is not the only issue in question however:
the first rule set also led to the creation of very many activiation
records in order to handle the "recursive call" of the interpreter so
that we could recognize when we had finished evaluating a sub-term.
Yet, in effect they are only administrative, in the sense that they
are carrying out the simplification of the sub-terms at run-time, when
that reorganization can in fact be done statically (at compile-time,
if you wish). Thus it is that the ANF rule set only creates
activation records when it is essential, in order to implement a
function call (in non-tail position).
Thus a new rule set for ANF/TLL is:
Leaving us with the question of how to translate terms into ANF. We can specify what we want to achieve by means of a recursive procedure:
Note: this is not a particularly efficient way of doing the job, since the way in which recursion is used leads to each constructed term being analysed over again. On the other hand it is relatively clear what is happening. A much more efficient (linear cost) implementation uses a form of continuation passing to remember "what to do next". This comes from a paper about ANF (Flanagan et al, 1993) and is written in Scheme:
And a simple example of what it can do...given the input:
A familiar and slightly bigger example illustrates the earlier reference to tail-calling. First the traditional recursive factorial:
which translates to:
in which we observe that the recursive call is in the initializer of a let expression, the rule for which we see constructs an activiation record to enable the return to evaluate the body. It should be obvious that this definition will construct a number of activation records equal to the magnitude of the argument to g/f. Now look at the traditional accumulating parameter recursive factorial:
which translates to:
Here we observe that the recursive call is in the body of the let and the rule for evaluating a call does not construct an activation record, but instead passes on the current activation record. The net effect of this is that this definition does not construct any activiation records at all, despite being a recursive function. The reason is that the ANF translation formalizes the notion that the last function call in a function can be replaced by a goto. In consequence, recursive definitions having this property run in bounded space, just as if they were loops.