The structure of a concatenative interpreter

Slava Pestov — 2004-06-16 21:27:33

Hi everybody,

I wanted to discuss different implementation strategies for the core of
a stack interpreter.

So I'm curious, how many other ways of implementing a stack interpreter
are there?

The Factor interpreter consists of three key components:

- datastack
- callstack - holds callframes to 'return to'
- callframe - a pointer inside currently-executing quotation

The interpreter is basically an infinite loop whose one iteration, in
imperitive pseudo-code is as follows:

if callframe == empty list
if callstack empty
top level has returned -- exit interpreter
else
callframe <- callstack pop
else
operand <- callframe.car
callframe <- callframe.cdr

if operand instanceof FactorWord

if operand is a compound definition
callstack push callframe
callframe <- word definition

else if operand is a primitive definition
execute operand

else
datastack push operand

Slava

John Cowan — 2004-06-17 01:58:18

Slava Pestov scripsit:

> So I'm curious, how many other ways of implementing a stack interpreter
> are there?

The Joy interpreter essentially uses the same mechanism you describe,
but uses recursion in the implementation language (C or Scheme).

Each call to the interpreter procedure steps through the elements of
a list/quotation: if the current element is a literal, it is pushed
on the stack; if it is a builtin, it is invoked (which may involve
invoking the interpreter recursively); if it is a defined word, the
interpreter invokes itself recursively to evaluate the definition.

--
Work hard, John Cowan
play hard, cowan@...
die young, http://www.reutershealth.com
rot quickly. http://www.ccil.org/~cowan

sa@dfa.com — 2004-06-17 15:05:38

my current model is http://www.nsl.com/k/tck/tck2.k

the core of the interpreter is the function 'a', which takes two
arguments: x, the result stack, and y, the input stack. 'a'
executes a single step and returns x' and y'.

o == first y
y == 1 drop y
if o is a function return the result of apply o to x and y
else return x append o and y

the evaluator applies 'a' repeatedly until y is empty.

a function o take x and y and return x' y'.

for example, the 'dip' combinator expects the result stack to have
the following form:

.. a [p]

if the input stack has the form:

..

then 'dip' will return:

result stack: ..
input stack: p a ..

that is, it will push a onto the input stack, then (unquoted) p.

the evaluator will then find p at the top of the input stack,
evaluate it, and then push a onto the result stack.

as it turns out, those functions which modify the input stack are
exactly the ones we call "combinators".

here's a trace of the execution steps of a typical evaluation:

t(10;20;30;40;50;(+;*);`dip)
((() / empty result stack
(10 / initial input stack
20
30
40
50
(+;*)
`dip))
(,10 / push 10
(20
30
40
50
(+;*)
`dip))
(10 20 / push 20
(30
40
50
(+;*)
`dip))
(10 20 30 / push 30
(40
50
(+;*)
`dip))
(10 20 30 40 / push 40
(50
(+;*)
`dip))
(10 20 30 40 50 / push 50
((+;*)
`dip))
((10 / push quotation (+;*)
20
30
40
50
(+;*))
,`dip)
(10 20 30 40 / dip: push 50 onto input stack, then push unquoted (+;*)
(+;*;50))
(10 20 70 / execute +
(*;50))
(10 1400 / execute *
,50)
(10 1400 50 / push 50
!0)
(10 1400 50 / empty input stack = done.
()))


Slava Pestov <slava@...> wrote on 06/16/2004 05:27:33 PM:

> Hi everybody,
>
> I wanted to discuss different implementation strategies for the core of
> a stack interpreter.
>
> So I'm curious, how many other ways of implementing a stack interpreter
> are there?
>
> The Factor interpreter consists of three key components:
>
> - datastack
> - callstack - holds callframes to 'return to'
> - callframe - a pointer inside currently-executing quotation
>
> The interpreter is basically an infinite loop whose one iteration, in
> imperitive pseudo-code is as follows:
>
> if callframe == empty list
> if callstack empty
> top level has returned -- exit interpreter
> else
> callframe <- callstack pop
> else
> operand <- callframe.car
> callframe <- callframe.cdr
>
> if operand instanceof FactorWord
>
> if operand is a compound definition
> callstack push callframe
> callframe <- word definition
>
> else if operand is a primitive definition
> execute operand
>
> else
> datastack push operand
>
> Slava
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>

Slava Pestov — 2004-06-22 19:57:40

sa@... wrote:
> for example, the 'dip' combinator expects the result stack to have
> the following form:
>
> .. a [p]
>
> if the input stack has the form:
>
> ..
>
> then 'dip' will return:
>
> result stack: ..
> input stack: p a ..
>
> that is, it will push a onto the input stack, then (unquoted) p.

I see a potential problem with this. Suppose you execute the equivalent
of this Factor code:

[ + ] car [ ... ] dip

Note that the symbol for + is pushed on the stack!

Now, the input stack will contain:

... +

However after executing ..., how will the interpreter know to simply
push back the +, and not execute it too?

Or does your language get around this by not allowing pushing symbols on
the stack? I admit its a weird case to do this but it comes in handy
when writing reflective code such as code browsers etc.

Slava

stevan apter — 2004-06-22 21:44:36

----- Original Message -----
From: "Slava Pestov" <slava@...>
To: <concatenative@yahoogroups.com>
Sent: Tuesday, June 22, 2004 3:57 PM
Subject: Re: [stack] The structure of a concatenative interpreter


> sa@... wrote:
> > for example, the 'dip' combinator expects the result stack to have
> > the following form:
> >
> > .. a [p]
> >
> > if the input stack has the form:
> >
> > ..
> >
> > then 'dip' will return:
> >
> > result stack: ..
> > input stack: p a ..
> >
> > that is, it will push a onto the input stack, then (unquoted) p.
>
> I see a potential problem with this. Suppose you execute the equivalent
> of this Factor code:
>
> [ + ] car [ ... ] dip
>
> Note that the symbol for + is pushed on the stack!
>
> Now, the input stack will contain:
>
> ... +
>
> However after executing ..., how will the interpreter know to simply
> push back the +, and not execute it too?
>
> Or does your language get around this by not allowing pushing symbols on
> the stack? I admit its a weird case to do this but it comes in handy
> when writing reflective code such as code browsers etc.

good point.

e(10;20;30;,(+);`first;,(*);`dip)
,610

one way to block this behavior is to implement the special syntactic
form of function quotation \+, and redefine dip to push \+ onto the
input stack. (cK already supports this. there was a long discussion
on the topic last year - search the archives for "[+]first".)

i've put a variant of tck2 which does this in

http://www.nsl.com/k/tck/tck2a.k

that is:

e(10;20;30;,(+);`first;,(*);`dip)
(10;600;+)

in particular,

e(10;20;`"+") <= note quoted function
(10;20;+) <= pushes the function

thanks.

>
> Slava
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>

phimvt@lurac.latrobe.edu.au — 2004-07-01 05:43:42

On Tue, 22 Jun 2004, stevan apter wrote:

>
> ----- Original Message -----
> From: "Slava Pestov" <slava@...>
> To: <concatenative@yahoogroups.com>
> Sent: Tuesday, June 22, 2004 3:57 PM
> Subject: Re: [stack] The structure of a concatenative interpreter

[..]

> > I see a potential problem with this. Suppose you execute the equivalent
> > of this Factor code:
> >
> > [ + ] car [ ... ] dip
> >
> > Note that the symbol for + is pushed on the stack!
> >
> > Now, the input stack will contain:
> >
> > ... +
> >
> > However after executing ..., how will the interpreter know to simply
> > push back the +, and not execute it too?

In Joy there is no problem. After + has been pushed and then [...],
the dip combinator shifts the + from the Joy stack to an auxiliary
structure called the dump (this was supposed to be in analogy with
the D (dump) in Landin's SECD machine). The dip combinator does this
shift for whatever is 2nd on the stack (number, quotation, symbol ..),
and after executing [...] shifts it back.

I cannot claim that when I designed the interpreter I foresaw the
problem you outlined. It was sheer luck that I chose a solution
that worked uniformly.

[..]

> one way to block this behavior is to implement the special syntactic
> form of function quotation \+, and redefine dip to push \+ onto the
> input stack. (cK already supports this. there was a long discussion
> on the topic last year - search the archives for "[+]first".)

You might want to consider a cK analogue of the Joy dump. I think
it keeps things regular.

Joy users don't normally use the dump explicitly, but the _help
command shows some special rarely used commands, including _dump
to push the dump onto the stack for debugging the Joy interpreter.
There are a few other things called dump1, dump2 .. for temporarily
saving all sorts of other things, especially for the more conplex
combinators. Because garbage collection can occur any time, nothing
can be safely saved in local variables (on the C-stack). All this
would not be necessary if Joy were rewritten to rely exclusively
on the BDW collector.

Occasionally I have toyed with the idea of implementing two new
commands, called perhaps peek and poke, to be used in conjunction
the dip combinator. The idea would be that whatever the most recent
saving by dip can be made visible by peek, and can be modified
by poke:

1 2 3 4 + [.. peek ..] dip
^^^^ This will push a copy of 7
onto Joy stack, for whatever use.
^^^ This will restore the original 7
after ..] finishes.
1 2 3 4 + [.. 9 poke ..] dip
^^^^ Pops 9 off stack, replaces
the original saved 7 by 9.
^^^ Now the 7 is no longer there,
so the 9 will be pushed.

Peek would be quite harmless, but poke might at least look like
procedural programming with assignment. But actually I don't
think it changes the functional nature of Joy. On the other hand,
I have not found any uses for peek and poke.

- Manfred

Slava Pestov — 2004-07-01 09:26:51

Hi,

Your dump sounds a lot like the FORTH and Factor return stack. The
return stack is visible and can be used for temporary storage:

1 3 7 >r + r>
==> 4 7

>r moves the top of the data stack to the return stack, and r> moves
the top of the return stack to the data stack.

In fact I implement dip as follows:

: dip ( a [ quot ] -- quot a )
swap >r call r> ;

'call' is equivalent to Joy's 'i' combinator.

Since the return stack is also used for activation records, misplacing
>r/r> can send the interpreter into hyperspace.

phimvt@... wrote:
> On Tue, 22 Jun 2004, stevan apter wrote:
>
>
>>----- Original Message -----
>>From: "Slava Pestov" <slava@...>
>>To: <concatenative@yahoogroups.com>
>>Sent: Tuesday, June 22, 2004 3:57 PM
>>Subject: Re: [stack] The structure of a concatenative interpreter
>
>
> [..]
>
>
>>>I see a potential problem with this. Suppose you execute the equivalent
>>>of this Factor code:
>>>
>>>[ + ] car [ ... ] dip
>>>
>>>Note that the symbol for + is pushed on the stack!
>>>
>>>Now, the input stack will contain:
>>>
>>>... +
>>>
>>>However after executing ..., how will the interpreter know to simply
>>>push back the +, and not execute it too?
>
>
> In Joy there is no problem. After + has been pushed and then [...],
> the dip combinator shifts the + from the Joy stack to an auxiliary
> structure called the dump (this was supposed to be in analogy with
> the D (dump) in Landin's SECD machine). The dip combinator does this
> shift for whatever is 2nd on the stack (number, quotation, symbol ..),
> and after executing [...] shifts it back.
>
> I cannot claim that when I designed the interpreter I foresaw the
> problem you outlined. It was sheer luck that I chose a solution
> that worked uniformly.
>
> [..]
>
>
>>one way to block this behavior is to implement the special syntactic
>>form of function quotation \+, and redefine dip to push \+ onto the
>>input stack. (cK already supports this. there was a long discussion
>>on the topic last year - search the archives for "[+]first".)
>
>
> You might want to consider a cK analogue of the Joy dump. I think
> it keeps things regular.
>
> Joy users don't normally use the dump explicitly, but the _help
> command shows some special rarely used commands, including _dump
> to push the dump onto the stack for debugging the Joy interpreter.
> There are a few other things called dump1, dump2 .. for temporarily
> saving all sorts of other things, especially for the more conplex
> combinators. Because garbage collection can occur any time, nothing
> can be safely saved in local variables (on the C-stack). All this
> would not be necessary if Joy were rewritten to rely exclusively
> on the BDW collector.
>
> Occasionally I have toyed with the idea of implementing two new
> commands, called perhaps peek and poke, to be used in conjunction
> the dip combinator. The idea would be that whatever the most recent
> saving by dip can be made visible by peek, and can be modified
> by poke:
>
> 1 2 3 4 + [.. peek ..] dip
> ^^^^ This will push a copy of 7
> onto Joy stack, for whatever use.
> ^^^ This will restore the original 7
> after ..] finishes.
> 1 2 3 4 + [.. 9 poke ..] dip
> ^^^^ Pops 9 off stack, replaces
> the original saved 7 by 9.
> ^^^ Now the 7 is no longer there,
> so the 9 will be pushed.
>
> Peek would be quite harmless, but poke might at least look like
> procedural programming with assignment. But actually I don't
> think it changes the functional nature of Joy. On the other hand,
> I have not found any uses for peek and poke.
>
> - Manfred
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>

John Cowan — 2004-07-01 12:34:55

Slava Pestov scripsit:

> Your dump sounds a lot like the FORTH and Factor return stack. The
> return stack is visible and can be used for temporary storage:

In essence, yes. Actual return addresses in the Joy0 and Joy1
interpreters are stored on the C stack; the dump is used because
the precise garbage collector used with Joy0 and some builds of Joy1
cannot understand the C stack, and so no local pointer variables can be
allocated there. This restriction is a great pain when writing code for
the Joy interpreter, since one must carefully maintain the dump by hand
in synchrony with the C stack, pushing at the beginning of C routines
and popping at the end. The alternative dumps are provided because it
would be painful to maintain the equivalent of multiple local variables
otherwise.

With a more tractable implementation language like C-with-conservative-
garbage-collection, Java, or Scheme, the dump would be unnecessary;
the implementation language stack could be used throughout.

--
LEAR: Dost thou call me fool, boy? John Cowan
FOOL: All thy other titles http://www.ccil.org/~cowan
thou hast given away: cowan@...
That thou wast born with. http://www.reutershealth.com

Slava Pestov — 2004-07-01 17:32:28

John Cowan wrote:
> This restriction is a great pain when writing code for
> the Joy interpreter, since one must carefully maintain the dump by hand
> in synchrony with the C stack, pushing at the beginning of C routines
> and popping at the end.

I get around this problem in the new C rewrite of Factor by keeping as
little as possible in C! Right now the C core is 1,400 lines. The rest
-- control structures, parser, dictionary lookup, high level I/O, etc --
is written in Factor.

Slava

John Cowan — 2004-07-01 17:47:38

Slava Pestov scripsit:

> I get around this problem in the new C rewrite of Factor by keeping as
> little as possible in C! Right now the C core is 1,400 lines. The rest
> -- control structures, parser, dictionary lookup, high level I/O, etc --
> is written in Factor.

I urge you again, when you add garbage collection, to use the Boehm conservative
collector. Writing bulletproof garbage collectors is *hard*. I've written
three of them, and they invariably lead to mysterious bugs in other parts of
the code because the GC changed something that should not have been changed
or failed to change something it should have. Knowing from the beginning
that you have a robust garbage collector improves your confidence in the
behavior of your program immensely, particularly as few test cases
exercise the GC properly on modern machines.

--
I suggest you call for help, John Cowan
or learn the difficult art of mud-breathing. jcowan@...
--Great-Souled Sam http://www.ccil.org/~cowan

stevan apter — 2004-07-01 22:09:38

----- Original Message -----
From: "John Cowan" <cowan@...>
[:]
> With a more tractable implementation language like C-with-conservative-
> garbage-collection, Java, or Scheme, the dump would be unnecessary;
> the implementation language stack could be used throughout.


my first cut at cK relies on K's stack, which is the C stack. but
this imposes a hard limit on the depth of recursion, which is very
much less than available memory. for example, factorial in terms
of linrec, which is

[null] [succ] [dup pred] [*] linrec

fails with a stack error at < 10000 fact. not a very realistic
example, i admit, but i'm sure there are common cases where you'd
want to recur past the limit set by the C stack.

in cK, linrec is implemented as the following function:

{[s;i;t;e;f]:[E_[s;i];E[s;t];E[linrec[E[s;e];i;t;e;f];f]]}

it takes as input the stack with the top four elements exposed and
named:

s i t e f

and returns a new stack.

E[x;y] is (roughly) 'i', or slava's 'call', where E takes a stack
x and a program y and returns a new stack. E_ returns just the
top element of that computed stack. so linrec is

if E_[s;i] is true
then return E[s;t]
else return E[linrec[E[s;e];i;t;e;f];f]

to avoid using the K stack -- to allow recursions to expand to take
available memory -- i implemented a two-stack design, where e.g. linrec
takes

x y i t e f

again, the result stack is expected to have the form

.. i t e f

and linrec is called with .. as x, the program stack y, and the top
four elements of the result stack i, t, e, f:

linrec:{[x;y;i;t;e;f](x;i,(i;t;e;f;x;linrec_),y)

it now returns *two* things: the result stack x, and a new *program stack*

unquoted[i] i t e f x linrec_ y

the program-stack is then evaluated up to linrec_, which is the function

linrec_:{[x;y;b;i;t;e;f;s](s;:[b;t;e,(i;t;e;f;linrec),f],y)}

linrec_ returns two things: the old result stack s and a new program-stack.
the new program stack is computed by pushing onto the program stack either:
(i) unquoted t if b is true (where b is the result of evaluating unquoted i
left on the stack by linrec), or, if b is false, the "recursion" step (ii):

unquoted[e] i t e f linrec y

in the two-stack approach, every combinator has to be unpacked to set up
the result stack so the intended "recursion" just "happens" in the right way.
so far, i've only analyzed linrec, i, dip, and cond, and the k combinators
over, converge, and right. implementing the "advanced" combinators of joy
(e.g. treegenrec, condnestrec, &c.) in cK gave me a headache which lasted
days, so i keep finding excuses to defer the task of writing two-stack versions.

phimvt@lurac.latrobe.edu.au — 2004-07-05 06:26:48

On Thu, 1 Jul 2004, John Cowan wrote:

[..]

> With a more tractable implementation language like C-with-conservative-
> garbage-collection, Java, or Scheme, the dump would be unnecessary;
> the implementation language stack could be used throughout.

Yes, and for most cases local C variables could replace the dumps.
But there are a few troublesome cases for some of the more
complex combinators, say "comb", where the code looks like
this:

void comb-aux()
{ if (some-condition) do something (reading dumps);
else recursively call comb-aux; }

void comb()
{ check parameters and types;
push stuff on dumps to save it;
call comb-aux;
pop the used dumps; }

If the dumps were replaced by local variables in comb(), then
comb-aux() could not access them. If this were Pascal, one would
make comb-aux() local to comb(). Then the static chain (or display)
could access the local variables in the enclosing function. In
Gnu-C such nesting is also allowed, using a "tramboline" mechanism
which I do not understand. I do not know if the BDW garbage collector
is compatible with trampolines (John ?).

If plain ol' C is used, then there are at least two possibilities
which avoid nesting: 1: give comb-aux some reference parameters to
those local variables of comb() which replace dumps. This is mildly
painful and slightly inefficient for the recursive calls of com-aux.
2: instead of local variables, use global ones just here, as a global
array serving as a stack (in partial synchrony with the C stack).
Then as for the current dumps, comb() starts by pushing on this
stack and ends by popping this stack. All this only for those
combinators which need a subsidiary -aux() function.

Any other tips for whoever re-implements Joy?

- Manfred

Slava Pestov — 2004-07-05 06:35:44

I think the simplest solution is to write these combinators in Joy
itself, with a minimal set in C (ifte and such). Why was this approach
not chosen for the current implementation?

phimvt@... wrote:
> On Thu, 1 Jul 2004, John Cowan wrote:
>
> [..]
>
>
>>With a more tractable implementation language like C-with-conservative-
>>garbage-collection, Java, or Scheme, the dump would be unnecessary;
>>the implementation language stack could be used throughout.
>
>
> Yes, and for most cases local C variables could replace the dumps.
> But there are a few troublesome cases for some of the more
> complex combinators, say "comb", where the code looks like
> this:
>
> void comb-aux()
> { if (some-condition) do something (reading dumps);
> else recursively call comb-aux; }
>
> void comb()
> { check parameters and types;
> push stuff on dumps to save it;
> call comb-aux;
> pop the used dumps; }
>
> If the dumps were replaced by local variables in comb(), then
> comb-aux() could not access them. If this were Pascal, one would
> make comb-aux() local to comb(). Then the static chain (or display)
> could access the local variables in the enclosing function. In
> Gnu-C such nesting is also allowed, using a "tramboline" mechanism
> which I do not understand. I do not know if the BDW garbage collector
> is compatible with trampolines (John ?).
>
> If plain ol' C is used, then there are at least two possibilities
> which avoid nesting: 1: give comb-aux some reference parameters to
> those local variables of comb() which replace dumps. This is mildly
> painful and slightly inefficient for the recursive calls of com-aux.
> 2: instead of local variables, use global ones just here, as a global
> array serving as a stack (in partial synchrony with the C stack).
> Then as for the current dumps, comb() starts by pushing on this
> stack and ends by popping this stack. All this only for those
> combinators which need a subsidiary -aux() function.
>
> Any other tips for whoever re-implements Joy?

John Cowan — 2004-07-05 17:22:15

phimvt@... scripsit:

> In Gnu-C such nesting is also allowed, using a "trampoline" mechanism
> which I do not understand.

"The trampoline that doesn't bend your brain is not the true trampoline."

This is the best brief explanation of GNU C trampolines I know,
from http://www.uwsg.iu.edu/hypermail/linux/kernel/9704.1/0547.html :

# Normal C functions are not nested.
#
# GNU C adds nested functions (ala Pascal) that can modify variables in
# the outer functions:
#
# int outer(){
# int i = 0;
#
# void inner () {
# i = 1;
# }
#
# inner ();
# return i;
# }
#
# The compiler, when calling a nested function, passes in a register
# its own stack frame, which is known as the static chain.
#
# In addition, you can pass the address of a nested function:
#
# int outer() {
# int i = 0;
#
# void inner () {
# i++;
# }
#
# other_func (inner);
# return i;
# }
#
# However, you have a problem, because the static chain is not passed,
# and so inner could clobber random memory or segfault. So what the
# compiler does is construct what it calls a trampoline on the stack
# that has executable code in it to load the static chain and jump to
# the real function. You can't use static storage for the trampoline,
# since the outer function might be called recursively. You can't use
# malloc'ed storage for the trampoline, since a longjmp would not free
# the allocated blocks.

The term "trampoline" is also used in a somewhat different sense
for a driver loop that invokes one of a group of procedures, each of
which returns a pointer to a different procedure to be called next.
This technique allows one to emulate non-local control flow in C,
and has nothing to do with nested procedures.

> I do not know if the BDW garbage collector
> is compatible with trampolines (John ?).

No reason why not.

--
"Well, I'm back." --Sam John Cowan <jcowan@...>

phimvt@lurac.latrobe.edu.au — 2004-07-12 06:52:50

On Mon, 5 Jul 2004, John Cowan wrote:

> phimvt@... scripsit:
>
> > In Gnu-C such nesting is also allowed, using a "trampoline" mechanism
> > which I do not understand.
>
> "The trampoline that doesn't bend your brain is not the true trampoline."
>
> This is the best brief explanation of GNU C trampolines I know,
> from http://www.uwsg.iu.edu/hypermail/linux/kernel/9704.1/0547.html :

[.. long quote from the above ..]

> > I do not know if the BDW garbage collector
> > is compatible with trampolines (John ?).
>
> No reason why not.

Thanks a lot for all that, John. It was very helpful.
To summarise, for a future re-implementation of Joy:

The current implementation of Joy has its own garbage collector,
but it cannot collect strings. John Cowan had modified it so
that (miraculously) it can be compiled either using its own
garbage collector, or the BDW collector which will also collect
strings.

A future implementation might not bother about its own collector
but always relies on the BDW collector. This will would make
the program simpler. Moreover, the inbuilt collector needs
a fair bit of machinery, the various dumps. So, eliminating that
will probably make it more efficient.

The dumps are used by the combinators, and in most cases they
could be eliminated by using local variables instead, since
these will be freed upon exit from the combinator function.
But a small number of combinator functions also use an auxiliary
function which needs to access what the main combinator function
has saved on various dumps. For these few combinators there are
three possibilities:
1. As for all other combinators, replace the dumps by local
variables. The variables are then passed by reference to
the auxiliary functions.
2. Again, replace dumps by locals, but nest the auxiliary
function inside the main function (as in Pascal). Use
GNU-C as the compiler, since it allows such nesting
with access to the variables of the enclosing function.
3. Use a global stack in place of the locals, push what
would have been saved in a local, and upon exit pop that
stack. This stack could be implemented by an array.

I do not have strong opinions as to which of the three methods
would be best. It might well depend on the BDW collector.

(Incidentally, re 3., in the current implementation the dumps are
essentially stacks implemented as linked lists. I chose that
in part for uniformity, in part because I thought it would be
useful for debugging to be able to push a dump onto the Joy
stack so that it can be printed. I am no longer certain whether
this was necessary.)

- Manfred