Re: [stack] SECD-like (explicit stack) machine for executing Joy?

stevan apter — 2003-06-28 13:25:18

have you considered implementing joy using continuation-passing?

john? have you thought about this for the functional rewrite of
your scheme implementation?

----- Original Message -----
From: "symstym" <symstym@...>
To: <concatenative@yahoogroups.com>
Sent: Saturday, June 28, 2003 3:34 AM
Subject: [stack] SECD-like (explicit stack) machine for executing Joy?

> The problem I ran into is that in the current C implementation of Joy, evaluation of
> quotations (by combinators) is done using recursion in C. For example, the
> implementation of dip 'recursively' calls the exeterm function to execute the
> quotation it popped off the stack. So, the current state of execution (dare I say, the
> continuation?) is stored implicitly on the C stack.
>
> For my (C++) application, I want a Joy interpreter that does one small (O(1)?) unit of
> work each time I invoke it. This seems to imply that I need to do more explicit
> bookeeping on some kind of stack. I looked into SECD machines a bit, and I think I'm
> looking for something exactly like that: a simple machine where the evaluation of
> each instruction affects a simple transformation of the data structures, so there is no
> recursive invocation of the evaluation function.

Brent Kerby — 2003-06-28 18:25:12

> The problem I ran into is that in the current C implementation of Joy,
evaluation of
> quotations (by combinators) is done using recursion in C. For example,
the
> implementation of dip 'recursively' calls the exeterm function to execute
the
> quotation it popped off the stack. So, the current state of execution
(dare I say, the
> continuation?) is stored implicitly on the C stack.
>
> For my (C++) application, I want a Joy interpreter that does one small
(O(1)?) unit of
> work each time I invoke it.

I've rewritten this kind of Joy interpreter before (granted, it was only a
slow toy for the obfuscated C coding contest :-)). The easy way to do it is
to use two stacks: the ordinary data stack, and a code stack. Then, "i",
"map", "dip", "ifte", etc. can be implemented by doing a few manipulations
to the code stack (i.e., they _change_ what the upcoming instructions are).
For example, to execute

3 [foo] dip more-instructions

When executing the thing on the left, after the interpreter pushed 3 and
"foo" onto the stack, it would implement "dip" by modifying the upcoming
code from this:

more-instructions

to this:

foo 3 more-instructions

It's also possible to merge the code and data stacks together, by thinking
of an instruction pointer moving through the program and applying rewrite
rules, such as (to use the same example)

3 [foo] dip more-instructions --> foo 3 more-instructions

But that's probably not terribly useful ...

Anyhow, back to task ... In implementing "map", you could use this approach:
If we just did

[1 2 3 4] [foo]

then, "map" could be implemented by pushing the following instructions onto
the code stack:

1 foo 2 foo 3 foo 4 foo [] cons cons cons cons

Or, alternatively,

[] 1 foo tack 2 foo tack 3 foo tack 4 foo tack

where "tack" is a combinator that "tacks" the top item onto the list
beneath:

[B] [A] tack == [B [A]]

Since you want really O(1) time, map could also be implemented in a
bite-size manner:

[1 2 3] [foo] map 4 foo tack

or

1 foo [2 3 4] [foo] map cons

> Russ
> symstym@...

Of course, there are some efficiency issues to worry about with this method.
It would be nice, for example, if all those copies of "foo" in the above
example were only really copies of pointers. It seems feasible, but would
probably require some work.

- Brent

Russel Simmons — 2003-06-29 03:38:50

> I've rewritten this kind of Joy interpreter before (granted, it was only a
> slow toy for the obfuscated C coding contest :-)). The easy way to do it is
> to use two stacks: the ordinary data stack, and a code stack. Then, "i",
> "map", "dip", "ifte", etc. can be implemented by doing a few manipulations
> to the code stack (i.e., they _change_ what the upcoming instructions are).

Yeah, I actually thought of something like this a little while after I posted :) As far as
the follow-up work to be done for 'ifte' after the evaluation of the IF part, I discovered
the 'branch' word, which does the trick. I guess if there are other words like 'ifte' that
_don't_ already have a corresponding word like 'branch' (and it can't be built out of
other words) then the word set would have to be augmented.. no big deal. But there
is a detail about map and other combinators that needs to be considered (see below).

> It's also possible to merge the code and data stacks together, by thinking
> of an instruction pointer moving through the program and applying rewrite
> rules, such as (to use the same example)

It's interesting to think about the different possible equivalent implementations. It
seems like the two stack method is the simplest one to implement, but its not
obvious to me how the performance would compare with other implementations. As
you mentioned, it would probably come down to management of lists. Copying
pointers where possible, copy on write, or perhaps garbage collection.. although it
seems like it would be possible to avoid GC if done carefully. Well hopefully the
simplicity of the two stack design should make thinking about the memory
management easier.

> Anyhow, back to task ... In implementing "map", you could use this approach:
> If we just did
>
> [1 2 3 4] [foo]
>
> then, "map" could be implemented by pushing the following instructions onto
> the code stack:
>
> 1 foo 2 foo 3 foo 4 foo [] cons cons cons cons

Actually this and the other variations don't exactly match the behavior of Joy's map.
If foo eats down into the stack (e.g. foo = +) or leaves more than one value on top,
then the behavior will not generally be the same, because Joy's map basically saves
and restores the stack before the evaluation of each item in the list. I'm not sure that
map should necessarily have this behavior, but at least that's how it is in Joy. I think
that Martin said in his response that Monkey's behavious matches that of your
suggested map, in that it doesn't use "local" stacks. It is certainly easier to implement
that way, but I'm leaning towards the leniency of Joy's style. That would mean
pushing onto the code stack sequences to restore the stack.. not a big deal, just
perhaps slow. Other combinators that have the same problem would be ifte, infra,
etc.

Thanks for the help! I will let you guys know if I get it working happily.

now let's see if I can get the line wrapping right this time

Russ

Russel Simmons — 2003-06-29 08:23:55

> For example, to execute
>
> 3 [foo] dip more-instructions
>
> When executing the thing on the left, after the interpreter pushed
3 and
> "foo" onto the stack, it would implement "dip" by modifying the
upcoming
> code from this:
>
> more-instructions
>
> to this:
>
> foo 3 more-instructions

Actually I noticed a small problem with this.
If instead of '3', we had a sequence like '[+] first', leaving an unquoted operator or
combinator on the stack, the above will not work. After executing foo, when we get
to the '+' in the code stack, it will be executed rather than just pushed back on the
stack as we intended. I guess the solution would be to 'escape' non-literals with
something like '[non-literal] first' when we stick them in the code stack.

Russ

Brent Kerby — 2003-06-30 05:45:04

> > For example, to execute
> >
> > 3 [foo] dip more-instructions
> >
> > When executing the thing on the left, after the interpreter pushed 3 and
> > "foo" onto the stack, it would implement "dip" by modifying the upcoming
> > code from this:
> >
> > more-instructions
> >
> > to this:
> >
> > foo 3 more-instructions
>
> Actually I noticed a small problem with this.
> If instead of '3', we had a sequence like '[+] first', leaving an unquoted
operator or
> combinator on the stack, the above will not work. After executing foo,
when we get
> to the '+' in the code stack, it will be executed rather than just pushed
back on the
> stack as we intended. I guess the solution would be to 'escape'
non-literals with
> something like '[non-literal] first' when we stick them in the code stack.

Good point. Joy's semantics with these unquoted operators has always baffled
me a bit. As I see it, operators like 'first' that pull lists apart fall
into a different class from more well-behaved combinators like 'swap',
'dup', 'cons', 'cat', 'tack', etc. (In particular, these well-behaved
combinators could be implemented even if quotations were compiled programs,
whereas 'first' definitely requires that the program exist in its original
list-of-operators form). But, of course, first and unquoted operators are
useful, so by no means should we throw them out.

So then, here's another possible way of solving your problem, other than
using an escape ... When we do "[+] first", then there is something on the
stack, call it +' . (Perhaps it would be handy for Joy to have syntax
directly for unquoted operators so they don't have to be constructed
sneakily using 'first'). Anyhow, then when we do "[foo] dip", we can simply
push "foo +'" onto the instruction stream, like normal. This presumes then
that the interpreter can handle +' , so we have to augment the language a
bit (well, I suppose it could all be internal, if desired), but that doesn't
seems too bad.

> Russ

- Brent

phimvt@lurac.latrobe.edu.au — 2003-07-02 02:42:02

On Sun, 29 Jun 2003, Brent Kerby wrote:

[...]

> Good point. Joy's semantics with these unquoted operators has
always baffled
> me a bit. As I see it, operators like 'first' that pull lists apart fall
> into a different class from more well-behaved combinators like 'swap',
> 'dup', 'cons', 'cat', 'tack', etc.

Example: In a particular family there are children [peter paul mary]
and pets [dup dip pop] and kitchen tables [square round] and for
the next month guests will be expected on the dates [3 8 17 22 29].
In each case, to get the first item of the respective lists you
use the 'first' operator. This will always work, even though 'dup'
is a Joy builtin and 'square' is defined in a library. It will
continue to work even if in the future 'peter' becomes a builtin
or is defined in a library. All pretty regular, I think.

[...]

> So then, here's another possible way of solving your problem, other
than
> using an escape ... When we do "[+] first", then there is something on the
> stack, call it +' .

This postfix quote may be OK for the program, but what ends up on the
stack should not be this quoted thing. One would want to be able to
cons the top item into another list/quotation, maybe to be executed,
maybe to be written. If the quoted thing ends up on the stack, then
you need another dequote operation to be able to cons it into something
else.

One may very well want a unary quotation operator, say your postfix
quote or a prefix Q. Note that this is a different sense of 'unary',
not the sense in which 'rest', 'not', 'square' are (semantically)
unary. Whatever the exact notation used, "+'" or "Q+" or "Q +" will
be an atom which pushes the (binary) addition operator, or whatever
the argument of the "'" or "Q" is. So in general,
[foo] first == foo' == Q foo
But the "'" or "Q" cannot stand on their own, they need an atom
as an argument to become syntactically correct. In the grammar
there would need to be another production:
factor ::= "Q" factor
This raises the question of what 'Q Q Q foo' should mean.
Perhaps the same as 'Q foo'. The implementation of Q would not
be quite as trivial as it may look, even just for single Q's,
because Q is not just another atom. Is it worth it ?

[...]

> (Perhaps it would be handy for Joy to have syntax
> directly for unquoted operators so they don't have to be constructed
> sneakily using 'first').

You are not persuaded? You don't like sneaky Joy? You want Yacl :

YACL = Yet Another Concatenative Language

Every syntactically correct Joy program is also a syntactically
correct Yacl program. The difference is in the semantics. What
in Joy are called operators and combinators are exactly like
literals in Yacl: when executed they result in a push operation.
So after the execution of
5 dup *
the stack will contain three items: '*' on top, 'dup' second,
and '5' third. (So execution is a bit like reversal, if you think
of the stack as a list). Quotations are left untouched, so
[1 2] [foo bar] concat
result in a stack of three items, 'concat' on top, '[foo bar]'
second, '[1 2]' third.

What is the point, you ask? Well, everything is treated as if it
were quoted, either as '[foo] first' or as "foo'" or as "Q foo"
So there is no need for the unary quotaion operator in Yacl.

But how does Yacl do any computation, like the square or 5 ?
Easy: There is another atom, not a unary operator, called 'E'
for 'execute'. The square of 5 is computed by either
5 dup E * E
or 5 E dup E * E
Semantically the E executes the top single item on the stack in
the same way that Joy would. If that item is an operator or combinator
E will execute it as such. If that item is a literal such as 5,
then E will push it on the stack. So for literals x, x E == x.
One could define such an E in Joy as E == [] cons i, of course,
and the little joy-in-joy uses something like that. But E in Joy
would be rarely used, in Yacl it is essential. One more example:
[1 2 3 4] [dup E * E] map E ==> [1 4 9 16]
In the now very exceptional case where you actually want the E
on top of the stack you would need '[E] first E', unless you
want to introduce a unary quotation operator: "E'" or "Q E".

End-of-Yacl-manual

Thank you all, I really enjoy (!) these discussions.

- Manfred

Nick Forde — 2003-07-02 09:09:47

Hi Russel,

Russel Simmons writes:
> > I've rewritten this kind of Joy interpreter before (granted, it was only a
> > slow toy for the obfuscated C coding contest :-)). The easy way to do it is
> > to use two stacks: the ordinary data stack, and a code stack. Then, "i",
> > "map", "dip", "ifte", etc. can be implemented by doing a few manipulations
> > to the code stack (i.e., they _change_ what the upcoming instructions are).
>
> Yeah, I actually thought of something like this a little while after I posted :) As far as
> the follow-up work to be done for 'ifte' after the evaluation of the IF part, I discovered
> the 'branch' word, which does the trick. I guess if there are other words like 'ifte' that
> _don't_ already have a corresponding word like 'branch' (and it can't be built out of
> other words) then the word set would have to be augmented.. no big deal. But there
> is a detail about map and other combinators that needs to be considered (see below).

Some time ago I did a quick and dirty port of the C Joy interpreter to
PalmOS. The interpreter's use of the C stack was the first thing I had
to change due to both OS stack size limits and the need to respond to
UI events from the user, e.g. to stop running a program.

I also used a "code" stack but didn't get much further than updating a
few of the recurisive primitives. At the same time I did some
performance profiling and found that often instruction dispatch time
was significant. I think this could be significantly reduced by
implementing a few common VM optimisation techniques. In particular
converting the main execution loop to threaded code, caching the
top-of-stack in a register, and using superinstructions for common
bytecode sequences.

> Thanks for the help! I will let you guys know if I get it working happily.

I'd love to hear how you get on. Given a little free time I'd like to
resurrect the PalmOS port I started and am sure I could learn from
your experiences in reimplementing the recursive primitives.

Good luck.

Regards,

Nick.

John Cowan — 2003-07-03 16:46:52

phimvt@... scripsit:

> One may very well want a unary quotation operator, say your postfix
> quote or a prefix Q. Note that this is a different sense of 'unary',
> not the sense in which 'rest', 'not', 'square' are (semantically)
> unary. Whatever the exact notation used, "+'" or "Q+" or "Q +" will
> be an atom which pushes the (binary) addition operator, or whatever
> the argument of the "'" or "Q" is. So in general,

We already can say "+" intern to push a raw + on the stack. You can
even build up operator names this way: "p" "op" concat intern [] i
will do a pop.

--
Do what you will, John Cowan
this Life's a Fiction jcowan@...
And is made up of http://www.reutershealth.com
Contradiction. --William Blake http://www.ccil.org/~cowan