Another implemenatation of Joy.

Martin Young — 2000-08-07 15:08:10

Hi Folks,

As per the subject you might be interested to know that I've produced a Joy
interpreter in Prolog.

At the moment it's a pretty close to Joy as described in the documentation but
without the large library (but I'm working on that...). I've added a system of
modules which (I hope) makes programs easier to break up. I've changed the
comment notation too. Missing are strings (which I intend to render as
syntactic sugar for a list of small integers), sets (which I don't like, and
won't be adding), and input (which I don't intend to add directly from Joy).

For example:

---- snip -- snip ----

<< The interpreter always runs word "main" from module "main", with an
empty stack, on startup. >>
module main provides main .
uses io control .

main ==
{{
[ 45 emit ] 10 repeat nl
10 factorial . nl
[ 45 emit ] 10 repeat nl
}}

---- snip -- snip ----

The module definition denotes which word are exported from the current module,
and which modules' exported words are imported. Unlike C, modules are not
chained - in the above, module "io" might include a module "foo" but the words
exported by "foo" would not be visible in "main".

In the spirit of Joy a word can always be lexically replaced by its definition.
A word actually consists of the ascii string the programmer sees and modules
where that instance of the word originated. When a user defined word is
encountered, the definition is taken from the originating module *not* the
module of the word currently being executed.

For example:

module main provides main foo .
uses an_external_module
foo ==
{{
42
}}

main ==
{{
[ foo ] run
}}

module an_external_module provides run .
foo ==
{{
84
}}

run ==
{{
i . nl
}}


Here, the output will be 42 because the instance of "foo" executed in "run"
originates in module "main" so that definition is used. As the user is not
allowed to construct new word names at runtime, every word is forced to have an
originating module.

The upshot of this is that lexical substitution is retained and so ease
reasoning about a program is also retained. Most importantly, the meaning of a
particular instance of a word is constant throughout the execution of a
program.

Once the main program has been loaded, the interpreted tries to import modules
which are used by looking in the current directory then in a directory
specified by an environment variable. The ordering of modules is not
significant which implies that there is no problem with forward references.

The source code is a little under 300 lines (with liberal use of whitespace and
comments). It's not terribly efficient or quick though. I'll make it
available once I work out where my ISP has hidden my "free" web space. I'm
using SWI Prolog as it's extrememly debuggable but I haven't used anything
outside of the ISO predicates.

My intention in writing yet another implementation is that I want to
experiement with extending the langauge but found the reference implementation
to be rather opaque and flakey.

I hope to look as adding a type system (I like the parallel data and type
stacks idiom), concurrency, programs with proofs, and a better IO system.
Obviously I don't have time to do all of this right now (who does?).

On another topic, I was browsing over the mailing list this morning. Somebody
suggested that when one "dup"s a list either the list is copied
element-by-element, and just reference is taken and so altering one copy will
side-effect the other. I don't believe this is the case.

If we have a list L1 where the name "L1" is a symbolic name for a handle on the
head of the list.

L1 -> [Item1]->[Item2]->[Item3]->[Item4]->[]

We push L1 onto a stack which then looks like this: [L1]. A "dup" gives us a
stack of the form [L1 L1], or write it out in full:

[
[Item1]->[Item2]->[Item3]->[Item4]->[]
[Item1]->[Item2]->[Item3]->[Item4]->[]
]

If we then add an item to the first list we get:

[
[Item1]->[Item2]->[Item3]->[Item4]->[]
[NewItem1]->[Item1]->[Item2]->[Item3]->[Item4]->[]
]

...and then remove the head of the second list, we get.

[
[Item2]->[Item3]->[Item4]->[]
[NewItem1]->[Item1]->[Item2]->[Item3]->[Item4]->[]
]

Finally, we swap the top two items on the second list. We can't just make
[Item3] point to [Item2] and the value on the stack point to [Item3]. In this
case we need to create two new items thusly:

[
[Item5]->[Item6]->[Item4]->[]
[NewItem1]->[Item1]->[Item2]->[Item3]->[Item4]->[]
]

Where items 5 and 6 carry the same values as 4 and 3 respectively.

My point is that we have non-interfering lists but without the overhead of a
full list copy. This is important for Joy programs like:

[ 1 2 3 4 5 ] dup uncons . drop

Where we often make copies of lists. AFAIK most list intensive languages do
this under the covers so my Prolog version (and presumably the j and k
versions) get it for free.

OKay. I thinks that's long enough. I'll shut up now.


--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='

Juho Snellman — 2000-08-07 16:44:45

On Mon, Aug 07, 2000 at 04:08:10PM +0100, Martin Young wrote:
> On another topic, I was browsing over the mailing list this morning.
> Somebody
> suggested that when one "dup"s a list either the list is copied
> element-by-element, and just reference is taken and so altering one
> copy will
> side-effect the other. I don't believe this is the case.

In all of the current Joy implementations that's true. The
question is whether only doing shallow copies would be useful.
The problems with shallow copies are (IMHO):

* Doing shallow copies on collections and deep copies on
scalars would be too inconsistent. (And doing shallow
copies on scalars too would make a mess...)
* A deep copy dup would have to be provided, since deep
copies are clearly needed.
* There isn't any application for shallow copies!

In fact, the interesting question is whether adding a
shallow-dup would have *any* applications at all.

> Finally, we swap the top two items on the second list. We can't just make
> [Item3] point to [Item2] and the value on the stack point to [Item3].
> In this
> case we need to create two new items thusly:
>
> [
> [Item5]->[Item6]->[Item4]->[]
> [NewItem1]->[Item1]->[Item2]->[Item3]->[Item4]->[]
> ]
>
> Where items 5 and 6 carry the same values as 4 and 3 respectively.

Why in the world would you need to do that? Those are still the
same objects, and certainly don't need to be copied until their
values change. Since we already need refcounting for the
objects that are contained in lists, it would be silly to not
take full advantage of that.

Here's a dump from my Joy-like interpreter, 'zoe'.

zoe> [1 2 3] dup
STACK (3):
>> [ 1 (0x805bef8 / 1) 2 (0x805bf18 / 1) 3 (0x805bf38 / 1) ] (0x805beb8 / 2)
>> [ 1 (0x805bef8 / 1) 2 (0x805bf18 / 1) 3 (0x805bf38 / 1) ] (0x805beb8 / 2)

(Here the two lists on the stack point to the same internal list, so
all elements have a refcount of 1.)

zoe> uncons
STACK (7):
>> [ 2 (0x805bf18 / 2) 3 (0x805bf38 / 2) ] (0x805bf68 / 1)
>> 1 (0x805bee8 / 2)
>> [ 1 (0x805bef8 / 2) 2 (0x805bf18 / 2) 3 (0x805bf38 / 2) ] (0x805beb8 / 1)

(We modified one of the lists, so they no longer point to the
same internal list. Therefore the refcounts of the objects increase.)

zoe> uncons [swap] dip cons cons
STACK (7):
>> [ 2 (0x805bf18 / 2) 1 (0x805bef8 / 2) 3 (0x805bf38 / 2) ] (0x805bf68 / 1)
>> [ 1 (0x805bef8 / 2) 2 (0x805bf18 / 2) 3 (0x805bf38 / 2) ] (0x805beb8 / 1)

(We put the objects back into the list in a different order. Since
their values didn't change, there was no need to make copies of them.)

--
Juho Snellman
"C:stä on kehitetty Massachusettsin teknillisessä korkeakoulussa kieli
nimeltä BCPL."

Martin Young — 2000-08-07 17:27:12

On Aug 7, 7:44pm, jsnell@... wrote:
> In fact, the interesting question is whether adding a
> shallow-dup would have *any* applications at all.

Indeed. My opinion is that even it has an application it shouldn't be provided
because of the side-effects it allows.

> > Finally, we swap the top two items on the second list. We can't just make
> > [Item3] point to [Item2] and the value on the stack point to [Item3].
[snip]

> Why in the world would you need to do that? Those are still the
> same objects, and certainly don't need to be copied until their
> values change.

[snip example]

Yes, I see how that works now.

I was assuming that every object in the system is a node in exactly one list
(from the implemenation's point of view - the running program may have it in
several lists) i.e. it's either on the stack (which I consider to be a special
list) as a simple value or in a list, on the stack. With the "exactly one"
clause each item has a "next" reference which leads to the next item or to the
empty list, and a "data" reference which specifies the value.

This means that the list structure is implicit in the objects. Upon a "dup"
both the list structure and the data are automatically and inseperably shared.

In "my" system if you change the data or metadata of a list, you must create
new items so that anybody else who is also has reference not see your changes.
All parts that don't change can still be shared. Hence the new items in my
example.

Thinking further about it, your scheme doesn't require a deep copy either. You
could share the list structures in exactly the same way.

I'm away from by bookshelf at the moment but I 99% sure this is why Prolog
difference lists (as described in Sterling and Shaperio, "The Art Of Prolog")
are so quick. Sterling and Shaperio note that there's a Lisp construct which
takes advantage of the same property, but I'm not a Lisp programmer and I can't
remember what it's called.

--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='

Juho Snellman — 2000-08-07 18:33:27

On Mon, Aug 07, 2000 at 06:27:12PM +0100, Martin Young wrote:
> On Aug 7, 7:44pm, jsnell@... wrote:
> > In fact, the interesting question is whether adding a
> > shallow-dup would have *any* applications at all.
>
> Indeed. My opinion is that even it has an application it shouldn't be
> provided because of the side-effects it allows.

Talking about side-effects in a language where all data is
global and mutable by any function seems a bit futile...
(If we implement the infra-combinator, this is not true. But
infra is a bad idea anyway.)

I'm assuming that shallow-dup would be optional, with deep-dup
as the default. Shallow-dup would be only used if we want a
method for changing objects deep in the stack, which is transparent
to all subsequent functions. (Something that they could do
themselves, having full access to the stack.)

No, I can't think of a reason to do that. But side-effects
aren't a reason for not doing it. :-)

> In "my" system if you change the data or metadata of a list, you must create
> new items so that anybody else who is also has reference not see your
> changes. All parts that don't change can still be shared. Hence the new
> items in my example.
>
> Thinking further about it, your scheme doesn't require a deep copy either.
> You could share the list structures in exactly the same way.

Hmm... Well, I share the list metadata until either list
is modified in any way. Since all of the metadata is in the
list object (implemented as an array), it'd be difficult to
share any metadata between them after that.

I wonder whether sharing the metadata actually has any
performance advantages? Wouldn't the excessive creation of
new objects would cause a worse slowdown than making copies
of the metadata?

--
Juho Snellman
"C:stä on kehitetty Massachusettsin teknillisessä korkeakoulussa kieli
nimeltä BCPL."

Martin Young — 2000-08-07 19:28:44

On Aug 7, 9:33pm, jsnell@... wrote:
> On Mon, Aug 07, 2000 at 06:27:12PM +0100, Martin Young wrote:
> > Indeed. My opinion is that even it has an application it shouldn't be
> > provided because of the side-effects it allows.
> Talking about side-effects in a language where all data is
> global and mutable by any function seems a bit futile...

OTOH, I'm interested in concurrency and proofs. Both of these are more
difficult in the presence of side-effects, even explicit ones.

> (If we implement the infra-combinator, this is not true. But
> infra is a bad idea anyway.)

Why so? It seems to me to be a useful basis for concurrency.

> I wonder whether sharing the metadata actually has any
> performance advantages? Wouldn't the excessive creation of
> new objects would cause a worse slowdown than making copies
> of the metadata?

I don't know. I expect it's possible to build programs which demonstrate the
advantages of both approaches. AFAIK there aren't any large real-world
programs to do benchmarking with.

I wonder if a collection of common homework exercises implemented in Joy would
be a useful resource. I've seen several of these on this list already.


--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='

Louis Madon — 2000-08-07 21:49:31

Martin Young wrote:
>
> On Aug 7, 9:33pm, jsnell@... wrote:
> > On Mon, Aug 07, 2000 at 06:27:12PM +0100, Martin Young wrote:
> > > Indeed. My opinion is that even it has an application it shouldn't be
> > > provided because of the side-effects it allows.
> > Talking about side-effects in a language where all data is
> > global and mutable by any function seems a bit futile...
>
> OTOH, I'm interested in concurrency and proofs. Both of these are
> more difficult in the presence of side-effects, even explicit ones.
>
> > (If we implement the infra-combinator, this is not true. But
> > infra is a bad idea anyway.)
>
> Why so? It seems to me to be a useful basis for concurrency.

Also, infra might be useful as a basis for encapsulation, ie. code +
local stack. The client code can treat that stack as an opaque value.

Speaking of concurrency, I'm interested in this too. One way it might be
added to Joy is as a generalisation of ifte (I'll call it 'fork'):

f [P] [Q] fork

fork expects three values on the stack, a flag and two quotations. The
flag can have one of three values:

Left
Right
Both

If Left, the left quotation P is executed. If Right, the right
quotation Q is executed. If Both, then both P and Q are executed
simultaenously. In this case, execution resumes after the fork only
when P and Q have both completed. Stack results will be presented as if
P and then Q had been executed sequentially on the same input stack.

This is just a start, practical concurrency would also need to provide a
means of communication between threads.

>
> > I wonder whether sharing the metadata actually has any
> > performance advantages? Wouldn't the excessive creation of
> > new objects would cause a worse slowdown than making copies
> > of the metadata?
>
> I don't know. I expect it's possible to build programs which
> demonstrate the advantages of both approaches. AFAIK there aren't any large
> real-world programs to do benchmarking with.

Wouldn't your approach to lists be described as "copy-on-write"? 
Another approach I've heard of for reducing copying is to maintain a
history of changes, ie. a list of deltas. The idea is that whenever a
change is made that must not be visible to other references, you simple
insert a delta (eg. (delete 2) -> [3] -> [2] -> [1] -> []) and then take
those into account when accessing the structure. I think it can be done
quite efficiently for lists, but I can't remember where I read it.


--
Louis.

Juho Snellman — 2000-08-08 19:11:00

On Mon, Aug 07, 2000 at 08:28:44PM +0100, Martin Young wrote:
> On Aug 7, 9:33pm, jsnell@... wrote:
> > (If we implement the infra-combinator, this is not true. But
> > infra is a bad idea anyway.)
>
> Why so? It seems to me to be a useful basis for concurrency.

Combinators that require saving/restoring the whole stack
kill performance for many implementation strategies. This
includes infra, nullary/unary/etc, stack and most of Joy's
conditionals. I believe that they are unneccessary, and
that their removal wouldn't be a loss to application writers.
(That's just guessing, since nothing larger than 100 lines
has been written in Joy.)

The other problem is that infra encourages writing words
that have a "random" stack effect, or that iterate through
all the contents of the stack. Clearly bad style, and
probably rarely useful.

Concurrency is a good point though, and my problems don't
apply to using something infra-like in that case. That
doesn't meant that ifte shouldn't be removed in favour
of branch, though ;-)

--
Juho Snellman
"C:stä on kehitetty Massachusettsin teknillisessä korkeakoulussa kieli
nimeltä BCPL."

Mark van Gulik — 2000-08-09 03:32:27

Louis Madon <madonl@...> wrote:
> Also, infra might be useful as a basis for encapsulation, ie. code +
> local stack. The client code can treat that stack as an opaque value.
>
> Speaking of concurrency, I'm interested in this too. One way it might be
> added to Joy is as a generalisation of ifte (I'll call it 'fork'):
>
> f [P] [Q] fork
>
> fork expects three values on the stack, a flag and two quotations. The
> flag can have one of three values:
>
> Left
> Right
> Both
>
> If Left, the left quotation P is executed. If Right, the right
> quotation Q is executed. If Both, then both P and Q are executed
> simultaenously. In this case, execution resumes after the fork only
> when P and Q have both completed. Stack results will be presented as if
> P and then Q had been executed sequentially on the same input stack.
>
> This is just a start, practical concurrency would also need to provide a
> means of communication between threads.

It seems to me the simplest mechanism of concurrency is futures. Say you
could write:

[100 factorial] future [50 factorial] future +

This mechanism would push two futures on the stack, and arrange for
processes to be created to start computing their results. The futures turn
into their results when the computation completes (indirection objects work
well for this).

An alternative variant of futures would make conversion from a future to its
value more explicit:

[100 factorial] future [50 factorial] future force swap force +

Again, as soon as the future is constructed a process starts computing its
result. The force operation waits for the computation to complete (if it
hasn't already) and then extracts the result.

Consider a function "race" that takes a list of futures and returns the
future that completed first (or maybe its result). A general rendezvous
mechanism can probably be constructed from this.

One can also imagine such operations as finished (to test for completion)
and lazy-future (the work process is only created when forced).



> Another approach I've heard of for reducing copying is to maintain a
> history of changes, ie. a list of deltas. The idea is that whenever a
> change is made that must not be visible to other references, you simple
> insert a delta (eg. (delete 2) -> [3] -> [2] -> [1] -> []) and then take
> those into account when accessing the structure. I think it can be done
> quite efficiently for lists, but I can't remember where I read it.

You may want to look at a really incredible book titled "Purely Functional
Data Structures" by Chris Okasaki (Cambridge, ISBN 0 521 63124 6). He
presents an interesting collection of data structures. At first the data
structures have amortized bounds on their operations. He then refines this
for precise worst-case costs on the operations.

The main "trick" to narrowing worst-case bounds is to use lazy evaluation to
do things like gradually producing the reverse of a list or some other
rebalancing activity. As "pseudo-destructive" changes are made to the main
data structure (e.g., cons), a small increment of work is performed, forcing
one or more suspended lazy computations to be converted to non-lazy objects.
This is vaguely similar to some schemes for incremental garbage collection.
Note that converting the lazy-evaluation nodes into "real" nodes isn't
really a side-effect, other than its influence on performance of later
operations.

It's a wonderful read, and highly recommended.

Martin Young — 2000-08-09 10:39:47

On Aug 8, 10:11pm, jsnell@... wrote:
> Subject: Re: [stack] Another implemenatation of Joy.
>
> On Mon, Aug 07, 2000 at 08:28:44PM +0100, Martin Young wrote:
> > On Aug 7, 9:33pm, jsnell@... wrote:
> > > (If we implement the infra-combinator, this is not true. But
> > > infra is a bad idea anyway.)
> >
> > Why so? It seems to me to be a useful basis for concurrency.
>
> Combinators that require saving/restoring the whole stack
> kill performance for many implementation strategies. This
> includes infra, nullary/unary/etc, stack and most of Joy's
> conditionals. I believe that they are unneccessary, and
> that their removal wouldn't be a loss to application writers.
> (That's just guessing, since nothing larger than 100 lines
> has been written in Joy.)

I agreee exactly 50%. It is possible to implement a system where duplicating
the stack is actually quite quick to execute. Whether or not it's good style
is open to question. Personally I don't like the way Joy conditional always
have a null effect on the stack, but I can see how it's useful.

The Joy interpreter I've written has only one conditional (if) and one
itterative primitive (pcon - parameterised concatenate). These execute the
condition and the program on the same stack. I /think/ it's possible to build
the Joy style primitives from these two and from "stack".

OTOH, any primitive which *replaces* the whole stack should immediately be
taken out and shot. IMHO :-)

> Concurrency is a good point though, and my problems don't
> apply to using something infra-like in that case. That
> doesn't meant that ifte shouldn't be removed in favour
> of branch, though ;-)

How would you have "branch" in Joy? It doesn't have labels. I can see how you
might have a "break" word which breaks out from executing the word "above" the
break word itself.


--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='

Juho Snellman — 2000-08-09 11:32:51

On Wed, Aug 09, 2000 at 11:39:47AM +0100, Martin Young wrote:
> I agreee exactly 50%. It is possible to implement a system where duplicating
> the stack is actually quite quick to execute.

Yup, but that system is going to be slower than the one that
doesn't need to worry about making the data structures easy
to save.

In the reference Joy implementation 'nullary' and 'i' take the
same amount of time to execute, while zoe executes 'i' a third
faster than Joy, but 'nullary' four times slower... Not a good
tradeoff considering the amount of words that have a null-effect
on the stack in Joy.

> How would you have "branch" in Joy? It doesn't have labels. I can see how you
> might have a "break" word which breaks out from executing the word "above" the
> break word itself.

By 'branch' I meant the Joy combinator :
branch : B [T] [F] -> ...
If B is true, then executes T else executes F.

(And yes, a 'break' would be extremely useful. It's just damn hard
to fit into the language. I assume that by the 'above' word you mean
the innermost loop/recursive combinator? )

--
Juho Snellman
"C:stä on kehitetty Massachusettsin teknillisessä korkeakoulussa kieli
nimeltä BCPL."

Martin Young — 2000-08-09 11:58:41

On Aug 9, 2:32pm, jsnell@... wrote:
> Yup, but that system is going to be slower than the one that
> doesn't need to worry about making the data structures easy
> to save.

This is true. From my own point of view I'm more interested of playing with
the language than making it go quickly so I'm biased.

> > How would you have "branch" in Joy?
> By 'branch' I meant the Joy combinator :
> branch : B [T] [F] -> ...
> If B is true, then executes T else executes F.

Ah. Right. I called that 'if', and implemented 'ifte' on top of it.

> (And yes, a 'break' would be extremely useful. It's just damn hard
> to fit into the language. I assume that by the 'above' word you mean
> the innermost loop/recursive combinator? )

I /think/ that's what I meant. I'm not sure what semantics it would have or
how it would be implemented. I suspect it would make programs very hard to
reason about though. I'm guessing that continuations would be a useful tool
for implementing a break word.


--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='

wtanksley@bigfoot.com — 2000-08-09 16:40:45

From: Martin Young [mailto:martin.young@...]
>How would you have "branch" in Joy? It doesn't have labels.
>I can see how you
>might have a "break" word which breaks out from executing the
>word "above" the break word itself.

It's easy to branch backwards. It's harder to branch forwards, but still
possible.

0BRANCH == compile a JMP instruction with a zero where the offset should
be; return the address of the offset. (Do all this at compile-time.)

0RESOLVE == store the offset into the address on TOS. (Do at
compile-time.)

?0BRANCH == like 0BRANCH, but conditional JMP, usually based on top of
stack.

IF == ?0BRANCH;
ELSE == 0RESOLVE 0BRANCH;
THEN == 0RESOLVE;

>Martin Young working for STMicroelectronics at `(o)_(o)'

-Billy