ASM Rules in Joy [was Re: [stack] conk - concatened k]

Louis Madon — 2000-06-15 00:50:01

wtanksley@... wrote:
>
> From: Louis Madon [mailto:madonl@...]
> >wtanksley@... wrote:
>
> >> ASMs accurately model execution in the presence of in-place
> updates,
> >> so I
> >> look forward to your language.
>
> >> Interestingly, ASM rules fire in parallel while concatenative words
> >> execute in serial. How do you plan to resolve the distinction?
>
> >Not quite true. ASM rules fire when their guard conditions are
> >met. This
> >is just a way to do sequencing without having to introduce a
> >full-fledged language. In Jel, the guards are unneccesary
> >since Joy code does the sequencing.
>
> Ah, true. That's what was confusing me.
>
> >The only thing that is missing is a way to execute
> >a bunch of updates in parallel; One possibility is that a special
> >combinator is provided (like i) except that it takes a quotation
> >containing only updates and executes them "simultaneously". Note
> that,
> >so far, I've been concentrating on the type system and so haven't
> >thought these issues through fully.
>
> An interesting point is that niladic functions (fns with no stack
> effect)
> can be executed in any order. Because of this, you can model
> parallelism by
> using functions which are niladic with respect to the system stack --
> for
> example, functions which act on some other stack.

Yes, this is a good insight. In effect, create a pair (inputs,update) to
create a niladic update (I hesitate to use the word "function" here,
since an operation without outputs can't really be called such). In
Joy, as you illustrated in your example, such a pair can be represented
by a quotation and we can then execute these in any order (I guess in
mathematical terms, we could say that niladic updates are commutative).
This in turn implies that the "parallel" combinator could be defined as:

parallel == [i] step

This almost seems like a complete solution, but I can see at least one
outstanding issue: inputs that have dependencies on prior updates within
the same rule (see more below).

>
> In Forth, this would be cone using some kind of BEGIN word, which set
> up a
> state; a task-loader word, which placed the tasks ready to go; and an
> END
> word, which executed the tasks in parallel and unloaded the state when
> they
> were done. Thus, only the ordering of the BEGIN and END would be
> relevant
> to the Forthlike language.
>
> BEGIN Task: 1 oneUpdate Task: 0 anotherUpdate END
>
> In Joy, I believe we'd want to use quotations, as you've said: since
> the
> language supplies some syntax, we might as well use it.
>
> [ [1 oneUpdate] [0 anotherUpdate] ] Parallel

An important requirement that must be satisified is that the effect of
the updates must not become visible until the entire rule has executed.
In your example above, this requirement holds, since the inputs to each
update are just constants. However, consider the case where the input
involves a dependancy on a dynamic function:


[ [ 1 one:= ] [ one another:= ] ] Parallel


In other words, the input value for the 'another:=' update is specified
as the value of the 'one' dynamic function. But that DF was just
updated, so the second update will "see" the wrong value. I think fixing
this requires:

- niladic update construction rule - All occurences of DFs in the inputs
to any update must first be dereferenced (that is replaced by their
value).

Ideally, the compiler could check that this rule isn't violated.

So the example above would become:


one [ another:= ] cons [ [ 1 one:= ] ] cons Parallel


This would now work correctly.

Or perhaps, (yes this is better - I don't know why I didn't think of it
in the first place), implement parallel as:

parallel == [opt] map [i] step

'opt' is a partial evaluation combinator (same semantics as 'id' except
that the result is in a maximally evaluated form). This eliminates any
special restrictions on the programmer and yet guarantees that there are
no DF interdependancies between updates.

It will be interesting to see just what its like reasoning with Joy
programs containing these extended features, I'm quite optimistic that
the ASM approach will work well.


--
Louis.

wtanksley@bigfoot.com — 2000-06-20 21:43:09

From: Louis Madon [mailto:madonl@...]
>wtanksley@... wrote:
>> From: Louis Madon [mailto:madonl@...]
>> >wtanksley@... wrote:

>> An interesting point is that niladic functions (fns with no stack
>> effect)
>> can be executed in any order. Because of this, you can model
>> parallelism by
>> using functions which are niladic with respect to the system stack --
>> for example, functions which act on some other stack.

>Yes, this is a good insight. In effect, create a pair
>(inputs,update) to
>create a niladic update (I hesitate to use the word "function" here,
>since an operation without outputs can't really be called such).

I don't quite remember, but it seems that the ASM literature does in fact
refer to niladic functions. Thus you would be quite justified in calling
them such.

>In
>Joy, as you illustrated in your example, such a pair can be represented
>by a quotation and we can then execute these in any order (I guess in
>mathematical terms, we could say that niladic updates are commutative).
>This in turn implies that the "parallel" combinator could be
>defined as:

>parallel == [i] step

>This almost seems like a complete solution, but I can see at least one
>outstanding issue: inputs that have dependencies on prior
>updates within the same rule (see more below).

Exactly. In other words, your definition is incorrect -- you really do have
to do it right.

I would suggest for efficiency's sake that you start (at compile time) by
doing a topological sort, and generating code for any independant updates
normally. Then make copies of dependant updates and generate code to update
and copy.

But you don't have to be this fancy; it's sufficient to copy, update, and
copy back, without first checking for independant updates.

>An important requirement that must be satisified is that the effect of
>the updates must not become visible until the entire rule has
>executed.
>In your example above, this requirement holds, since the inputs to each
>update are just constants. However, consider the case where the input
>involves a dependancy on a dynamic function:

You're assuming that I hadn't already thought of that. :-)

>Or perhaps, (yes this is better - I don't know why I didn't think of it
>in the first place), implement parallel as:

>parallel == [opt] map [i] step

>'opt' is a partial evaluation combinator (same semantics as 'id' except
>that the result is in a maximally evaluated form). This eliminates any
>special restrictions on the programmer and yet guarantees that
>there are no DF interdependancies between updates.

Yes, that would be interesting as well. It makes sense to not be able to
specify unordered updates.

>It will be interesting to see just what its like reasoning with Joy
>programs containing these extended features, I'm quite optimistic that
>the ASM approach will work well.

Same here.

>Louis.

-Billy

Louis Madon — 2000-06-21 00:31:33

wtanksley@... wrote:
>
> From: Louis Madon [mailto:madonl@...]
> >Yes, this is a good insight. In effect, create a pair
> >(inputs,update) to
> >create a niladic update (I hesitate to use the word "function" here,
> >since an operation without outputs can't really be called such).
>
> I don't quite remember, but it seems that the ASM literature does in
> fact
> refer to niladic functions. Thus you would be quite justified in
> calling
> them such.

I'm pretty sure that what they mean by niladic function is different
though; its a function without inputs but at least one output (eg. a
variable access is modelled as a niladic function). For example, in
Haskell I could write:

x = 2

which defines a niladic function 'x' that when invoked returns '2'. I
have no problem with considering x a function. However, if a function
has neither inputs nor outputs, thats when I think its strange to call
it such. This is just terminology ofcourse, but I'd prefer to use the
terms 'update' and 'niladic update' to emphasise the fact that they are
pure state change commands.

>
> >In
> >Joy, as you illustrated in your example, such a pair can be
> represented
> >by a quotation and we can then execute these in any order (I guess in
> >mathematical terms, we could say that niladic updates are
> commutative).
> >This in turn implies that the "parallel" combinator could be
> >defined as:
>
> >parallel == [i] step
>
> >This almost seems like a complete solution, but I can see at least
> one
> >outstanding issue: inputs that have dependencies on prior
> >updates within the same rule (see more below).
>
> Exactly. In other words, your definition is incorrect -- you really
> do have
> to do it right.

Yes. My last message was more brainstorming than well thought out
principles.

>
> I would suggest for efficiency's sake that you start (at compile time)
> by
> doing a topological sort, and generating code for any independant
> updates
> normally. Then make copies of dependant updates and generate code to
> update
> and copy.

This could work but the topo-sort would need to be a part of the
definition of parallel since the inputs may not be knowable until
runtime (as a result of Joys' metaprogramming support). Partial
evaluation can still be used to move the overhead to the earliest
possible stage (which may well be compile-time in most cases).

>
> But you don't have to be this fancy; it's sufficient to copy, update,
> and
> copy back, without first checking for independant updates.

I'm not sure what you mean by "copy back". The way I'm interpreting this
solution is:

parallel == [copy] map [i] step

where copy is defined as

copy == unswonsr [i] dip swonsr

(here unswonsr is the same as unswons except it returns the
(initial,last) instead of (rest,first) and likewise swonsr adds an
element to the right of a list instead of the left).

Is that kind of what you had in mind? (I think its a reasonable
approach).

>
> >An important requirement that must be satisified is that the effect
> of
> >the updates must not become visible until the entire rule has
> >executed.
> >In your example above, this requirement holds, since the inputs to
> each
> >update are just constants. However, consider the case where the
> input
> >involves a dependancy on a dynamic function:
>
> You're assuming that I hadn't already thought of that. :-)

Sorry, I didn't mean to imply that (although it obviously reads that
way).

>
> >Or perhaps, (yes this is better - I don't know why I didn't think of
> it
> >in the first place), implement parallel as:
>
> >parallel == [opt] map [i] step
>
> >'opt' is a partial evaluation combinator (same semantics as 'id'
> except
> >that the result is in a maximally evaluated form). This eliminates
> any
> >special restrictions on the programmer and yet guarantees that
> >there are no DF interdependancies between updates.
>
> Yes, that would be interesting as well.

Actually, there's a bug in it. opt will try to invoke the update if all
its inputs are available. The copy solution above seems ok though. Like
I said, I was just brainstorming ...

> It makes sense to not be able to
> specify unordered updates.

I agree. There is still the issue of niladic updates being well formed;
ie. they must be niladic and the update must be the very last item in
the quotation. The type system will need to take care of checking this.

[BTW, what stage are you at for the type system in your implementation?
Is it still wide open or have you already settled on an overall
approach?]


--
Louis.

wtanksley@bigfoot.com — 2000-06-21 16:25:22

From: Louis Madon [mailto:madonl@...]
>wtanksley@... wrote:
>> From: Louis Madon [mailto:madonl@...]

>I'm pretty sure that what they mean by niladic function is different
>though; its a function without inputs but at least one output (eg. a
>variable access is modelled as a niladic function). For example, in
>Haskell I could write:

Whoops! You're right. 'Update' is a better term.

Note that updating is a very nice term to have -- input and output are also
'updates', although not niladic.

>> I would suggest for efficiency's sake that you start (at
>> compile time)
>> by
>> doing a topological sort, and generating code for any independant
>> updates
>> normally. Then make copies of dependant updates and generate code to
>> update
>> and copy.

>This could work but the topo-sort would need to be a part of the
>definition of parallel since the inputs may not be knowable until
>runtime (as a result of Joys' metaprogramming support). Partial
>evaluation can still be used to move the overhead to the earliest
>possible stage (which may well be compile-time in most cases).

Honestly, I'd rather require them to be known at compile-time. If you want
to play around with runtime, it's very easy to do compilation tasks at
runtime. This is one very confusing thing about Forth which I'm now
completely used to. :-)

>> But you don't have to be this fancy; it's sufficient to copy, update,
>> and copy back, without first checking for independant updates.

>I'm not sure what you mean by "copy back". The way I'm
>interpreting this solution is:

I'm sorry, I don't know of any other way to express it, aside from the
statement "It's the only way to solve the problem you're looking at."

>Is that kind of what you had in mind? (I think its a reasonable
>approach).

I honestly couldn't read your code. Must be all the sugar today. Sorry.

However, to the best of my knowledge there's only one possible way to handle
concurrent updates: make copies and work from the copies.

>Actually, there's a bug in it. opt will try to invoke the
>update if all
>its inputs are available. The copy solution above seems ok
>though. Like I said, I was just brainstorming ...

Hmm. Perhaps opt needs to recognise updates as such -- when you
destructively modify something you can't do that at compile-time (unless
you're instructed to).

>> It makes sense to not be able to
>> specify unordered updates.

>I agree. There is still the issue of niladic updates being well formed;
>ie. they must be niladic and the update must be the very last item in
>the quotation. The type system will need to take care of checking this.

It's not hard to make it niladic: simply have it operate on its own stack.
I don't see the need for making the update be the last item, either.

>[BTW, what stage are you at for the type system in your implementation?
>Is it still wide open or have you already settled on an overall
>approach?]

It's open, but I'm tending to lean towards Dr. Becher's system: my first
system is intended to be very Forthlike, in the sense of simplicity and
transparency. Your system sounds very powerful, but it does things for the
programmer without always asking permission; it also sounds like GC is going
to be essential to it, and my system won't have that (except _perhaps_ as a
user-callable routine).

>Louis.

-Billy

Louis Madon — 2000-06-22 06:42:04

> From: Louis Madon [mailto:madonl@...]
> >wtanksley@... wrote:
> >> I would suggest for efficiency's sake that you start (at
> >> compile time)
> >> by
> >> doing a topological sort, and generating code for any independant
> >> updates
> >> normally. Then make copies of dependant updates and generate code to
> >> update
> >> and copy.
>
> >This could work but the topo-sort would need to be a part of the
> >definition of parallel since the inputs may not be knowable until
> >runtime (as a result of Joys' metaprogramming support). Partial
> >evaluation can still be used to move the overhead to the earliest
> >possible stage (which may well be compile-time in most cases).
>
> Honestly, I'd rather require them to be known at compile-time.

Ok, its always easier to compile a language by only supporting some
(less dynamic) subset of it. I'm interested in preserving the full
dynamism of Joy though.


> If you want
> to play around with runtime, it's very easy to do compilation tasks at
> runtime.

"doing compilation tasks at runtime" is also called "interpreting" :-)
And you're right, it is easy, but its also inefficient.

This touches on what I see as a key issue with the efficiency of Joy
programs: namely how do you handle quotations?
I see them more as a way of expressing semantics to the compiler rather
than being fully explicit at runtime (similar to the way that the stack
can be seen as a logical entity that does not have to be fully realised
in the executable code).

Its a key issue because, in Joy, its not only possible but quite natural
to have programs constructing programs constructing programs ... to an
arbitrary number of levels (eg. when using program construction in lieu
of closures). If you do the construction at runtime, the total overhead
will be the _product_ (not the sum) of the overheads for each level.
For example, say a constructed program effectively runs at half the
speed of a pre-generated equivalent. If there are 5 levels of
construction, then the overall program will run 2^5 or 32 times slower
(this is a very simplistic and rather unrealistic calculation ofcourse,
but I'm just trying to emphasise the multiplicative nature of the total
overhead).

This issue is not related to Joy being a concatenative language but
stems from its pervasive programs-as-data model (even lisp doesn't go as
far; macros must be fully expandable at compile-time and 'eval' is not
defined as a language primitive).

I want to eliminate the in-efficiency incurred in a straight-forward
interpretative implementation without restricting the language. To that
end I see partial evaluation playing a very critical role. Conversely,
it seems that having an efficient implementation of Joy without PE is
only possible if you restrict or at least encourage less use of its meta
capabilities.


> However, to the best of my knowledge there's only one possible way to handle
> concurrent updates: make copies and work from the copies.

Right. Its just a matter of deciding on the most efficient way to do
that. I think all the solutions presented so far are just expressing
this basic strategy in different forms.

>
> >[BTW, what stage are you at for the type system in your implementation?
> >Is it still wide open or have you already settled on an overall
> >approach?]
>
> It's open, but I'm tending to lean towards Dr. Becher's system: my first
> system is intended to be very Forthlike, in the sense of simplicity and
> transparency. Your system sounds very powerful, but it does things for the
> programmer without always asking permission; it also sounds like GC is going
> to be essential to it, and my system won't have that (except _perhaps_ as a
> user-callable routine).

Sounds good. Our implementations will be exploring two quite different
points in the design space of Joy-like systems. In that sense they're
quite complementary.

I'd like to expand a bit on the issue of "not asking permission" though.
My system will have well defined semantics and so it wont do anything
mysteriously (if it does, thats a bug). These semantics are more
abstract than what a low level system offers thus giving the compiler
more implementational freedom. It follows that the programmer has less
control over very low level details (though in principle you could still
inspect the low level stuff with a browse tool of some sort). It really
just boils down to delegation of responsibility: giving up some control
in order that you can concentrate on bigger things (and let the compiler
do a good job on the little things for you).
Indeed, my number-one goal (I guess its a holy grail ...) is that you
can delegate to the compiler without sacrificing efficiency.



--
Louis.