stackless fixed-arity concatenative languages
John Nowak — 2008-05-20 06:15:43
In some languages, such as Haskell, all functions consume a fixed
number of arguments and produce a fixed number of arguments.* For
example, the 'foldr' function in Haskell always consumes three values
and produces one. In fact, all functions in Haskell produce one value.
Concatenative languages like Joy expand on this; functions are not
restricted to returning only one value.** For example, 'dup' consumes
one value and produces two. We'll say 'dup' has a 1/2 arity; 'pop'
would have a 1/0 arity, '+' would have a 2/1 arity, 'swap' would have
a 2/2 arity, and so on.
However, some functions in Joy go beyond this. For example, 'i'
essentially takes the entire state of the program as an argument and
yields a new state. Because 'i' simply invokes the quotation on the
top of the stack, there's no way to tell how many values 'i' will
consume or produce without knowing what's on top of the stack. Other
examples of such functions include 'dip' and 'while'.
One downside of such functions is that it becomes impossible to
efficiently compile a language like Joy to C; you have no choice but
to make use of a stack (or some equivalent mechanism) in the
implementation. In contrast, a concatenative language with only fixed-
arity functions can be efficiently compiled to C without any need for
a stack. An example of such a language is Forth without n-ary words
like 'roll' and with the modest restriction that both branches of a
conditional must have the same stack effect. [1][2]
It seems the way to do this for a language like Joy would be to find
alternatives for functions like 'i' and add a type system to ensure
conditional branches are balanced. Ignoring the type system for now,
it seems one possibility would be to restrict 'i' to simply dequoting
the quotation on the top of the stack without allowing it access to
the stack itself. If one wanted the quotation to have access to
arguments on the stack, they could be partially applied to the
quotation first. Several versions of 'i' would need to be available to
indicate how many values will be returned from its use. For example:
# 'p' is partial application
# 'i1' indicates that the quotation produces 1 value
# the result is '7'
2 [5 +] p i1
Rather than dealing with partial application, a convenient and
efficient implementation would simply offer a suite of 'i' combinators
from 0i0 up until 4i4 or so, where the numbers indicate the number of
values to be consumed and produced.
While this all likely seems to be overly burdensome at first, most
functions that make use of quotations would not need so many versions.
For example, 'fold' would only be valid for quotations of arity 2/1,
and hence only one version would be needed.
Does anyone think such a language would be feasible? What might the
benefits be? What useful features would be lost?
- John
[1]
http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz
[2]
http://www.complang.tuwien.ac.at/papers/ertl96diss.ps.gz
* Yes, all functions in Haskell /really/ take one argument and are
curried, and yes, there are tricky ways of implementing variadic
functions, but we'll ignore that for now.
** Yes, all functions are /really/ unary functions of stacks to
stacks, but we'll ignore that for now.
Daniel Ehrenberg — 2008-05-20 06:39:35
This is a reasonable idea for some cases, but there's some generality
that's lost. Look at how reduce is defined in Factor:
: reduce ( seq identity quot -- result )
swapd each ;
When each runs, it has the stack underneath as an implicit
accumulator. That means that we can just use the code of each when
writing reduce, but only because the quotation can be not just 1/0 but
also 2/1. The genericness here allows a reduction in code duplication
in certain cases which are completely impossible in Haskell (unless
you start by implementing reduce, then base a each off of it
(pretending Haskell has side effects), and this is more complicated),
though arguably unnecessary.
Anyway, can't a type signature for each of something like ( A ( A x --
A ) [x] -- A ) handle this adequately? Is it difficult to infer or
deal with such a type?
Dan
On Tue, May 20, 2008 at 1:15 AM, John Nowak <john@...> wrote:
> In some languages, such as Haskell, all functions consume a fixed
> number of arguments and produce a fixed number of arguments.* For
> example, the 'foldr' function in Haskell always consumes three values
> and produces one. In fact, all functions in Haskell produce one value.
>
> Concatenative languages like Joy expand on this; functions are not
> restricted to returning only one value.** For example, 'dup' consumes
> one value and produces two. We'll say 'dup' has a 1/2 arity; 'pop'
> would have a 1/0 arity, '+' would have a 2/1 arity, 'swap' would have
> a 2/2 arity, and so on.
>
> However, some functions in Joy go beyond this. For example, 'i'
> essentially takes the entire state of the program as an argument and
> yields a new state. Because 'i' simply invokes the quotation on the
> top of the stack, there's no way to tell how many values 'i' will
> consume or produce without knowing what's on top of the stack. Other
> examples of such functions include 'dip' and 'while'.
>
> One downside of such functions is that it becomes impossible to
> efficiently compile a language like Joy to C; you have no choice but
> to make use of a stack (or some equivalent mechanism) in the
> implementation. In contrast, a concatenative language with only fixed-
> arity functions can be efficiently compiled to C without any need for
> a stack. An example of such a language is Forth without n-ary words
> like 'roll' and with the modest restriction that both branches of a
> conditional must have the same stack effect. [1][2]
>
> It seems the way to do this for a language like Joy would be to find
> alternatives for functions like 'i' and add a type system to ensure
> conditional branches are balanced. Ignoring the type system for now,
> it seems one possibility would be to restrict 'i' to simply dequoting
> the quotation on the top of the stack without allowing it access to
> the stack itself. If one wanted the quotation to have access to
> arguments on the stack, they could be partially applied to the
> quotation first. Several versions of 'i' would need to be available to
> indicate how many values will be returned from its use. For example:
>
> # 'p' is partial application
> # 'i1' indicates that the quotation produces 1 value
> # the result is '7'
> 2 [5 +] p i1
>
> Rather than dealing with partial application, a convenient and
> efficient implementation would simply offer a suite of 'i' combinators
> from 0i0 up until 4i4 or so, where the numbers indicate the number of
> values to be consumed and produced.
>
> While this all likely seems to be overly burdensome at first, most
> functions that make use of quotations would not need so many versions.
> For example, 'fold' would only be valid for quotations of arity 2/1,
> and hence only one version would be needed.
>
> Does anyone think such a language would be feasible? What might the
> benefits be? What useful features would be lost?
>
> - John
>
> [1] http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz
> [2] http://www.complang.tuwien.ac.at/papers/ertl96diss.ps.gz
>
> * Yes, all functions in Haskell /really/ take one argument and are
> curried, and yes, there are tricky ways of implementing variadic
> functions, but we'll ignore that for now.
>
> ** Yes, all functions are /really/ unary functions of stacks to
> stacks, but we'll ignore that for now.
>
>
John Nowak — 2008-05-20 07:21:28
On May 20, 2008, at 2:39 AM, Daniel Ehrenberg wrote:
> This is a reasonable idea for some cases, but there's some generality
> that's lost.
Certainly.
> Look at how reduce is defined in Factor:
>
> : reduce ( seq identity quot -- result )
> swapd each ;
Aye. It makes sense as 'each' is a generalized version of 'fold' that
allows you to access the stack and not just deal with two values. It's
not clear to me though if something like 'each' is too powerful for
its own good. It probably isn't.
> When each runs, it has the stack underneath as an implicit
> accumulator. That means that we can just use the code of each when
> writing reduce, but only because the quotation can be not just 1/0 but
> also 2/1.
I suppose I just restated this above for no good reason.
> Anyway, can't a type signature for each of something like ( A ( A x --
> A ) [x] -- A ) handle this adequately? Is it difficult to infer or
> deal with such a type?
The type for 'each' in Cat is exactly that and it works fine in
practice. My main interest here is in allowing a direct and efficient
translation to C. My guess is that the sacrifices necessary will be
too severe, although it may possibly allow easier reasoning about
program behavior that might make the loss of generality worth it.
Given the lack of problems n-ary functions have presented though in
terms of types and formalizing a term rewriting semantics, I'm
skeptical of that being the case.
- John
John Nowak — 2008-05-20 09:48:43
Here is a visualization of the function 'swap dup dip swap' where we
want the quotation supplied to 'dip' to consume two arguments and
produce one (as '+' or 'cons' does); note that 'R' and 'S' are row
variables that represent some "rest of the stack":
http://johnnowak.com/heap/stackless-dataflow.png
As you can (maybe?) see, the restrictions necessary to allow a
stackless implementation of a concatenative language, namely that the
arities of all functions be locally knowable, simplifies things quite
a bit. At the very least, it gives us a nice mapping to a visual
dataflow language where each "transformer" has a fixed number of
"ports" and the column position of each "port" directly corresponds to
an absolute stack position (for lack of a better term). I find this
easier to follow than the alternative version and would certainly find
it easier to explain to someone else.
I suppose all there is to do at this point is implement an interpreter
for such a language and see how painful the "fixed arity" restriction
is relative to its benefits. I suspect a simple macro system would go
a long way towards making the restriction tolerable by allowing
"generalized" versions of combinators to be written. Alternatively,
syntactic support for specifying arities could be built into the
language. Ultimately, that would be really no different than the
information you give the compiler when you write something like (foo a
b c) in Scheme.
- John
Chris Double — 2008-05-20 11:52:56
On Tue, May 20, 2008 at 9:48 PM, John Nowak <
john@...> wrote:
> I suppose all there is to do at this point is implement an interpreter
> for such a language and see how painful the "fixed arity" restriction
> is relative to its benefits.
Based on the things I've written in Factor a 'fixed arity' restriction
would have little to no effect on what I've written. It did in the
past when 'curry' was discouraged and quotations passed to 'map' and
the like would often access items deeper in the stack. For example:
5 { 1 2 3 } [ over + ] map
=> { 6 7 8 }
A map-with operation was eventually provided to make this easier:
5 { 1 2 3 } [ + ] map-with
Now with the composition of quotations being more acceptable in factor
(previously composing quotations caused performance problems) words
like curry can be used:
5 { 1 2 3 } swap [ + ] curry map
(There's actually a word 'with' that makes this more readable).
Would your restriction still allow curry, compose, etc?
Chris.
--
http://www.bluishcoder.co.nz
William Tanksley, Jr — 2008-05-20 13:42:22
John Nowak <
john@...> wrote:
> One downside of such functions is that it becomes impossible to
> efficiently compile a language like Joy to C; you have no choice but
> to make use of a stack (or some equivalent mechanism) in the
> implementation.
"Impossible" is an overly strong word (especially in conjunction with
a shades-of-grey word like "efficiently"). It's quite possible to use
the stack in the same way that C itself does; but it's very hard to
work together with your C compiler's optimizer, and I'd say that is
the major difficulty, not anything having to do with stacks.
Even in the presence of statically indeterminate stack effects, MOST
of the program can be efficiently compiled.
BUT, I do agree that it's simple to statically determine stack
effects, so your point is otherwise well-taken. I think that would be
a good variant of Joy.
> It seems the way to do this for a language like Joy would be to find
> alternatives for functions like 'i' and add a type system to ensure
> conditional branches are balanced. Ignoring the type system for now,
> it seems one possibility would be to restrict 'i' to simply dequoting
> the quotation on the top of the stack without allowing it access to
> the stack itself. If one wanted the quotation to have access to
> arguments on the stack, they could be partially applied to the
> quotation first. Several versions of 'i' would need to be available to
> indicate how many values will be returned from its use. For example:
Indeed, Joy does this in many places, with all its functions that
contain a numeral in their name.
> # 'p' is partial application
> # 'i1' indicates that the quotation produces 1 value
> # the result is '7'
> 2 [5 +] p i1
>
> Rather than dealing with partial application, a convenient and
> efficient implementation would simply offer a suite of 'i' combinators
> from 0i0 up until 4i4 or so, where the numbers indicate the number of
> values to be consumed and produced.
At this point, it would seem convenient to add some syntax for
staticly-known (and required) parameters. Perhaps (i 4 4) would be an
example of such syntax. strongForth does something like this with its
EXECUTE function (equivalent to 'i'), which is always followed by a
parenthetical type declaration.
> Does anyone think such a language would be feasible? What might the
> benefits be? What useful features would be lost?
Well, you'd also need to remove Joy's stack-preservation semantics. As
long as that's present, you simply have to implement a stack, no way
around it.
But yes, I see no reason why it'd be hard at all. You'd lose the
ability to create code with dynamic stack effects, but such code is
generally frowned upon anyhow -- it's very hard to read.
> - John
-Wm
John Nowak — 2008-05-20 22:13:23
On May 20, 2008, at 7:52 AM, Chris Double wrote:
> Based on the things I've written in Factor a 'fixed arity' restriction
> would have little to no effect on what I've written.
Very good to hear! I was half-expecting it to be shot down for being
overly restrictive.
> It did in the past when 'curry' was discouraged and quotations passed
> to 'map' and the like would often access items deeper in the stack...
>
> 5 { 1 2 3 } [ over + ] map
> 5 { 1 2 3 } [ + ] map-with
The version using map-with is certainly clearer. Perhaps there really
is a readability benefit in gently forcing the programmer to use
combinators with fixed stack effects.
> Would your restriction still allow curry, compose, etc?
Yes, and you wouldn't need different versions of 'compose' for each
possible arity combination of the functions being composed. In other
words, '[0] [+] compose' and '[swap] [over] compose' would both be
fine, even though the arities of the input quotations and the arities
of the resulting composed quotations do not match at all.
- John
John Nowak — 2008-05-20 22:37:28
On May 20, 2008, at 9:42 AM, William Tanksley, Jr wrote:
> John Nowak <john@...> wrote:
>
>> One downside of such functions is that it becomes impossible to
>> efficiently compile a language like Joy to C; you have no choice but
>> to make use of a stack (or some equivalent mechanism) in the
>> implementation.
>
> "Impossible" is an overly strong word (especially in conjunction with
> a shades-of-grey word like "efficiently"). It's quite possible to use
> the stack in the same way that C itself does; but it's very hard to
> work together with your C compiler's optimizer, and I'd say that is
> the major difficulty, not anything having to do with stacks.
That makes sense. Agreed.
>> Rather than dealing with partial application, a convenient and
>> efficient implementation would simply offer a suite of 'i'
>> combinators
>> from 0i0 up until 4i4 or so, where the numbers indicate the number of
>> values to be consumed and produced.
>
> At this point, it would seem convenient to add some syntax for
> staticly-known (and required) parameters. Perhaps (i 4 4) would be an
> example of such syntax.
That's certainly possible. I already have that syntax for special
forms anyway. It might be nice to not require the numbers to be given
when the effect is inferable though. For example, I often do things
like '[+] dip', and it would be rather annoying to have to do '[+]
(dip 2 1)' instead.
Once you start inferring that sort of thing though, you have a
situation (again) where you can't split definitions cleanly because
there might not be enough information to infer stack effects. Perhaps
instead, you could have a 'dip-with' form that takes a quotation
directly. You could then do something like '(dip-with +)'. This way,
it is at least clear that you can't split between '+' and 'dip-with'.
In any case, it should be possible to figure out some nice way of
doing all this.
>> Does anyone think such a language would be feasible? What might the
>> benefits be? What useful features would be lost?
>
> Well, you'd also need to remove Joy's stack-preservation semantics. As
> long as that's present, you simply have to implement a stack, no way
> around it.
Aye. I've already decided against that as Joy's semantics are just too
hard to compile efficiently. The one thing I really wanted Joy's stack-
preservation for however was assertions and contracts. It's important
to be able to remove assertions from a program without changing the
program's meaning, and preserving the stack takes care of that.
However, a fixed arity restriction does that as well, as it is known
at compile time how many elements of the stack need to be saved and
restored for the call to assert. Since we're not longer dealing with a
stack though, you just call assertion with the arguments it needs and
ignore the result. Very easy.
> But yes, I see no reason why it'd be hard at all. You'd lose the
> ability to create code with dynamic stack effects, but such code is
> generally frowned upon anyhow -- it's very hard to read.
Very good. Thank you William and Chris for giving me the go-ahead.
- John
William Tanksley, Jr — 2008-05-20 23:17:59
John Nowak <
john@...> wrote:
> William Tanksley, Jr wrote:
>> At this point, it would seem convenient to add some syntax for
>> staticly-known (and required) parameters. Perhaps (i 4 4) would be an
>> example of such syntax.
> That's certainly possible. I already have that syntax for special
> forms anyway. It might be nice to not require the numbers to be given
> when the effect is inferable though. For example, I often do things
> like '[+] dip', and it would be rather annoying to have to do '[+]
> (dip 2 1)' instead.
Agreed; this should tie in with type inference and annotation. This
stack effect notation is really an abbreviated form of type notation
and static polymorphism / function overloading, so perhaps it should
be clearly annotated as such (rather than simply looking like a macro,
as I suggested).
> Once you start inferring that sort of thing though, you have a
> situation (again) where you can't split definitions cleanly because
> there might not be enough information to infer stack effects. Perhaps
> instead, you could have a 'dip-with' form that takes a quotation
> directly. You could then do something like '(dip-with +)'. This way,
> it is at least clear that you can't split between '+' and 'dip-with'.
For that to work consistently, the type system would have to produce
an error whenever annotation was missing, even if the inference could
theoretically be made.
> In any case, it should be possible to figure out some nice way of
> doing all this.
Perhaps :-).
>>> Does anyone think such a language would be feasible? What might the
>>> benefits be? What useful features would be lost?
>> Well, you'd also need to remove Joy's stack-preservation semantics. As
>> long as that's present, you simply have to implement a stack, no way
>> around it.
> It's important
> to be able to remove assertions from a program without changing the
> program's meaning, and preserving the stack takes care of that.
> However, a fixed arity restriction does that as well, as it is known
> at compile time how many elements of the stack need to be saved and
> restored for the call to assert. Since we're not longer dealing with a
> stack though, you just call assertion with the arguments it needs and
> ignore the result. Very easy.
Yes, it seems like fixing the arity is the correct action (really, you
don't ignore the result; you make sure the result is always 'true').
> - John
-Wm
William Tanksley, Jr — 2008-05-20 23:20:14
John Nowak <
john@...> wrote:
> The version using map-with is certainly clearer. Perhaps there really
> is a readability benefit in gently forcing the programmer to use
> combinators with fixed stack effects.
The Forth community's experience strongly suggests that there is.
Words with externally visible variable stack effects are frowned upon.
> - John
-Wm
John Nowak — 2008-05-21 02:00:23
On May 20, 2008, at 7:17 PM, William Tanksley, Jr wrote:
>>> At this point, it would seem convenient to add some syntax for
>>> staticly-known (and required) parameters. Perhaps (i 4 4) would be
>>> an
>>> example of such syntax.
>>
>> it would be rather annoying to have to do '[+] (dip 2 1)' instead.
>
> Agreed; this should tie in with type inference and annotation. This
> stack effect notation is really an abbreviated form of type notation
> and static polymorphism / function overloading, so perhaps it should
> be clearly annotated as such (rather than simply looking like a macro,
> as I suggested).
It seems to me that the way to handle this is to simply require a
normal type annotation that provides enough information to figure
these things out. There is a simple rule for determining if all
arities for all function instances within a function are known by
looking at the row variables, so this wouldn't introduce too much
complication. The rule is that the row variable for the outermost
function must be the same in the consumption and production. (There
are ways to break this with stack-saving/restoring primitives, so
they'd need to be special forms if they were to exist at all.) In
other words, 'A (B -> C) d -> A d (B -> C)' is fine, but 'A (A -> B) d
-> B d' isn't.
Specifying arities directly seems like it would be too much of a pain.
The reason is that I use normal functions to deconstruct sum types
where each quotation provided to the function indicates what to do if
the sum type was constructed via a particular constructor. It would be
way too inconvenient to have to specify arities for all of these.
They'd also no longer be "normal functions" if they somehow were
allowed to get around the arity rule. This applies to something like
'dip' as well. If you require all user-written functions to have a
fixed arity, something harmless like 'foo = dip' would not be valid.
It seems what is required is two classes of functions, ones with fixed
arities and ones without. For functions with known arities, you could
do something like '(define foo [+] dip)'. For functions without known
arities, you could do something like '(define-nary bar [dip] dip)'.
Functions that are defined as n-ary functions would act like macros;
code would not be generated for them until they are instantiated in
such a way that the arity is known for that particular instance. This
seems like a reasonable requirement, makes rules for separate
compilation simple, and encourages fixed arity functions while
allowing n-ary functions where necessary.
>> You could then do something like '(dip-with +)'. This way,
>> it is at least clear that you can't split between '+' and 'dip-with'.
>
> For that to work consistently, the type system would have to produce
> an error whenever annotation was missing, even if the inference could
> theoretically be made.
Yes, this is the general problem with ensuring that you can break
functions apart: You either deal with unnecessary restrictions or
resort to some form of whole program analysis. The type system I'm
working on now is essentially a clever way of doing whole program
analysis by delaying certain things when doing local inference, but it
doesn't interact well with type annotations which will be necessary
for restricting n-ary functions to fixed arities so that code can be
generated for them when doing separate compilation.
I'm possibly just talking to myself at this point, but it seems like I
should just give up on the idea of preserving cut-and-paste factoring
in all cases, use a simpler yet highly expressive type system that
occasionally requires annotations like Leijen's HMF or HML, drop
macros (since we have no binding forms to deal with, we have a
lightweight syntax for quotations, we don't build abstractions around
things like SET!, postfix syntax is already quite good for DSLs, etc),
and just get on with writing an implementation already.
- John
P.S. Apologies if this message ends up on the list twice. My first
attempt was rejected.
John Nowak — 2008-05-21 07:02:29
On May 20, 2008, at 10:00 PM, John Nowak wrote:
> There is a simple rule for determining if all
> arities for all function instances within a function are known by
> looking at the row variables, so this wouldn't introduce too much
> complication. The rule is that the row variable for the outermost
> function must be the same in the consumption and production.
After a brief review, this is clearly bogus. For example, we might
have a 'loop' combinator that simply executes a procedure repeatedly:
loop :: A (A -> A) -> A
Here, the row variables match, but we have no idea of how many
elements of the stack 'loop' will actually use. There are, however,
functions with the same signature as 'loop' that would work just fine,
such as those that simply require a function that would not alter the
type of the stack but do not actually call it. Therefore, the type of
a function alone (at least with a Cat-like type system) can give no
guarantee either way regarding if the fixed arity requirement has been
satisfied. Some other solution will be necessary.
- John
John Nowak — 2008-05-21 08:08:15
After doing some thinking, I've come up with an example of where an n-
ary 'dip' combinator makes a complete mess of things. The example is
Factor's 'bi@' combinator. The 'bi@' combinator takes some x, some y,
and a quotation which it applies to both. As an example, '4 5 [ dup
* ] bi@' yields '16 25'.
Here's one way we might think to implement 'bi@' in Factor:
: bi@ ( x y quot -- x' y' ) dup dip dip ;
The thinking here is that we duplicate the quotation, apply it below
the original quotation to 'y', then apply the original quotation to
'x'. Just to be clear, I'll trace the execution of an expression as if
Factor used term rewriting (which I find is the easiest way to show
this sort of thing):
4 5 [ dup * ] bi@
4 5 [ dup * ] dup dip dip
4 5 [ dup * ] [ dup * ] dip dip
4 5 dup * [ dup * ] dip
4 5 5 * [ dup * ] dip
4 25 [ dup * ] dip
4 dup * 25
4 4 * 25
16 25
Great. It worked as we wanted it to. There is, however, a bug. The bug
is that the first call to the quotation has both 4 and 5 on the stack,
while the second call only has 4 on the stack. In other words, the
quotations are exposed to stacks of different sizes. If the quotation
supplied uses two elements on the stack rather than one, very weird
things will happen.
Surely a type system like Cat's would catch this, yes? Well, sort of.
Unfortunately, it does it in the worst way possible; it gives bi@ the
type 'A [A -> A b] -> A b b'. What this means is that the quotation
must keep the type of the stack the same and add another element. This
means doing '4 5 [dup *] bi@' would be a type error since '[dup *]' is
not an instance of '[A -> A b]'; it's an instance of '[A b -> A b]'.
Furthermore, it doesn't even indicate that two elements must be below
the quotation on the stack! As a result, all you can do are things
like '4 5 [6] bi@' to yield '4 5 6 6' which makes the function
completely useless.
One way to solve this problem is to use a version of 'dip' that calls
a quotation with only one element and requires the quotation to yield
one element. Cat's type system cannot enforce this, but Fifth's type
system can. Below are the types of the unrestricted 'dip' combinator
and the restricted version:
dip :: A b [A -> C] -> C b
dip1-1 :: a b [a -> c] -> c b
As you can see, the types are basically the same, except the quotation
can only take one element 'a' and transform it to one element 'c'
rather than the stack 'A' to the stack 'C'. Using the restricted
combinator, our function would look like this:
bi@ = dup dip1-1 dip1-1
We could then assign it the following type:
bi@ :: a a [a -> b] -> b b
This type is much more like what we wanted. Indeed, it'll let us do '4
5 [dup *] bi@'. It does have the restriction however that both
elements below the quotation must be of the same type. To solve this
we need higher rank types which is another discussion altogether.
Now the nasty bit: 'bi@ = dup dip dip' can be assigned the type above
in Fifth provided that you give it a signature. The signature 'a a [a -
> b] -> b b', however, is not an instance of the type that is
inferred (which is the same as Cat's: 'A [A -> A b] -> A b b'). This
is very unfortunate as it means Fifth does not have principal types in
the presence of n-ary combinators! This problem would hold for Cat as
well if it had a way of enforcing similar arity restrictions (which it
will need to be able to eventually for things like callbacks if
nothing else).
This is rather damning, and it may mean that n-ary combinators have to
be completely eliminated unless a more expressive system can be found.
One way to do this would be fixed arity versions of 'dip' and 'i',
along with built-in mechanisms for pattern matching and deconstruction
(a la Haskell, et al) rather than simply using normal functions like
'unlist' as I've talked about previously. There would still need to be
some way to address the need to allow the user to write new
combinators like '2dip' that apply a quotation below the top two
elements of the stack rather than just one. Introducing something like
lambda expressions would be one way to do this (similar to how
Backus's Formal FP allowed lambda expressions for creating new
combinators).
It's important to remember that 'dup dip dip' works fine for our
expected use case, but has some very strange behaviors lurking about!
The new type system I'm working on does capture this by giving 'dup
dip dip' the type '[$A] => $A [$A] dip'. Essentially, it types 'dup
dip', realizes there's no most general way to proceed, and stops and
waits for more information about $A (the quotation passed in).
Unfortunately, this type system is undecidable, expensive, and yields
ugly types. What fun!
So what's the correct way to write 'bi@' so that Cat gives it the
correct type without placing an arity restriction on the quotation? It
turns out there isn't one. I thought maybe something like 'under apply
slip' would work, but I get the rather insane type 'A b (A self b -> C
(C -> D) e) -> D e'. Amusing. Another attempt was 'under [[apply dip]
dip apply', but this gets the type 'A b b -> (A b -> A) -> A' which
means you can only do things like '3 4 [pop] bi@' and hence it's
rather useless. If you think about it, there's really no way it could
possibly give a useful type without an arity restriction. Well,
actually it could if you had first-class stacks. That's something else
I have to consider as a possible solution...
Hey, I thought concatenative languages were supposed to be simpler!
- John
John Nowak — 2008-05-21 08:15:32
On May 21, 2008, at 4:08 AM, John Nowak wrote:
> Another attempt was 'under [[apply dip]
> dip apply', but this gets the type 'A b b -> (A b -> A) -> A' which
> means you can only do things like '3 4 [pop] bi@' and hence it's
> rather useless.
Minor correction: The type is 'A b b (A b -> A) -> A'.
Too much curry.
- John
Christopher Diggins — 2008-05-21 15:19:11
> It's important to remember that 'dup dip dip' works fine for our
> expected use case, but has some very strange behaviors lurking about!
> The new type system I'm working on does capture this by giving 'dup
> dip dip' the type '[$A] => $A [$A] dip'. Essentially, it types 'dup
> dip', realizes there's no most general way to proceed, and stops and
> waits for more information about $A (the quotation passed in).
> Unfortunately, this type system is undecidable, expensive, and yields
> ugly types. What fun!
On the other hand this looks like a turing-complete programming
language at the type level. Perhaps if you go that route, and not
worry about decidability, you will arrive at something interesting.
That Cat type issues you have identified are appreciated. They are
related to the inability to properly assign the correct level of
nesting of nested forall qualifiers, and to do renaming properly. This
is still something that I have on my to-do list, to try and improve.
Oh, and for the record, I still want variable arity functions in
concatenative langauges, where the arity depends on the arity of
function parameters. However, some of the Joy primitives have a little
too much flexibility with stack access: I don't want all of my list
processing functions to have access to the stack, because it seriously
impedes performance. What I am doing in Cat is making multiple
versions of the list processing functions (like Map and Fold): ones
that allows unfettered access to the stack (and thus is order
dependent) and others that don't.
- Christopher
William Tanksley, Jr — 2008-05-21 16:15:00
John Nowak <
john@...> wrote:
> William Tanksley, Jr wrote:
>>>> At this point, it would seem convenient to add some syntax for
>>>> staticly-known (and required) parameters. Perhaps (i 4 4) would be
>>>> an example of such syntax.
>>> it would be rather annoying to have to do '[+] (dip 2 1)' instead.
>> Agreed; this should tie in with type inference and annotation. This
>> stack effect notation is really an abbreviated form of type notation
>> and static polymorphism / function overloading, so perhaps it should
>> be clearly annotated as such (rather than simply looking like a macro,
>> as I suggested).
> It seems to me that the way to handle this is to simply require a
> normal type annotation that provides enough information to figure
> these things out.
I agree; for a language with static typechecking, this makes complete
sense. If you don't have static typechecking but you want to do static
stack depths, you have to check and annotate arities.
> Specifying arities directly seems like it would be too much of a pain.
Sure, if you also have to specify types it's a waste of time. If you
don't have types it's easier to add arities than it is to add types.
> I'm possibly just talking to myself at this point, but it seems like I
> should just give up on the idea of preserving cut-and-paste factoring
> in all cases, use a simpler yet highly expressive type system that
> occasionally requires annotations like Leijen's HMF or HML,
In some cases disallowing cut-and-paste is the right thing to do
(specifically, when the associative property doesn't hold due to side
effects)... Unfortunately, our type theory isn't sophisticated enough
to differentiate between those legitimate cases and other less
legitimate ones.
> drop macros (since we have no binding forms to deal with, we have a
> lightweight syntax for quotations, we don't build abstractions around
> things like SET!, postfix syntax is already quite good for DSLs, etc),
I suppose that makes sense. Perhaps macros can be a design goal... Or
perhaps they'll have to be a feature for Sixth.
> and just get on with writing an implementation already.
I got no complaints. I still haven't published my 'tworing' language.
> - John
-Wm
William Tanksley, Jr — 2008-05-21 17:39:37
John Nowak <
john@...> wrote:
> Factor's 'bi@' combinator. The 'bi@' combinator takes some x, some y,
> and a quotation which it applies to both. As an example, '4 5 [ dup
> * ] bi@' yields '16 25'.
> Here's one way we might think to implement 'bi@' in Factor:
> : bi@ ( x y quot -- x' y' ) dup dip dip ;
Great point. Here we seem to actually *need* type annotations on the
function types... I wonder whether it might be a fundamental property
and not just an accident of our lack of understanding.
> This is rather damning, and it may mean that n-ary combinators have to
> be completely eliminated unless a more expressive system can be found.
Or such combinators may have to be accommodated via specialized syntax.
> One way to do this would be fixed arity versions of 'dip' and 'i',
> along with built-in mechanisms for pattern matching and deconstruction
> (a la Haskell, et al) rather than simply using normal functions like
> 'unlist' as I've talked about previously.
Seems like a smart idea.
> There would still need to be
> some way to address the need to allow the user to write new
> combinators like '2dip' that apply a quotation below the top two
> elements of the stack rather than just one. Introducing something like
> lambda expressions would be one way to do this (similar to how
> Backus's Formal FP allowed lambda expressions for creating new
> combinators).
> Unfortunately, this type system is undecidable, expensive, and yields
> ugly types. What fun!
Undecidable isn't always bad. In general, types are undecidable. The
question is whether we're drawing the line at the right location, or
whether we're throwing away too much info.
> Hey, I thought concatenative languages were supposed to be simpler!
They appear to be, but the fundamental research for them hasn't been
done yet. You're doing the research at the same time as you're
generating a practical application. This is a recipe for high risk.
Hope you don't have any large amounts of money invested in this!
> - John
-Wm
John Nowak — 2008-05-22 06:03:04
On May 21, 2008, at 11:19 AM, Christopher Diggins wrote:
> That Cat type issues you have identified are appreciated. They are
> related to the inability to properly assign the correct level of
> nesting of nested forall qualifiers, and to do renaming properly. This
> is still something that I have on my to-do list, to try and improve.
I think you may be underestimating the problem here. The issue is
that, for 'dup dip dip', all of these are valid and (somewhat) useful
types:
// Call a quotation yielding a single value on some stack twice
A (A -> A b) -> A b b
// Call a quotation on two separate values (the intended use)
A b b (b -> c) -> A c c
// Destroy the top two values of the stack in some special
// way, such as displaying them or specially deconstructing
A b b (b -> ) -> A
// Apply a binary function such as equality to two pairs on top
// of the stack
A b c b c (b c -> d) -> A d d
There are probably others as well. The huge issue here is that none of
these types are instances of another! If you believe Cat could
eventually type 'dup dip dip' in a satisfactory manner, what type
would you want to give it? As far as I can tell, without a system that
delays typing as necessary such as the one I've hinted at recently, it
isn't possible to give a most general type even with annotations.
Maybe a system with intersection types could do it.
I'm guessing here, but I think a system similar to Cat's cannot have
principal types if you allow non-row polymorphic functions alongside
row polymorphic functions. Given that you do need them for typing
things like 'bi@' in a satisfactory manner and for typing things like
callbacks (where you save a function that uses no more than N stack
elements somewhere), this seems to be a real problem.
It does seem though that row polymorphic functions or non-row
polymorphic functions alone do not cause an issue. For example, if you
have a version of 'dip' for every arity combination, you don't run
into this issue. If you only have row polymorphic functions, then 'A
(A -> A b) -> A b b' is a most general type. It's the combination of
an n-ary dip with non-row polymorphic constraints that causes the
issue (because you can make many more uses of a function type check).
Even though the new system I'm working on can handle 'dup dip dip'
properly, it's not at all clear if that's even a desirable property!
After all, 'dup dip dip' can do quite a few different things depending
on how you constrain it. It may very well be better to have the
programmer give different types for each intended use and document the
possible uses separately. From an implementation point of view, 'A b b
(b -> c) -> A c c' and 'A b c b c (b c -> d) -> A d d' are very
different combinators anyway.
My issue then is that Fifth, as it is now, infers a type which is not
most general (as indeed no most general type exists for 'dup dip dip'
in my system). This is likely to confuse the programmer as their
function is (sort of) correct and type checks but cannot be used in
the intended manner. The only way I can think of right now to avoid
this problem is to get rid of n-ary combinators like 'dip' and require
the programmer to use a more specific combinator.
I'm also looking at taking an approach similar to FP or FL where you
have certain combining forms that take functions directly. For
example, I could use the 'dip' combining form to do something like
'(dip +)' which would add two values below the top of the stack. The
difference here is that the '+' function is supplied directly to the
combining form and hence is known statically. Because of this, there's
no need for any sort of arity specification or multiple versions of
'dip'. If you wanted to 'dip' a quotation only known at runtime, you
could combine the 'dip' combining form with the 'i' combining form
that requires an arity specification. For example, to dip any binary
function on top of the stack, you could do '(dip (i 2 1))'. What I
like about this approach is that the static uses avoid redundant arity
annotations and dynamic uses make their intent explicit.
- John
John Nowak — 2008-05-22 06:28:09
On May 21, 2008, at 1:39 PM, William Tanksley, Jr wrote:
> John Nowak <john@...> wrote:
>
>> Factor's 'bi@' combinator. The 'bi@' combinator takes some x, some y,
>> and a quotation which it applies to both. As an example, '4 5 [ dup
>> * ] bi@' yields '16 25'.
>> Here's one way we might think to implement 'bi@' in Factor:
>> : bi@ ( x y quot -- x' y' ) dup dip dip ;
>
> Great point. Here we seem to actually *need* type annotations on the
> function types... I wonder whether it might be a fundamental property
> and not just an accident of our lack of understanding.
It's a fundamental property, at least for any type system resembling
the one in Fifth as it stands now. This comes about because we're not
"hiding" the result of the first 'dip' from the second. One way around
this would be to have 'bi@' bundle the two values into their own sub-
stacks before applying the quotation. The implementation and type
would then be something like this:
-- assume this:
stack :: A b -> A (Stack b)
infra :: A (Stack B) (B -> C) -> A (Stack C)
-- so we can define this:
bi@ :: A b b (b -> C) -> A (Stack C) (Stack C)
bi = [stack [stack] dip] dip dup [infra] dip [infra] dip
The above code already type checks and works in Fifth as expected. As
you can see though, it's a huge pain in the ass; 'dup dip dip' is much
nicer. It's also difficult to compile to efficient code.
The version of 'bi@' with type 'a a (a -> b) -> b b' is certainly
better. This type is quite similar to the type 'bi@' in Haskell:
bi@ :: (a -> b) -> a -> a -> (b, b)
bi f x y = (f x, f y)
It may seem limiting to require the quotation be of arity 1->1, but
the quotation can always create its own sub-stacks or tuples if it
wants to return multiple values for each use. I think encouraging
separate computations to be returned in their own bundles rather than
in one pile on the stack is a good idea anyway. If you do this, you
can get away with far fewer n-ary combinators.
>> This is rather damning, and it may mean that n-ary combinators have
>> to
>> be completely eliminated unless a more expressive system can be
>> found.
>
> Or such combinators may have to be accommodated via specialized
> syntax.
Indeed. See the end of the email I sent in response to Chris right
before this one talking about the possibility for using something like
FP's combining forms.
- John
John Nowak — 2008-05-22 07:38:58
On May 22, 2008, at 2:03 AM, John Nowak wrote:
> There are probably others as well. The huge issue here is that none of
> these types are instances of another! If you believe Cat could
> eventually type 'dup dip dip' in a satisfactory manner, what type
> would you want to give it? As far as I can tell, without a system that
> delays typing as necessary such as the one I've hinted at recently, it
> isn't possible to give a most general type even with annotations.
> Maybe a system with intersection types could do it.
Slava has suggested this intersection type for 'bi@':
bi@ :: A b (A -> C /\ C b -> D) -> D
This is a rather interesting type. It only requires one element on the
stack below the quotation; it applies the quotation to whatever is
below the top element, then applies it again to whatever the first
quotation did with 'b' added to the top. When using two (or more)
elements below the quotation, it is not required that they be of the
same type. This is a very general type that would cover all uses I can
think of.
The downside, of course, is that the purpose of 'bi@' is not clear
from the type. Additionally, many misuses of 'bi@' would not be caught
until possibly much later. Still, the generality is appealing.
The kind people on #concatenative also pointed out that 'dup dip dip'
behaves differently than the version of 'bi@' in Factor because it
applies the quotation to 'y' first, not 'x'. In an impure language,
these are obviously not equivalent. A definition that matches the
semantics of 'bi@' in Factor is 'dup [ dip ] dip call', or '[ dip ]
keep call'.
So far, I've only done a small about of work with intersection types
in a concatenative setting. Perhaps it is worth more attention.
- John
John Nowak — 2008-05-22 08:14:58
On May 22, 2008, at 3:38 AM, John Nowak wrote:
> The kind people on #concatenative also pointed out that 'dup dip dip'
> behaves differently than the version of 'bi@' in Factor because it
> applies the quotation to 'y' first, not 'x'. In an impure language,
> these are obviously not equivalent. A definition that matches the
> semantics of 'bi@' in Factor is 'dup [ dip ] dip call', or '[ dip ]
> keep call'.
Last thought on this for tonight. I wanted to note that 'dup dip dip'
is actually just plain wrong if you want to use it for quotations with
arities beyond 1->1. For example, here's the Factor version of 'bi@'
at work with a quotation of arity 1->2, 'dup':
1 2 [ dup ] bi@
1 2 [ dup ] dup [ dip ] dip call
1 2 [ dup ] [ dup ] [ dip ] dip call
1 2 [ dup ] dip [ dup ] call
1 dup 2 [ dup ] call
1 1 2 [ dup ] call
1 1 2 dup
1 1 2 2
As you can see, it applies 'dup' to both 1 and 2 as expected. Here's
my broken version:
1 2 [ dup ] bi@_broken
1 2 [ dup ] dup dip dip
1 2 [ dup ] [ dup ] dip dip
1 2 dup [ dup ] dip
1 2 2 [ dup ] dip
1 2 dup 2
1 2 2 2
Not so good. Anyway, if we use the /correct/ version of 'bi@' from
Factor, my new system-in-progress can give it this rather nice looking
type that very clearly shows what it does:
$a [$B] -> $B $a $B
Maybe the trick to making n-ary combinators understandable is a more
powerful system, not a more restricted one.
- John
William Tanksley, Jr — 2008-05-22 14:27:50
John Nowak <
john@...> wrote:
>> A definition that matches the
>> semantics of 'bi@' in Factor is 'dup [ dip ] dip call', or '[ dip ]
>> keep call'.
> Last thought on this for tonight. I wanted to note that 'dup dip dip'
> is actually just plain wrong if you want to use it for quotations with
> arities beyond 1->1. For example, here's the Factor version of 'bi@'
> at work with a quotation of arity 1->2, 'dup':
Ah, that does make sense.
> Not so good. Anyway, if we use the /correct/ version of 'bi@' from
> Factor, my new system-in-progress can give it this rather nice looking
> type that very clearly shows what it does:
> $a [$B] -> $B $a $B
> Maybe the trick to making n-ary combinators understandable is a more
> powerful system, not a more restricted one.
Hmm. I certainly like that idea. That's what I meant by asking whether
the problem was fundamental -- it appears that it wasn't; our problem
was due to a foggy specification of bi@ that caused us to accept an
incorrect implementation. We can't fix foggy specifications :-), but
at least this problem isn't caused by some fundamental property of
concatenative languages.
Nonetheless, it seems obvious that some programs will have equivocal
typings (two or more entirely different type inferences which imply
different capabilities) and only one of those typings will be intended
by the author. Equally obviously, requiring handwritten type
signatures is not a general solution (unless we require type
signatures on ALL methods as does Scala).
One idea might be to tie the inferencing system into a
design-by-contract and behavior-driven-development system. DBC
expresses universal properties (i.e. for-all properties), while BDD in
this sense expresses particular examples (i.e. there-exists
properties). If the inference system had direct access to both types
of specification, it could make better decisions and print clearer
error messages (and even, perhaps, eventually make guesses that the
programmer needs to write a given type of test in order to clarify the
types his program is using). Both types of specification are, of
course, valuable in their own right, so this seems like it could be a
win-win solution.
> - John
-Wm
John Nowak — 2008-05-22 14:50:18
On May 22, 2008, at 10:27 AM, William Tanksley, Jr wrote:
> That's what I meant by asking whether the problem was fundamental
> -- it appears that it wasn't; our problem was due to a foggy
> specification of bi@ that caused us to accept an incorrect
> implementation.
Well, to be clear, Cat and Fifth can't infer a good type for the
correct version either. You end up with 'A b b (A b -> A) -> A'. It's
the same general problem where the first application gets in the way
of the second. This is directly related to the fact that you're doing
all your computation on a single stack and the second application (in
this case) is allowed access to the entire state of the program at the
limit. This is what I meant by it being a fundamental problem for a
system like Fifth's when n-ary combinators are involved.
With the newer (and not yet fully implemented) system, you can infer a
good type with the clearer specification, but only because the type
'$a [$B] -> $B $a $B' *is* the specification. It's somewhat concerning
however that I need a system that will do whole program analysis if
necessary in order to avoid making assumptions. Only more experience
will tell if it's feasible in practice. I haven't been able to try it
with anything but programs of a few lines yet.
- John
William Tanksley, Jr — 2008-05-22 15:32:28
John Nowak <
john@...> wrote:
> William Tanksley, Jr wrote:
>> That's what I meant by asking whether the problem was fundamental
>> -- it appears that it wasn't; our problem was due to a foggy
>> specification of bi@ that caused us to accept an incorrect
>> implementation.
> Well, to be clear, Cat and Fifth can't infer a good type for the
> correct version either. You end up with 'A b b (A b -> A) -> A'. It's
Sorry, I was talking about your new implementation -- you said that
one would handle it correctly.
> the same general problem where the first application gets in the way
> of the second. This is directly related to the fact that you're doing
> all your computation on a single stack and the second application (in
> this case) is allowed access to the entire state of the program at the
> limit. This is what I meant by it being a fundamental problem for a
> system like Fifth's when n-ary combinators are involved.
If it's a problem for "a system like Fifth" it's not a fundamental
problem with concatenative languages in general :-). Sorry I'm selfish
that way.
> With the newer (and not yet fully implemented) system, you can infer a
> good type with the clearer specification, but only because the type
> '$a [$B] -> $B $a $B' *is* the specification. It's somewhat concerning
> however that I need a system that will do whole program analysis if
> necessary in order to avoid making assumptions. Only more experience
> will tell if it's feasible in practice. I haven't been able to try it
> with anything but programs of a few lines yet.
Yes, whole-program can become bad... I imagine that HM's exponential
worst-case performance could manifest itself unpleasantly.
Like I said, I'd like to see a system that tied static
typing/inference, BDD, and DBC. Such a system could (in theory) limit
its static analysis to just the function, its contract, and its BDD
specifications, and then require a unique concrete type (no $ type
variables).
> - John
-Wm
Daniel Ehrenberg — 2008-05-22 15:48:56
On Thu, May 22, 2008 at 2:38 AM, John Nowak <
john@...> wrote:
>
> On May 22, 2008, at 2:03 AM, John Nowak wrote:
>
>> There are probably others as well. The huge issue here is that none of
>> these types are instances of another! If you believe Cat could
>> eventually type 'dup dip dip' in a satisfactory manner, what type
>> would you want to give it? As far as I can tell, without a system that
>> delays typing as necessary such as the one I've hinted at recently, it
>> isn't possible to give a most general type even with annotations.
>> Maybe a system with intersection types could do it.
>
> Slava has suggested this intersection type for 'bi@':
>
> bi@ :: A b (A -> C /\ C b -> D) -> D
>
> This is a rather interesting type. It only requires one element on the
> stack below the quotation; it applies the quotation to whatever is
> below the top element, then applies it again to whatever the first
> quotation did with 'b' added to the top. When using two (or more)
> elements below the quotation, it is not required that they be of the
> same type. This is a very general type that would cover all uses I can
> think of.
>
> The downside, of course, is that the purpose of 'bi@' is not clear
> from the type. Additionally, many misuses of 'bi@' would not be caught
> until possibly much later. Still, the generality is appealing.
>
> The kind people on #concatenative also pointed out that 'dup dip dip'
> behaves differently than the version of 'bi@' in Factor because it
> applies the quotation to 'y' first, not 'x'. In an impure language,
> these are obviously not equivalent. A definition that matches the
> semantics of 'bi@' in Factor is 'dup [ dip ] dip call', or '[ dip ]
> keep call'.
>
Well, there actually is a difference in semantics in a pure language,
in the arities involved. Your bi@ can be used with a quotation of the
type ( b c -- d ) for an overall type of ( b c b c -- d d ) whereas
Factor's cannot. On the other hand, Factor's bi@ can be used for ( b
-- c d ) for an overall type of ( b b -- c d c d ) whereas yours
cannot (though it could be used for ( b -- b c ) as ( b b -- b c c ).
The type of dup dup dip might be something more like A b (A -> C e /\
C -> D) -> D e
Dan
John Nowak — 2008-05-22 17:58:40
On May 22, 2008, at 11:48 AM, Daniel Ehrenberg wrote:
> Well, there actually is a difference in semantics in a pure language,
> in the arities involved. Your bi@ can be used with a quotation of the
> type ( b c -- d ) for an overall type of ( b c b c -- d d ) whereas
> Factor's cannot. On the other hand, Factor's bi@ can be used for ( b
> -- c d ) for an overall type of ( b b -- c d c d ) whereas yours
> cannot (though it could be used for ( b -- b c ) as ( b b -- b c c ).
> The type of dup dup dip might be something more like A b (A -> C e /\
> C -> D) -> D e
Interesting. It seems the question going forward is if we want the
language and type system to do away with all this confusion and over-
generality or do its best to deal with it.
- John
John Nowak — 2008-05-22 18:01:35
On May 22, 2008, at 11:32 AM, William Tanksley, Jr wrote:
> If it's a problem for "a system like Fifth" it's not a fundamental
> problem with concatenative languages in general :-). Sorry I'm selfish
> that way.
I can't argue with that. It's just unfortunate that HM + row types
isn't enough.
> Like I said, I'd like to see a system that tied static
> typing/inference, BDD, and DBC. Such a system could (in theory) limit
> its static analysis to just the function, its contract, and its BDD
> specifications, and then require a unique concrete type (no $ type
> variables).
This is an interesting idea, although it might be a bit heavy in
practice. I'll give it a look.
- John