Notation for typed arrays in Cat
Christopher Diggins — 2008-04-04 15:26:07
I am again playing with the idea of how to introduce typed arrays in to Cat.
The following is the notation that I prefer, but I am concerned about
the confusion of ['a] meaning a list of 'a in types, and [a] meaning a
quotation in the code:
uncons : (['a] -> ['a] 'a)
cons : (['a] 'a -> ['a])
empty : (['a] -> ['a] bool)
count : (['a] -> ['a] int)
Another syntax I am seriously considering is:
uncons : ('a* -> 'a* 'a)
cons : ('a* 'a -> 'a*)
empty : ('a* -> 'a* bool)
count : ('a* -> 'a* int)
And yet another contender is:
uncons : (list('a) -> list('a) 'a)
cons : (list('a) 'a -> list('a))
empty : (list('a) -> list('a) bool)
count : (list('a) -> list('a) int)
This last one is interesting because it opens the door to the
possibility of full-blown meta-programming at the type-level. There
are hints of how the type-system could be used as a programming
language. Of course if I could also keep it postfix:
uncons : ('a list -> 'a list 'a)
cons : ('a list 'a -> 'a list)
empty : ('a list -> 'a list bool)
count : ('a list -> 'a list int)
This might be more consistent, but for some reason I feel more
comfortable with the prefix notation. It delineates the world of types
from the world of values more clearly for me.
Any thoughts or opinions on the subject would be appreciated.
- Christopher
Stevan Apter — 2008-04-04 17:43:22
i find quine's argument in _set theory and its logic_ for reversing domain -> range
to match our use of y=f x persuasive. in a sense, we're first interested in the
result, and then in the argument needed to produce it.
Y <- X
appeals to me, and in fact i use that ordering consistently in my own coding
practice. if you adopt that convention, then postfix makes more sense: you read
left-to-right and evaluate right-to-left.
----- Original Message -----
From: "Christopher Diggins" <cdiggins@...>
To: "concatenative" <concatenative@yahoogroups.com>; <catlanguage@...>
Sent: Friday, April 04, 2008 10:26 AM
Subject: [stack] Notation for typed arrays in Cat
>I am again playing with the idea of how to introduce typed arrays in to Cat.
>
> The following is the notation that I prefer, but I am concerned about
> the confusion of ['a] meaning a list of 'a in types, and [a] meaning a
> quotation in the code:
>
> uncons : (['a] -> ['a] 'a)
> cons : (['a] 'a -> ['a])
> empty : (['a] -> ['a] bool)
> count : (['a] -> ['a] int)
>
> Another syntax I am seriously considering is:
>
> uncons : ('a* -> 'a* 'a)
> cons : ('a* 'a -> 'a*)
> empty : ('a* -> 'a* bool)
> count : ('a* -> 'a* int)
>
> And yet another contender is:
>
> uncons : (list('a) -> list('a) 'a)
> cons : (list('a) 'a -> list('a))
> empty : (list('a) -> list('a) bool)
> count : (list('a) -> list('a) int)
>
> This last one is interesting because it opens the door to the
> possibility of full-blown meta-programming at the type-level. There
> are hints of how the type-system could be used as a programming
> language. Of course if I could also keep it postfix:
>
> uncons : ('a list -> 'a list 'a)
> cons : ('a list 'a -> 'a list)
> empty : ('a list -> 'a list bool)
> count : ('a list -> 'a list int)
>
> This might be more consistent, but for some reason I feel more
> comfortable with the prefix notation. It delineates the world of types
> from the world of values more clearly for me.
>
> Any thoughts or opinions on the subject would be appreciated.
>
> - Christopher
>
William Tanksley, Jr — 2008-04-04 17:25:09
Christopher Diggins <
cdiggins@...> wrote:
> I am again playing with the idea of how to introduce typed arrays in to Cat.
> The following is the notation that I prefer, but I am concerned about
> the confusion of ['a] meaning a list of 'a in types, and [a] meaning a
> quotation in the code:
Factor uses {} for arrays and [] for quotations (and lists). It does
seem to me that there should be a way to denote different type
treatment, especially if the two entities behave differently in other
ways (for example, lists could be conceptually sequential, while
arrays could be conceptually parallel; thus, "i" would work on lists,
while the combinators we're talking about would be expected to execute
an array).
> Another syntax I am seriously considering is:
> uncons : ('a* -> 'a* 'a)
> cons : ('a* 'a -> 'a*)
> empty : ('a* -> 'a* bool)
> count : ('a* -> 'a* int)
This nicely expresses "an unknown number of elements of this single
type", which is (I believe) what you mean by "array". What would an
array literal look like in Cat, though? It _should_ (one would think)
be possible to tell the difference between an untyped list and a typed
array reasonably quickly.
> And yet another contender is:
> uncons : (list('a) -> list('a) 'a)
> cons : (list('a) 'a -> list('a))
> empty : (list('a) -> list('a) bool)
> count : (list('a) -> list('a) int)
> This last one is interesting because it opens the door to the
> possibility of full-blown meta-programming at the type-level. There
> are hints of how the type-system could be used as a programming
> language.
Indeed true!
> Of course if I could also keep it postfix:
> uncons : ('a list -> 'a list 'a)
> cons : ('a list 'a -> 'a list)
> empty : ('a list -> 'a list bool)
> count : ('a list -> 'a list int)
> This might be more consistent, but for some reason I feel more
> comfortable with the prefix notation. It delineates the world of types
> from the world of values more clearly for me.
I think the reason you feel more comfortable with prefix is simple:
your type notation is not a concatenative language. Making "list"
pretend to be a concatenative operator would require the reader to
"know" in advance that "list" isn't a valid typename.
Suppose the user defined a typename which worked like 'list', but
which was named 'a'. All of a sudden, all functions which used the
symbol a for one of their types would become syntax errors.
So... I agree with you.
> Any thoughts or opinions on the subject would be appreciated.
I'm just an echo.
> - Christopher
-Wm
Christopher Diggins — 2008-04-04 17:42:15
On Fri, Apr 4, 2008 at 1:25 PM, William Tanksley, Jr
<
wtanksleyjr@...> wrote:
> Christopher Diggins <cdiggins@...> wrote:
> > I am again playing with the idea of how to introduce typed arrays in to
> Cat.
> > The following is the notation that I prefer, but I am concerned about
> > the confusion of ['a] meaning a list of 'a in types, and [a] meaning a
> > quotation in the code:
>
> Factor uses {} for arrays and [] for quotations (and lists). It does
> seem to me that there should be a way to denote different type
> treatment, especially if the two entities behave differently in other
> ways (for example, lists could be conceptually sequential, while
> arrays could be conceptually parallel; thus, "i" would work on lists,
> while the combinators we're talking about would be expected to execute
> an array).
> > Another syntax I am seriously considering is:
> > uncons : ('a* -> 'a* 'a)
> > cons : ('a* 'a -> 'a*)
> > empty : ('a* -> 'a* bool)
> > count : ('a* -> 'a* int)
>
> This nicely expresses "an unknown number of elements of this single
> type", which is (I believe) what you mean by "array". What would an
> array literal look like in Cat, though? It _should_ (one would think)
> be possible to tell the difference between an untyped list and a typed
> array reasonably quickly.
Sorry for the confusion I am using the terms "array" and "list"
interchangeably. In Cat a quotation is not a list (or array or
whatever). There is also no built-in syntax for writing list literals
in Cat. A quotation is [1 2 3]. A list is generated using the "list"
instruction. E.g. "[1 2 3] list". We can overload "(", ")" and "," to
construct lists like (1, 2, 3) but this is just syntactic sugar.
What I am proposing that all lists in Cat will become either lists of
"any" (e.g. dynamically typed) or can be lists of some specific type
(e.g. int*, or ('A -> 'B)*)
I hope this clarifies things
Thanks for your feedback!
- Christopher
John Nowak — 2008-04-04 20:40:31
On Apr 4, 2008, at 11:26 AM, Christopher Diggins wrote:
> I am again playing with the idea of how to introduce typed arrays in
> to Cat.
> The following is the notation that I prefer
Here's what I'm doing in Fifth:
Row variable: A B XY -- uppercase
Scalar variable: a b xy -- lowercase
Atomic type: Num Char File -- titlecase, 2+ chars
Parameterized type: (Tree a) (Ref Int) (Table a b)
List: {a} {Int} {String}
Quotation: [A b c -> A c b] [A b -> A b b] [A [A -> B] -> B]
And an example:
unlist :: A {b} [A -> C] [A b {b} -> C] -> C
Note that the quotation syntax is a bit more complex in practice due
to effects. The version above is what programmers use when giving
annotations if they don't care to be more explicit.
- John
Daniel Ehrenberg — 2008-04-04 22:48:10
A quick question: what's the rationale for using {a} as the type for a
list instead of (List a)? There's no reason for lists to be built-in,
right? Overall, though, that syntax is pretty clean-looking, I think.
Dan
On Fri, Apr 4, 2008 at 3:40 PM, John Nowak <john@...> wrote:
>
> Here's what I'm doing in Fifth:
>
> Row variable: A B XY -- uppercase
> Scalar variable: a b xy -- lowercase
> Atomic type: Num Char File -- titlecase, 2+ chars
> Parameterized type: (Tree a) (Ref Int) (Table a b)
> List: {a} {Int} {String}
> Quotation: [A b c -> A c b] [A b -> A b b] [A [A -> B] -> B]
>
> And an example:
>
> unlist :: A {b} [A -> C] [A b {b} -> C] -> C
>
> Note that the quotation syntax is a bit more complex in practice due
> to effects. The version above is what programmers use when giving
> annotations if they don't care to be more explicit.
>
> - John
>
John Nowak — 2008-04-04 23:37:46
On Apr 4, 2008, at 6:48 PM, Daniel Ehrenberg wrote:
> A quick question: what's the rationale for using {a} as the type for a
> list instead of (List a)? There's no reason for lists to be built-in,
> right? Overall, though, that syntax is pretty clean-looking, I think.
Curly braces are used for list literals so the type syntax simply
matches the expression syntax. This is the same reason that square
brackets are used for the types of quotations. Using {a} instead of
(List a) also helps keep type sizes a bit more manageable.
None of this is set in stone of course. In particular, I still have no
good way for notating everything the effect system tracks. For
example, if you pass a function two arrays and it mutates one of them,
the effect system can tell you which of the two arrays is the one that
gets changed. Unfortunately, I've not come up with a nice way to
notate this yet.
I should also note that types are just normal s-expressions; {a}
parses to (:list a) and [a b c] parses to (:quote a b c). As such,
normal macros can give you parameterized type synonyms and possibly
other useful things.
- John
John Nowak — 2008-04-04 23:53:42
On Apr 4, 2008, at 1:25 PM, William Tanksley, Jr wrote:
> Christopher Diggins <cdiggins@...> wrote:
>
>> ...
>>
>
> I think the reason you feel more comfortable with prefix is simple:
> your type notation is not a concatenative language.
I don't suppose anyone has an idea for concatenative type annotations?
It seems like a terrible idea, but here's a first stab anyway:
-- normal
swap :: A b c -> A c b
i :: A [A -> B] -> B
unlist :: A {b} [A -> C] [A b {b} -> C] -> C
-- concatenative?
swap :: [A b c] [A c b] 'fn
i :: [A [A] [B] 'fn] [B] 'fn
unlist :: [A b 'list [A] [C] 'fn [A b b list] [C] 'fn] [C] 'fn
Well that's sort of horrible, isn't it. I don't suppose anyone has a
magical set of type-level combinators to make this beautiful? Is it
possible to imagine doing this without variable names?
- John
John Nowak — 2008-04-05 00:21:30
On Apr 4, 2008, at 7:53 PM, John Nowak wrote:
> -- concatenative?
> swap :: [A b c] [A c b] 'fn
> i :: [A [A] [B] 'fn] [B] 'fn
> unlist :: [A b 'list [A] [C] 'fn [A b b list] [C] 'fn] [C] 'fn
Here's another attempt at a concatenative type notation. We assume
some function 'r' that pushes a fresh row variable onto the stack and
some function 's' that pushes a fresh scalar variable onto the stack.
You can then use *any normal function* to manipulate these on the type
level where computation is done using the *types* of those normal
functions. We also have some binary function 'fn' for making functions
where the second argument describes how to *change* the first, some
unary function 'list' for making a parameterized list, and a nullary
function 'num' for making a number. Oh, and a 'prod' function that
returns the type of the production of a function. And I use my tick
notation as well as otherwise you need 'swap' to define the type of
'swap', etc...
Now that I've totally confused you, here you go:
swap :: [r s s] [`ba] fn
dup :: [r s] [`aa] fn
drop :: [r s] [`b] fn
nip :: [r s s] [swap drop] fn
add :: [r num num] [drop] fn
cons :: [r s dup list] [nip] fn
And now a bit more complicated:
i :: [v dup quote [drop v] fn] [nip prod] fn
Here's a trace of 'i' to see how it might be working:
v: A
dup: A A
quote: A [A]
[drop v]: A [A] [drop B]
fn: A [A] [A drop B]
A [A] [B]
A [A -> B]
- - -
nip: [A -> B]
prod: B
This is obviously very, very loose. It seems you'd really need more
appropriate type-level combinators. Then there's the whole issue of
giving back coherent concatenative types after performing inference...
To be clear, this is not a serious proposal. If anyone thinks there's
a shred of feasibility however, I'd be interested to hear about it...
- John
John Nowak — 2008-04-05 00:23:33
On Apr 4, 2008, at 8:21 PM, John Nowak wrote:
> And now a bit more complicated:
>
> i :: [v dup quote [drop v] fn] [nip prod] fn
By 'v', I meant 'r'.
I swear I do proofread these before I send them...
- John
William Tanksley, Jr — 2008-04-05 15:42:32
John Nowak <
john@...> wrote:
> Here's another attempt at a concatenative type notation. We assume
Technically, strongForth's type notation is stack based. (The
exceptions are his compound types, and even those could be made so.)
I'm not sure what it buys you, though -- types don't have the same
factorization properties as words, do they?
> - John
-Wm
john@johnnowak.com — 2008-04-05 19:57:20
> John Nowak <john@...> wrote:
>> Here's another attempt at a concatenative type notation. We assume
>
> Technically, strongForth's type notation is stack based. (The
> exceptions are his compound types, and even those could be made so.)
>
> I'm not sure what it buys you, though -- types don't have the same
> factorization properties as words, do they?
Indeed; I don't think it buys you much. It's very likely a terrible idea.
- John
Manfred Von Thun — 2008-04-09 07:44:31
On 5/4/08 1:26 AM, "Christopher Diggins" <
cdiggins@...> wrote:
> I am again playing with the idea of how to introduce typed arrays in to Cat.
>
Before you do that, consider the various uses of typing annotations. Note
that this list is sort of cumulative:
1. To help the implementer/maintainer to understand the implementation
2. To help the user (as part of the manual or help facility). This is how I
did it in Joy, the typing information is part of the help message. (A very
primitive way of doing it.) But the typechecking is done independently in
the interpreter.
3. To become part of the implementation for an interpreter: before an
operator is executed, the current stack is checked against the typing
requirement for that operator. This guarantees that the manual and the
implementation actually agree (what an extraordinary thought!).
>
4. To become part of a compiler which checks whether the output-stack from
one operator is suitable as the input-stack for the next operator. For a
definition it should check the body, from the input-stack for the first
operator it should determine the input-stack for the whole body, and from
the output-stack of the last operator it should determine the output stack
of the whole body.
Clearly (4.) is the most desirable, it encompasses all others before. The
way to do it is with pattern matching available in Prolog, ML(?) and
Haskell(?), and probably in some versions of Lisp. But you need a good
pattern notation to apply it to. This is how I tried to do it with my
experiments on Joy in Prolog. a year ago. If I did it again, I would change
quite a few things, but the central idea was essentially right. I give an
outline:
Stacks, written as Prolog lists:
[A, B, C | D] is a stack of at least 3 elements A is an empty stack
(suitable only suitable for push) [A, B, C] would be a stack of exactly 3
elements (hardly useful)
swap turns [A, B | C) into [B, A | C). # This is the Prolog code !
Stack elements, have typing information when the type matters: Assume that
we have a general numeric type, with two subtypes
A:num:B is a numeric stack element, subtype unspecified:
+ turns [A:num:B, C:num:D | E) into [F:num:G | E] :- F is A + C. # Prolog !
A:num:int is an integer numeric stack element.
A:num:float a a float numeric stack element.
Note that addition and several other operators work on both subtypes, but
not all do:
succ turns [A:num:int | B] into [C:num:int | B] :- C is A +1. #Prolog
Assume that we have a similar notation for lists, possibly with subtypes:
A:list:B is a list of elements of unspecified type A:list:num:float is a
list of floats. A:list:list:B is a list of lists of unspecified type.
cons turns [A:list:B, C:B | D] into [[C:B, A]:list:B | D].
And so on for all operators. Now the important bit, the concatenative
principle: output stack from one operator must match input stack for next.
In the first line below, E is the intermediate stack pattern. Here is the
interpreter ( = exterm() in Joy, John Cowan might recognise).
[A | B] turns C into D :- A turns C into E, B turns E into D. [] turns A
into A.
Now the marvellous bit: given the definition foo == bar baz zot, where bar
baz zot are primitives or have been previously defined, Prolog automatically
works out the input and output stacks for foo. I did not have to write any
code for this, in fact it took me some time to notice that Prolog had done
it. Such is the power of unification.
So, even if your implementation language is not Prolog, you should consider
whether similar techniques could be written in your implmentation language.
I have to confess that I have never written a unifier, but I have seen very
short (half a page) versions. It may even be that for the above purposes
only a very simple version is needed.
Best wishes with the project.
- Manfred
[Non-text portions of this message have been removed]
john@johnnowak.com — 2008-04-10 02:46:55
On April 9th, 2008 3:44 AM, Manfred Von Thun wrote:
> Now the important bit, the concatenative
> principle: output stack from one operator must match input stack for
> next. Now the marvellous bit: given the definition foo == bar baz zot,
> where bar baz zot are primitives or have been previously defined, Prolog
> automatically. works out the input and output stacks for foo. I did not
> have to write any code for this, in fact it took me some time to notice
> that Prolog had done it. Such is the power of unification.
>
> So, even if your implementation language is not Prolog, you should
> consider whether similar techniques could be written in your
> implmentation language. I have to confess that I have never written a
> unifier, but I have seen very short (half a page) versions.
This is essentially the approach that Fifth takes. Indeed, the core of the
type system (essentially Cat's system plus parameterized types and part of
the effect system) is roughly 50 lines of Scheme.
- John
Christopher Diggins — 2008-04-10 03:00:29
On Wed, Apr 9, 2008 at 3:44 AM, Manfred Von Thun
<
m.vonthun@...> wrote:
> On 5/4/08 1:26 AM, "Christopher Diggins" <cdiggins@...> wrote:
>
> > I am again playing with the idea of how to introduce typed arrays in to
> Cat.
> >
> Before you do that, consider the various uses of typing annotations. Note
> that this list is sort of cumulative:
>
> 1. To help the implementer/maintainer to understand the implementation
>
> 2. To help the user (as part of the manual or help facility). This is how I
> did it in Joy, the typing information is part of the help message. (A very
> primitive way of doing it.) But the typechecking is done independently in
> the interpreter.
>
> 3. To become part of the implementation for an interpreter: before an
> operator is executed, the current stack is checked against the typing
> requirement for that operator. This guarantees that the manual and the
> implementation actually agree (what an extraordinary thought!).
>
> 4. To become part of a compiler which checks whether the output-stack from
> one operator is suitable as the input-stack for the next operator. For a
> definition it should check the body, from the input-stack for the first
> operator it should determine the input-stack for the whole body, and from
> the output-stack of the last operator it should determine the output stack
> of the whole body.
>
> Clearly (4.) is the most desirable, it encompasses all others before.
FWIW I think your list is accurate and useful.
> The
> way to do it is with pattern matching available in Prolog, ML(?) and
> Haskell(?), and probably in some versions of Lisp. But you need a good
> pattern notation to apply it to. This is how I tried to do it with my
> experiments on Joy in Prolog. a year ago. If I did it again, I would change
> quite a few things, but the central idea was essentially right. I give an
> outline:
>
> Stacks, written as Prolog lists:
>
> [A, B, C | D] is a stack of at least 3 elements A is an empty stack
> (suitable only suitable for push) [A, B, C] would be a stack of exactly 3
> elements (hardly useful)
>
> swap turns [A, B | C) into [B, A | C). # This is the Prolog code !
That is very cool.
> Stack elements, have typing information when the type matters: Assume that
> we have a general numeric type, with two subtypes
>
> A:num:B is a numeric stack element, subtype unspecified:
>
> + turns [A:num:B, C:num:D | E) into [F:num:G | E] :- F is A + C. # Prolog !
>
> A:num:int is an integer numeric stack element.
>
> A:num:float a a float numeric stack element.
>
> Note that addition and several other operators work on both subtypes, but
> not all do:
>
> succ turns [A:num:int | B] into [C:num:int | B] :- C is A +1. #Prolog
>
> Assume that we have a similar notation for lists, possibly with subtypes:
> A:list:B is a list of elements of unspecified type A:list:num:float is a
> list of floats. A:list:list:B is a list of lists of unspecified type.
>
> cons turns [A:list:B, C:B | D] into [[C:B, A]:list:B | D].
>
> And so on for all operators. Now the important bit, the concatenative
> principle: output stack from one operator must match input stack for next.
> In the first line below, E is the intermediate stack pattern. Here is the
> interpreter ( = exterm() in Joy, John Cowan might recognise).
>
> [A | B] turns C into D :- A turns C into E, B turns E into D. [] turns A
> into A.
>
> Now the marvellous bit: given the definition foo == bar baz zot, where bar
> baz zot are primitives or have been previously defined, Prolog automatically
> works out the input and output stacks for foo. I did not have to write any
> code for this, in fact it took me some time to notice that Prolog had done
> it. Such is the power of unification.
Your approach to directly expose Prolog as the language for expressing
type is elegant and very powerful. My only hesitation is the syntax of
prolog, which may be accessible to a logician, are not great for
someone with only some collegial level math. In general the problem I
am having is with constructing an accessible syntax for people.
Currently I am now heavily leaning towards:
cons : (list['a] 'a -> list['a])
uncons : (list['a] -> list['a] 'a)
etc.
> So, even if your implementation language is not Prolog, you should consider
> whether similar techniques could be written in your implmentation language.
> I have to confess that I have never written a unifier, but I have seen very
> short (half a page) versions. It may even be that for the above purposes
> only a very simple version is needed.
Unifiers can be quite simple, given the right abstractions. I have
seen incredibly short versions in Scheme for instance. My unification
code is horribly naive. ;-)
> Best wishes with the project.
Thank you very much! And thank you for the little Prolog tutorial, it
was quite informative.
> - Manfred
- Christopher
Manfred Von Thun — 2008-04-15 05:23:42
On 10/4/08 1:00 PM, "Christopher Diggins" <
cdiggins@...> wrote:
>
> On Wed, Apr 9, 2008 at 3:44 AM, Manfred Von Thun
> <m.vonthun@... <mailto:m.vonthun%40latrobe.edu.au> > wrote:
>
>> > Now the marvellous bit: given the definition foo == bar baz zot, where bar
>> > baz zot are primitives or have been previously defined, Prolog
>> automatically
>> > works out the input and output stacks for foo. I did not have to write any
>> > code for this, in fact it took me some time to notice that Prolog had done
>> > it. Such is the power of unification.
>
> Your approach to directly expose Prolog as the language for expressing
> type is elegant and very powerful. My only hesitation is the syntax of
> prolog, which may be accessible to a logician, are not great for
> someone with only some collegial level math. In general the problem I
> am having is with constructing an accessible syntax for people.
>
That would not a problem at all.
The idea is that the implementation uses this notation internally, either
because the whole implementation is in Prolog or because it
uses a subset of Prolog to define the primitives. But the end-users
never see the Prolog at all, they just see a concatenative language.
When they define foo == bar baz zot, the unifier works out the input
and the output stacks, and possibly provide an error message when
there is a mismatch. So the end-users should never see the Prolog.
- Manfred
[Non-text portions of this message have been removed]