Notation for typed arrays in Cat

Christopher Diggins — 2008-04-04 15:26:07

I am again playing with the idea of how to introduce typed arrays in to Cat.



The following is the notation that I prefer, but I am concerned about

the confusion of ['a] meaning a list of 'a in types, and [a] meaning a

quotation in the code:



uncons : (['a] -> ['a] 'a)

cons : (['a] 'a -> ['a])

empty : (['a] -> ['a] bool)

count : (['a] -> ['a] int)



Another syntax I am seriously considering is:



uncons : ('a* -> 'a* 'a)

cons : ('a* 'a -> 'a*)

empty : ('a* -> 'a* bool)

count : ('a* -> 'a* int)



And yet another contender is:



uncons : (list('a) -> list('a) 'a)

cons : (list('a) 'a -> list('a))

empty : (list('a) -> list('a) bool)

count : (list('a) -> list('a) int)



This last one is interesting because it opens the door to the

possibility of full-blown meta-programming at the type-level. There

are hints of how the type-system could be used as a programming

language. Of course if I could also keep it postfix:



uncons : ('a list -> 'a list 'a)

cons : ('a list 'a -> 'a list)

empty : ('a list -> 'a list bool)

count : ('a list -> 'a list int)



This might be more consistent, but for some reason I feel more

comfortable with the prefix notation. It delineates the world of types

from the world of values more clearly for me.



Any thoughts or opinions on the subject would be appreciated.



- Christopher

Stevan Apter — 2008-04-04 17:43:22

i find quine's argument in _set theory and its logic_ for reversing domain -> range

to match our use of y=f x persuasive.  in a sense, we're first interested in the

result, and then in the argument needed to produce it.



    Y <- X



appeals to me, and in fact i use that ordering consistently in my own coding

practice.  if you adopt that convention, then postfix makes more sense:  you read

left-to-right and evaluate right-to-left.





 ----- Original Message ----- 

From: "Christopher Diggins" <cdiggins@...>

To: "concatenative" <concatenative@yahoogroups.com>; <catlanguage@...>

Sent: Friday, April 04, 2008 10:26 AM

Subject: [stack] Notation for typed arrays in Cat





>I am again playing with the idea of how to introduce typed arrays in to Cat.

> 

> The following is the notation that I prefer, but I am concerned about

> the confusion of ['a] meaning a list of 'a in types, and [a] meaning a

> quotation in the code:

> 

> uncons : (['a] -> ['a] 'a)

> cons : (['a] 'a -> ['a])

> empty : (['a] -> ['a] bool)

> count : (['a] -> ['a] int)

> 

> Another syntax I am seriously considering is:

> 

> uncons : ('a* -> 'a* 'a)

> cons : ('a* 'a -> 'a*)

> empty : ('a* -> 'a* bool)

> count : ('a* -> 'a* int)

> 

> And yet another contender is:

> 

> uncons : (list('a) -> list('a) 'a)

> cons : (list('a) 'a -> list('a))

> empty : (list('a) -> list('a) bool)

> count : (list('a) -> list('a) int)

> 

> This last one is interesting because it opens the door to the

> possibility of full-blown meta-programming at the type-level. There

> are hints of how the type-system could be used as a programming

> language. Of course if I could also keep it postfix:

> 

> uncons : ('a list -> 'a list 'a)

> cons : ('a list 'a -> 'a list)

> empty : ('a list -> 'a list bool)

> count : ('a list -> 'a list int)

> 

> This might be more consistent, but for some reason I feel more

> comfortable with the prefix notation. It delineates the world of types

> from the world of values more clearly for me.

> 

> Any thoughts or opinions on the subject would be appreciated.

> 

> - Christopher

>

William Tanksley, Jr — 2008-04-04 17:25:09

Christopher Diggins <cdiggins@...> wrote:

 > I am again playing with the idea of how to introduce typed arrays in to Cat.

>  The following is the notation that I prefer, but I am concerned about

>  the confusion of ['a] meaning a list of 'a in types, and [a] meaning a

>  quotation in the code:



 
Factor uses {} for arrays and [] for quotations (and lists). It does

seem to me that there should be a way to denote different type

treatment, especially if the two entities behave differently in other

ways (for example, lists could be conceptually sequential, while

arrays could be conceptually parallel; thus, "i" would work on lists,

while the combinators we're talking about would be expected to execute

an array).



 >  Another syntax I am seriously considering is:

>  uncons : ('a* -> 'a* 'a)

>  cons : ('a* 'a -> 'a*)

>  empty : ('a* -> 'a* bool)

>  count : ('a* -> 'a* int)



 
This nicely expresses "an unknown number of elements of this single

type", which is (I believe) what you mean by "array". What would an

array literal look like in Cat, though? It _should_ (one would think)

be possible to tell the difference between an untyped list and a typed

array reasonably quickly.



 >  And yet another contender is:

>  uncons : (list('a) -> list('a) 'a)

>  cons : (list('a) 'a -> list('a))

>  empty : (list('a) -> list('a) bool)

>  count : (list('a) -> list('a) int)

>  This last one is interesting because it opens the door to the

>  possibility of full-blown meta-programming at the type-level. There

>  are hints of how the type-system could be used as a programming

>  language.



 
Indeed true!



 > Of course if I could also keep it postfix:

>  uncons : ('a list -> 'a list 'a)

>  cons : ('a list 'a -> 'a list)

>  empty : ('a list -> 'a list bool)

>  count : ('a list -> 'a list int)

>  This might be more consistent, but for some reason I feel more

>  comfortable with the prefix notation. It delineates the world of types

>  from the world of values more clearly for me.



 
I think the reason you feel more comfortable with prefix is simple:

your type notation is not a concatenative language. Making "list"

pretend to be a concatenative operator would require the reader to

"know" in advance that "list" isn't a valid typename.



Suppose the user defined a typename which worked like 'list', but

which was named 'a'. All of a sudden, all functions which used the

symbol a for one of their types would become syntax errors.



So... I agree with you.



 >  Any thoughts or opinions on the subject would be appreciated.



 
I'm just an echo.



 >  - Christopher



 
-Wm

Christopher Diggins — 2008-04-04 17:42:15

On Fri, Apr 4, 2008 at 1:25 PM, William Tanksley, Jr

<wtanksleyjr@...> wrote:

 > Christopher Diggins <cdiggins@...> wrote:

> > I am again playing with the idea of how to introduce typed arrays in to

> Cat.

> > The following is the notation that I prefer, but I am concerned about

> > the confusion of ['a] meaning a list of 'a in types, and [a] meaning a

> > quotation in the code:

>

> Factor uses {} for arrays and [] for quotations (and lists). It does

> seem to me that there should be a way to denote different type

> treatment, especially if the two entities behave differently in other

> ways (for example, lists could be conceptually sequential, while

> arrays could be conceptually parallel; thus, "i" would work on lists,

> while the combinators we're talking about would be expected to execute

> an array).

> > Another syntax I am seriously considering is:

> > uncons : ('a* -> 'a* 'a)

> > cons : ('a* 'a -> 'a*)

> > empty : ('a* -> 'a* bool)

> > count : ('a* -> 'a* int)

>

> This nicely expresses "an unknown number of elements of this single

> type", which is (I believe) what you mean by "array". What would an

> array literal look like in Cat, though? It _should_ (one would think)

> be possible to tell the difference between an untyped list and a typed

> array reasonably quickly.



 
Sorry for the confusion I am using the terms "array" and "list"

interchangeably. In Cat a quotation is not a list (or array or

whatever). There is also no built-in syntax for writing list literals

in Cat. A quotation is [1 2 3]. A list is generated using the "list"

instruction. E.g. "[1 2 3] list". We can overload "(", ")" and "," to

construct lists like (1, 2, 3) but this is just syntactic sugar.



What I am proposing that all lists in Cat will become either lists of

"any" (e.g. dynamically typed) or can be lists of some specific type

(e.g. int*, or ('A -> 'B)*)



I hope this clarifies things



Thanks for your feedback!

- Christopher

John Nowak — 2008-04-04 20:40:31

On Apr 4, 2008, at 11:26 AM, Christopher Diggins wrote:



 > I am again playing with the idea of how to introduce typed arrays in

> to Cat.

> The following is the notation that I prefer



 
Here's what I'm doing in Fifth:



Row variable:       A B XY        -- uppercase

Scalar variable:    a b xy        -- lowercase

Atomic type:        Num Char File -- titlecase, 2+ chars

Parameterized type: (Tree a) (Ref Int) (Table a b)

List:               {a} {Int} {String}

Quotation:          [A b c -> A c b] [A b -> A b b] [A [A -> B] -> B]



And an example:



   unlist :: A {b} [A -> C] [A b {b} -> C] -> C



Note that the quotation syntax is a bit more complex in practice due

to effects. The version above is what programmers use when giving

annotations if they don't care to be more explicit.



- John

Daniel Ehrenberg — 2008-04-04 22:48:10

A quick question: what's the rationale for using {a} as the type for a

list instead of (List a)? There's no reason for lists to be built-in,

right? Overall, though, that syntax is pretty clean-looking, I think.



Dan



 On Fri, Apr 4, 2008 at 3:40 PM, John Nowak <john@...> wrote:

>

>  Here's what I'm doing in Fifth:

>

>  Row variable: A B XY -- uppercase

>  Scalar variable: a b xy -- lowercase

>  Atomic type: Num Char File -- titlecase, 2+ chars

>  Parameterized type: (Tree a) (Ref Int) (Table a b)

>  List: {a} {Int} {String}

>  Quotation: [A b c -> A c b] [A b -> A b b] [A [A -> B] -> B]

>

>  And an example:

>

>  unlist :: A {b} [A -> C] [A b {b} -> C] -> C

>

>  Note that the quotation syntax is a bit more complex in practice due

>  to effects. The version above is what programmers use when giving

>  annotations if they don't care to be more explicit.

>

>  - John

>

John Nowak — 2008-04-04 23:37:46

On Apr 4, 2008, at 6:48 PM, Daniel Ehrenberg wrote:

 > A quick question: what's the rationale for using {a} as the type for a

> list instead of (List a)? There's no reason for lists to be built-in,

> right? Overall, though, that syntax is pretty clean-looking, I think.

Curly braces are used for list literals so the type syntax simply  

matches the expression syntax. This is the same reason that square  

brackets are used for the types of quotations. Using {a} instead of  

(List a) also helps keep type sizes a bit more manageable.

None of this is set in stone of course. In particular, I still have no  

good way for notating everything the effect system tracks. For  

example, if you pass a function two arrays and it mutates one of them,  

the effect system can tell you which of the two arrays is the one that  

gets changed. Unfortunately, I've not come up with a nice way to  

notate this yet.

I should also note that types are just normal s-expressions; {a}  

parses to (:list a) and [a b c] parses to (:quote a b c). As such,  

normal macros can give you parameterized type synonyms and possibly  

other useful things.

- John

John Nowak — 2008-04-04 23:53:42

On Apr 4, 2008, at 1:25 PM, William Tanksley, Jr wrote:

 > Christopher Diggins <cdiggins@...> wrote:

>

>> ...

>>

>

> I think the reason you feel more comfortable with prefix is simple:

> your type notation is not a concatenative language.

I don't suppose anyone has an idea for concatenative type annotations?  

It seems like a terrible idea, but here's a first stab anyway:

    -- normal

    swap   :: A b c -> A c b

    i      :: A [A -> B] -> B

    unlist :: A {b} [A -> C] [A b {b} -> C] -> C

    -- concatenative?

    swap   :: [A b c] [A c b] 'fn

    i      :: [A [A] [B] 'fn] [B] 'fn

    unlist :: [A b 'list [A] [C] 'fn [A b b list] [C] 'fn] [C] 'fn

Well that's sort of horrible, isn't it. I don't suppose anyone has a  

magical set of type-level combinators to make this beautiful? Is it  

possible to imagine doing this without variable names?

- John

John Nowak — 2008-04-05 00:21:30

On Apr 4, 2008, at 7:53 PM, John Nowak wrote:

 >    -- concatenative?

>    swap   :: [A b c] [A c b] 'fn

>    i      :: [A [A] [B] 'fn] [B] 'fn

>    unlist :: [A b 'list [A] [C] 'fn [A b b list] [C] 'fn] [C] 'fn

Here's another attempt at a concatenative type notation. We assume  

some function 'r' that pushes a fresh row variable onto the stack and  

some function 's' that pushes a fresh scalar variable onto the stack.  

You can then use *any normal function* to manipulate these on the type  

level where computation is done using the *types* of those normal  

functions. We also have some binary function 'fn' for making functions  

where the second argument describes how to *change* the first, some  

unary function 'list' for making a parameterized list, and a nullary  

function 'num' for making a number. Oh, and a 'prod' function that  

returns the type of the production of a function. And I use my tick  

notation as well as otherwise you need 'swap' to define the type of  

'swap', etc...

Now that I've totally confused you, here you go:

    swap :: [r s s]        [`ba]       fn

    dup  :: [r s]          [`aa]       fn

    drop :: [r s]          [`b]        fn

    nip  :: [r s s]        [swap drop] fn

    add  :: [r num num]    [drop]      fn

    cons :: [r s dup list] [nip]       fn

And now a bit more complicated:

    i    :: [v dup quote [drop v] fn] [nip prod] fn

Here's a trace of 'i' to see how it might be working:

    v:        A

    dup:      A A

    quote:    A [A]

    [drop v]: A [A] [drop B]

    fn:       A [A] [A drop B]

              A [A] [B]

              A [A -> B]

    - - -

    nip:      [A -> B]

    prod:     B

This is obviously very, very loose. It seems you'd really need more  

appropriate type-level combinators. Then there's the whole issue of  

giving back coherent concatenative types after performing inference...

To be clear, this is not a serious proposal. If anyone thinks there's  

a shred of feasibility however, I'd be interested to hear about it...

- John

John Nowak — 2008-04-05 00:23:33

On Apr 4, 2008, at 8:21 PM, John Nowak wrote:



 > And now a bit more complicated:

>

>    i    :: [v dup quote [drop v] fn] [nip prod] fn



 
By 'v', I meant 'r'.



I swear I do proofread these before I send them...



- John

William Tanksley, Jr — 2008-04-05 15:42:32

John Nowak <john@...> wrote:

 >  Here's another attempt at a concatenative type notation. We assume

Technically, strongForth's type notation is stack based. (The

exceptions are his compound types, and even those could be made so.)

I'm not sure what it buys you, though -- types don't have the same

factorization properties as words, do they?

 >  - John

-Wm

john@johnnowak.com — 2008-04-05 19:57:20

> John Nowak <john@...> wrote:

>>  Here's another attempt at a concatenative type notation. We assume

>

> Technically, strongForth's type notation is stack based. (The

> exceptions are his compound types, and even those could be made so.)

>

> I'm not sure what it buys you, though -- types don't have the same

> factorization properties as words, do they?



 
Indeed; I don't think it buys you much. It's very likely a terrible idea.



- John

Manfred Von Thun — 2008-04-09 07:44:31

On 5/4/08 1:26 AM, "Christopher Diggins" <cdiggins@...> wrote:

 > I am again playing with the idea of how to introduce typed arrays in to Cat.

> 

Before you do that, consider the various uses of typing annotations. Note

that this list is sort of cumulative:

1. To help the implementer/maintainer to understand the implementation

2. To help the user (as part of the manual or help facility). This is how I

did it in Joy, the typing information is part of the help message. (A very

primitive way of doing it.) But the typechecking is done independently in

the interpreter.

3. To become part of the implementation for an interpreter: before an

operator is executed, the current stack is checked against the typing

requirement for that operator. This guarantees that the manual and the

implementation actually agree (what an extraordinary thought!).

 > 

4. To become part of a compiler which checks whether the output-stack from

one operator is suitable as the input-stack for the next operator. For a

definition it should check the body, from the input-stack for the first

operator it should determine the input-stack for the whole body, and from

the output-stack of the last operator it should determine the output stack

of the whole body.

Clearly (4.) is the most desirable, it encompasses all others before. The

way to do it is with pattern matching  available in Prolog, ML(?) and

Haskell(?), and probably in some versions of Lisp. But you need a good

pattern notation to apply it to. This is how I tried to do it with my

experiments on Joy in Prolog. a year ago. If I did it again, I would change

quite a few things, but the central idea was essentially right. I give an

outline:

Stacks, written as Prolog lists:

[A, B, C | D] is a stack of at least 3 elements A is an empty stack

(suitable only suitable for push) [A, B, C] would be a stack of exactly 3

elements (hardly useful)

swap turns [A, B | C) into [B, A | C). # This is the Prolog code !

Stack elements, have typing information when the type matters: Assume that

we have a general numeric type, with two subtypes

A:num:B   is a numeric stack element, subtype unspecified:

+  turns [A:num:B, C:num:D | E) into [F:num:G | E] :- F is A + C. # Prolog !

A:num:int is an integer numeric stack element.

A:num:float a a float numeric stack element.

Note that addition and several other operators work on both subtypes, but

not all do:

succ turns [A:num:int | B] into [C:num:int | B] :- C is A +1. #Prolog

Assume that we have a similar notation for lists, possibly with subtypes:

A:list:B is a list of elements of unspecified type A:list:num:float is a

list of floats. A:list:list:B  is a list of lists of unspecified type.

cons turns [A:list:B, C:B | D] into [[C:B, A]:list:B | D].

And so on for all operators. Now the important bit, the concatenative

principle: output stack from one operator must match input stack for next.

In the first line below, E is the intermediate stack pattern. Here is the

interpreter ( = exterm() in Joy, John Cowan might recognise).

[A | B] turns C into D :- A turns C into E, B turns E into D. [] turns A

into A.

Now the marvellous bit: given the definition foo == bar baz zot, where bar

baz zot are primitives or have been previously defined, Prolog automatically

works out the input and output stacks for foo. I did not have to write any

code for this, in fact it took me some time to notice that Prolog had done

it. Such is the power of unification.

So, even if your implementation language is not Prolog, you should consider

whether similar techniques could be written in your implmentation language.

I have to confess that I have never written a unifier, but I have seen very

short (half a page) versions. It may even be that for the above purposes

only a very simple version is needed.

Best wishes with the project.

- Manfred

[Non-text portions of this message have been removed]

john@johnnowak.com — 2008-04-10 02:46:55

On April 9th, 2008 3:44 AM, Manfred Von Thun wrote:



 > Now the important bit, the concatenative

> principle: output stack from one operator must match input stack for

> next. Now the marvellous bit: given the definition foo == bar baz zot,

> where bar baz zot are primitives or have been previously defined, Prolog

> automatically. works out the input and output stacks for foo. I did not

> have to write any code for this, in fact it took me some time to notice

> that Prolog had done it. Such is the power of unification.

>

> So, even if your implementation language is not Prolog, you should

> consider whether similar techniques could be written in your

> implmentation language. I have to confess that I have never written a

> unifier, but I have seen very short (half a page) versions.



 
This is essentially the approach that Fifth takes. Indeed, the core of the

type system (essentially Cat's system plus parameterized types and part of

the effect system) is roughly 50 lines of Scheme.



- John

Christopher Diggins — 2008-04-10 03:00:29

On Wed, Apr 9, 2008 at 3:44 AM, Manfred Von Thun

<m.vonthun@...> wrote:

 > On 5/4/08 1:26 AM, "Christopher Diggins" <cdiggins@...> wrote:

>

> > I am again playing with the idea of how to introduce typed arrays in to

> Cat.

> >

> Before you do that, consider the various uses of typing annotations. Note

> that this list is sort of cumulative:

>

> 1. To help the implementer/maintainer to understand the implementation

>

> 2. To help the user (as part of the manual or help facility). This is how I

> did it in Joy, the typing information is part of the help message. (A very

> primitive way of doing it.) But the typechecking is done independently in

> the interpreter.

>

> 3. To become part of the implementation for an interpreter: before an

> operator is executed, the current stack is checked against the typing

> requirement for that operator. This guarantees that the manual and the

> implementation actually agree (what an extraordinary thought!).

>

> 4. To become part of a compiler which checks whether the output-stack from

> one operator is suitable as the input-stack for the next operator. For a

> definition it should check the body, from the input-stack for the first

> operator it should determine the input-stack for the whole body, and from

> the output-stack of the last operator it should determine the output stack

> of the whole body.

>

> Clearly (4.) is the most desirable, it encompasses all others before.



 
FWIW I think your list is accurate and useful.



 > The

> way to do it is with pattern matching  available in Prolog, ML(?) and

> Haskell(?), and probably in some versions of Lisp. But you need a good

> pattern notation to apply it to. This is how I tried to do it with my

> experiments on Joy in Prolog. a year ago. If I did it again, I would change

> quite a few things, but the central idea was essentially right. I give an

> outline:

>

> Stacks, written as Prolog lists:

>

> [A, B, C | D] is a stack of at least 3 elements A is an empty stack

> (suitable only suitable for push) [A, B, C] would be a stack of exactly 3

> elements (hardly useful)

>

> swap turns [A, B | C) into [B, A | C). # This is the Prolog code !



 
That is very cool.



 > Stack elements, have typing information when the type matters: Assume that

> we have a general numeric type, with two subtypes

>

> A:num:B is a numeric stack element, subtype unspecified:

>

> + turns [A:num:B, C:num:D | E) into [F:num:G | E] :- F is A + C. # Prolog !

>

> A:num:int is an integer numeric stack element.

>

> A:num:float a a float numeric stack element.

>

> Note that addition and several other operators work on both subtypes, but

> not all do:

>

> succ turns [A:num:int | B] into [C:num:int | B] :- C is A +1. #Prolog

>

> Assume that we have a similar notation for lists, possibly with subtypes:

> A:list:B is a list of elements of unspecified type A:list:num:float is a

> list of floats. A:list:list:B is a list of lists of unspecified type.

>

> cons turns [A:list:B, C:B | D] into [[C:B, A]:list:B | D].

>

> And so on for all operators. Now the important bit, the concatenative

> principle: output stack from one operator must match input stack for next.

> In the first line below, E is the intermediate stack pattern. Here is the

> interpreter ( = exterm() in Joy, John Cowan might recognise).

>

> [A | B] turns C into D :- A turns C into E, B turns E into D. [] turns A

> into A.

>

> Now the marvellous bit: given the definition foo == bar baz zot, where bar

> baz zot are primitives or have been previously defined, Prolog automatically

> works out the input and output stacks for foo. I did not have to write any

> code for this, in fact it took me some time to notice that Prolog had done

> it. Such is the power of unification.



 
Your approach to directly expose Prolog as the language for expressing

type is elegant and very powerful. My only hesitation is the syntax of

prolog, which may be accessible to a logician, are not great for

someone with only some collegial level math. In general the problem I

am having is with constructing an accessible syntax for people.



Currently I am now heavily leaning towards:



cons : (list['a] 'a -> list['a])

uncons : (list['a] -> list['a] 'a)

etc.



 > So, even if your implementation language is not Prolog, you should consider

> whether similar techniques could be written in your implmentation language.

> I have to confess that I have never written a unifier, but I have seen very

> short (half a page) versions. It may even be that for the above purposes

> only a very simple version is needed.



 
Unifiers can be quite simple, given the right abstractions. I have

seen incredibly short versions in Scheme for instance. My unification

code is horribly naive. ;-)



 > Best wishes with the project.



 
Thank you very much! And thank you for the little Prolog tutorial, it

was quite informative.



 > - Manfred



 
- Christopher

Manfred Von Thun — 2008-04-15 05:23:42

On 10/4/08 1:00 PM, "Christopher Diggins" <cdiggins@...> wrote:

 > 

> On Wed, Apr 9, 2008 at 3:44 AM, Manfred Von Thun

> <m.vonthun@... <mailto:m.vonthun%40latrobe.edu.au> > wrote:

> 

>> > Now the marvellous bit: given the definition foo == bar baz zot, where bar

>> > baz zot are primitives or have been previously defined, Prolog

>> automatically

>> > works out the input and output stacks for foo. I did not have to write any

>> > code for this, in fact it took me some time to notice that Prolog had done

>> > it. Such is the power of unification.

> 

> Your approach to directly expose Prolog as the language for expressing

> type is elegant and very powerful. My only hesitation is the syntax of

> prolog, which may be accessible to a logician, are not great for

> someone with only some collegial level math. In general the problem I

> am having is with constructing an accessible syntax for people.

> 

 
That would not a problem at all.

The idea is that the implementation uses this notation internally, either

because the whole implementation is in Prolog or because it

uses a subset of Prolog to define the primitives. But the end-users

never see the Prolog at all, they just see a concatenative language.

When they define foo == bar baz zot, the unifier works out the input

and the output stacks, and possibly provide an error message when

there is a mismatch.  So the end-users should never see the Prolog.



  - Manfred







[Non-text portions of this message have been removed]