Some notes on stack effect inference
Slava Pestov — 2004-10-29 02:43:41
Hi everybody,
I have written up some preliminary notes on stack effect algebra in my
weblog:
http://www.jroller.com/page/slava/20041028#type_inference
It is the first entry in a series of entries I will write about type
inference in Factor. This is a work-in-progress and I don't know where
my work will lead. When I conclude the series, I will have developed a
concrete type inference algorithm and program.
Slava
William Tanksley, Jr — 2004-10-29 05:27:27
sa@dfa.com — 2004-10-29 17:08:43
there are circumstances in which we want to turn a list into
an atom, to hide its structure. for example, we might want
to build a nested list whose ultimate elements are programs.
if we could turn programs (= quotations = lists) into atoms,
then we could expect our generalized recursive tools to
bottom-out correctly. otherwise, if the programs are lists,
those tools would just continue drilling down past the
intended level.
in XY, then, there is a new primitive, ` (backtick). given
a quotation, ` returns a quotation-atom. given a quotation-
atom, it returns a list. given anything else, it does nothing.
[1 2 3] ` @: / is it atomic?
1
[1 2 3] ` ` [1 2 3] ~ / is it self-dual?
1
in addition, i've tweaked the parser so that ` immediately
preceding [ is equivalent to the above:
`[1 2 3] [1 2 3] ` ~ / do they match
1
[1 2 3] `
`[1 2 3]
similarly for eager lists: `(..) is equivalent to (..) `.
William Tanksley, Jr — 2004-10-29 17:08:24
On Fri, 29 Oct 2004 13:08:43 -0400,
sa@... <
sa@...> wrote:
> in XY, then, there is a new primitive, ` (backtick). given
> a quotation, ` returns a quotation-atom. given a quotation-
> atom, it returns a list. given anything else, it does nothing.
A J programmer would no doubt call this "boxing".
-Billy
sa@dfa.com — 2004-10-29 17:17:30
back when the world was young, APLers were divided over whether
'enclose' had any effect on scalars. i.e. if x is an atom, does
x match the enclose of x. in APL2, it did, and in Sharp APL
(the predecessor of J) it didn't. i suspect that this is still
true in J: the enclose of a scalar is a box. in XY, it isn't:
the enclose of an atom has no effect.
"William Tanksley, Jr" <
wtanksleyjr@...> wrote on 10/29/2004 01:08:24
PM:
>
> On Fri, 29 Oct 2004 13:08:43 -0400, sa@... <sa@...> wrote:
> > in XY, then, there is a new primitive, ` (backtick). given
> > a quotation, ` returns a quotation-atom. given a quotation-
> > atom, it returns a list. given anything else, it does nothing.
>
> A J programmer would no doubt call this "boxing".
>
> -Billy
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
John Carter — 2004-11-01 01:46:45
On Thu, 28 Oct 2004, Slava Pestov wrote:
> http://www.jroller.com/page/slava/20041028#type_inference
Ah, I see you fairly rapidly hit that brick wall that all the
concatenative languages I have seen so far hit....
Conditional Stack Effects.
How do you infer past an "if" statement.
The "then" branch can do arbitarily nastily different things to the
stack compare to the "else" branch. So what is the 'net stack effect?
It gets really messy.
Yet compare that to more traditional lambda calculus / call / function
environment.
All stack effects get rolled up and undone as soon as you as you hit
return. Much simpler to analyze. A Lambda calcalus _always_ has a
simply functional mapping of [input1, input2, input3,...] => [single
output value]
Is this a hard trade off? Stack languages are easier than lambda
calculus to analyze in that argument binding is not present, but
harder to analyze the the return values.
One fix may be a stack language where the compiler enforces that
the stack effect of the "then" branch must be exactly the same as the
"else" branch.
Or to enforce that every name-able word leaves exactly one value on
the stack.
I don't know the answer.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email :
john.carter@...
New Zealand
The universe is absolutely plastered with the dashed lines exactly one
space long.
William Tanksley, Jr — 2004-11-01 16:57:34
On Mon, 01 Nov 2004 14:46:45 +1300 (NZDT), John Carter
<
john.carter@...> wrote:
> Ah, I see you fairly rapidly hit that brick wall that all the
> concatenative languages I have seen so far hit....
> Conditional Stack Effects.
> How do you infer past an "if" statement.
Yup, that's a problem. Loops are possibly worse.
The solution is to have two new anonymous types: iterated and
conditional. The "iterated" type encloses a single stack signature and
indicates that it could be repeated any number of times; the
"conditional" type encloses two stack signatures, and indicates that
either one may be valid at runtime.
Then, depending on the strength of your type inference engine, you can
allow those types to propagate to various levels. The easiest choice,
of course, is to provide no support for them; this means that all
loops must be balanced, and all IFs must have the same type effect as
their THENs. The next easiest choice is to not allow those types to be
returned from functions; this means that every word must clean up its
own mess.
If you allow more than this, I don't know of a static solution; to the
best of my knowledge, you need to provide runtime facilities. But this
makes sense, because the problem at hand IS a dynamic stack.
-Billy
sa@dfa.com — 2004-11-01 18:06:08
John Carter <
john.carter@...> wrote on 10/31/2004 08:46:45 PM:
[:]
>
> One fix may be a stack language where the compiler enforces that
> the stack effect of the "then" branch must be exactly the same as the
> "else" branch.
>
as some of you know, i've been advocating the project of integrating
vector programming techniques into the concatenative language framework.
in the ideal world, we never loop. computation is just the flow of data
through array primitives.
over the years, APLers have accumulated hundreds, if not thousands of
techniques for just this purpose. some of these results are wonderfully
counterintuitive.
here is a simple example.
given a string y of space-delimited words and an integer x, cut y into
a list of strings [ .. z .. ] where the count of each z is less than or
equal to x, and every word in y is wholly contained in just one z.
wrap[20;"this is a string consisting of words separated by blanks"]
("this is a string "
"consisting of "
"words separated by "
"blanks ")
the standard k implementation:
wrap:{(b -1_(-1+b _binl x+b:0,1+&" "=y)\0)_ y,:" "}
broken out into bite-sized chunks:
y:"this is a string consisting of words separated by blanks"
x:20
y,:" "
a:" "=y
b:&a
c:1+b
d:0,c
e:x+d
f:d _binl e
g:-1+f
h:g\0
i:-1_ h
j:d i
k:j _ y
prefixing each step with a colon dumps the value of the variable into
the console. with explanations:
:y:"this is a string consisting of words separated by blanks"
"this is a string consisting of words separated by blanks"
:x:20
20
/ append a blank to the input string:
:y,:" "
"this is a string consisting of words separated by blanks "
/ mark every blank. note that = takes an atom " " and a vector and
/ returns a (boolean) vector result:
:a:" "=y
0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1
/ find indices of the blanks:
:b:&a
4 7 9 16 27 30 36 46 49 56 / e.g. y 4 is blank
/ add 1 to the index vector:
:c:1+b
5 8 10 17 28 31 37 47 50 57 / i.e. start of each word
/ prepend a 0:
:d:0,c / i.e. word 0 starts at 0
0 5 8 10 17 28 31 37 47 50 57
/ add x to that:
:e:x+d
20 25 28 30 37 48 51 57 67 70 77
/ find buckets. e.g. 17 < 20 <= 28:
:f:d _binl e
5 5 5 6 7 9 10 10 11 11 11
/ subtract 1:
:g:-1+f
4 4 4 5 6 8 9 9 10 10 10
/ chase pointers: g[0] -> 4, g[4] -> 6, g[6] -> 9, &c.
:h:g\0
0 4 6 9 10
/ drop the last pointer:
:i:-1_ h
0 4 6 9
/ get the indices in y of the words to cut on:
:j:d i
0 17 31 50
/ cut y into substrings at those points:
:k:j _ y
("this is a string "
"consisting of "
"words separated by "
"blanks ")
how fast is this algorithm?
on my pentium 4, i use wrap to chop the text of the king james bible:
a:6:"kjv/bible.txt" / read the bible
#a / byte-count
4378288 / 4.37 meg
\t b:wrap[80;a] / time the wrap
203 / 203 milliseconds
10#b / first ten lines
("The First Book of Moses, called Genesis {1:1} In the beginning God
created "
"the heaven and the earth. {1:2}And the earth was without form, and void;
and "
"darkness [was] upon theface of the deep. And the Spirit of God moved upon
the "
"face of thewaters. {1:3} And God said, Let there be light: and there
was "
"light. {1:4}And God saw the light, that [it was] good: and God divided
the "
"lightfrom the darkness. {1:5} And God called the light Day, and the
darknesshe "
"called Night. And the evening and the morning were the first day. {1:6}
And "
"God said, Let there be a firmament in the midst of thewaters, and let it
"
"divide the waters from the waters. {1:7} And Godmade the firmament, and "
"divided the waters which [were] under thefirmament from the waters which
")
translated into XY:
; wrap [' , dup] dip swap ' = &: 1 + 0 ,.
dup [+] dip swap [dup] dip _binl
-1 + 0 swap .v! -1 _. @ _. ;
[ to be continued ]
sa@dfa.com — 2004-11-01 18:55:38
k:
wrap:{(b -1_(-1+b _binl x+b:0,1+&" "=y)\0)_ y,:" "}
XY:
; wrap [' , dup] dip swap ' = &: 1 + 0 ,.
dup [+] dip swap [dup] dip _binl
-1 + 0 swap .v! -1 _. @ _. ;
'wrap' requires nine (!) stack shuffle words: three dips,
three dups, and three swaps.
using pattern-notation, we can eliminate them all:
; wrap { [s w] s ' , ' = &: 1 + 0 ,. s w wrap. } ;
; wrap. { [b s w] 0 b b w + _binl -1 + .v! -1 _. b @. s _ } ;
sa@dfa.com — 2004-11-05 18:05:46
phimvt@lurac.latrobe.edu.au — 2004-11-08 04:29:17
On Mon, 1 Nov 2004, John Carter wrote:
[..]
> Conditional Stack Effects.
>
> How do you infer past an "if" statement.
>
> The "then" branch can do arbitarily nastily different things to the
> stack compare to the "else" branch. So what is the 'net stack effect?
>
> It gets really messy.
But the same kind of problems arise with lists in other languages.
First, take homogeneous lists such as lists of integers, [11 22 33]
or [100 200 300 400]. There will be a possible program which,
expressed procedurally, says: if the first item in the list is
less than 20, then delete it from the list else leave it. Or in
Joy: test1 == [first 20 <] [rest] [] ifte. Or in a Lisp-like lambda
calculus language: (define test1(L) (if (< (car L) 20) (cdr L) (L))
So, if the type checker keeps track of the length of lists, then any
such branching function requires the checker to handle exactly the
same "problem" (if indeed it is a problem) as you mentioned. As far
as I know, strongly typed functional languages would not see it as
a problem at all, they treat the type simply as "list-of-integer"
and do not bother about the length.
In languages where lists can be heterogenous one might have the lists
[123 peter 'A "foo"], and a program which says: if the first item of
the list is an integer, delete it else leave it. That is rather analogous
to a program in a stack language which says: if the top item of the
stack is an integer, pop it off else leave it. In both cases any kind
of static type checker would need to keep track of the two possibilities.
As far as I know all languages that allow heterogenous lists do not
even try to do this, and instead leave all type checking to run time.
This is also the solution adopted by Joy. But I do not think it is
specifically a problem for concatenative languages.
So, the two kinds of "problems" (stack size and type of elements)
also arise for other languages that allow lists. The problems are
not specific to stack languages or to concatennative languages.
> Yet compare that to more traditional lambda calculus / call / function
> environment.
>
> All stack effects get rolled up and undone as soon as you as you hit
> return. Much simpler to analyze. A Lambda calcalus _always_ has a
> simply functional mapping of [input1, input2, input3,...] => [single
> output value]
And concatenative languages are simpler still: _always_ a simple
functional mapping of [input] => [single output value].
Both input and output are lists, but they are called stacks.
(Sorry if this looks like a smart answer.)
> Is this a hard trade off? Stack languages are easier than lambda
> calculus to analyze in that argument binding is not present, but
> harder to analyze the the return values.
See my earlier response about homegeneous and heterogeneous lists.
> One fix may be a stack language where the compiler enforces that
> the stack effect of the "then" branch must be exactly the same as the
> "else" branch.
Much will depend on what counts as a stack effect. If you are proposing
that "stack effect" and "then/else effect" should be defined with
the same level of detail, then I agree. But that same level should
be applied to lists. In weakly typed languages (such as all the Lisps
and the concatenative languages) that level is very low indeed.
But there may well be room for languages in which there is much less
flexibility, where all types (and sizes?) are known at compile time.
I suppose they would be closer to APL, and lend themselves being
implemented using consecutive memory locations (arrays) both for
"the stack" and for linear and multidimensional structures (vectors
and matrices). Here some very strict equivalence of then/else effects
might be unavoidable.
- Manfred
sa@dfa.com — 2004-11-08 16:40:35
John Carter — 2004-11-09 21:36:42
On Mon, 8 Nov 2004
phimvt@... wrote:
> And concatenative languages are simpler still: _always_ a simple
> functional mapping of [input] => [single output value].
> Both input and output are lists, but they are called stacks.
> (Sorry if this looks like a smart answer.)
As usual, a very smart and wise answer.
Unfortunately my growth has been stunted by Niklaus Wirth. I have this
deep seated belief in things like strict typing, modularization,
lexical scoping and little green men. I program daily in Ruby now, so
I know that strict typing is a false belief, but it is in my soul
anyway.
You see, the "Strict Typing" meme is fighting for it's existence in my
brain space, and it's last gasp is this...
If, instead of specifying the types in a program, the compiler _told_
you what the types were, you would recover the correctness /
optimizability of Pascal with the flexibility Lisp.
Couple that to the Rubyish notion of "Duck Typing", and the
analyzability of Joy and it would be unstoppable.
I'm not an academic. I write, maintain and design processes for
creating multi million line monster codes.
I feel when I program in Joy like I did when I programmed in machine
code. (Note not assembler, machine code). I was too free, too
unconstrained. I did silly things like creating loops that modified
the immediate values within the code.
Cute.
Unmaintainable, unscalable.
Or when I first met Common Lisp. There were just so _many_ ways of
doing any one thing, and the idiomatic right way was simply not
obvious.
Or the way I feel when maintaining very badly written body of C filled
with hundreds of globals variables. The effects of what I do in any
small region can explode throughout the code via the million causal worm
holes created by the globals.
The fact that Joy functions map input stack to output stack is like
that. If I have a Joy function written by some third party, I cannot
without substantial analysis say what it does or even define a "region
of damage". It could, depending on inputs, for all I know, wallop the
bottommost element of the stack, or add a million new elements to the
stack.
Word by Word, Joy is supremely analyzable compared to something like C
or Pascal. Kilo-word by Kilo-word, it is hopeless.
The current incarnation is hopeless for large scale programming, but
the basic idea is not. (Anyone else note the irony that gcc has moved to a
Static Single Assignment intermediate representation?)
I feel if I was just a wee bit smarter, I could create a language that....
a) Looked very much like Joy.
b) Could do the sort type inference Slava is doing from end to end
of even large scale commercial/industrial programs.
That is why I watch Slava's efforts with much hope, and sigh with
disappointment when he glides past the problem of the conditional if
with a shrug.
> But the same kind of problems arise with lists in other languages.
I admit it. I'm a back sliding strict typer.
I have decided I quite like Duck Typing and all my recent programs use
tests like "if obj.respond_to?( :this_func)" instead of
"if obj.kind_of?( ThisClass)"
The notion of Duck Typing fits really well with the notion of type
inference.
Type inference tells you what type the parameters are when all it
should be doing is enumerating all operations that can potentially be
applied to an object.
> Much will depend on what counts as a stack effect. If you are proposing
> that "stack effect" and "then/else effect" should be defined with
> the same level of detail, then I agree. But that same level should
> be applied to lists. In weakly typed languages (such as all the Lisps
> and the concatenative languages) that level is very low indeed.
Part of the problem are the list to stack and stack to list operators
in Joy. They enable some really cute stuff, but they really scrap any
hope of type inference. They change the type inference problem from
being a relatively static a low order thing to being of the same
complexity as the halting problem. ie. You have to infer what is in
the list prior to it getting splattered all over the stack.
Of course, the end answer may be that static typing was just one of
those curious wrong turns in history.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email :
john.carter@...
New Zealand
The universe is absolutely plastered with the dashed lines exactly one
space long.
William Tanksley, Jr — 2004-11-09 22:15:58
For those who don't know: "Duck Typing" is the concept that an object
is the correct type for an operation if the object can be used in the
operation. For example, consider a function:
def MyFunction(obj):
obj.doSomething()
MyFunction will work with ANY object containing a method "doSomething"
with no parameters. Any other object will cause a runtime failure. The
name "duck typing" was invented for the language Ruby; the typing
system wasn't so much invented as discovered in Smalltalk, and perhaps
even BCPL.
On Wed, 10 Nov 2004 10:36:42 +1300 (NZDT), John Carter
<
john.carter@...> wrote:
> Part of the problem are the list to stack and stack to list operators
> in Joy. They enable some really cute stuff, but they really scrap any
> hope of type inference. They change the type inference problem from
> being a relatively static a low order thing to being of the same
> complexity as the halting problem. ie. You have to infer what is in
> the list prior to it getting splattered all over the stack.
This is true. Let me add, though, that the typing problem IS
equivalent to the halting problem. Think about it -- the purpose of
static typing is to determine whether your program is correct.
Certainly that includes the concept of halting :-), and much more.
The trick is to define an interesting and useful subset of the typing
problem that CAN be handled staticly, hopefully without making
programming too hard or limiting the capabilities of the language.
Some language authors have made some neat explorations on this topic;
for example, ML added type inference, Simula added classes (which are
different from types, although subtly so; some modern languages
distunguish sharply between the two), BETA added patterns, Io adds
delegation (not for the first time, but I think it's significant), and
so on.
> Of course, the end answer may be that static typing was just one of
> those curious wrong turns in history.
I don't think so -- I think that people have just gotten hung up on
wrong ideas about static typing, and wrong implementations, usually
based on historical examples rather than any understanding. We're
slowly recovering from the idea that subclasses are subtypes, for
example; and we still don't know whether maintaining the distinction
helps or hurts us. Hey, in this group I can say it: the typed lambda
calculus may have been a wrong move. (Of course, I can't prove that,
and in fact I don't want to try; I can't deny that lots of useful work
is being done using it.)
Anyhow.
There's a little bit of an idea for how to staticly check a language
with capabilities very much like "Duck Typing" in the page
http://javalab.cs.uni-bonn.de/research/darwin/. Only one idea, and it
does require you to rethink a little of what you know about object
orientation.
> John Carter Phone : (64)(3) 358 6639
-Billy
Slava Pestov — 2004-11-10 00:39:39
John Carter wrote:
> Unfortunately my growth has been stunted by Niklaus Wirth. I have this
> deep seated belief in things like strict typing, modularization,
> lexical scoping and little green men. I program daily in Ruby now, so
> I know that strict typing is a false belief, but it is in my soul
> anyway.
I think most programmers believe in modularization and strict typing. Or
do you mean 'static typing' when you say 'strict typing'?
> I feel when I program in Joy like I did when I programmed in machine
> code. (Note not assembler, machine code). I was too free, too
> unconstrained. I did silly things like creating loops that modified
> the immediate values within the code.
Any language allows you to write poor code.
> If I have a Joy function written by some third party, I cannot
> without substantial analysis say what it does or even define a "region
> of damage". It could, depending on inputs, for all I know, wallop the
> bottommost element of the stack, or add a million new elements to the
> stack.
Which is why you must document your library for it to be of any use.
Each definition should have a stack comment given.
> Word by Word, Joy is supremely analyzable compared to something like C
> or Pascal. Kilo-word by Kilo-word, it is hopeless.
Why?
> That is why I watch Slava's efforts with much hope, and sigh with
> disappointment when he glides past the problem of the conditional if
> with a shrug.
What is the problem of the conditional if?
> Part of the problem are the list to stack and stack to list operators
> in Joy. They enable some really cute stuff, but they really scrap any
> hope of type inference. They change the type inference problem from
> being a relatively static a low order thing to being of the same
> complexity as the halting problem. ie. You have to infer what is in
> the list prior to it getting splattered all over the stack.
The fact that these operators exist does not imply that your code should
use them. If you don't like them, don't use them.
Slava
phimvt@lurac.latrobe.edu.au — 2004-11-22 05:13:54
On Wed, 10 Nov 2004, John Carter wrote:
[..]
> Unfortunately my growth has been stunted by Niklaus Wirth. I have this
> deep seated belief in things like strict typing, modularization,
> lexical scoping and little green men. I program daily in Ruby now, so
> I know that strict typing is a false belief, but it is in my soul
> anyway.
My growth has been similarly affected by Wirth, but I never thought
of it as being stunted by that. After my very short flirtation with
Basic, Focal, Algol 60 and a few minor things, I was very happy with
the very (too?) strict typing of the original versions of Pascal.
Maybe if I had read the critique by one of the authors of the book
"Software Tools" (?) I might have thought differently. But when I
started programming in Pascal there were just Wirth's book and the
manual/report. So I never missed the freedom (and danger) offered by
the more leniently typed languages. When I started writing the Joy
interpreter in C I shot myself in the foot a few times, but I soon
learnt to do the strict typing in my head.
> You see, the "Strict Typing" meme is fighting for it's existence in my
> brain space, and it's last gasp is this...
>
> If, instead of specifying the types in a program, the compiler _told_
> you what the types were, you would recover the correctness /
> optimizability of Pascal with the flexibility Lisp.
Sound rather like the Milner type inference, ass used in ML and Miranda.
I have no experience in polymorphic programming, and no idea on how
to implement it. But I have often wondered about a polymorphic version
of Joy.
> Couple that to the Rubyish notion of "Duck Typing", and the
> analyzability of Joy and it would be unstoppable.
Sounds like a winning combination. Any polymorphic programmers/
implementors out there?
[..]
> Word by Word, Joy is supremely analyzable compared to something like
> C or Pascal. Kilo-word by Kilo-word, it is hopeless.
Maybe yes, maybe no. Joy differs from the usual lambda calculus
languages in just one essential respect, one that concerns the
basic structure, and one that is already apparent in the smallest
one-line program. It does not attempt to address problems arising
for large scale programs, and I do not think it should. But that
does not preclude an implementation providing all the facilities
for large scale programming which other languages provide.
> The current incarnation is hopeless for large scale programming, but
> the basic idea is not. (Anyone else note the irony that gcc has moved to a
> Static Single Assignment intermediate representation?)
I must investigate this.
> I feel if I was just a wee bit smarter, I could create a language that....
> a) Looked very much like Joy.
> b) Could do the sort type inference Slava is doing from end to end
> of even large scale commercial/industrial programs.
Please do create !!!
[..]
> > Much will depend on what counts as a stack effect. If you are proposing
> > that "stack effect" and "then/else effect" should be defined with
> > the same level of detail, then I agree. But that same level should
> > be applied to lists. In weakly typed languages (such as all the Lisps
> > and the concatenative languages) that level is very low indeed.
>
> Part of the problem are the list to stack and stack to list operators
> in Joy. They enable some really cute stuff, but they really scrap any
> hope of type inference. They change the type inference problem from
> being a relatively static a low order thing to being of the same
> complexity as the halting problem. ie. You have to infer what is in
> the list prior to it getting splattered all over the stack.
Your response surprises me. Maybe I expressed myself badly. But the
list/stack and stack/list operators are irrelevant to what I meant.
Consider a non-concatenative list processing language such as Lisp/Scheme,
or an array processing languagge such as APL/J/K. For most applications
it is desirable not to make the size of the lists/arrays part of their
type (one of the criticism of the original Pascal was that it did this).
One could easily write programs that have the semantics of Joy but not
their syntax. For example, to implement addition of the top two elements
of the stack one might write a function which takes a list(stack) as
argument and produces a list(stack) as result:
(define add(l) (cons (+ (car l) (cadr l)) (cddr l)))
and similarly for any other functions. But do you expect the language
to be able to keep trackof the number of elements in the lists(stacks)?
Anyhow, must fly now. Thanks for the discussion.
- Manfred
sa@dfa.com — 2004-11-22 18:52:35
this is a first draft of the X language summary. the code and the summary
can be found at:
http://www.nsl.com/k/x/x.k
http://www.nsl.com/k/x/x.x
http://www.nsl.com/k/x/x.txt
i've made one major change in the last day, which is to allow names of
length
greater than 1.
unlike XY, X adds zero performance overhead to K.
over the next few weeks i'll be working on examples and a manual, and, of
course, debugging.
-----------------------------------------------------------------------------
THE CONCATENATIVE LANGUAGE X (Draft)
* Introduction
X is a new installment in concatenative vector language project.
X borrows code and ideas from the language XY, but departs in certain
respects
from that earlier language.
Whereas in XY, the programmer has access to the entirety of the result
stack
and the input queue, in X, explicit access to the queue is blocked, and
stack
manipulation is limited to what can be expressed by the use of finite
patterns.
Otherwise, the most striking difference between the two languages is that
all
combinators in X are treated as adverbs. An adverb is a function which
takes
one or two quotations off the stack and derives from them a function which
is
then prepended to the queue. X functions are not first-class. That is, in
X
there is no way to construct an arbitrary function, nor is there any way to
leave a function on the stack.
* Notation
"abc" a string
'a a character
10 integer
10.2 float
10: negative integer
10.2: negative float
[a 10] lazy list (quotation)
A name begins with a-zA-Z, followed by characters drawn from a-zA-Z, dot,
and
colon. For example, the following are valid names:
a
a:
a.b.c:
a..foo:bar..
Note that
a12b
is parsed as
a 12 b
That is, a name, followed by an integer, followed by a name.
* Datatypes
The elementary datatypes of X are integer, float, character, symbol, and
list.
Note that "abc" is a list of characters, that is
['a 'b 'c]
is identical to
"abc"
* Verbs
A verb is a function from elementary datatypes to elementary datatypes. 18
of the 20 K verbs are verbs of X:
~ | @ # $ % ^ & * - _ = + | < , > ?
Each verb has three forms:
@ dyad
@: monad
@. commuted dyad
All but one of the X verbs retain their K meanings. The exception is that
dyadic '@' has been mapped to dyadic '.'. That is, x y @ has the K
meaning:
x . y
X doesn't require the following K verbs:
.: execute, make/unmake dictionary, &c.
@ at
: x:y -> y
:: ::x -> x
* Definition
Inline definition in X is the same as it is in XY:
;x A; define x as A
;x; undefine x
* Trains
One of the principal motivations for X is to explore the consequences of
introducing into K the concept of a verb-train, or "train" for short.
In K, it is possible to construct a limited subset of compositions. A
valid
K composition has one of the following two forms:
m..mm
m..md
where 'm' is a monad and 'd' is a dyad. For example, the following are
valid K compositions:
*|: first (of the) reverse (of)
*| first (of the) minimum (of)
Only the final verb of the composition requires the monadic qualifier ":".
All verbs preceding the final verb are assumed to be monads.
In X, any quotation consisting entirely of verbs is a valid train:
[^:+*]
* Adverbs
X has six adverbs. These have been mapped to the "mated" characters as
follows:
X K name
- - ----
( ': prior
) ' each
{ \: each-left
} /: each-right
\ \ scan
/ / over
For example, polyadic 'each':
[1 2 3][4 5 6][+])
) takes the quotation at the top of the stack, derives a function which
adds
corresponding elements of [1 2 3] and [4 5 6], and returns [5 7 9].
In X, the valence of the derived function determines how many arguments it
will consume from the stack. For example, [+] has valence 2, so [+]) will
consume two lists. Since [+*] has valence 3, it expects three arguments on
the stack:
[1 2 3][4 5 6][7 8 9][+*])
The monadic form of an adverb derives a function which expects to find its
arguments packaged in a single list:
[[1 2 3][4 5 6][7 8 9]][+*]):
* Evaluation
X has integers, floats, characters, symbols, and lists, but it does not
have
derived functions as a first-class datatype. This implies that at every
point in the course of evaluation, every object on the stack is one of the
primitive datatypes enumerated above.
Although the X programmer has no access to the queue (the list of elements
yet to be evaluated), the X interpreter does. Think of the state of X at
any
point in time as consisting of
X z^Y
where X is the stack, z^Y is the queue, and z is the next element on the
queue to evaluate. The evaluator A examines z:
If z is defined symbol, A applies it to X and Y, which yields a new stack
X'
and a new queue Y'. There are two cases:
- If z is a verb, for example +, then Y' = Y and X' = the
result of applying + to X.
- If z is an adverb, for example ), then X' = X minus the
quotation Q on top and Y' = the result of prepending the
derived function Q) onto the queue.
If z is not a defined symbol, there are two cases:
- If z is not a derived function, then X' = X^z.
- If z is a derived function, then A applies it to the top
element of the stack. There are two cases:
- The result r is a derived function, in which case
Y' = r^Y.
- The result r is not a derived function, in which
case X' = X^r.
In the course of repeated passes of the evaluator, a derived function on
the
queue keeps consuming arguments from the stack until its valence is
reached.
For example, [+*]) has valence 3, so it will take three passes before it
finally returns something which is not a function, at which point that
result
is pushed onto the stack.
* Combinators
The valence of a function is the number of arguments it takes. The charge
of a function is the number of results it returns.
K verbs have valence 1 or 2, and charge 1. K adverbs derive functions
which
have valence v > 0 and charge 1. but stack-based languages require
operations
which, implicitly or explicitly, permute elements of the stack. For
example,
'dup' has valence 1 and charge 2, 'swap' has valence 2 and charge 2, and
the
valence and charge of 'i' vary, depending on the quotation 'i' finds at the
top of the stack.
X has two combinators:
` pattern
`: i
`: is the familiar Joy disquotation combinator:
10 20 30[+*]`:
500
'pattern' is similar to the { combinator found in XY:
10 20 30[a b c][c a]`
[30 10]
It expects a pair of quotations on the stack. The first quotation is a
list,
possibly nested, of undotted names. The second quotation is a list,
possibly
nested, containing some or all of the names in the first quotation,
possibly
repeated, intermixed with other elements. ` takes the two quotations and
derives a function.
The derived function has the following behavior: Values are drawn in order
from the stack and mapped to names in the first quotation. Those values
are
then mapped to corresponding names in the second quotation.
When the first first quotation is non-empty, the result is a list:
10 20 30[a b][b a]
10 30 20
When the first quotation is empty, the function returned by ` is the 'dip'
combinator:
10 20 30 100[][+*]`
500 100
In combination with `:, ` is powerful enough to derive the standard swap-
and
list-manipulation words. For example,
;dup[a][a a]`i;
;swap[a b][b a]`i;
;zap[a][]`i;
;uncons[[a A]][a A]`i;
* Examples