Joyful CLIP: early version

iepos@tunes.org — 2000-05-08 04:11:18

Hello, folks....

Well, a bit ago I decided to follow Billy's advice and just make a
simple interactive environment, and not worry about compilation to
ELF and strange things for a while...

I've come up with something half-usable and thought I'd let you guys
know about it. It's at:

http://www.tunes.org/~iepos/clip-joyful-0.2.tar.gz

It is an interpreter for a concatenative langauge similar to Joy.
However, it's not a normal interpreter, since it compiles to
native x86 code at runtime. But, understand that currently it generates
rather sorry code; it does no register allocation and often juggles
things on the stack even when they are statically known.

Okay, here are the words implemented now...

Arithmetic:
+ : adds top two stack items
- : subtracts top stack item from the next
* : multiplies top two stack items
/ : divides top stack item into the next
% : like "/", but leaves remainder instead of quotient
= : compares top two stack items (destructively, like always),
and does one of the literals "true" or "false".

Combinator:
dup : [x] dup -> [x] [x]
pop : [x] pop ->
swap : [y] [x] swap -> [x] [y]
call : [x] call -> x
dip : [y] [x] dip -> x [y]
cons : [y] [x] cons -> [[y] x]

Boolean:
true : a literal
false: a literal
if : true [x] [y] -> y
false [x] [y] -> x

I/O:
puts : prints top string on stack (followed by newline)
puti : prints top int on stack (followed by newline)

And, here is the confusing example program included with the tar.gz :-)

(Count upward, beginning at 0, skipping multiples of five)
0 [[dup 5 % 0 = [true] [false] if [] [dup puti] if 1 +] dip dup call] dup call

One mysterious problem with the system is that it doesn't work when I
turn on gcc's optimizations (p.s, this is because I forgot to use
"volatile" on the inline assembly; it works now).

Another problem is that the system never attempts to free string and
program literals, even when they are long past executed or printed for
their last time. Of course, this might be considered minor, since it
happens in ordinary C systems too.

There's another memory-management problem that's worse. In this system,
"cons" gives one the ability to construct dynamically-allocated
(by malloc(), currently) programs at runtime, but they will never
be freed; the programmer has no way to do it manually, and there is
no automatic garbage collector either.

Another problem is that the stack has a statically limited size of
200000 bytes; this may be changed by a #define at the top of the .c file :-)
Presumably, this could be fixed by somehow guaranteeing that a segfault
occurs upon overflow and arranging things so that the segfault is caught
and the stack extended a ways.

Currently, there are no primitives for dealing with arrays, buffers,
or things of that kind. Also, there is no support for lambdas or definitions.

I think that soon I will rewrite the whole system so that the compiler
keeps track of information about things on the stack; it would keep
track of the types of stack items, and their exact value, if possible;
otherwise, it would keep track of where they are stored (possibly in
registers, on the stack, or perhaps in a static memory location).
Then, many operations could be implemented by just changing the
bookkeeping, as Billy has said, without emitting any code.

Still, I'm a little puzzled about how exactly to go about register
allocation. I suppose we'd want to reserve the registers mainly
for the top 7 or so stack items (although not necessarily in a fixed
order). The main attack to register allocation schemes happens, I think,
when we need to compile a quotation; when we see it, it might not be known
which registers the parameters are going to be stored in, since we don't
know yet who's going to call it. Ultimately, in compiling a quotation,
one would have to decide what interface it will use, and then it would
be up to the callers to abide by it.

Perhaps a good technique would be to postpone compiling a quotation
until its first caller is known; then the quotation could be compiled
to take the parameters from the most convenient storage locations,
the one they are already in (but, future callers might have to move things
around to get things into the chosen locations). This technique would
turn out especially well if there is only one caller; this would actually
be a very common situation, I think; it would happen all the time
with branching and looping constructs, in which the program to
branch or loop has just been pushed as a literal.

Comments welcome...

- "iepos" (Brent Kerby)