wtanksley@bigfoot.com — 2000-05-08 18:05:58

From: iepos@... [mailto:iepos@...]

>Hello, folks....

>Well, a bit ago I decided to follow Billy's advice and just make a
>simple interactive environment, and not worry about compilation to
>ELF and strange things for a while...
>I've come up with something half-usable and thought I'd let you guys
>know about it. It's at:
> http://www.tunes.org/~iepos/clip-joyful-0.2.tar.gz

Wow, that was quick! In that time I managed to fix a single bug in Omega.

>It is an interpreter for a concatenative langauge similar to Joy.
>However, it's not a normal interpreter, since it compiles to
>native x86 code at runtime.

So it's not an interpreter, then. It's an interactive compiler which emits
native code.

>But, understand that currently it generates
>rather sorry code; it does no register allocation and often juggles
>things on the stack even when they are statically known.

This is a common problem with early compilers. I don't think it's worrisome
:-).

A couple of points to consider:

1. Is it really worthwhile worrying about optimising juggling stack
elements? It's something that's _easy_ for the programmer to fix, and
unneeded stack juggles make the code look terrible. So the programmer has
good reason to want to NOT write unneeded stack juggles.

2. Register allocation is very good, but there's another option. Why don't
you do some benchmarks, first with and then without allocation; I think you
might be suprised. Register allocation makes code run faster, but when you
don't register allocate you only use a couple of registers, and task
switching is faster (because you only have to save and restore those
registers).

I'm working on getting my friend's register allocater. Hopefully it'll be
useful to you.

> % : like "/", but leaves remainder instead of quotient

It's common to name this MOD, since it's rarely used. This way, the word
"/MOD" makes sense: it returns both the quotient and remainder. This is
fast almost everywhere.

>I/O:
> puts : prints top string on stack (followed by newline)
> puti : prints top int on stack (followed by newline)

Skip the newline. A trailing space is okay, although it gets annoying when
you don't need it.

>And, here is the confusing example program included with the tar.gz :-)

> (Count upward, beginning at 0, skipping multiples of five)
> 0 [[dup 5 % 0 = [true] [false] if [] [dup puti] if 1 +] dip
>dup call] dup call

0 [[dup 5 % 0 = [dup puti] [] if 1 +] dip dup call] dup call

?

Why did you write "[true] [false] if" in there? That's only a NOT, and
since your input is boolean and your output simply does an IF, you can
simply reverse the clauses on the following IF.

>Another problem is that the system never attempts to free string and
>program literals, even when they are long past executed or printed for
>their last time. Of course, this might be considered minor, since it
>happens in ordinary C systems too.

Another choice here would be to retreat a bit and implement a Forth-style
dictionary, and let the programmer manage it. A Forth-style dictionary is
managed soley by growing and shrinking; shrinking happens when you run
FORGET <word>, and growing occurs whenever you define a word or APPEND a
cell.

("Dictionary" is the wrong word; Forth uses it for a dictionary, but you
don't have to. The important thing isn't that it's a place to store and
lookup words; the important thing is that it's a region of memory which
starts at some address and can grow or shrink. Often the dictionary is
overallocated to make growing faster, and the spare space at the end is
called the PAD, and is used for temporary storage.)

>There's another memory-management problem that's worse. In this system,
>"cons" gives one the ability to construct dynamically-allocated
>(by malloc(), currently) programs at runtime, but they will never
>be freed; the programmer has no way to do it manually, and there is
>no automatic garbage collector either.

Grin. That's a problem.

The solution there is the same -- don't provide CONS yet. Instead, think of
what you'd do instead which doesn't require seperate memory management. The
answer's pretty simple -- use APPEND. APPEND should take one argument, and
append it to the end of the current definition (which has the effect of
enlarging the dictionary by four bytes).

Of course, there is another solution, and long-term it's what you want. You
want full garbage collection. It's your call, but I recommend you keep
following the path of least resistance, and make things simple now. It's
not going to make things any harder later.

>Another problem is that the stack has a statically limited size of
>200000 bytes; this may be changed by a #define at the top of
>the .c file :-)
>Presumably, this could be fixed by somehow guaranteeing that a segfault
>occurs upon overflow and arranging things so that the segfault
>is caught and the stack extended a ways.

That's a good solution. Another solution would be to make the stack
circular, so that there _is_ no overflow. This would cause HUGE surprise to
anyone not expecting it ;-), but would remove a _lot_ of potential errors.

Us Forth programmers are widely known for being very wierd. This reputation
is not without cause.

>Currently, there are no primitives for dealing with arrays, buffers,
>or things of that kind. Also, there is no support for lambdas
>or definitions.

Add the dictionary, and you'll have support for buffers, and a buffer is an
array. Later on, you'll be able to use the dictionary as its name
indicates, to store names and addresses, which will be useful for
definitions and Billy-style lambdas.

Another thing that's useful for buffers are Forth-style "blocks". A block
is a 1024-byte (or any other size that's convenient) array which is stored
in a set of buffers; there can be as many buffers as the system needs or
wants. Blocks are used as follows:

1024 read ( activate block # 1024 )
3 block ! ( write whatever values you want to the current block, using
standard memory operations )
set-modified ( tag it as having changed; this is optional, but it's wise not
to lie )
293934 read ( play with another block )
( ... there are at least 10 blocks cached, and it would be wise to have a
variable which let the programmer know how many are available... )

( ...after modifying all the blocks the programmer wanted to, he can then
run )
flush
( to save all the modified blocks to disk, or )
dump-modified
( to mark all active blocks as unmodified. )

Yes, block access is used for disk access.

>I think that soon I will rewrite the whole system so that the compiler
>keeps track of information about things on the stack; it would keep
>track of the types of stack items, and their exact value, if possible;
>otherwise, it would keep track of where they are stored (possibly in
>registers, on the stack, or perhaps in a static memory location).
>Then, many operations could be implemented by just changing the
>bookkeeping, as Billy has said, without emitting any code.

Right.

>Perhaps a good technique would be to postpone compiling a quotation
>until its first caller is known;

Another effective trick, and the one I use, is to never implicitly compile a
quotation; let quotations be just that, quotations. They don't have to
remain in text form, because that's very slow for all purposes (except
printing, which is slow anyhow); it's better to store them in wordcode form.
In other words, a quotation is an array of word addresses.

When it comes time to _use_ a quotation, you have two choices: interpret it,
or compile it and then call the result.

>then the quotation could be compiled
>to take the parameters from the most convenient storage locations,
>the one they are already in (but, future callers might have to
>move things
>around to get things into the chosen locations). This technique would
>turn out especially well if there is only one caller; this
>would actually
>be a very common situation, I think; it would happen all the time
>with branching and looping constructs, in which the program to
>branch or loop has just been pushed as a literal.

This is why Forth doesn't use quotations to implement branching and
conditionals. I think that using quotation for that is a bad idea; it
forces the computer to do extra work, moving those blocks of code all over
the place, when almost always the programmer doesn't need that.

It's easier and faster to have your IF word compile a "CMP TopOfStack, 0 /
DROP TopOfStack / JNZ *0000000*", and then (at compile time) push the
address of the starred zero on the stack (this is called an "unresolved
jump"). Then ELSE is similar: it compiles a "JMP *00000000*", stores the
address right after the JMP into the unresolved jump IF had left on the
stack, and leaves on the stack its own unresolved jump. THEN simply
resolves a single unresolved jump.

The primitive words here are called "?0BRANCH,", "0BRANCH,", and "BRANCH!".
So in Forth, these words look like this:

: IF ?0BRANCH, ; IMMEDIATE
: ELSE 0BRANCH, ; IMMEDIATE
: THEN HERE BRANCH! ; IMMEDIATE

Does this make sense?

Oh, by the way, Forth's word THEN is called ENDIF in some languages. It's
not the same as BASIC's THEN, which is used to indicate the end of the IF
clause; in Forth, IF does that job. Forth programmers think of THEN as
being the sequential THEN; a synonym for it would be ANYHOW, since it marks
the start of words which are performed regardless of the result of the
conditional.

>- "iepos" (Brent Kerby)

-Billy

iepos@tunes.org — 2000-05-09 20:49:24

> >It is an interpreter for a concatenative langauge similar to Joy.
> >However, it's not a normal interpreter, since it compiles to
> >native x86 code at runtime.
>
> So it's not an interpreter, then. It's an interactive compiler which emits
> native code.

Actually, it's not really even interactive. It was going to be, with
each word being read and executed one at a time (a quotation being treated
as a single word), but for some reason I ended up not doing it that way :-)

> A couple of points to consider:
>
> 1. Is it really worthwhile worrying about optimising juggling stack
> elements? It's something that's _easy_ for the programmer to fix, and
> unneeded stack juggles make the code look terrible. So the programmer has
> good reason to want to NOT write unneeded stack juggles.

I guess I'd agree with this; there's no point in making an optimizer spend
a bunch of time fixing programmers' mistakes (well, it could be useful
in some cases, but I don't think I'll attempt it in mine) ... however,
sometimes a SWAP really is necessary, and there is nothing the programmer
can do about it; in these cases, it would be good if the system implemented
it efficiently.

> 2. Register allocation is very good, but there's another option. Why don't
> you do some benchmarks, first with and then without allocation; I think you
> might be suprised. Register allocation makes code run faster, but when you
> don't register allocate you only use a couple of registers, and task
> switching is faster (because you only have to save and restore those
> registers).

Yeah... I'll save the old version, so I can compare to see how big of
an impact register allocation has ...

> > % : like "/", but leaves remainder instead of quotient
>
> It's common to name this MOD, since it's rarely used. This way, the word
> "/MOD" makes sense: it returns both the quotient and remainder. This is
> fast almost everywhere.

Yes... for efficiency, it would be good to have a "/MOD" word that
computes both quotient and remainder, since x86 processors (perhaps others)
always computes both.

> >I/O:
> > puts : prints top string on stack (followed by newline)
> > puti : prints top int on stack (followed by newline)
>
> Skip the newline. A trailing space is okay, although it gets annoying when
> you don't need it.

You're right... automatic newline is no good. I wouldn't really like
trailing space either, though.

> >And, here is the confusing example program included with the tar.gz :-)
>
> > (Count upward, beginning at 0, skipping multiples of five)
> > 0 [[dup 5 % 0 = [true] [false] if [] [dup puti] if 1 +] dip
> >dup call] dup call
>
> 0 [[dup 5 % 0 = [dup puti] [] if 1 +] dip dup call] dup call
>
> ?
>
> Why did you write "[true] [false] if" in there? That's only a NOT, and
> since your input is boolean and your output simply does an IF, you can
> simply reverse the clauses on the following IF.

Ahh. You're right. That was silly. The way you wrote it is better.
It could be written even more concisely if we had a "spin" primitive
that executes a program over and over again:

[f] spin -> f f f f f ...

We can construct this as:

spin == [dip dup call] cons dup call

Here's how it would work:

[f] [dip dup call] cons dup call
[[f] dip dup call] dup call (cons)
[[f] dip dup call] [[f] dip dup call] call (dup)
[[f] dip dup call] [f] dip dup call (call)
f [[f] dip dup call] dup call (dip)
f [[f] dip dup call] [[f] dip dup call] call (dup)
f [[f] dip dup call] [f] dip dup call (call)
f f [[f] dip dup call] dup call (dip)
f f [[f] dip dup call] [[f] dip dup call] call (dup)
f f [[f] dip dup call] [f] dip dup call (call)
f f f [[f] dip dup call] dup call (dip)
(...)

Then, we can rewrite the example program as:

0 [dup 5 % 0 = [dup puti] [] if 1 +] spin

Anyway... real infinite loops such as this would probably only rarely
be needed in practical situations; we'd probably like to have a bunch
of other looping primitives like these "loop", "while", "rep", "for":

[f] loop -> [[f] loop] f
[f] [c] while -> c f c f ... c (do "f" while "c" yields true)
[f] n rep -> f f ... f (do "f" "n" times)
[f] n for -> 0 f 1 f .. n 1 - f (do "f" "n" times, with index pushed)

The first one, "loop", is the most general and is referred to in the
Joy manual as the "Y combinator". It can be constructed as:

loop == [dup cons] swap concat dup call

> >There's another memory-management problem that's worse. In this system,
> >"cons" gives one the ability to construct dynamically-allocated
> >(by malloc(), currently) programs at runtime, but they will never
> >be freed; the programmer has no way to do it manually, and there is
> >no automatic garbage collector either.
>
> Grin. That's a problem.
>
> The solution there is the same -- don't provide CONS yet.

:-( ... but I like CONS ...

> Instead, think of what you'd do instead which doesn't require seperate
> memory management. The answer's pretty simple -- use APPEND.
> APPEND should take one argument, and append it to the end of the
> current definition (which has the effect of enlarging the dictionary
> by four bytes).

Hmm... I don't see how this would really be a replacement for CONS.
CONS allows you to generate programs at runtime, based on other programs
except with a parameter "hard-wired" in. I don't see how APPEND fills
this role...

> >Another problem is that the stack has a statically limited size of
> >200000 bytes; this may be changed by a #define at the top of
> >the .c file :-)
> >Presumably, this could be fixed by somehow guaranteeing that a segfault
> >occurs upon overflow and arranging things so that the segfault
> >is caught and the stack extended a ways.
>
> That's a good solution. Another solution would be to make the stack
> circular, so that there _is_ no overflow. This would cause HUGE surprise to
> anyone not expecting it ;-), but would remove a _lot_ of potential errors.

Hmm... I don't think I quite understand what you're suggesting
(or is this just a joke :-)) ... you say to make it circular ...
do you mean, set it up so that if stack space is exhausted the stack
pointer just wraps around and begins eating up the bottom of the stack?

> >Currently, there are no primitives for dealing with arrays, buffers,
> >or things of that kind. Also, there is no support for lambdas
> >or definitions.
>
> Add the dictionary, and you'll have support for buffers, and a buffer is an
> array.

Yeah... I might end up using this sort of solution. In the long term,
though, I'd like to find some way to unify quoted programs with arrays,
so that primitives like "cons" and "concat" could work on both. Joy
actually did something like this, in that one could use something
like this

[1 2 4 5]

to push something onto the stack that is treated by the system as an array,
although more literally is a program that pushes "1", "2", "4", and "5"
onto the stack. Actually, I think I may like this metaphor, that
data-pushing programs are just as good as lists. The main problem
I see is that with lists we often like to have destructing operators
like "uncons" and other operators that would allow one to peek and poke
various an item in a list, given an index ... If we don't have these
extra kind of primitives, then the only way to modify or inspect something
in the middle of the list would be to totally unwrap onto the stack
with "call" (i), do whatever, and then wrap it back up with "[]" and lots
of "cons"; this doesn't sound good. I don't know ...

> Another thing that's useful for buffers are Forth-style "blocks". A block
> is a 1024-byte (or any other size that's convenient) array which is stored
> in a set of buffers; there can be as many buffers as the system needs or
> wants.

Hmm... I'm a bit confused about the distinction between buffers and blocks.
Blocks are always some fixed size, usually 1024-byte, you say; can buffers
be any size, then? Are they usually smaller, or usually larger?

> Blocks are used as follows:
>
> 1024 read ( activate block # 1024 )
> 3 block ! ( write whatever values you want to the current block, using
> standard memory operations )
> [...]

Hmm... That could work, I suppose. But it seems like it'd be awful messy
to the programmer, to have to manually mess with the boundaries between
blocks, if all one really wanted was a large, flat buffer.

> Another effective trick, and the one I use, is to never implicitly compile a
> quotation; let quotations be just that, quotations. They don't have to
> remain in text form, because that's very slow for all purposes (except
> printing, which is slow anyhow); it's better to store them in wordcode form.
> In other words, a quotation is an array of word addresses.

Yes... if one wants to do dynamic optimization, it is almost essential
to do it this sort of way, keeping around a high-level version of
the code.

> When it comes time to _use_ a quotation, you have two choices: interpret it,
> or compile it and then call the result.

Right... the way my system works now, though, everything is always kept
compiled. But I could see how it would be better the other way...

> >[situation where there is only one caller of a quotation]
> >this would actually be a very common situation, I think;
> >it would happen all the time with branching and looping constructs,
> >in which the program to branch or loop has just been pushed as a literal.
>
> This is why Forth doesn't use quotations to implement branching and
> conditionals. I think that using quotation for that is a bad idea; it
> forces the computer to do extra work, moving those blocks of code all over
> the place, when almost always the programmer doesn't need that.

Hmm... well, it would probably really only be pointers to the code that
would be moved around (rather than the large blocks themselves) ...
That's how it works in my system, at least: a program literal (quotation)
just pushes a function pointer onto the stack. And actually, if the system
implemented some simple optimization, even these pointers wouldn't have
to be shuffled on the stack, since they're statically known; the system
would generate the ordinary high-quality code for branches and loops
that one would expect.

> It's easier and faster to have your IF word compile a "CMP TopOfStack, 0 /
> DROP TopOfStack / JNZ *0000000*", and then (at compile time) push the
> address of the starred zero on the stack (this is called an "unresolved
> jump"). Then ELSE is similar: it compiles a "JMP *00000000*", stores the
> address right after the JMP into the unresolved jump IF had left on the
> stack, and leaves on the stack its own unresolved jump. THEN simply
> resolves a single unresolved jump.

Well... this style of IF-ELSE-THEN is tempting, because it makes
compilation simpler; however, there is a problem with it I'd like to
point out... there seems to be no way that a system could implement
something like

[x IF]

where the location of the ELSE and THEN is not statically known,
but where the programmer is planning on later cat'ing that program
with something like

[y ELSE z THEN]

Basically, if IF-ELSE-THEN is going to be a sensible construct, then
we must require that they always come as a group. This is bad.
It would make it difficult to construct new interesting control
constructs out of IF-ELSE-THEN, since they could not be manipulated
as first-class citizens. But, the good news would be that a
quotation-taking "if" could be constructed from IF-ELSE-THEN:

if == dig2 IF popd call ELSE pop call THEN

Anyway; IF-ELSE-THEN is also bad because each one is impure;
i.e., they have effects other on the stack; this would make
it more difficult for systems to implement laziness or concurrency,
and would also make automated reasoning on programs more difficult.

> The primitive words here are called "?0BRANCH,", "0BRANCH,", and "BRANCH!".
> So in Forth, these words look like this:
>
> : IF ?0BRANCH, ; IMMEDIATE
> : ELSE 0BRANCH, ; IMMEDIATE
> : THEN HERE BRANCH! ; IMMEDIATE
>
> Does this make sense?

I guess so... but, for the reasons mentioned above, I still don't really
like IF-ELSE-THEN, or its buddies like DO-LOOP and FOR-NEXT.

> >- "iepos" (Brent Kerby)
>
> -Billy

- "iepos" (Brent Kerby)

wtanksley@bigfoot.com — 2000-05-10 22:39:03

From: iepos@... [mailto:iepos@...]

>> A couple of points to consider:

>> 1. Is it really worthwhile worrying about optimising juggling stack
>> elements? It's something that's _easy_ for the programmer
>> to fix, and
>> unneeded stack juggles make the code look terrible. So the
>> programmer has
>> good reason to want to NOT write unneeded stack juggles.

>I guess I'd agree with this; there's no point in making an
>optimizer spend
>a bunch of time fixing programmers' mistakes (well, it could be useful
>in some cases, but I don't think I'll attempt it in mine) ... however,
>sometimes a SWAP really is necessary, and there is nothing the
>programmer
>can do about it; in these cases, it would be good if the
>system implemented it efficiently.

Well, if it's needed then there's nothing the system can do about it, is
there? The point is that the programmer has far more ability than an
optimiser to change the efficiency of the code, so long as he can see
inefficiency. In this particular case, he can see it clearly.

There are tons of other reasons to want a register scheduler, of course.
I'm being deliberately minimalistic.

>Ahh. You're right. That was silly. The way you wrote it is better.
>It could be written even more concisely if we had a "spin" primitive
>that executes a program over and over again:

> [f] spin -> f f f f f ...

Or even more efficient, a lazy evaluator (continuation-based, perhaps) which
only generates the needed executions. But I won't be able to do that until
I understand continuations well enough.

>> >There's another memory-management problem that's worse. In
>> >this system,
>> >"cons" gives one the ability to construct dynamically-allocated
>> >(by malloc(), currently) programs at runtime, but they will never
>> >be freed; the programmer has no way to do it manually, and there is
>> >no automatic garbage collector either.

>> Grin. That's a problem.

>> The solution there is the same -- don't provide CONS yet.

>:-( ... but I like CONS ...

Most people don't; it's considered a benchamrk of inefficiency in Lisp
programs.

>> Instead, think of what you'd do instead which doesn't
>> require seperate
>> memory management. The answer's pretty simple -- use APPEND.
>> APPEND should take one argument, and append it to the end of the
>> current definition (which has the effect of enlarging the dictionary
>> by four bytes).

>Hmm... I don't see how this would really be a replacement for CONS.
>CONS allows you to generate programs at runtime, based on
>other programs
>except with a parameter "hard-wired" in. I don't see how APPEND fills
>this role...

I don't have any idea what you mean by "hard-wired". Let me assume you
simply said, "CONS allows you to generate programs at runtime; I don't see
how APPEND fills this role" and I'll continue from there.

Programs in LISP are trees, right? That's why a tree structure, the s-exp,
is used to represent them, and that's why a tree-building program, CONS, is
used to create them. Programs in concatenative languages are lists.
Therefore, an array-building program is sufficient. CONS is a quite easy
way to build a program, but it's simply overkill.

The only reason CONS comes into building programs in Joy is that Joy adds
the ability to build quotations as though they were a part of the program,
rather than simply being literal data which is appended to the program.

>> >Another problem is that the stack has a statically limited size of
>> >200000 bytes; this may be changed by a #define at the top of
>> >the .c file :-)
>> >Presumably, this could be fixed by somehow guaranteeing
>> >that a segfault
>> >occurs upon overflow and arranging things so that the segfault
>> >is caught and the stack extended a ways.

>> That's a good solution. Another solution would be to make the stack
>> circular, so that there _is_ no overflow. This would cause
>> HUGE surprise to
>> anyone not expecting it ;-), but would remove a _lot_ of
>> potential errors.

>Hmm... I don't think I quite understand what you're suggesting
>(or is this just a joke :-)) ... you say to make it circular ...
>do you mean, set it up so that if stack space is exhausted the stack
>pointer just wraps around and begins eating up the bottom of the stack?

It's not a joke, and that's exactly what I meant. Mathematically, it makes
absolutely no sense; but programmatically, it's very logical, and highly
controllable.

It's also very minimalist. :-)

>> >Currently, there are no primitives for dealing with arrays, buffers,
>> >or things of that kind. Also, there is no support for lambdas
>> >or definitions.

>> Add the dictionary, and you'll have support for buffers, and
>> a buffer is an array.

>Yeah... I might end up using this sort of solution. In the long term,
>though, I'd like to find some way to unify quoted programs with arrays,
>so that primitives like "cons" and "concat" could work on both.

cons will never work with true arrays; arrays respond only to indexing and
concatenation.

And since quotations are completely and correctly modelled by arrays, and
arrays can be held in buffers, if you add support for buffers and
compilation into those buffers you have all of the essentials for handling
quotations.

>The main problem
>I see is that with lists we often like to have destructing operators
>like "uncons" and other operators that would allow one to peek and poke
>various an item in a list, given an index ... If we don't have these
>extra kind of primitives, then the only way to modify or
>inspect something
>in the middle of the list would be to totally unwrap onto the stack
>with "call" (i), do whatever, and then wrap it back up with
>"[]" and lots of "cons"; this doesn't sound good. I don't know ...

This is a really good question. It's easy to define our quotation-arrays so
that exactly one 32-bit cell is taken up by each word and each literal in
the quotation so that we can use indexing to access the values, but it's
hard to solve the fundamental problem you're describing.

>> Another thing that's useful for buffers are Forth-style
>> "blocks". A block
>> is a 1024-byte (or any other size that's convenient) array
>> which is stored
>> in a set of buffers; there can be as many buffers as the
>> system needs or wants.

>Hmm... I'm a bit confused about the distinction between
>buffers and blocks.
>Blocks are always some fixed size, usually 1024-byte, you say;
>can buffers
>be any size, then? Are they usually smaller, or usually larger?

There are a huge number of blocks; not all of them actually exist in memory,
and it's possible that some of them don't exist on disk (for example, if you
implement your block system as a sparse file). Each block which exists
holds the data you place in it, and doesn't change unless you change it by
loading it into a buffer, modifying that buffer, marking it modified, and
flushing the buffers.

There is a precisely known number of block-buffers, and they all exist in
memory. They provide the data structures needed to implement the operations
I've mentioned, such as mark-modified, clear-modification, and flush.

The nice thing about the block-buffers is that they can also be used for
other purposes.

>> Blocks are used as follows:

>> 1024 read ( activate block # 1024 )
>> 3 block ! ( write whatever values you want to the current
>> block, using
>> standard memory operations )
>> [...]

>Hmm... That could work, I suppose. But it seems like it'd be
>awful messy
>to the programmer, to have to manually mess with the boundaries between
>blocks, if all one really wanted was a large, flat buffer.

Certainly. The programmer would have to load more blocks and so on,
literally handling paging. In that case there's a good likelihood that
there are better ways of doing that: either using ALLOT or APPEND, which
enlarge the dictionary by the space needed, or by using ALLOCATE, which
allocates memory like malloc.

These are, of course, primitives. But you're building a primitive system
here; don't get ahead of yourself by wanting to build advanced structures
before you finish the primitives.

>> >[situation where there is only one caller of a quotation]
>> >this would actually be a very common situation, I think;
>> >it would happen all the time with branching and looping constructs,
>> >in which the program to branch or loop has just been pushed
>> >as a literal.

>> This is why Forth doesn't use quotations to implement branching and
>> conditionals. I think that using quotation for that is a
>> bad idea; it
>> forces the computer to do extra work, moving those blocks of
>> code all over
>> the place, when almost always the programmer doesn't need that.

>Hmm... well, it would probably really only be pointers to the code that
>would be moved around (rather than the large blocks themselves) ...
>That's how it works in my system, at least: a program literal
>(quotation) just pushes a function pointer onto the stack.

I was talking about during compilation, not during runtime. During
compilation you have to set the compilation pointer somewhere else, compile
the program, and then reset the pointer. If you're doing any form of
optimization, you'll then have to copy the quotation into the program.

>And actually, if the system
>implemented some simple optimization, even these pointers wouldn't have
>to be shuffled on the stack, since they're statically known; the system
>would generate the ordinary high-quality code for branches and loops
>that one would expect.

Peephole optimization for this isn't too hard, but type-based optimization
is much more descriptive and powerful. And, once you have the type support
in, it follows naturally.

So you're right; using quotations for conditionals isn't a terrible idea.
It just isn't simple.

>> It's easier and faster to have your IF word compile a "CMP
>> TopOfStack, 0 /
>> DROP TopOfStack / JNZ *0000000*", and then (at compile time) push the
>> address of the starred zero on the stack (this is called an
>> "unresolved
>> jump"). Then ELSE is similar: it compiles a "JMP
>> *00000000*", stores the
>> address right after the JMP into the unresolved jump IF had
>> left on the
>> stack, and leaves on the stack its own unresolved jump. THEN simply
>> resolves a single unresolved jump.

>Well... this style of IF-ELSE-THEN is tempting, because it makes
>compilation simpler; however, there is a problem with it I'd like to
>point out... there seems to be no way that a system could implement
>something like

> [x IF]

>where the location of the ELSE and THEN is not statically known,
>but where the programmer is planning on later cat'ing that program
>with something like

> [y ELSE z THEN]

>Basically, if IF-ELSE-THEN is going to be a sensible construct, then
>we must require that they always come as a group. This is bad.

Your perception is excellent! That's very cool that you should notice so
quickly a very subtle issue in Forth which is only really aggrivated by the
presence of quotations.

There is at least one solution; I leave it to you as to whether that
solution is a good one, or sufficient for your purposes.

Do you remember how I suggested compiling quotations to wordcode, with one
cell per word/literal? Well, IF and THEN may be immediate, but when they're
quoted treat them just like any other word -- their wordcode is a single
cell long. In other words, they aren't executed.

It's only when you're compiling the quotation that you actually execute the
IF.

Thus, "[x IF]" is valid. "[x IF] i" is not.

>Anyway; IF-ELSE-THEN is also bad because each one is impure;
>i.e., they have effects other on the stack; this would make
>it more difficult for systems to implement laziness or concurrency,
>and would also make automated reasoning on programs more difficult.

They behave no differently from any other language's IF at runtime. It's
only at compile time that they affect the dictionary -- and so does every
other language.

However, APPEND is certainly impure, and any word which calls it at runtime
is tainted. So is PUT and PUTI. None of this stops us from defining those
words or using them. That's one really nice thing about concatenative
languages; the effects of impure words are confined to the impure words
themselves, and they occur in predictable form.

Laziness might be hard in concatenative languages, though. The problem is
that programs are assumed to be laid out in terms of dataflow, and are
assumed to act sequentially on a single object, the stack. This single
object is a bottleneck to parallelism (and laziness). Fortunately, there's
a solution -- the stack can be usually treated as groups of items, and the
dataflow-nature of stack based programming makes those groups easier to
discover than they otherwise would be. A good optimiser, lazy or otherwise,
could treat those groups seperately.

Remember how I mentioned that is we didn't use register allocation we would
be able to use far fewer registers? Let's count.

The registers needed for a reasonable Forth are:

IP -- instruction pointer
DS -- data stack pointer
RS -- return stack pointer
TOS -- top of stack

4 registers.

In many architectures there are enough registers to hold at least two Forth
tasks onboard at the same time. That is FAST context switching. (In the
x86, you can discard the TOS register by using direct memory references, at
a speed penalty, so you can still fit two tasks onboard if it's worth the
loss.)

>- "iepos" (Brent Kerby)

-Billy

iepos@tunes.org — 2000-05-11 15:39:14

> There are tons of other reasons to want a register scheduler, of course.
> I'm being deliberately minimalistic.

Okay... so we agree.

> >Ahh. You're right. That was silly. The way you wrote it is better.
> >It could be written even more concisely if we had a "spin" primitive
> >that executes a program over and over again:
>
> > [f] spin -> f f f f f ...
>
> Or even more efficient, a lazy evaluator (continuation-based, perhaps) which
> only generates the needed executions. But I won't be able to do that until
> I understand continuations well enough.

It's interesting that you bring up continuations again. I had left the
idea behind with applicative systems. I don't see what use they'd have
in a Joy-like system. But ... maybe they would be useful ...

> >CONS allows you to generate programs at runtime, based on
> >other programs
> >except with a parameter "hard-wired" in. I don't see how APPEND fills
> >this role...
>
> I don't have any idea what you mean by "hard-wired".

Okay... I'll attempt to use a practical example to explain what I meant.
Suppose you have a program "render" that takes an "x" and "y" screen
size and various 3D data-structures beneath that on the stack; and,
suppose that this program upon executing will yield an array of
pixels of what should be put on the screeen.

Now, we don't know at compile-time the size of the screen that we'll
be rendering to; however, once we figure it out the size from the
window manager or something, we know it will mostly stay the same.
Since we want the renderer to go very fast, we would like it if
the system could somehow generate a fine-tuned compilation of our
renderer at runtime, taking into account the screen size. CONS
is a marvelous way for a system to provide this feature.

For, suppose we had just obtained our screen size from the window manager
and our "x" and "y" are on the stack. Then all we have to do is

[render] cons cons

and *POOF*, we have a new fine-tuned compiled version of "render" on
the stack that takes into account the fixed screen size. Of course,
the system might not bother to optimize the new program until it
realizes that it is going to be duplicated and called MANY times,
in a loop.

In case this still is not clear, let me point out the way Joy's "cons"
works:

5 [foo] cons -> [5 foo]

As you can see, "cons" has the effect of hard-wiring the parameter
"5" into the program "foo" that had been pushed onto the stack.

> Programs in LISP are trees, right? That's why a tree structure, the s-exp,
> is used to represent them, and that's why a tree-building program, CONS, is
> used to create them.

Hmm... I'm not sure if Joy's CONS is really the same thing as LISP's
construct that runs by the same name ...

> Programs in concatenative languages are lists.

Well, not any more than LISP's programs are lists. You could say
that programs in concatenative languages are also trees. They're
lists, but members of the list may also be lists, in the case of
a quotation. Joy's syntax is really very similar to LISP ...
the main difference is that we use []s instead of ()s, and omit
the outermost []. Of course, LISP is different in many other ways,
for instance, in that it does not use an explicit stack and
relies heavily on bound variables.

> Therefore, an array-building program is sufficient. CONS is a quite easy
> way to build a program, but it's simply overkill.

To me, CONS seems like the perfect way to build programs in Joy.
But, I agree that it's not appropriate in a system that does not
provide automatic garbage collection. In such a system, it would
probably be better to have something like a COMPILE that takes a
pointer to a buffer of execution tokens and compiles the whole thing.

But, even if one goes the automatic-memory-management route, there
are alternatives to CONS. For instance, there is CAT (concat) and UNIT:

[x] [y] cat -> [x y]
[x] unit -> [[x]]

But, CAT and UNIT can be constructed from CONS as:

cat == [[i] dip i] cons cons
unit == [] cons

Of course, conversely, CONS can be constructed as:

cons == [unit] dip cat

> >Hmm... I don't think I quite understand what you're suggesting
> >(or is this just a joke :-)) ... you say to make it circular ...
> >do you mean, set it up so that if stack space is exhausted the stack
> >pointer just wraps around and begins eating up the bottom of the stack?
>
> It's not a joke, and that's exactly what I meant. Mathematically, it makes
> absolutely no sense; but programmatically, it's very logical, and highly
> controllable.
>
> It's also very minimalist. :-)

Well. In this kind of system, it wouldn't be possible to reliably do
some kinds of recursion. Hard to say how serious that would be ...

> >> >Currently, there are no primitives for dealing with arrays, buffers,
> >> >or things of that kind. Also, there is no support for lambdas
> >> >or definitions.
>
> >> Add the dictionary, and you'll have support for buffers, and
> >> a buffer is an array.
>
> >Yeah... I might end up using this sort of solution. In the long term,
> >though, I'd like to find some way to unify quoted programs with arrays,
> >so that primitives like "cons" and "concat" could work on both.
>
> cons will never work with true arrays; arrays respond only to indexing and
> concatenation.

What about insertion and deletion (of chunks into and out of the middle
of arrays)? These are things I'd want to be able to do with arrays too.
Of course, I'd use them sparingly unless there's some automatic memory
management in place to improve their performance.

> And since quotations are completely and correctly modelled by arrays, and
> arrays can be held in buffers, if you add support for buffers and
> compilation into those buffers you have all of the essentials for handling
> quotations.

Yes, that probably is not a bad way to do it for now. But, in the
long-term, I'd like to have some automatic memory management, so
that programmers do not have to worry about manually allocating,
resizing, and freeing buffers.

> >Hmm... I'm a bit confused about the distinction between
> >buffers and blocks.
> >Blocks are always some fixed size, usually 1024-byte, you say;
> >can buffers
> >be any size, then? Are they usually smaller, or usually larger?
>
> There are a huge number of blocks; not all of them actually exist in memory,
> and it's possible that some of them don't exist on disk (for example, if you
> implement your block system as a sparse file). Each block which exists
> holds the data you place in it, and doesn't change unless you change it by
> loading it into a buffer, modifying that buffer, marking it modified, and
> flushing the buffers.
>
> There is a precisely known number of block-buffers, and they all exist in
> memory. They provide the data structures needed to implement the operations
> I've mentioned, such as mark-modified, clear-modification, and flush.

Okay... Thanks... I think I understand now.

> >> This is why Forth doesn't use quotations to implement branching and
> >> conditionals. I think that using quotation for that is a
> >> bad idea; it
> >> forces the computer to do extra work, moving those blocks of
> >> code all over
> >> the place, when almost always the programmer doesn't need that.
>
> >Hmm... well, it would probably really only be pointers to the code that
> >would be moved around (rather than the large blocks themselves) ...
> >That's how it works in my system, at least: a program literal
> >(quotation) just pushes a function pointer onto the stack.
>
> I was talking about during compilation, not during runtime.

Oops. Okay, I'm on the same page now.

> During compilation you have to set the compilation pointer somewhere
> else, compile the program, and then reset the pointer. If you're
> doing any form of optimization, you'll then have to copy the quotation
> into the program.

You're right. I don't deny that using quotations for branching/looping
constructs would probably make compilation somewhat slower.

> So you're right; using quotations for conditionals isn't a terrible idea.
> It just isn't simple.

Right. There are disadvantages to using quotations for conditionals,
but despite these I still like the idea.

> >Well... this style of IF-ELSE-THEN is tempting, because it makes
> >compilation simpler; however, there is a problem with it I'd like to
> >point out... there seems to be no way that a system could implement
> >something like
>
> > [x IF]
>
> >where the location of the ELSE and THEN is not statically known,
> >but where the programmer is planning on later cat'ing that program
> >with something like
>
> > [y ELSE z THEN]
>
> >Basically, if IF-ELSE-THEN is going to be a sensible construct, then
> >we must require that they always come as a group. This is bad.
>
> Your perception is excellent! That's very cool that you should notice so
> quickly a very subtle issue in Forth which is only really aggrivated by the
> presence of quotations.
>
> There is at least one solution; I leave it to you as to whether that
> solution is a good one, or sufficient for your purposes.
>
> Do you remember how I suggested compiling quotations to wordcode, with one
> cell per word/literal? Well, IF and THEN may be immediate, but when they're
> quoted treat them just like any other word -- their wordcode is a single
> cell long. In other words, they aren't executed.
>
> It's only when you're compiling the quotation that you actually execute the
> IF.

And you can only compile a quotation that has "IF" in it if the "THEN"
is in sight. Yes, this could be a solution. However, it seems kind of
strange that "immediate" words would then not always be executed
immediately, but may be executed at various times while the program
is running. Nevertheless, it _could_ work (although I think I'm
more inclined to just stick with the quotation-taking conditionals).

> Laziness might be hard in concatenative languages, though. The problem is
> that programs are assumed to be laid out in terms of dataflow, and are
> assumed to act sequentially on a single object, the stack. This single
> object is a bottleneck to parallelism (and laziness). Fortunately, there's
> a solution -- the stack can be usually treated as groups of items, and the
> dataflow-nature of stack based programming makes those groups easier to
> discover than they otherwise would be. A good optimiser, lazy or otherwise,
> could treat those groups seperately.

Yes... something like this would make automatic laziness and concurrency
not too hard. However, laziness and concurrency can only safely be
done if the programs in question are pure, i.e. they have no effect
except on the stack. If a group of programs are impure, then they must
be executed in the order specified; attempting to be lazy and postpone
one or trying to execute both at once may be bad.

> Remember how I mentioned that is we didn't use register allocation we would
> be able to use far fewer registers? Let's count.

Oh no... not this topic again :-)

> The registers needed for a reasonable Forth are:
>
> IP -- instruction pointer
> DS -- data stack pointer
> RS -- return stack pointer
> TOS -- top of stack
>
> 4 registers.
>
> In many architectures there are enough registers to hold at least two Forth
> tasks onboard at the same time. That is FAST context switching. (In the
> x86, you can discard the TOS register by using direct memory references, at
> a speed penalty, so you can still fit two tasks onboard if it's worth the
> loss.)

This is a nice idea, but it wouldn't work on x86. Many instructions,
such as DIV, IN, and OUT require that the specific registers
EAX and EDX be used. And, CALL and RET require that ESP be used for
the return stack. Thus, if you were to switch tasks safely, you
would at least have to save EAX, EDX, ESP, and EIP (you _might_ be able
to get away with having the data stack pointers for two tasks be kept
in separate registers or in static memory addresses so that they don't
have to be saved). So, you can't really keep two tasks "on board" at once;
it will at least be necessary to swap out those four registers.

But, since we're going to be executing hundreds of thousands of
instructions in between context switches, the time needed to save
these four registers is insignificant. And, the monumental gain
of going ahead and allocating the other four registers will FAR outweigh
the tiny loss of having to save them. At least, that's my suspicion;
I haven't done any benchmarks, so I can't really say.

> >- "iepos" (Brent Kerby)
>
> -Billy

By the way, I have to apologize for not yet responding to your
reply to that other post which I think was on Lambdas and Definitions.
I think I may eventually get around to it :-)

- "iepos" (Brent Kerby)

RE: [stack] Joyful CLIP: early version

wtanksley@bigfoot.com — 2000-05-08 18:05:58

iepos@tunes.org — 2000-05-09 20:49:24

wtanksley@bigfoot.com — 2000-05-10 22:39:03

iepos@tunes.org — 2000-05-11 15:39:14