RE: [stack] CLIP, CONS, Registers, Blocks, IF-ELSE-THEN (was: Joy ful CLIP: early version)
wtanksley@bigfoot.com — 2000-05-11 21:07:14
From:
iepos@... [mailto:
iepos@...]
>> There are tons of other reasons to want a register
>> scheduler, of course.
>> I'm being deliberately minimalistic.
>Okay... so we agree.
Probably not ;-), but I'll certainly agree that your choices and priorities
are respectable.
>> >Ahh. You're right. That was silly. The way you wrote it is better.
>> >It could be written even more concisely if we had a "spin" primitive
>> >that executes a program over and over again:
>> > [f] spin -> f f f f f ...
>> Or even more efficient, a lazy evaluator
>> (continuation-based, perhaps) which
>> only generates the needed executions. But I won't be able
>> to do that until
>> I understand continuations well enough.
>It's interesting that you bring up continuations again. I had left the
>idea behind with applicative systems. I don't see what use they'd have
>in a Joy-like system. But ... maybe they would be useful ...
I posted a message about backtracking just a while ago. Clearly,
backtracking is useful, and it works using continuations.
>> >CONS allows you to generate programs at runtime, based on
>> >other programs
>> >except with a parameter "hard-wired" in. I don't see how
>> >APPEND fills
>> >this role...
>> I don't have any idea what you mean by "hard-wired".
>For, suppose we had just obtained our screen size from the
>window manager
>and our "x" and "y" are on the stack. Then all we have to do is
> [render] cons cons
>and *POOF*, we have a new fine-tuned compiled version of "render" on
>the stack that takes into account the fixed screen size. Of course,
>the system might not bother to optimize the new program until it
>realizes that it is going to be duplicated and called MANY times,
>in a loop.
Okay. You're asking for a way to compile parameters into a procedure. No
problem. You do it, as you note, the same way you build procedures. Let's
suppose x and y are on the stack:
NAME: myRender
swap append append ( add the parameters )
' render append ( add the function )
myRender compile: myRender ( compile the quotation we just built into a
function which has the same name. )
>In case this still is not clear, let me point out the way Joy's "cons"
>works:
> 5 [foo] cons -> [5 foo]
>As you can see, "cons" has the effect of hard-wiring the parameter
>"5" into the program "foo" that had been pushed onto the stack.
Right.
As I discussed, APPEND operates in exactly the same way as cons in this
respect, except that APPEND adds to the end of the list/array rather than
the beginning. (And it can only add atoms, not lists; in order to add a
list, you have to express it as an atom first.)
>> Programs in LISP are trees, right? That's why a tree
>> structure, the s-exp,
>> is used to represent them, and that's why a tree-building
>> program, CONS, is
>> used to create them.
>Hmm... I'm not sure if Joy's CONS is really the same thing as LISP's
>construct that runs by the same name ...
It is.
>> Programs in concatenative languages are lists.
>Well, not any more than LISP's programs are lists.
You're going to have to really slow down and read my message. You're
assuming a LOT of false stuff, and it's really throwing you off.
s-exps are trees, by definition. Concatenative programs are better
expressed as lists. Think about it: concatenation makes sense on a list,
but it makes no sense on a tree.
>You could say
>that programs in concatenative languages are also trees.
Yes, you could. This is what Joy does. However:
>They're
>lists, but members of the list may also be lists, in the case of
>a quotation.
What's special about a quotation, I ask rhetorically? Simply this: a
quotation is _data_, not a part of the program. It's a literal in the same
sense that an integer is a literal. If Joy manipulated our integers in tree
form but didn't provide quotations, would Joy be a tree-structured language?
No. It would be a list-structured language with tree-structured data.
>Joy's syntax is really very similar to LISP ...
>the main difference is that we use []s instead of ()s, and omit
>the outermost []. Of course, LISP is different in many other ways,
>for instance, in that it does not use an explicit stack and
>relies heavily on bound variables.
Whew! That's a heavy statement, and very untrue. Joy doesn't have a
distinguished position for functions versus parameters, and Lisp does. Joy
doesn't use [] to indicate function calls, but rather to indicate data
boundaries.
>> Therefore, an array-building program is sufficient. CONS is
>> a quite easy
>> way to build a program, but it's simply overkill.
>To me, CONS seems like the perfect way to build programs in Joy.
It may very well be -- but only because, thanks to its implementation of IF,
Joy uses data as code as part of its regular execution. In other words, in
order to write a simple Joy function which happens to use IF, you _have_ to
construct some tree-based data.
Forth programmers get around that by not using tree-structured data as parts
of programs, and when such is needed, by using function calls (since a
function call provides something very like a tree structure, although
without the inefficiency).
>But, I agree that it's not appropriate in a system that does not
>provide automatic garbage collection. In such a system, it would
>probably be better to have something like a COMPILE that takes a
>pointer to a buffer of execution tokens and compiles the whole thing.
If there are two ways of doing things, one requiring GC and one not
requiring GC, the one not requiring GC is more efficient. This isn't just
opinion; it's established fact. See Baker's papers on GC and
thermodynamics.
>But, even if one goes the automatic-memory-management route, there
>are alternatives to CONS. For instance, there is CAT (concat) and UNIT:
> [x] [y] cat -> [x y]
> [x] unit -> [[x]]
This is essentially what happens when you use APPEND instead of CONS.
(Although APPEND takes a scalar and a list, rather than two lists.) Unit,
in this context, is the act of allocating a new, empty buffer.
>> >Hmm... I don't think I quite understand what you're suggesting
>> >(or is this just a joke :-)) ... you say to make it circular ...
>> >do you mean, set it up so that if stack space is exhausted the stack
>> >pointer just wraps around and begins eating up the bottom
>> >of the stack?
>> It's not a joke, and that's exactly what I meant.
>> Mathematically, it makes
>> absolutely no sense; but programmatically, it's very
>> logical, and highly
>> controllable.
>> It's also very minimalist. :-)
>Well. In this kind of system, it wouldn't be possible to reliably do
>some kinds of recursion. Hard to say how serious that would be ...
Yes, it is. Especially when it's so vague.
More specifically, it would be hard to express extremely deep recursion.
But realistically, it's already hard to express that; the programmer wishing
to do that type of job usually finds some other way to do it.
>> >> >Currently, there are no primitives for dealing with
>> >> >arrays, buffers,
>> >> >or things of that kind. Also, there is no support for lambdas
>> >> >or definitions.
>> >> Add the dictionary, and you'll have support for buffers, and
>> >> a buffer is an array.
>> >Yeah... I might end up using this sort of solution. In the
>> >long term,
>> >though, I'd like to find some way to unify quoted programs
>> >with arrays,
>> >so that primitives like "cons" and "concat" could work on both.
>> cons will never work with true arrays; arrays respond only
>> to indexing and concatenation.
>What about insertion and deletion (of chunks into and out of the middle
>of arrays)? These are things I'd want to be able to do with arrays too.
Those are nothing more or less than indexing, concatenation, and array
creation.
Insertion creates a new array the size of both arrays together, then appends
the elements of the first array up to the insertion point, then appends all
of the elements of the second array, then appends the rest of the elements
of the first array. It then destroys the original first array.
Deletion operates in the same basic way -- it creates a new array which
contains some of the stuff contained in the old one, then treats the new
array as though it were the old one.
>Of course, I'd use them sparingly unless there's some automatic memory
>management in place to improve their performance.
Memory management would only hurt their performance.
Of course, these things never happen in a vacuum; there's always some code
_calling_ these array mutators. It's quite possible that this code needs
memory management of the arrays because it's not written linearly. There's
nothing wrong with that.
At a low level, though, you want to start implementing without garbage
collection, and only start using GC after you've implemented it. If you
think that you can't implement any more of your design until you add GC,
that's your call. I would disagree; I don't think it's possible to have GC
until you have operations can use it.
>> And since quotations are completely and correctly modelled
>> by arrays, and
>> arrays can be held in buffers, if you add support for buffers and
>> compilation into those buffers you have all of the
>> essentials for handling quotations.
>Yes, that probably is not a bad way to do it for now. But, in the
>long-term, I'd like to have some automatic memory management, so
>that programmers do not have to worry about manually allocating,
>resizing, and freeing buffers.
Sure. But you start by adding buffers to your system, and making your
system use those buffers, and as your system gets more complete it can do
more of the work of managing those buffers.
>> So you're right; using quotations for conditionals isn't a
>> terrible idea. It just isn't simple.
>Right. There are disadvantages to using quotations for conditionals,
>but despite these I still like the idea.
Understandably.
Personally, I dislike the whole idea of complex conditionals. They make
your definitions much larger, and make it harder to break them apart into
smaller definitions. I think the Joy 'if' is too complex; not only does it
take three parameters, it also requires both 'if' and 'else' blocks. Chuck
Moore doesn't even use ELSE blocks anymore; he simply returns at the end of
his IF block, and performs the ELSE action after the endif. This clearly
would have the effect of keeping definitions short.
>> Do you remember how I suggested compiling quotations to
>> wordcode, with one
>> cell per word/literal? Well, IF and THEN may be immediate,
>> but when they're
>> quoted treat them just like any other word -- their wordcode
>> is a single cell long. In other words, they aren't executed.
>> It's only when you're compiling the quotation that you
>> actually execute the IF.
>And you can only compile a quotation that has "IF" in it if the "THEN"
>is in sight. Yes, this could be a solution. However, it seems kind of
>strange that "immediate" words would then not always be executed
>immediately, but may be executed at various times while the program
>is running.
They execute only when the system is generating code. That's their job.
There's a problem with my solution, though. Some words in Forth look ahead
in their source code, but if a quotation has already parsed the source for
them they can't do that. I think this is a message: quotations embedded in
the source and source parsing don't ever mix.
Of the two, I choose to keep source parsing, because it's a quick, simple,
powerful operation, adding no overhead to the runtime.
>Nevertheless, it _could_ work (although I think I'm
>more inclined to just stick with the quotation-taking conditionals).
I'm inclined to get rid of the idea of quotations as a special data
structure. I want to keep them, but I don't want to leave them as something
which can be entered as a literal at anytime, nor do I wish to encourage
their use where they aren't needed.
If you need a quotation, you should be able to build it, just the same as
when you need a list you can build it. You can also write a microlanguage
which will build either one for you -- but I won't do it for you, lest I get
in your way when you're doing something else.
>> Laziness might be hard in concatenative languages, though.
>> The problem is
>> that programs are assumed to be laid out in terms of
>> dataflow, and are
>> assumed to act sequentially on a single object, the stack.
>> This single
>> object is a bottleneck to parallelism (and laziness).
>> Fortunately, there's
>> a solution -- the stack can be usually treated as groups of
>> items, and the
>> dataflow-nature of stack based programming makes those
>> groups easier to
>> discover than they otherwise would be. A good optimiser,
>> lazy or otherwise,
>> could treat those groups seperately.
>Yes... something like this would make automatic laziness and
>concurrency
>not too hard. However, laziness and concurrency can only safely be
>done if the programs in question are pure, i.e. they have no effect
>except on the stack. If a group of programs are impure, then they must
>be executed in the order specified; attempting to be lazy and postpone
>one or trying to execute both at once may be bad.
But this depends on what you mean by "pure". Every procedure is in a sense
pure if it performs only what it's supposed to do. Displaying to the screen
is perfectly fine, so long as you're supposed to do that. A lazy/parallel
compiler would have to track all of the different resources which are being
modified and used, and could thus reasonably easily handle an impure
functions in a pure language. Assuming the language in question is
concatenative, of course -- if it's not, the order the language is written
in is not the order it's supposed to execute in, and the compiler has to
make assumptions about instruction ordering.
>> In many architectures there are enough registers to hold at
>> least two Forth
>> tasks onboard at the same time. That is FAST context
>> switching. (In the
>> x86, you can discard the TOS register by using direct memory
>> references, at
>> a speed penalty, so you can still fit two tasks onboard if
>> it's worth the loss.)
>This is a nice idea, but it wouldn't work on x86. Many instructions,
>such as DIV, IN, and OUT require that the specific registers
>EAX and EDX be used. And, CALL and RET require that ESP be used for
>the return stack.
This is why you can only fit two processes on an x86, even after discarding
the TOS register.
>Thus, if you were to switch tasks safely, you
>would at least have to save EAX, EDX, ESP, and EIP (you _might_ be able
>to get away with having the data stack pointers for two tasks be kept
>in separate registers or in static memory addresses so that they don't
>have to be saved). So, you can't really keep two tasks "on
>board" at once;
>it will at least be necessary to swap out those four registers.
Ah, but if both tasks are truly on board you're going to save and restore
all the registers anyhow when you task switch.
And if only one task is on board, it's not a problem to store the DS in EDX,
the TOS in EAX, and so on.
>But, since we're going to be executing hundreds of thousands of
>instructions in between context switches, the time needed to save
>these four registers is insignificant. And, the monumental gain
>of going ahead and allocating the other four registers will
>FAR outweigh
>the tiny loss of having to save them. At least, that's my suspicion;
>I haven't done any benchmarks, so I can't really say.
That's possible. I'm only bringing up the options. As I said, "if it's
worth the loss."
>- "iepos" (Brent Kerby)
-Billy
iepos@tunes.org — 2000-05-12 18:30:38
> >It's interesting that you bring up continuations again. I had left the
> >idea behind with applicative systems. I don't see what use they'd have
> >in a Joy-like system. But ... maybe they would be useful ...
>
> I posted a message about backtracking just a while ago. Clearly,
> backtracking is useful, and it works using continuations.
Ahh... I'll have to go back sometime and follow some of those
links you gave...
> >For, suppose we had just obtained our screen size from the
> >window manager
> >and our "x" and "y" are on the stack. Then all we have to do is
>
> > [render] cons cons
>
> >and *POOF*, we have a new fine-tuned compiled version of "render" on
> >the stack that takes into account the fixed screen size. Of course,
> >the system might not bother to optimize the new program until it
> >realizes that it is going to be duplicated and called MANY times,
> >in a loop.
>
> Okay. You're asking for a way to compile parameters into a procedure. No
> problem. You do it, as you note, the same way you build procedures. Let's
> suppose x and y are on the stack:
>
> NAME: myRender
Okay. Let me see if I've got this straight... That line was supposed
to create a new empty buffer and name it "myRender", right?
But wait, a pointer to it is getter pushed on the stack, too, right?
> swap append append ( add the parameters )
Hmm. I'm not quite following this. Okay, "swap" gets things in order
so that the "x" parameter is on top and the empty buffer beneath.
Then we append. But wait, wouldn't we need to swap again before
we append the "y"? Oh, no, because the "y" is prepended, since
we want it to get pushed before the "x" in the resulting program.
But wait. That wouldn't work... doesn't append care what order the buffer
comes in? The buffer either has to be on top and the other item beneath
or the other way around, right? So which way is it?
> ' render append ( add the function )
Hmm. Okay, "' render" pushes a pointer to the "render" function onto
the stack, right? And this pointer can be treated like a buffer.
So we append it, completing the buffer we've been constructing.
But, there's a problem here, I think. We have the buffer that contains
our two parameters "x" and "y", and we have a buffer for the render
function; what we need to is _concatenate_ them together. If I
understand, this is not what your "append" does. "append" only adds
a single item, right?
> myRender compile: myRender ( compile the quotation we just built into a
> function which has the same name. )
Not sure I understand this. Isn't a pointer to our buffer on the stack.
If so, then why are we pushing another one with "myRender"?
> >> Programs in LISP are trees, right? That's why a tree
> >> structure, the s-exp,
> >> is used to represent them, and that's why a tree-building
> >> program, CONS, is
> >> used to create them.
> >> [...]
> >> Programs in concatenative languages are lists.
>
> >Well, not any more than LISP's programs are lists.
>
> You're going to have to really slow down and read my message. You're
> assuming a LOT of false stuff, and it's really throwing you off.
>
> s-exps are trees, by definition. Concatenative programs are better
> expressed as lists. Think about it: concatenation makes sense on a list,
> but it makes no sense on a tree.
Hmm... I think I might be see what I was missing. When you see the s-exp
(3 4 5)
you are thinking of it as the tree:
*
/ \
3 *
/ \
4 *
/ \
5 nil
Right? I was thinking that LISP lists were not necessarily always
implemented with pointer chains in this way, that they could sometimes
be implemented with arrays. But perhaps this is not true; I'm not
very knowledgable about LISP and have never done any practical
programming in it, so I'll just retreat and take back my above
statement about LISP programs :-)
> >You could say
> >that programs in concatenative languages are also trees.
>
> Yes, you could. This is what Joy does. However:
One thing I should clarify... When I said that Joy programs could
be thought of as trees, I meant something like this: that this program
0 [[dup . 1 +] dup call] dup call
could be thought of as the tree:
0 * dup call
|
----------
* dup call
|
---------
dup . 1 +
That is what I meant, in case it was not clear.
> >They're
> >lists, but members of the list may also be lists, in the case of
> >a quotation.
>
> What's special about a quotation, I ask rhetorically?
Quotations are very special, a very important part of Joy.
At least, that's my perspective.
There's an interesting class of programs can be constructed from
just "i", "dip", "cons", "dup", and "drop" using concatenation and
quotation. Surprisingly, if you supplement these with some primitives
for I/O, you can do just about anything, even without extra primitives
for arithmetic, branching, and looping. This was demonstrated somewhat
by my little interpreter that emulates fixed-point arithmetic
(at
http://www.tunes.org/~iepos/prog.c with 'remarks' file in same
directory).
In fact, I believe that the Joy base {call, dip, cons, dup, drop} corresponds
closely to the {I, B, C, W, K} base of good old applicative combinatory
logic.
Since "i", "dip", "cons" require quotations in order to work, this
may explain a bit why I find quotations so special.
> Simply this: a quotation is _data_, not a part of the program.
> It's a literal in the same sense that an integer is a literal.
Well. Yes. I guess you could say that quotations are data.
> >Joy's syntax is really very similar to LISP ...
> >the main difference is that we use []s instead of ()s, and omit
> >the outermost []. Of course, LISP is different in many other ways,
> >for instance, in that it does not use an explicit stack and
> >relies heavily on bound variables.
>
> Whew! That's a heavy statement, and very untrue. Joy doesn't have a
> distinguished position for functions versus parameters, and Lisp does. Joy
> doesn't use [] to indicate function calls, but rather to indicate data
> boundaries.
Yes, there are many many differences. All I meant to say was that LISP
and Joy are both based on lists. But then again, my experience with LISP
is very limited ...
> >But, even if one goes the automatic-memory-management route, there
> >are alternatives to CONS. For instance, there is CAT (concat) and UNIT:
> > [x] [y] cat -> [x y]
> > [x] unit -> [[x]]
>
> This is essentially what happens when you use APPEND instead of CONS.
> (Although APPEND takes a scalar and a list, rather than two lists.) Unit,
> in this context, is the act of allocating a new, empty buffer.
Hmm... based on what I saw earlier, it looks to me like your APPEND is
more similar to "cons" than to "cat". UNIT does not need to be taken
as an extra primitive, since its effect can be emulated by creating
an empty buffer and then APPEND'ing a single thing to it.
By the way, earlier you seemed to have some construct "NAME" for making
an empty buffer. It seemed to involve (gasp) binding a variable name.
Ideally, it seems like we'd just use "[]" to push an empty buffer
onto the stack.
> >> cons will never work with true arrays; arrays respond only
> >> to indexing and concatenation.
>
> >What about insertion and deletion (of chunks into and out of the middle
> >of arrays)? These are things I'd want to be able to do with arrays too.
>
> Those are nothing more or less than indexing, concatenation, and array
> creation.
>
> Insertion creates a new array the size of both arrays together, then appends
> the elements of the first array up to the insertion point, then appends all
> of the elements of the second array, then appends the rest of the elements
> of the first array. It then destroys the original first array.
>
> Deletion operates in the same basic way -- it creates a new array which
> contains some of the stuff contained in the old one, then treats the new
> array as though it were the old one.
Oh, I see... Yes you could construct insertion and deletion from
your favorite primitives: "indexing" (meaning "!" (fetch) and "@" (store)
I suppose) and "array creation".. You really don't even need concatenation.
It could be constructed in a similar way to insertion, by appending
all the elements of the second array one at a time.
> >Of course, I'd use them sparingly unless there's some automatic memory
> >management in place to improve their performance.
>
> Memory management would only hurt their performance.
Whoa... now what if somebody tries to insert a 64KB chunk into a
big 2MB buffer? An automatic memory management system would certainly
not hurt the performance if it decided not to actually insert the
chunk, but instead just keep track and pretend that it inserted
it. Of course, a programmer could do that himself, although it
could get messy.
> Personally, I dislike the whole idea of complex conditionals. They make
> your definitions much larger,
Hmm... do they really? Let's compare:
[x] [y] if
to:
IF x ELSE y THEN
Looks like the FORTH one is larger by a bit.
> and make it harder to break them apart into smaller definitions.
Hmm... that's interesting. Do some FORTH programmers write definitions
that include incomplete IFs? If so, then this may be a valid point.
> I think the Joy 'if' is too complex; not only does it
> take three parameters, it also requires both 'if' and 'else' blocks.
True. I've considered adding another primitive "opt" that would take
just an "if" block.
> There's a problem with my solution, though. Some words in Forth look ahead
> in their source code, but if a quotation has already parsed the source for
> them they can't do that. I think this is a message: quotations embedded in
> the source and source parsing don't ever mix.
Yes. I think that may be true.
> Of the two, I choose to keep source parsing, because it's a quick, simple,
> powerful operation, adding no overhead to the runtime.
And, I choose to keep quotations because quotations are cool.
> I'm inclined to get rid of the idea of quotations as a special data
> structure. I want to keep them, but I don't want to leave them as something
> which can be entered as a literal at anytime, nor do I wish to encourage
> their use where they aren't needed.
Hmm... this style is opposite from the one I'm considering. But it
sounds like it would probably work out fine. There's no harm in
trying both approaches.
> If you need a quotation, you should be able to build it, just the same as
> when you need a list you can build it.
Yes. By the way, I mentioned earlier that I wanted some way to unify
quotations with lists and buffers; I think I've come up with a pretty
good solution, similar to Joy's. I'm going to have introspecting
primitives "!" and "@" (similar to FORTH's words of the same names)...
"@" would take an index and a quotation/list beneath, and would push
the such-and-such element of the list onto the stack. "!" would
act similarly, taking the item to store on top of the stack, and then
the index and the quotation/list. There will also be a "size" primitive
for inspecting the size of a quotation/list. A quotation would be
formally identified with a list of execution tokens. Execution tokens
could be fetched, stored, and compared as a means of reflection.
It feels kind of funny advocating this technique, since it was only
a couple months ago you were advocating it, and I was arguing about
how impure it was.
Anyway, there are many details to work out...
> >> In many architectures there are enough registers to hold at
> >> least two Forth
> >> tasks onboard at the same time. That is FAST context
> >> switching. (In the
> >> x86, you can discard the TOS register by using direct memory
> >> references, at
> >> a speed penalty, so you can still fit two tasks onboard if
> >> it's worth the loss.)
>
> >This is a nice idea, but it wouldn't work on x86. Many instructions,
> >such as DIV, IN, and OUT require that the specific registers
> >EAX and EDX be used. And, CALL and RET require that ESP be used for
> >the return stack.
>
> This is why you can only fit two processes on an x86, even after discarding
> the TOS register.
Oh, I see now. At a task-switch, you can just exchange EAX, EDX, ESP, and
the data stack pointer with the four remaining registers. I thought
by "fit two tasks onboard" that you meant that you could switch
tasks without _any_ exchanging of registers or memory. You can't do that on
x86, but you can switch pretty fast, by doing the four exchanges,
more-or-less (some other stuff would have to be done to switch EIP).
> >- "iepos" (Brent Kerby)
>
> -Billy
- "iepos" (Brent Kerby)
wtanksley@bigfoot.com — 2000-05-12 19:36:32
From:
iepos@... [mailto:
iepos@...]
>> Okay. You're asking for a way to compile parameters into a
>> procedure. No
>> problem. You do it, as you note, the same way you build
>> procedures. Let's
>> suppose x and y are on the stack:
>> NAME: myRender
>Okay. Let me see if I've got this straight... That line was supposed
>to create a new empty buffer and name it "myRender", right?
>But wait, a pointer to it is getter pushed on the stack, too, right?
Oops. Sorry, I forgot to document that.
The word "NAME:" modifies the dictionary by adding a name to it, and setting
the "current definition" to the address of the name. The name's default
behavior when executed will be to push its address on the stack.
APPEND adds its parameter to the end of the buffer for the "current
definition."
I was assuming Forth-like behavior there, not Joy-like behavior.
>> swap append append ( add the parameters )
>Hmm. I'm not quite following this. Okay, "swap" gets things in order
>so that the "x" parameter is on top and the empty buffer beneath.
Swap simply exchanges x and y, so that when we append them they go in the
right order. Then we append twice, thereby compiling both x and y into the
buffer.
>But wait. That wouldn't work... doesn't append care what order
>the buffer
>comes in? The buffer either has to be on top and the other item beneath
>or the other way around, right? So which way is it?
Neither -- the "current buffer" is a dictionary concept, and APPEND uses it
by default. In other words, APPEND is a method of the singleton object
'dictionary'.
Sorry, my mistake. I should have either chosen semantics which matched what
I knew you were expecting, or I should have clearly documented mine (and
probably chosen a different name).
>> ' render append ( add the function )
>Hmm. Okay, "' render" pushes a pointer to the "render" function onto
>the stack, right?
Yes.
>And this pointer can be treated like a buffer.
>So we append it, completing the buffer we've been constructing.
Almost. I forgot to append one more thing; I need to append an EXIT, so
that the word terminates when it reaches its end. To do that, I can either
"' EXIT APPEND", or I can ";" (since semicolon's job is to append an EXIT).
>But, there's a problem here, I think. We have the buffer that contains
>our two parameters "x" and "y", and we have a buffer for the render
>function; what we need to is _concatenate_ them together. If I
>understand, this is not what your "append" does. "append" only adds
>a single item, right?
We certainly do NOT want to concatenate them together; we want to
concatenate a _call_ to render into our buffer. If we wanted to concatenate
them together, we would use a function named something like "' render
APPEND-INLINE". APPEND-INLINE would probably call APPEND repeatedly.
>> myRender compile: myRender ( compile the quotation we just
>> built into a function which has the same name. )
>Not sure I understand this. Isn't a pointer to our buffer on the stack.
>If so, then why are we pushing another one with "myRender"?
No, there's not a pointer on the stack. But there could be, and even if
there wasn't I could have been consistant by making the compile: word
compile the "current buffer" by default.
>> >> Programs in LISP are trees, right? That's why a tree
>> >> structure, the s-exp,
>> >> is used to represent them, and that's why a tree-building
>> >> program, CONS, is
>> >> used to create them.
>> >> [...]
>> >> Programs in concatenative languages are lists.
>> >Well, not any more than LISP's programs are lists.
>> s-exps are trees, by definition. Concatenative programs are better
>> expressed as lists. Think about it: concatenation makes
>> sense on a list, but it makes no sense on a tree.
>Hmm... I think I might be see what I was missing. When you see
>the s-exp
> (3 4 5)
>you are thinking of it as the tree:
> *
> / \
> 3 *
> / \
> 4 *
> / \
> 5 nil
>Right? I was thinking that LISP lists were not necessarily always
>implemented with pointer chains in this way, that they could sometimes
>be implemented with arrays. But perhaps this is not true; I'm not
>very knowledgable about LISP and have never done any practical
>programming in it, so I'll just retreat and take back my above
>statement about LISP programs :-)
Grin.
Just FYI, then, Lisp programs *must* be expressed as trees; this is why you
have to use parens in them. Even if you imagine that (3 4 5) is an array,
Lisp programs are NEVER written as a single function call; they're always at
least one layer of nesting deeper. (add 5 4) may be valid Lisp, but it's
not a program; (print (add 4 5)) is more likely, and (defun func (x) (print
(add x 5))) is still more likely.
The branches of this tree are required because they're parameters being
applied to functions. Conversely, take away the ability to apply parameters
to functions, and you also remove the need for tree-structured code.
This is fundamental.
A while ago -- before I'd heard of Joy, and therefore before I'd heard of
concatenative languages -- someone on the Lisp newsgroup quoted one of the
inventors of Lisp as saying, "the strength of Lisp is how it's able to treat
data as code. Lisp will be dead only when someone comes up with a language
which does this better."
Need I spell out the next conclusion I logically make?
>> >You could say
>> >that programs in concatenative languages are also trees.
>> Yes, you could. This is what Joy does. However:
>One thing I should clarify... When I said that Joy programs could
>be thought of as trees, I meant something like this: that this program
> 0 [[dup . 1 +] dup call] dup call
>could be thought of as the tree:
> 0 * dup call
> |
> ----------
> * dup call
> |
> ---------
> dup . 1 +
>That is what I meant, in case it was not clear.
It's absolutely clear. This is also what I meant. However, it's notable
that the _code_ does not itself have a tree structure; rather, some data
which happens to be literally included in the code has a tree structure.
>> >They're
>> >lists, but members of the list may also be lists, in the case of
>> >a quotation.
>> What's special about a quotation, I ask rhetorically?
>Quotations are very special, a very important part of Joy.
>At least, that's my perspective.
I agree. They're not a vital part of any concatenative language, but
they're such an important part of the theory and notation that I agree with
you that any concatenative language without them is incomplete.
>Since "i", "dip", "cons" require quotations in order to work, this
>may explain a bit why I find quotations so special.
Ah, but they don't require inline quotations.
>> >But, even if one goes the automatic-memory-management route, there
>> >are alternatives to CONS. For instance, there is CAT
>> >(concat) and UNIT:
>> > [x] [y] cat -> [x y]
>> > [x] unit -> [[x]]
>> This is essentially what happens when you use APPEND instead of CONS.
>> (Although APPEND takes a scalar and a list, rather than two
>> lists.) Unit,
>> in this context, is the act of allocating a new, empty buffer.
>Hmm... based on what I saw earlier, it looks to me like your APPEND is
>more similar to "cons" than to "cat". UNIT does not need to be taken
>as an extra primitive, since its effect can be emulated by creating
>an empty buffer and then APPEND'ing a single thing to it.
>By the way, earlier you seemed to have some construct "NAME" for making
>an empty buffer. It seemed to involve (gasp) binding a variable name.
>Ideally, it seems like we'd just use "[]" to push an empty buffer
>onto the stack.
It didn't involve binding a variable name; its entire effect was to bind a
variable name. You're correct that it was the wrong way to solve the
problem, though. Your suggestion is wiser, although I would go a step
further by not using a _new_ empty buffer, but rather borrowing a temporary
buffer.
>> >Of course, I'd use them sparingly unless there's some
>> >automatic memory
>> >management in place to improve their performance.
>> Memory management would only hurt their performance.
>Whoa... now what if somebody tries to insert a 64KB chunk into a
>big 2MB buffer? An automatic memory management system would certainly
>not hurt the performance if it decided not to actually insert the
>chunk, but instead just keep track and pretend that it inserted
>it. Of course, a programmer could do that himself, although it
>could get messy.
It would *certainly* not hurt performance? That's a big assertion. I can
clearly see that there are cases when it would not hurt too badly, but all
in all I believe it would damage both performance and code size (and
therefore bugginess) greatly, because every bit of code which operated on
that array would have to consider not only the end cases of the array, but
also the end cases of pointers inserted into the so-called array.
It's true that you can manage that all automatically; add an extra layer to
generate code for those cases, add another layer to watch for code which
will need that, add more layers to ...
And so on.
Why? Why not just go back to simplicity, and get the same job done in the
same amount of time in all cases using technology which exists now? Your
solution at best keeps the problem solution linear; it's not like the simple
solution is O(2^n) and yours is O(n).
>> Personally, I dislike the whole idea of complex
>> conditionals. They make
>> your definitions much larger,
>Hmm... do they really? Let's compare:
> [x] [y] if
>to:
> IF x ELSE y THEN
>Looks like the FORTH one is larger by a bit.
Trivial cases prove nothing. This case is so trivial that the overhead of
the if overwhelms the overhead of your own code.
Suppose that x and y were four or five words. Suppose that you're reading
the word they're included in for the first time. You don't even know what
those quotations are _for_ until you finally reach the 'if', which is almost
invisible it's so small.
Compare the Moore-style code:
: one IF x OTHERWISE y ;
...unlike ELSE, "OTHERWISE" compiles a return from the current word.
>> and make it harder to break them apart into smaller definitions.
>Hmm... that's interesting. Do some FORTH programmers write definitions
>that include incomplete IFs? If so, then this may be a valid point.
No Forth programmer would consider writing a definition with an incomplete
IF, assuming that you mean an IF without a THEN. However, every Forth
programmer writes IFs without ELSEs.
>> I think the Joy 'if' is too complex; not only does it
>> take three parameters, it also requires both 'if' and 'else' blocks.
>True. I've considered adding another primitive "opt" that would take
>just an "if" block.
This isn't a bad idea, aside from the abbreviation. C is evil mainly
because it abbreviates. It's better to use a meaningless symbol than it is
to use an abbreviation for a meaningful word.
>> There's a problem with my solution, though. Some words in
>> Forth look ahead
>> in their source code, but if a quotation has already parsed
>> the source for
>> them they can't do that. I think this is a message:
>> quotations embedded in
>> the source and source parsing don't ever mix.
>Yes. I think that may be true.
Actually, I thought of another solution: the current Forth scheme has only
two levels, normal and immediate. Perhaps it needs three levels: normal,
parsing, and immediate.
>> Of the two, I choose to keep source parsing, because it's a
>> quick, simple,
>> powerful operation, adding no overhead to the runtime.
>And, I choose to keep quotations because quotations are cool.
And I respect that. I believe quotations are essential.
>Yes. By the way, I mentioned earlier that I wanted some way to unify
>quotations with lists and buffers; I think I've come up with a pretty
>good solution, similar to Joy's. I'm going to have introspecting
>primitives "!" and "@" (similar to FORTH's words of the same names)...
>"@" would take an index and a quotation/list beneath, and would push
>the such-and-such element of the list onto the stack. "!" would
>act similarly, taking the item to store on top of the stack, and then
>the index and the quotation/list. There will also be a "size" primitive
>for inspecting the size of a quotation/list. A quotation would be
>formally identified with a list of execution tokens. Execution tokens
>could be fetched, stored, and compared as a means of reflection.
This looks exactly like what I was thinking, although I personally wouldn't
use the names ! and @. Perhaps I should, though.
I also have another primitive, which in Forth is named "," (comma). It
appends a single execution token onto the current dictionary definition,
growing the dictionary if needed. I belive I would use it more than @ and
!; those two would mainly be useful for patching existing code or writing
quotations into fixed-sized buffers outside of the dictionary.
Uh, by the way, Chuck Moore came up with a cool refinement for Forth:
instead of having a single word ! which takes an address and a value, he has
a set of words. A! takes an address and stores it to the "current address
register"; !A stores a value to the current address reg, and !A+ stores
something and increments the address register. This dramatically reduces
stack manipulations, since most of the time you're using ! and @ you want to
do multiple manipulations with a single pointer.
In software, it might be prudent to have an A stack rather than just an A
register (since words you call could overwrite the value in a register).
Natually, there are also A@ and @A instructions.
>> >> In many architectures there are enough registers to hold at
>> >> least two Forth
>> >> tasks onboard at the same time. That is FAST context
>> >> switching. (In the
>> >> x86, you can discard the TOS register by using direct memory
>> >> references, at
>> >> a speed penalty, so you can still fit two tasks onboard if
>> >> it's worth the loss.)
>> >This is a nice idea, but it wouldn't work on x86. Many instructions,
>> >such as DIV, IN, and OUT require that the specific registers
>> >EAX and EDX be used. And, CALL and RET require that ESP be used for
>> >the return stack.
>> This is why you can only fit two processes on an x86, even
>> after discarding the TOS register.
>Oh, I see now. At a task-switch, you can just exchange EAX,
>EDX, ESP, and
>the data stack pointer with the four remaining registers. I thought
>by "fit two tasks onboard" that you meant that you could switch
>tasks without _any_ exchanging of registers or memory. You
>can't do that on
>x86, but you can switch pretty fast, by doing the four exchanges,
>more-or-less (some other stuff would have to be done to switch EIP).
Actually, I wouldn't want to try to fit two tasks onboard at the same time
unless the compiler was interleaving the tasks at compile-time. Obviously,
this is only good for certain types of task -- but it does have interesting
implications for superscalar processors, since interleaving different
registers reduces pipeline stalls.
>- "iepos" (Brent Kerby)
-Billy
iepos@tunes.org — 2000-05-12 23:09:36
> >> Okay. You're asking for a way to compile parameters into a
> >> procedure. No
> >> problem. You do it, as you note, the same way you build
> >> procedures. Let's
> >> suppose x and y are on the stack:
>
> >> NAME: myRender
>
> >Okay. Let me see if I've got this straight... That line was supposed
> >to create a new empty buffer and name it "myRender", right?
> >But wait, a pointer to it is getting pushed on the stack, too, right?
>
> Oops. Sorry, I forgot to document that.
>
> The word "NAME:" modifies the dictionary by adding a name to it, and setting
> the "current definition" to the address of the name. The name's default
> behavior when executed will be to push its address on the stack.
But its address won't be pushed in this case, since this is the initial
use of the name, which gets gobbled up by "NAME:" in a look-ahead
parsing manner, right? Instead, the address is kept in the
"current definition" register or static memory address. Okay, I think
I get it now.
> >> swap append append ( add the parameters )
>
> >Hmm. I'm not quite following this. Okay, "swap" gets things in order
> >so that the "x" parameter is on top and the empty buffer beneath.
>
> Swap simply exchanges x and y, so that when we append them they go in the
> right order. Then we append twice, thereby compiling both x and y into the
> buffer.
Yes, that makes sense now. This certainly is an interesting way to do it.
If I understand now, this is the complete program for hard-wiring in
the "x" and "y":
NAME: myRender
swap append append
' render append ;
myRender compile: myRender
But it sure seems like it would be easier if we had CONS so we could
just do:
[render] cons cons
But this is probably not compatible with a manual memory management style...
> A while ago -- before I'd heard of Joy, and therefore before I'd heard of
> concatenative languages -- someone on the Lisp newsgroup quoted one of the
> inventors of Lisp as saying, "the strength of Lisp is how it's able to treat
> data as code. Lisp will be dead only when someone comes up with a language
> which does this better."
>
> Need I spell out the next conclusion I logically make?
You conclusion is: it is possible that Lisp is now dead, because
Mr. Thun has come up with a language which does a better job of
allowing one to treat data as code.
> However, it's notable
> that the _code_ does not itself have a tree structure; rather, some data
> which happens to be literally included in the code has a tree structure.
Yes. You could think of it that way.
> >> > [operations like insertion and deletion]
> >> >Of course, I'd use them sparingly unless there's some
> >> >automatic memory
> >> >management in place to improve their performance.
>
> >> Memory management would only hurt their performance.
>
> >Whoa... now what if somebody tries to insert a 64KB chunk into a
> >big 2MB buffer? An automatic memory management system would certainly
> >not hurt the performance if it decided not to actually insert the
> >chunk, but instead just keep track and pretend that it inserted
> >it. Of course, a programmer could do that himself, although it
> >could get messy.
>
> It would *certainly* not hurt performance? That's a big assertion.
> I can clearly see that there are cases when it would not hurt too badly,
> but all in all I believe it would damage both performance and code size (and
> therefore bugginess) greatly, because every bit of code which operated on
> that array would have to consider not only the end cases of the array, but
> also the end cases of pointers inserted into the so-called array.
All right. I blew it. You're right. There are many cases where the
physical insertion would be faster. However, now I'm going to modify
the example. We still have a 2MB buffer. However, now instead of
inserting 64KB we are only inserting one byte. And, very soon,
we know that we are going to need to insert some more bytes, probably
near the same location. We may also need to delete bytes.
This situation happens in real life. In happens in text editors
(although 2MB is stretching it). Anyway, I think that now you'd
agree that physical insertions would be lame in this case; one
solution would be to keep some padding in areas that are likely
to have insertions soon, or that have just had deletions.
Wouldn't it be nice if when you were writing a text editor you didn't
have to worry about that, and could just freely insert and delete bytes
here and there without fear of ridiculous performance? But then again,
there is the argument that memory is very important, and not easy to
automatically manage, and that a programmer should play a role in deciding
how it is managed.
To me, the issue boils down to this question: is automatic memory
management really a hard problem? Or is it just tedious one? If it
turns out that there exists a memory management algorithm that
gives nearly optimal results in nearly all situations, then I'd say:
Automatic memory management wins hands-down. Why write a system in
which the programmer has to write relatively bloated programs to
manually manage something that the system could do just as well?
On the other hand, if it turns out that memory management is truly
a hard problem, and that every time someone comes out with a new
supposedly good algorithm for it, someone else comes out with an
example program on which it performs horribly, and that even if one
patches the algorithm to take into account the example, someone else
can find yet another example on which in performs horribly, then I'd say:
All right. I guess manual memory management is a good thing after all.
My guess is that there exist algorithms that in nearly all situations
do the job as well or better than nearly all programmers would.
But this is only a guess; I could be wrong.
> Suppose that x and y were four or five words. Suppose that you're reading
> the word they're included in for the first time. You don't even know what
> those quotations are _for_ until you finally reach the 'if', which is almost
> invisible it's so small.
That's a good point. However, this kind of thing can happens with nearly
everything in both Joy and Forth, since they are postfix. It seems
consistent with the spirit of postfix to go ahead and give the programs
to branch to first, before you get to the "if".
> >> and make it harder to break them apart into smaller definitions.
>
> >Hmm... that's interesting. Do some FORTH programmers write definitions
> >that include incomplete IFs? If so, then this may be a valid point.
>
> No Forth programmer would consider writing a definition with an incomplete
> IF, assuming that you mean an IF without a THEN.
Yes, that is what I meant, yes.
> However, every Forth programmer writes IFs without ELSEs.
Makes sense...
> >> I think the Joy 'if' is too complex; not only does it
> >> take three parameters, it also requires both 'if' and 'else' blocks.
>
> >True. I've considered adding another primitive "opt" that would take
> >just an "if" block.
>
> This isn't a bad idea, aside from the abbreviation. C is evil mainly
> because it abbreviates. It's better to use a meaningless symbol than it is
> to use an abbreviation for a meaningful word.
I agree; abbreviations are evil, especially when they strip away all
the vowels; this makes them hard to pronounce, and hard to type,
when using a keyboard like Dvorak. However, in this case, one need not
think of "opt" as an abbreviation for "option". I was thinking of it
as a verb, as in, "He opted to sell his computer".
> >Yes. By the way, I mentioned earlier that I wanted some way to unify
> >quotations with lists and buffers; I think I've come up with a pretty
> >good solution, similar to Joy's. I'm going to have introspecting
> >primitives "!" and "@" (similar to FORTH's words of the same names)...
> >
> >[...]
>
> This looks exactly like what I was thinking, although I personally wouldn't
> use the names ! and @. Perhaps I should, though.
There's one thing I might should clarify... The primary use of
"!" and "@" that I had in mind was in dealing with structures
other than quotations, such as arrays of pixels, sound samples,
or characters. It just seems natural, and possibly useful in some
situations, to extend their use to quotations as well.
> I also have another primitive, which in Forth is named "," (comma). It
> appends a single execution token onto the current dictionary definition,
> growing the dictionary if needed. I belive I would use it more than @ and
> !; those two would mainly be useful for patching existing code or writing
> quotations into fixed-sized buffers outside of the dictionary.
Yes, such a "," would be quite useful.
> Uh, by the way, Chuck Moore came up with a cool refinement for Forth:
> instead of having a single word ! which takes an address and a value, he has
> a set of words. A! takes an address and stores it to the "current address
> register"; !A stores a value to the current address reg, and !A+ stores
> something and increments the address register. This dramatically reduces
> stack manipulations, since most of the time you're using ! and @ you want to
> do multiple manipulations with a single pointer.
Yes. That seems like a useful technique. However, I'd be somewhat hesitant
to use it, because of the way that it uses an address register instead
of the stack. Here's a similar solution that I'd like better: just
have "!" and "@" look at the pointer on the stack but not remove it.
You could have "!+" and "@+" if you wanted, that would increment the
pointer. When you were through the pointer, you'd "drop" it off.
However, that technique would probably not be appropriate for FORTH
(at least, you'd still want to keep the original "!" and "@" in some form).
Because of the common use of the single slot variables that are fetched
and stored with "!" and "@", it would be really annoying if
one constantly had to drop the pointer off afterwards.
> >- "iepos" (Brent Kerby)
>
> -Billy
By the way, there's one thing I'd like to bring up. While fiddling
with my interpreter, I've been getting a lot of mysterious
Segmentation Faults because of accidentally running things like
"call", "dip", and "if" with integers on the stack instead of programs.
Thus, I've been thinking about adding some type-checking. But,
I've run into a brick wall. I realized that if we have branching,
it is sometimes impossible to tell at compile-time what the types
of things on the stack will be.
I'm just curious to know what you suggest to remedy this. Here are
some possible policies the system could have, if it tries to
find out the type of a thing on the stack but can't, yet would
like to know because it is getting ready to execute, say, a "+" and
wants to be sure it's two ints on the stack:
1) Emit an error message that the programmer is too sneaky.
2) Install a runtime check to make sure the types are right
(this would involve carrying around some type tags).
3) Don't do any checks, and just hope we're okay.
Right now, I'm leaning toward #1. However, I'm pretty this means
that there will be some cases that the programmer will be absolutely
sure is program is safe, yet the system rejects it because it is
not smart enough. My guess is that this could happen regardless
of how smart the system is.
- "iepos" (Brent Kerby)
wtanksley@bigfoot.com — 2000-05-13 01:03:06
From:
iepos@... [mailto:
iepos@...]
>> >> swap append append ( add the parameters )
>> >Hmm. I'm not quite following this. Okay, "swap" gets things in order
>> >so that the "x" parameter is on top and the empty buffer beneath.
>> Swap simply exchanges x and y, so that when we append them
>> they go in the
>> right order. Then we append twice, thereby compiling both x
>> and y into the buffer.
>Yes, that makes sense now. This certainly is an interesting
>way to do it.
>If I understand now, this is the complete program for hard-wiring in
>the "x" and "y":
> NAME: myRender
> swap append append
> ' render append ;
> myRender compile: myRender
>But it sure seems like it would be easier if we had CONS so we could
>just do:
> [render] cons cons
>But this is probably not compatible with a manual memory
>management style...
No, it's not. It's also wasteful of memory, and it produces a program which
runs _slower_ than the original. But that's okay; include a single word at
the end, for example, "compile-optimised", and you've got it as good as
mine.
Let me rewrite mine a little, using a better style:
swap [] append append ' renderer append function: myRenderer
compile-optimised
Here's the same code, only without the naming, so it's equivalent to yours:
swap [] append append ' renderer append
Better? "[]" is a word (not syntax) which prepares a static buffer to hold
a definition, and returns the buffer address. append works as before, but
this time takes the address of a buffer. compile-optimised takes the
address of a buffer and compiles it into the dictionary.
Remove the words I used at the end to give this new word a name, and you're
ready to compare. The two take almost the same amount of words, but the
append version requires no memory management.
>> A while ago -- before I'd heard of Joy, and therefore before
>> I'd heard of
>> concatenative languages -- someone on the Lisp newsgroup
>> quoted one of the
>> inventors of Lisp as saying, "the strength of Lisp is how
>> it's able to treat
>> data as code. Lisp will be dead only when someone comes up
>> with a language which does this better."
>> Need I spell out the next conclusion I logically make?
>You conclusion is: it is possible that Lisp is now dead, because
>Mr. Thun has come up with a language which does a better job of
>allowing one to treat data as code.
Actually, I'm more of the opinion that whoever said that should eat his
words, but yes. ;-)
>> >> > [operations like insertion and deletion]
>> >> >Of course, I'd use them sparingly unless there's some
>> >> >automatic memory
>> >> >management in place to improve their performance.
>> >> Memory management would only hurt their performance.
>> >Whoa... now what if somebody tries to insert a 64KB chunk into a
>> >big 2MB buffer? An automatic memory management system would
>> >certainly
>> >not hurt the performance if it decided not to actually insert the
>> >chunk, but instead just keep track and pretend that it inserted
>> >it. Of course, a programmer could do that himself, although it
>> >could get messy.
>> It would *certainly* not hurt performance? That's a big assertion.
>> I can clearly see that there are cases when it would not
>> hurt too badly,
>> but all in all I believe it would damage both performance
>> and code size (and
>> therefore bugginess) greatly, because every bit of code
>> which operated on
>> that array would have to consider not only the end cases of
>> the array, but
>> also the end cases of pointers inserted into the so-called array.
>All right. I blew it. You're right. There are many cases where the
>physical insertion would be faster. However, now I'm going to modify
>the example. We still have a 2MB buffer. However, now instead of
>inserting 64KB we are only inserting one byte. And, very soon,
>we know that we are going to need to insert some more bytes, probably
>near the same location. We may also need to delete bytes.
>This situation happens in real life.
Agreed. Furthermore, even if this situation didn't actually exist, there
are other situations where we really DO need automatic memory management.
For example, it's VERY handy in building parsers (eww). And so on.
So I argued wrongly when I tried to point out that automatic memory
management is slower. It sometimes is -- but sometimes it's the only
possible way to manage memory, and even when it's not the only possibility
it's often easier and not all that much slower.
Fortunately, that wasn't the point I was trying to make originally. I was
originally trying to demonstrate that you should start in your compiler by
implimenting the primitive operations (which are always manual), and testing
them manually. Later on you can write automatic control programs which use
those primitive components, but for now you want to start small.
The point is that at every point you have a completely working language.
It's more fun that way! Plus, if you really think your language is good,
you'd want to write your GC algorithm in it!!!! (Okay, I'm kidding. Nobody
WANTS to write a GC algorithm. Except for an insane freak. Hey, me too!
Okay, two insane freaks.)
>In happens in text editors
>(although 2MB is stretching it).
Well, in THAT case there's an easier method -- don't keep the entire text in
a single buffer; instead, keep it in a ragged array (a linked list of
lines). That doesn't require memory management either. But that's a
special case.
>To me, the issue boils down to this question: is automatic memory
>management really a hard problem? Or is it just tedious one?
It's both. But I think that the current GC solutions are sufficient.
>My guess is that there exist algorithms that in nearly all situations
>do the job as well or better than nearly all programmers would.
>But this is only a guess; I could be wrong.
You'd certainly be wrong, until we have computers that can write programs in
general as well as humans.
>> Suppose that x and y were four or five words. Suppose that
>> you're reading
>> the word they're included in for the first time. You don't
>> even know what
>> those quotations are _for_ until you finally reach the 'if',
>> which is almost invisible it's so small.
>That's a good point. However, this kind of thing can happens
>with nearly everything in both Joy and Forth, since they are
>postfix.
That's true, but nowhere else in either Forth or Joy do you have data
structures which span more than a few characters. With everything else, you
get to see the next operator right away, and it's usually easy to tell the
operator apart from the data. Not so with quotations.
>It seems
>consistent with the spirit of postfix to go ahead and give
>the programs to branch to first, before you get to the "if".
This is true. It's also purely functional, which has some theoretical
advantages. However, we can see that it has some negative effects, and it
certainly is harder to compile for.
All in all, though, Forth programmers have tried that before, and it's never
stuck.
>> >Yes. By the way, I mentioned earlier that I wanted some way to unify
>> >quotations with lists and buffers; I think I've come up
>> >with a pretty
>> >good solution, similar to Joy's. I'm going to have introspecting
>> >primitives "!" and "@" (similar to FORTH's words of the
>> >same names)...
>> This looks exactly like what I was thinking, although I
>> personally wouldn't
>> use the names ! and @. Perhaps I should, though.
>There's one thing I might should clarify... The primary use of
>"!" and "@" that I had in mind was in dealing with structures
>other than quotations, such as arrays of pixels, sound samples,
>or characters. It just seems natural, and possibly useful in some
>situations, to extend their use to quotations as well.
Oh! I had no idea that's what you meant. Yes, that makes especial sense in
a primitive system, such as you're building. No doubt you plan to get rid
of them later, but for now they're essential.
>> I also have another primitive, which in Forth is named ","
>> (comma). It
>> appends a single execution token onto the current dictionary
>> definition,
>> growing the dictionary if needed. I belive I would use it
>> more than @ and
>> !; those two would mainly be useful for patching existing
>> code or writing
>> quotations into fixed-sized buffers outside of the dictionary.
>Yes, such a "," would be quite useful.
Oh, it does require allocation which can smoothly grow -- the easiest way to
do that is by what OTA calls "rubberband allocation", like the Forth
dictionary or Unix's sbrk.
>> Uh, by the way, Chuck Moore came up with a cool refinement for Forth:
>> instead of having a single word ! which takes an address and
>> a value, he has
>> a set of words. A! takes an address and stores it to the
>> "current address
>> register"; !A stores a value to the current address reg, and
>> !A+ stores
>> something and increments the address register. This
>> dramatically reduces
>> stack manipulations, since most of the time you're using !
>> and @ you want to
>> do multiple manipulations with a single pointer.
>Yes. That seems like a useful technique. However, I'd be
>somewhat hesitant
>to use it, because of the way that it uses an address register instead
>of the stack.
As I mentioned, in hardware you _want_ to use a register, but in software a
stack would be just as efficient. Not _the_ stack, but rather an additional
stack. Yes, I know what you're about to say: "But that makes those words
impure, because they affect more than the stack!" Well, think about it --
those words are memory access words, and as such they already affect more
than the stack. :-)
>Here's a similar solution that I'd like better: just
>have "!" and "@" look at the pointer on the stack but not remove it.
>You could have "!+" and "@+" if you wanted, that would increment the
>pointer. When you were through the pointer, you'd "drop" it off.
This is fine if you've only got one pointer and you never need to do any
data manipulations.
>However, that technique would probably not be appropriate for FORTH
>(at least, you'd still want to keep the original "!" and "@"
>in some form).
>Because of the common use of the single slot variables that are fetched
>and stored with "!" and "@", it would be really annoying if
>one constantly had to drop the pointer off afterwards.
Right. That's why any technique shouldn't steal the names ! and @, since
they're already used. In general, avoid reusing names for functions with
similar-but-different semantics.
>- "iepos" (Brent Kerby)
-Billy