comments - 2

phimvt@lurac.latrobe.edu.au — 2000-07-07 04:38:41

*** and I'm only in the middle of May ! ***

1. "CLIP joyful", which I must read properly some time, has a fixed
size array as a stack. There seems to have been fruitful interplay
between Forth and Joy concepts. I would like to see more such cross-
fertilisation.
2. Optimising combinators. In a program such as
[1 2 3 4 5] [dup *] map
the [dup *] is first pushed and then immediately popped by the map.
This will occur in most uses of combinators: the program parameters
are actually known at compile time, and the action of the [..] push
(or several pushes) and the combinator could be combined and optimised.
One solution is to have combinators which _syntactically_ take arguments:
[1 2 3 4 5] MAP([dup *])
written in, say, capitals to distinguish them from the ordinary combinators.
Similarly, the ifte combinator
. . . [if-part] [then-part] [else-part] ifte
could allow the optimisation
. . . IFTE([if-part],[then-part],[else-part])
(The parentheses and the commas could be omitted, at some risk to
cause confusions.)
The capitalised variants would never have to do the redundant push(es)
and pop(s). One could even have ifte, IFTE, IfTe etc. which have some of
the program parameters attached at compile time and others taken
from the Joy stack at runtime.
The programmer should not have to be bothered by this, a good compiler
should figure out the possible optimisations.
But figuring this out is not as straightforward as it might seem.
What looks like a program to be executed might also end up being
used as data somewhere, and of course the semantics has to ensure
that the program behaviour is independent of space/time optimisations
- except of course for space/time usage.
My has been, and will be for quite some time, not to worry about
premature optimisation. I prefer to get the design of the language
right in the first place, then to get a reasonable implementation
right, and only then start to think about optimisations seriously.
3. Type checking
In static languages all or almost all typechecking can be done
at compile time (= as the program is being read). In dynamic languages
this is not possible, and hence either programs are not checked at
all, or are checked at run time. In Joy this means, in the case
of the addition operator + :
1. Do we have two parameters?
2. Is the top an integer?
3. Is the second an integer?
4. Add them!
Tests 2. and 3. are done by looking at a (one byte) field of the
stack nodes. These fields are in addition to the value fields of the
nodes (and the "next-link" fields of the nodes).
Often the checks are unnecessary. Consider
foo == + 10 *
This will add two (hopefully) numbers and multiply the sum by 10.
At compile time it will not be known whether the two parameters
really are numbers, so the usual dynamically typechecked version
of + has to be compiled. On the other hand, the sum is an integer,
10 is an integer, and so the compiler knows that the multiplication
will receive correct arguments, and a stripped down version of
multiplication is all that is needed.
Again, you already know my views on premature optimisation.
It would be of considerable interest to examine what kind of
subset of Joy could be (easily) statically typechecked. From Prolog
we know that all kinds of interesting and useful dynamic features
cannot be (easily) made available for a compiled subset.
4. "Baker papers"
The relationship between Joy and linear logic was raised. I have
been interested in this topic for some time, but not been able to
research it properly.
The question was raised whether Joy has references or copies (for
e.g. dup). That is actually a bad question, and it cannot be settled
from the definition of the language - because it is not a language
property. Joy could be implemented either way. The prototype implemen-
tation uses references (hence dup simply pushes a single new node
onto the stack, which has the same type (and value) as the old top.
The type can be "list", and the value a pointer to a huge list.
But that list is not copied at all.
There is another way of implementation: have everything as an array,
copy them fully where necessary. This is more "Forthy" as far as I can
tell, and it might be the best way in many cases. To cons a new item
to the front of a long sequence then requires copying the item and then
the old sequence into a new area. Similar with concatenation.
In purely functional languages the linked method is possible
because there are no assignment statements which can change the
value of a list (although you can have functions that return a
different list than their argument.
Consider a function "swap-last":
[1 2 3 4 5] swap-last --> [1 2 3 5 4]
but now use it as follows:
[1 2 3 4 5] dup swap-last pop --> [1 2 3 4 5]
If the implementation of swap-last interchanges the values 4 and 5 in
the list (which is a state change), and dup was implemented without
copying, then the original list had 4 and 5 interchanged and the
result will be
--> [1 2 3 5 4]
But that is not functional. The state change ruined it.
In a functional language there is no state, no state change,
no assignment statements. It is then always _safe_ to use references
for doing things like dup and swap and the like, and it is efficient in some ways. Linked lists become
the preferred implementation method.
On the other hand, linked lists need garbage collection or
somesuch, and they are slow for indexed access. This is where
array implementations do better. Books on datastructures deal
with all this in detail. The implementation issues are again
similar to those in Prolog (I cannot remember the references, sorry).


Well, friends, this will have to do for today.
Manfred