Re: [stack] Digest Number 26

Mark van Gulik — 2000-05-31 03:33:51

Massimo Dentico <m.dentico@...> wrote:
> Mark van Gulik on Concatenative mailing-list wrote:
>> [.. discussing about OLAP ..]
>> I myself have worked on a smaller scale project with an Objectivity database
>> containing information for about 50 million customers (about 800 Gig, if I
>> recall). We did most of the development in *Smalltalk*. No regrets -- it
>> was a *lot* of fun, and a complete success.
>>
>> Languages like Forth would be completely inappropriate for such constructs.
>> The amount of semantic information in the models easily exceeds Forth's
>> complexity limits (in my estimation). *Parts* of the database could be
>> dealt with by Forth, but its entirety is too intrinsically complex (e.g.,
>> all the indexing structures, keyed by a new kind of spherical coordinate
>> system and the spectra, must allow well-choreographed storage clustering on
>> disk and tape without overcomplicating the high-level code requiring the
>> access).
>
> [Well, I'm not here to sell Forth but to highlight its qualities because
> I'm interested about the design of a better concatenative language of the kind
> that Bill Tanksley has illustrated previously. I'm convinced that it needs
> improvement, in particular about its safety, thus my interest in a
hypothetical
> type system that fits well in the general framework of Forth: IMHO the
approach
> "the language is the type system", that you also seem to embrace with Avail,
> looks very promising.]

I'm positive that Forth is applicable over an enormous range of domains,
mostly due to its core simplicity and extreme extensibility. I'm mostly
concerned, however, with the ability to build *restrictions* in the
language. I've long felt that the power of a programming language depends
in two nearly opposite was on how restrictive it is. A language should
never restrict one to only bad solutions that are mired in useless detail
as, say, Java does, due to its half-assed implementation of almost every
concept it bastardizes from C++, where the implementation is more
whole-assed. In the opposite direction, the thing I hate most about
Smalltalk is its lack of type declarations. This leads me to spend whole
*seconds* staring at a method, trying to figure out the kinds of things it
expects as arguments, in instance variables, etc. Seriously, that's a lot
of time for simple reverse engineering, compared to what's possible.

It really has been a long, long time since I wrote a Forth interpreter, or
even used the language, so my previous post was in part a mild (I hope it
was mild) flame to find out what people *really* like about it, and even
what it looks like these days.

And, of course I'm really on this list to learn about concatenative
languages. My first few posts in this group must have shown my level of
ignorance (see my attempt at tree-rewriting in Joy for a good laugh).

As for "the language is the type system" in Avail... that's only because I
got to the implementation of the type system before most of the other things
in my stacks of notes. Hm, maybe you meant it in the sense of "the type
declarations of Avail are written in Avail code". I think I was
interpreting it more like "everything interesting in Avail has to do with
the type system" when I first read it.


> You miss completely my point. Forth is a meta-tool, this is its point of
> strength. You know better than me (you say that you have written one) that
> Forth is extensible, at the syntactic/semantic level thanks to its
> open interpreter/compiler, in whatever direction: toward low level (direct
> access to hardware, embedding of assemblers) and toward high level (domain
> or even application specific languages). Smalltalk, to a certain extent,
> shares these characteristics of uniformity and extensibility.

Smalltalk and Forth are very similar in initial "world-view". The biggest
difference I see is that Smalltalk *hides* layers better than what I recall
of Forth. In other words, Smalltalk works much, much better when building
upwards than when building downwards. That's fine for the kinds of
applications I deal with, and the rest of them I can always squeeze data
through a carefully shaped API to the lower levels (usually C,
unfortunately). Using Smalltalk as a temporary implementation language for
Avail was quite instructive to me. It was especially constructive when I
started to hybridize the VM into a Smalltalk/C++ blend. My productivity
dropped by more than a factor of 10 (maybe 100 is closer) just by bringing
C++ into the picture, even though only 1% of the C++ code is hand-coded
(20KB out of 2.1MB). Still, if I hadn't done it, I'd be spending far too
long waiting for the VM to run my tests.


> Numerous Object Oriented extensions, with very different characteristics,
> and a Garbage Collector are available in Forth, written in it. I have seen
> also subsets of other programming languages embedded in it, like Lisp and
> Prolog or a tagging language like HTML.

I can understand *implementing* another language in Forth, but embedding it
seems kind of awkward. GC in Forth is a big surprise to me. In Smalltalk I
only know of one or two exploits that can actually corrupt memory (easily
fixed in the VM, of course). Aside from exploiting old bytecode-verifier
bugs in Java, I don't know how to corrupt memory with it at all. C++ with
GC, on the other hand, is trivial to corrupt. Simply convert a pointer to
integer, xor it with 0x55555555, store it somewhere, then later decode and
try to use it. Even with a non-moving collector, this will corrupt memory.
Many C++ compilers will do certain kinds of optimizations that are
incompatible with GC (the main example being pre-adding the address of the
start of an array and a constant offset that puts the address out of range
of the array, even though all uses of it would offset it back into range).
I'm sure Forth has better control over these sorts of incompatible
optimizations, but GC? I'm skeptical of how. I could see it might be
possible if one sandboxed everything and made addresses opaque objects that
were not convertible to integers. A rogue program would still crash, but
the crash would only be in the context of the sandbox (not the
OS/Process/what-have-you).

I just re-read your comment about your concern for more safety in Forth. I
guess that's a way of phrasing my major concern with it, too. It's not just
pointer-safety and type-safety, though. Is it common practice (yet) to use
assertions in the language?


> All that in a coherent, uniform framework, *without* leaving the language
> (or better, the environment) and on-the-fly (load, use and then discard, the
> extensions you need). Note that this is different in comparison to statically
> compiled modules, it's a dynamic precess: every time you load a screen (a
little
> chunk of code) you interpret/compile it perhaps in a different context; this
> opens the door to adaptive interpretation/compilation/execution which is
specific
> to actual context, so any sort of specialization/optimization which exploit
> this dynamic information is potentially possible (depending of what make
sense).
> Note even that only most frequent executed parts of the program need to be
> heavily optimized and these parts could vary during the execution itself or
> between different executions.

I agree that a near-instantaneous compiler allows one to get on with the
work without fussing with figuring out how to reduce dependencies with
headers and make files. Forth must be terrific for this on today's
hardware. My Forth platform was the Commodore-64, which can now run >10x
original speed in *precise* emulation on my 233MHz Mac. By precise I mean
the CPU, the video interrupts, the CIAs, VIAs, sound chip, etc. are all
perfect virtual replicas, down to the cycle.

In Smalltalk, there is no type system to slow down compilations with.
Unfortunately, it misses many optimization opportunities, but projects like
Self have helped show what can be done without sacrifice (other than the VM
developer's time).


> Traditional batch compilers don't know much about the dynamic execution
context
> of the code, they compile alla and blindly apply every possible optimization
> technique statically. This means long compilation times and calls for separate
> module compilation. But separate compilation introduces complexity and
constrains
> the semantics of the language: every feature that is not statically known,
> at compile time *and* at the module boundary, is discarded or restricted (for
> example, separate compilation in Standard ML, which have a powerful type
system,
> is quite difficult and limits the language). Obviously, moving this to
link-time
> or load-time (of compiled code, a.k.a. DLL) complicates the process and don't
> solve anything: the problem here is too much pre-compilation. Forth avoids
this
> issue entirely and Smalltalk too (I know that, for some implementation, is
possible
> to compile to C/C++ but usally this feature is used at the end of a
development
> cycle).

That's why Avail was designed from the ground up to support dynamic
optimization. The current implementation has a stack-oriented nybblecode
representation for space, which is dynamically translated, only when used
heavily, into a wordcode representation that's RTL-like. All optimizations
are reversible, even for existing stack frames. Basically, the wordcode
interpreter is *completely* hidden from application developers (it can be
omitted entirely from the VM to save space on tiny platforms).


>> Clearly these projects need the cooperation of many software developers.
>> What kind of tools are available for Forth in this regard? As far as I'm
>> aware, Forth doesn't even have namespaces. It also lacks (if memory serves)
>> any ability to specify interfaces separately from implementations. High
>> complexity parallel development is simply not possible (at any price) in
>> such an environment.
>>
>> Your mileage may vary.
>
> Experienced Forth programmers assert that their productivity is greater of
> an order of magitude when they use Forth rather than other programming
> languages. I think that, thanks to the characteristics discussed above, they
> tend and succeed to avoid complexity, thus the necessity of big teams of
> programmers and complex management.

Actually, I don't know of a language for with 10x productivity is *not*
claimed by its advocates! Well, ok, FORTRAN, COBOL, and other CAPSLOCK
languages. My personal output peak was 5600 lines of fully tested,
documented code, sustained over a 4.6 day period (~35 hours). That's about
1200 lines a day. The complexity level was fairly high, as this was a
metacircular Smalltalk compiler, interpreter, debugger, disassembler, and
decompiler. I did write 1000 lines of Pascal in a (very short) day, but
that simply implemented a couple of variants of linked list. I only
remember it, because when I compiled it I thought the compiler was broken -
the errors weren't being output to the console. I checked again, and the
executable was just sitting there. I ran it, and every test ran perfectly.
Definitely a strange feeling, but I get that quality level regularly now
with Smalltalk (well, there are usually a few typos). Over all, I'd guess
my productivity is close to 100x higher in Smalltalk than C, or about 20x
higher than in C++.


> But forthers are no more alone in their commitment toward simplicty.
> "Do the simplest thing that could possibly work" [1] is also the motto of
> Extreme Programming [2], a lightweight, humanistic discipline of software
> development, grown in a part of the Smalltalk community:
[...]
> Don't Complexify
[...]
>
> Simplify the problem you've got or rather don't complexify it. I've done it
> myself, it's fun to do. You have a boring problem and hiding behind it is a
much
> more interesting problem. So you code the more interesting problem and the one
> you've got is a subset of it and it falls out trivial. But of course you wrote
> ten times as much code as you needed to solve the problem that you actually
had.

Ah, but that assumes you will never be asked to solve the general problem.
I disagree with the level of "fix it up later" that XP advocates. At first,
when I read the Charles Moore quote I thought "so, writing ten times as much
code doesn't mean it took 10x longer to do". Usually testing, correction,
and general maintenance will be the dominating costs. But 10x as much code
means approximately 10x as many bugs (assuming the problems were of equal
difficulty, and the solutions of similar complexity, neither of which was
probably true). Bug density seems fairly narrow, and depends more on the
person writing the code than anything else, including language. My
father-in-law produce something like 50,000 lines of COBOL code, mostly a
rewrite of a small piece of a moderate scale Smalltalk system I worked on.
It took one month, and as of this date I believe there were two bugs every
found in it that weren't compiler bugs. And this was significantly *above*
his average released-code bug rate.

Clearly the language doesn't matter as much as the discipline. However,
that month could have been a week in another language.


> This seems to contradict what I have written in another e-mail, my concern
with
> generality, but it's not so: abstraction, taking out of context, is a process
that
> necessarily needs more concrete "objects" by which to abstract.

Wow, that phrase "abstraction [is] taking out of context" really sticks in
my mind. I've never heard it described in exactly those words, but they
*really* fit. Anyhow, *finding* the appropriate abstraction level is most
of my effort when working with Smalltalk, both in reading it and writing it,
but in different ways that I'm not sure I can describe well.


> In fact in Forth
> and Smalltalk communities (and probably other, like Self community) this
process
> of abtraction is called factoring or refactoring and moves from more specific/
> concrete pattern of code/data to more abstract one, in an utilitarian way
because
> only what is really general, in the actual context, is generalized. This
> contributes
> to the compactness and then tractability of code (no redundances). I think
that
> generalization and specialization are not necessarily in contrast.

Yes, XP is an interesting phenomenon. I've been using many of the core
concepts for a long time (as have many other Smalltalkers and others). My
current practice is quite light on regression test suites, but I'm working
on it. As the saying goes, "Every day, in every way, I get better and
better, as shown by the chart on my left, showing the results of regression
tests on two consecutive days..."


My apologies for such an off-topic post. Hopefully there are enough
Forthers out there that this isn't as off-topic as I fear it is.

Samuel A. Falvo II — 2000-05-31 04:52:06

On Tue, 30 May 2000, Mark van Gulik wrote:

> I can understand *implementing* another language in Forth, but embedding it
> seems kind of awkward. GC in Forth is a big surprise to me. In Smalltalk I

From the point of view of Forth, there is no difference between embedding
and implementing, especially since to embed you must first implement.

> bugs in Java, I don't know how to corrupt memory with it at all. C++ with
> GC, on the other hand, is trivial to corrupt. Simply convert a pointer to
> integer, xor it with 0x55555555, store it somewhere, then later decode and
> try to use it. Even with a non-moving collector, this will corrupt memory.

This requires more coding time than just using a pointer directly; it's
not economical to pull such tricks. However...

> Many C++ compilers will do certain kinds of optimizations that are
> incompatible with GC (the main example being pre-adding the address of the
> start of an array and a constant offset that puts the address out of range
> of the array, even though all uses of it would offset it back into range).

...this is a very good point to keep in mind and is a much more valid
concern.

> I'm sure Forth has better control over these sorts of incompatible
> optimizations, but GC? I'm skeptical of how. I could see it might be

Conservative GC, the same used in C and C++ systems.

> possible if one sandboxed everything and made addresses opaque objects that
> were not convertible to integers. A rogue program would still crash, but
> the crash would only be in the context of the sandbox (not the
> OS/Process/what-have-you).

Garbage collection and sandboxing have absolutely zero to do with each
other. GC is the examination of running state, while sandboxing is the
cordoning off of execution environment. Two radically different concepts
that are 100% orthogonal to each other.

> pointer-safety and type-safety, though. Is it common practice (yet) to use
> assertions in the language?

In much the same way that it's common practice to use assertions in any
other programming language. Gforth, for example, uses them and exception
catching (yes, Forth now supports exceptions) extensively. I believe that
bigForth and win32Forth does as well.

The tools are there; but if they're not used, don't blame the language.
Blame the programmer.

> Unfortunately, it misses many optimization opportunities, but projects like
> Self have helped show what can be done without sacrifice (other than the VM
> developer's time).

Polymorphic inline caching is very intriguing in this area, and offers
much promise for other types of typeless execution environments, such as
object oriented Forths and even Smalltalk. I would *love* to see it
applied to Python.

> with Smalltalk (well, there are usually a few typos). Over all, I'd guess
> my productivity is close to 100x higher in Smalltalk than C, or about 20x
> higher than in C++.

I find that I can develop my software significantly faster in Python than
in C at some cost in execution speed. However, it's great for performing
on-the-fly proof-of-concept software implementations which, after fully
fleshed out, can then be transliterated into C.

> on it. As the saying goes, "Every day, in every way, I get better and
> better, as shown by the chart on my left, showing the results of regression
> tests on two consecutive days..."

I find that it is very, very, very difficult for me to write any kind
regression tests. I tend to build software by mathematical induction:
design top-down, but then implement bottom up. Basically, the goal is
simple: make each individual component simple enough that it *has* to
work. Then, as you add additional layers to the software, using only the
code you wrote, you therefore know that the higher layers work as well.

Obviously, this doesn't eliminate the need for testing; I do test my code
(with very high success rates in most cases). I just don't do an all-out
regression test on it. In fact, I wrote the bulk of my operating system
kernel this way, pretty much out of necessity; I didn't have a debugger,
and I had very limited amounts of time to write it in (specifically, my
spare time). Astonishingly enough, the overall architecture of the kernel
has remained the same for the past six years.

wtanksley@bigfoot.com — 2000-05-31 18:18:45

From: Mark van Gulik [mailto:ghoul6@...]
>Massimo Dentico <m.dentico@...> wrote:
>> Mark van Gulik on Concatenative mailing-list wrote:

>>> Languages like Forth would be completely inappropriate for
>>> such constructs.
>>> The amount of semantic information in the models easily
>>> exceeds Forth's
>>> complexity limits (in my estimation). *Parts* of the
>>> database could be
>>> dealt with by Forth, but its entirety is too intrinsically
>>> complex (e.g.,
>>> all the indexing structures, keyed by a new kind of
>>> spherical coordinate
>>> system and the spectra, must allow well-choreographed
>>> storage clustering on
>>> disk and tape without overcomplicating the high-level code
>>> requiring the access).

>> [Well, I'm not here to sell Forth but to highlight its
>> qualities because
>> I'm interested about the design of a better concatenative
>> language of the kind
>> that Bill Tanksley has illustrated previously. I'm convinced
>> that it needs
>> improvement, in particular about its safety, thus my interest in a
>> hypothetical
>> type system that fits well in the general framework of
>> Forth: IMHO the approach
>> "the language is the type system", that you also seem to
>> embrace with Avail, looks very promising.]

BTW, I've been talking with the guy who implemented the strong typesystem
for Forth, and it turns out that the language IS the typesystem for him. He
is using the Forth way of doing things after all. You can include any Forth
code in the type declarations.

>It really has been a long, long time since I wrote a Forth
>interpreter, or
>even used the language, so my previous post was in part a mild
>(I hope it
>was mild) flame to find out what people *really* like about
>it, and even what it looks like these days.

Best way is to go to comp.lang.forth; not many people here are that well
aquainted with Forth. Oh, and skip the mild flame; just ask :-). Forth is
certainly a part of our discussion here, but it's not the main point.

>Smalltalk and Forth are very similar in initial "world-view".

Really? I wouldn't have thought that. The Forth machine (including the
compiler and editor) is very simple; the Smalltalk one is more complex.

>The biggest
>difference I see is that Smalltalk *hides* layers better than
>what I recall of Forth.

Smalltalk has many, many layers to hide, even just with nothing but the VM
running. Forth doesn't. This doesn't mean that Forth is worse at hiding
layers; however, it does mean that it doesn't bother hiding the language
layers.

As part of writing a system or application in Forth, you're supposed to
write words which effectively hide their implementation (and the language
layers) and expose an interface which is appropriate to the nature of the
problem. It's recommended that you don't try to hide the Forthness of the
solution (for example, by writing a C parser which outputs Forth code), but
it's certainly possible to do that.

>In other words, Smalltalk works much, much better
>when building upwards than when building downwards.

I'm not sure how to take this. Do you mean that Smalltalk is good at
bottom-up design? Forth is also renowned for that.

>> Numerous Object Oriented extensions, with very different
>> characteristics,
>> and a Garbage Collector are available in Forth, written in
>> it. I have seen
>> also subsets of other programming languages embedded in it,
>> like Lisp and
>> Prolog or a tagging language like HTML.

>I can understand *implementing* another language in Forth, but
>embedding it seems kind of awkward.

That's the huge advantage of concatenative languages -- because the language
itself has no parser (or almost none), the words can act as parsers without
interference from the language. For example, someone implemented a FORTRAN
embedded in Forth (which later was reduced to just a formula translator, but
originally had an almost complete FORTRAN). In use, it looked like this:

VARIABLE x
: myWord i" 3*4/x+sin(x)" ;

As you see, it just fits right into the language, and allows some things to
be expressed in a manner more appropriate to certain problem domains.

In short, i-quote (i") is a Forth word which is set to execute immediately,
regardless of whether Forth is compiling. Its job is to search ahead in the
input stream until it finds the quote which marks the end of its input, and
then it has to parse the expression and emit Forth code for it. It's
actually pleasantly simple to implement, but who cares about implementation
when it's so easy to use.

>GC in Forth is a big surprise to me.

Why? It's natural.

>In Smalltalk I
>only know of one or two exploits that can actually corrupt
>memory (easily fixed in the VM, of course).

>Aside from exploiting old bytecode-verifier
>bugs in Java, I don't know how to corrupt memory with it at
>all. C++ with
>GC, on the other hand, is trivial to corrupt. Simply convert
>a pointer to
>integer, xor it with 0x55555555, store it somewhere, then
>later decode and
>try to use it. Even with a non-moving collector, this will
>corrupt memory.
>Many C++ compilers will do certain kinds of optimizations that are
>incompatible with GC (the main example being pre-adding the
>address of the
>start of an array and a constant offset that puts the address
>out of range
>of the array, even though all uses of it would offset it back
>into range).
>I'm sure Forth has better control over these sorts of incompatible
>optimizations, but GC? I'm skeptical of how. I could see it might be
>possible if one sandboxed everything and made addresses opaque
>objects that
>were not convertible to integers. A rogue program would still
>crash, but
>the crash would only be in the context of the sandbox (not the
>OS/Process/what-have-you).

You seem to be confusing GC with memory protection. In a language which
allows stores to arbitary addresses, memory corruption is always possible.
You don't need to add GC to allow it, and adding GC doesn't change anything
about it.

If you're coding at a low level with Forth and GC is enabled, you can do
things that will kill the GC. They will hurt. Don't do them. If you're
coding at a high level, though, such things are impossible. That "high
level" may appear very much like the low level Forth code you're used to
writing, but if it has memory protections, it's a higher level.

>I just re-read your comment about your concern for more safety
>in Forth. I
>guess that's a way of phrasing my major concern with it, too.
>It's not just
>pointer-safety and type-safety, though. Is it common practice
>(yet) to use assertions in the language?

Not really. It's far more common to use compile-time unit tests/regression
tests. The problem with assertions is that they _require_ application: you
have to be able to remove them when you're about to distribute. It's
possible to do that by building a special sublanguage which handles the
assertions applicatively, but most people don't bother.

As an example of what I mean, consider a Forth word ABORT" (abort-quote),
which takes a flag on the stack and dies if it's true (printing out a
message in the process). You could put an assert of the form:

: myWord dup 0 > ABORT" my param must be bigger than zero."
3 swap /
;

...but you can clearly see that this adds considerable bulk to the word, and
there's no way to optimise it out (without using dataflow and assuming that
anything having to do with ABORT" is a waste of time).

An example of the right way to do this is provided by a fictional word,
"ASSERT|".

: myWord ASSERT| dup 0 > " my param must be bigger than zero." |
3 swap /
;

Now we can define ASSERT to skip over everything up to the next | when we
don't want the asserts in the code. This is very simple to implement -- but
few Forth programmers use it.

There are two replacements. First, compile-time unit tests are a
phenominally powerful means for documenting code and discovering errors.
Second, interactivity allows for considerable examination of code in ways
not appropriate for unit tests. The two together allow for expressivity at
_least_ equal (if not better than) the expressivity given by Design By
Contract (assertions).

>That's why Avail was designed from the ground up to support dynamic
>optimization. The current implementation has a stack-oriented
>nybblecode representation for space, which is dynamically
>translated, only when used heavily, into a wordcode representation
>that's RTL-like. All optimizations are reversible, even for
>existing stack frames. Basically, the wordcode interpreter is
>*completely* hidden from application developers (it can be
>omitted entirely from the VM to save space on tiny
>platforms).

This is very similar to what I plan for the high level of my Joylike
languages, except that I'll use wordcode where you use nibblecode, and
native code where you use wordcode. At a low level, though, no optimization
will EVER be secret; you have to ask for it in order to make it happen
(there will be words which carry out the optimizations). In other words,
the compiler itself will be written in the low level code, and it'll be very
simple indeed.

>Actually, I don't know of a language for with 10x productivity is *not*
>claimed by its advocates!

Correct. Hardly suprising -- the difference between a good programmer and a
bad one is 25x. In other words, 10x is noise. Chuck Moore claims 1000x for
Forth -- not 1000x productivity, but rather 1000x quicker speed and smaller
size without the need for any more budget. I don't speak for him; check his
work at http://www.ultratechnology.com

>> Don't Complexify
>[...]

>> Simplify the problem you've got or rather don't complexify
>> it. I've done it
>> myself, it's fun to do. You have a boring problem and hiding
>> behind it is a much
>> more interesting problem. So you code the more interesting
>> problem and the one
>> you've got is a subset of it and it falls out trivial. But
>> of course you wrote
>> ten times as much code as you needed to solve the problem
>> that you actually had.

>Ah, but that assumes you will never be asked to solve the
>general problem.

No, it assumes _nothing_. That's the beauty of it. It only solves the
problem at hand -- and it leaves you with _real_ experience of what the
specific problem is actually like. If you are later asked to solve the
general problem (and that does sometimes happen), you'll be able to
confidently write the general solution, knowing that you have a specific
solution to test it against (or perhaps even to base it on). If you aren't
asked for that, you haven't spent any extra time or effort on it.

But consider the best case for your argument: the times when you actually
_are_ asked for the general solution. Did you _really_ test your generic
code for all of the cases in the general solution? Are you so certain that
it's the best solution? I've never seen first-draft generic code which
actually worked. I have seen first-draft specific code, though. And the
experience granted by writing that working specific code will certainly lead
the programmer to choose wisely when it comes to the choice between
rewriting and refactoring.

>I disagree with the level of "fix it up later" that XP
>advocates.

And I strongly agree -- with the caveat that "fix it" implies that it's
broken now, which is the opposite of what XP advocates.

>At first,
>when I read the Charles Moore quote I thought "so, writing ten
>times as much code doesn't mean it took 10x longer to do".

Correct. You can spend 10x more time per line, so you can quite possibly
wind up in a tie, in terms of time spent, with a person writing bulkier
code. The difference is that your code will be able to be written and
maintained by a single person instead of requiring a team; and THAT can
suddenly start saving a _lot_ of time and grief.

(The professional Forth companies claim massive time savings instead of
space savings; they claim that they used to lose jobs by bidding
"unrealistically" low times for their jobs (a quarter of the lowest C time),
so they've since tried to simply underbid everyone else, and use the extra
time to polish their code and documentation and still get the project in on
time. They claim they could and did easily meet the so-called "unrealistic"
estimates they made.)

>My apologies for such an off-topic post. Hopefully there are enough
>Forthers out there that this isn't as off-topic as I fear it is.

I mentioned "concatenative" a couple of times, so I think we're okay. ;-)

At any rate, it's clear that a concatenative language _can_ be vastly
simpler and more transparent than an applicative one. At this point we have
to ask whether it's neccessary or good. I believe that it is good, and I'm
designing the first of my two concatenative languages on the basis of that
assumption.

-Billy

Massimo Dentico — 2000-06-01 00:36:50

Mark van Gulik wrote:
>
> Massimo Dentico <m.dentico@...> wrote:
> > Mark van Gulik on Concatenative mailing-list wrote:
> >> [.. discussing about OLAP ..]
> >> I myself have worked on a smaller scale project with an Objectivity database
> >> containing information for about 50 million customers (about 800 Gig, if I
> >> recall). We did most of the development in *Smalltalk*. No regrets -- it
> >> was a *lot* of fun, and a complete success.
> >>
> >> Languages like Forth would be completely inappropriate for such constructs.
> >> The amount of semantic information in the models easily exceeds Forth's
> >> complexity limits (in my estimation). *Parts* of the database could be
> >> dealt with by Forth, but its entirety is too intrinsically complex (e.g.,
> >> all the indexing structures, keyed by a new kind of spherical coordinate
> >> system and the spectra, must allow well-choreographed storage clustering on
> >> disk and tape without overcomplicating the high-level code requiring the
> >> access).
> >
> > [Well, I'm not here to sell Forth but to highlight its qualities because
> > I'm interested about the design of a better concatenative language of the kind
> > that Bill Tanksley has illustrated previously. I'm convinced that it needs
> > improvement, in particular about its safety, thus my interest in a hypothetical
> > type system that fits well in the general framework of Forth: IMHO the approach
> > "the language is the type system", that you also seem to embrace with Avail,
> > looks very promising.]
>
> I'm positive that Forth is applicable over an enormous range of domains,
> mostly due to its core simplicity and extreme extensibility. I'm mostly
> concerned, however, with the ability to build *restrictions* in the
> language.

I completely agree with this. In particular, the key concept is
".. the ability to build *restrictions* in the language". This means
that *you*, the programmer, build the restrictions when you really
need it and it's not the language designer that in arbitrary way
restricts you.

The ability to restrict a language is crucial to reconcile
extensibility and safety. Plan-P is such an example:

- http://www.irisa.fr/compose/plan-p/

---------------------------------------------------------------------------

PLAN-P : a Programming Language for Active Networks and Protocols

Active networks are aimed at incorporating programmability into the
network to achieve extensibility. An approach to obtaining extensibility
is based on downloading router programs into network nodes. Although
promising, this approach raises several critical issues: expressiveness
to enable programmability at all levels of networking, safety and security
to protect shared resources, and efficiency to maximize usage of bandwidth.

PLAN-P is a domain-specific language for implementing application-specific
protocols. It allows applications to program network routers. [...] The key
characteritics of PLAN-P are:

[...]

Safety and security. Because the language is restricted, many properties
are automatically verifiable. For exemple PLAN-P ensures global termination
and guarantees no packet loss or exponential duplication.

[...]

---------------------------------------------------------------------------

> I've long felt that the power of a programming language depends
> in two nearly opposite was on how restrictive it is. A language should
> never restrict one to only bad solutions that are mired in useless detail
> as, say, Java does, due to its half-assed implementation of almost every
> concept it bastardizes from C++, where the implementation is more
> whole-assed. In the opposite direction, the thing I hate most about
> Smalltalk is its lack of type declarations. This leads me to spend whole
> *seconds* staring at a method, trying to figure out the kinds of things it
> expects as arguments, in instance variables, etc. Seriously, that's a lot
> of time for simple reverse engineering, compared to what's possible.

A concatenative language (but I thik this is true for other laguages too)
should suggest, in any case, to use 1 or 2 and not more than 3 parameters,
like Moore recommend for Forth, so that the name of the word (function)
define the meaning of the parameter(s) in an implicit way. However, simple
checking mechanisms are possible via annotations: you annotate the expression
tree nodes during definition and you check it during actual use. These annotations
could be types but not only. Pratically this means the introduction of auxiliary
stack(s). I think other on this list can express this idea better than me.

> It really has been a long, long time since I wrote a Forth interpreter, or
> even used the language, so my previous post was in part a mild (I hope it
> was mild) flame to find out what people *really* like about it, and even
> what it looks like these days.
>
> And, of course I'm really on this list to learn about concatenative
> languages. My first few posts in this group must have shown my level of
> ignorance (see my attempt at tree-rewriting in Joy for a good laugh).
>
> As for "the language is the type system" in Avail... that's only because I
> got to the implementation of the type system before most of the other things
> in my stacks of notes. Hm, maybe you meant it in the sense of "the type
> declarations of Avail are written in Avail code". I think I was
> interpreting it more like "everything interesting in Avail has to do with
> the type system" when I first read it.

This is what I meant: "the type declarations of Avail are written in Avail code".
Another way to express it is: "you blur the distinction between expression
languages for values and types". This means that you reduce the number of
different syntactic *and* semantic elements in the language. This is closely
related to staged computations in my mind:

a type system in a language (statically or dynamically checked) is useful
for restricting run-time values of expressions (and in case variables);
a static typing ensures at compile-time that the program will satisfy these
restrictions; but, in a language where type declarations are written in
the language itself, the only element that differentiates these (type
declarations) and other code is the stage (compile-time vs run-time).
In a language which offers explicit stage annotations (as in Forth with
words like IMMDEDIATE) the distinction between the language and the type
checker is completely blur (the code for the checker is not different
from other IMMEDIATE code).

> > You miss completely my point. Forth is a meta-tool, this is its point of
> > strength. You know better than me (you say that you have written one) that
> > Forth is extensible, at the syntactic/semantic level thanks to its
> > open interpreter/compiler, in whatever direction: toward low level (direct
> > access to hardware, embedding of assemblers) and toward high level (domain
> > or even application specific languages). Smalltalk, to a certain extent,
> > shares these characteristics of uniformity and extensibility.
>
> Smalltalk and Forth are very similar in initial "world-view". The biggest
> difference I see is that Smalltalk *hides* layers better than what I recall
> of Forth. In other words, Smalltalk works much, much better when building
> upwards than when building downwards. That's fine for the kinds of
> applications I deal with, and the rest of them I can always squeeze data
> through a carefully shaped API to the lower levels (usually C,
> unfortunately). Using Smalltalk as a temporary implementation language for
> Avail was quite instructive to me. It was especially constructive when I
> started to hybridize the VM into a Smalltalk/C++ blend. My productivity
> dropped by more than a factor of 10 (maybe 100 is closer) just by bringing
> C++ into the picture, even though only 1% of the C++ code is hand-coded
> (20KB out of 2.1MB). Still, if I hadn't done it, I'd be spending far too
> long waiting for the VM to run my tests.
>
> > Numerous Object Oriented extensions, with very different characteristics,
> > and a Garbage Collector are available in Forth, written in it. I have seen
> > also subsets of other programming languages embedded in it, like Lisp and
> > Prolog or a tagging language like HTML.
>
> I can understand *implementing* another language in Forth, but embedding it
> seems kind of awkward.

Bill Tanskley discusses that in this message:

- http://www.egroups.com/message/concatenative/165?&start=136

I want only to add that Forth quotation seems less intrusive in comparison
to the quotation mechanism of Joy and more appropriate to absorb other
syntaxes. I would like hear your (list participants) opinion regard this subject.

> GC in Forth is a big surprise to me. In Smalltalk I only know of one or
> two exploits that can actually corrupt memory (easily fixed in the VM, of
> course). Aside from exploiting old bytecode-verifier bugs in Java, I don't
> know how to corrupt memory with it at all. C++ with GC, on the other hand,
> is trivial to corrupt. Simply convert a pointer to integer, xor it with
> 0x55555555, store it somewhere, then later decode and try to use it.
> Even with a non-moving collector, this will corrupt memory.
> Many C++ compilers will do certain kinds of optimizations that are
> incompatible with GC (the main example being pre-adding the address of the
> start of an array and a constant offset that puts the address out of range
> of the array, even though all uses of it would offset it back into range).
> I'm sure Forth has better control over these sorts of incompatible
> optimizations, but GC? I'm skeptical of how. I could see it might be
> possible if one sandboxed everything and made addresses opaque objects that
> were not convertible to integers. A rogue program would still crash, but
> the crash would only be in the context of the sandbox (not the
> OS/Process/what-have-you).
>
> I just re-read your comment about your concern for more safety in Forth. I
> guess that's a way of phrasing my major concern with it, too. It's not just
> pointer-safety and type-safety, though. Is it common practice (yet) to use
> assertions in the language?

I don't have nothing to add on the subject of GC which it has not been written
already by Billy and Samuel A. Falvo II.

I don't like assertion: they are usually intended as dynamically checked
invariants instead I prefer statically checked invariants when possible.

> > All that in a coherent, uniform framework, *without* leaving the language
> > (or better, the environment) and on-the-fly (load, use and then discard, the
> > extensions you need). Note that this is different in comparison to statically
> > compiled modules, it's a dynamic precess: every time you load a screen (a little
> > chunk of code) you interpret/compile it perhaps in a different context; this
> > opens the door to adaptive interpretation/compilation/execution which is specific
> > to actual context, so any sort of specialization/optimization which exploit
> > this dynamic information is potentially possible (depending of what make sense).
> > Note even that only most frequent executed parts of the program need to be
> > heavily optimized and these parts could vary during the execution itself or
> > between different executions.
>
> I agree that a near-instantaneous compiler allows one to get on with the
> work without fussing with figuring out how to reduce dependencies with
> headers and make files. Forth must be terrific for this on today's
> hardware. My Forth platform was the Commodore-64, which can now run >10x
> original speed in *precise* emulation on my 233MHz Mac. By precise I mean
> the CPU, the video interrupts, the CIAs, VIAs, sound chip, etc. are all
> perfect virtual replicas, down to the cycle.
>
> In Smalltalk, there is no type system to slow down compilations with.
> Unfortunately, it misses many optimization opportunities, but projects like
> Self have helped show what can be done without sacrifice (other than the VM
> developer's time).

I agree with you.

> > Traditional batch compilers don't know much about the dynamic execution context
> > of the code, they compile alla and blindly apply every possible optimization
> > technique statically. This means long compilation times and calls for separate
> > module compilation. But separate compilation introduces complexity and constrains
> > the semantics of the language: every feature that is not statically known,
> > at compile time *and* at the module boundary, is discarded or restricted (for
> > example, separate compilation in Standard ML, which have a powerful type system,
> > is quite difficult and limits the language). Obviously, moving this to link-time
> > or load-time (of compiled code, a.k.a. DLL) complicates the process and don't
> > solve anything: the problem here is too much pre-compilation. Forth avoids this
> > issue entirely and Smalltalk too (I know that, for some implementation, is possible
> > to compile to C/C++ but usally this feature is used at the end of a development
> > cycle).
>
> That's why Avail was designed from the ground up to support dynamic
> optimization. The current implementation has a stack-oriented nybblecode
> representation for space, which is dynamically translated, only when used
> heavily, into a wordcode representation that's RTL-like. All optimizations
> are reversible, even for existing stack frames. Basically, the wordcode
> interpreter is *completely* hidden from application developers (it can be
> omitted entirely from the VM to save space on tiny platforms).
>
> >> Clearly these projects need the cooperation of many software developers.
> >> What kind of tools are available for Forth in this regard? As far as I'm
> >> aware, Forth doesn't even have namespaces. It also lacks (if memory serves)
> >> any ability to specify interfaces separately from implementations. High
> >> complexity parallel development is simply not possible (at any price) in
> >> such an environment.
> >>
> >> Your mileage may vary.
> >
> > Experienced Forth programmers assert that their productivity is greater of
> > an order of magitude when they use Forth rather than other programming
> > languages. I think that, thanks to the characteristics discussed above, they
> > tend and succeed to avoid complexity, thus the necessity of big teams of
> > programmers and complex management.
>
> Actually, I don't know of a language for with 10x productivity is *not*
> claimed by its advocates! Well, ok, FORTRAN, COBOL, and other CAPSLOCK
> languages. My personal output peak was 5600 lines of fully tested,
> documented code, sustained over a 4.6 day period (~35 hours). That's about
> 1200 lines a day. The complexity level was fairly high, as this was a
> metacircular Smalltalk compiler, interpreter, debugger, disassembler, and
> decompiler. I did write 1000 lines of Pascal in a (very short) day, but
> that simply implemented a couple of variants of linked list. I only
> remember it, because when I compiled it I thought the compiler was broken -
> the errors weren't being output to the console. I checked again, and the
> executable was just sitting there. I ran it, and every test ran perfectly.
> Definitely a strange feeling, but I get that quality level regularly now
> with Smalltalk (well, there are usually a few typos). Over all, I'd guess
> my productivity is close to 100x higher in Smalltalk than C, or about 20x
> higher than in C++.

I don't intend productivity as LOCs (Line Of Codes) but what is possible
to accomplish with the same resources. Give a look to this example:

---------------------------------------------------------------------------

on comp.ai.genetic Markus Mottl <mottl@...-wien.ac.at> wrote:

- http://x33.deja.com/%5bST_rn=ps]/getdoc.xp?AN=573110275&search=thread&CONTEXT=955486708.1962147859&hitnum=6

or as pure text:

- http://www.deja.com/%5bST_rn=ps]/getdoc.xp?AN=573110275&fmt=text

[...]

Consider the Horus project (written in C - ca. half a million LOCs),
conducted at Cornell University. Then consider its successor project
"Ensemble" in which they switched to OCaml and rewrote the whole system
in about fifty thousand LOCs. Result (read their page below): a *speedup*
by an order of magnitude! So the system was not only considerably less
complex but also much faster...

http://www.cs.cornell.edu/Info/Projects/Ensemble/index.html

Well, you could argue now that they are lousy C-programmers but excellent
OCaml-hackers. However, implementing a 500 kLOC program in C is not peanuts...

[...]

---------------------------------------------------------------------------

Contrary to popular belief *size matters* and *the language make the difference*.


> > But forthers are no more alone in their commitment toward simplicty.
> > "Do the simplest thing that could possibly work" [1] is also the motto of
> > Extreme Programming [2], a lightweight, humanistic discipline of software
> > development, grown in a part of the Smalltalk community:
> [...]
> > Don't Complexify
> [...]
> >
> > Simplify the problem you've got or rather don't complexify it. I've done it
> > myself, it's fun to do. You have a boring problem and hiding behind it is a much
> > more interesting problem. So you code the more interesting problem and the one
> > you've got is a subset of it and it falls out trivial. But of course you wrote
> > ten times as much code as you needed to solve the problem that you actually had.
>
> Ah, but that assumes you will never be asked to solve the general problem.
> I disagree with the level of "fix it up later" that XP advocates. At first,
> when I read the Charles Moore quote I thought "so, writing ten times as much
> code doesn't mean it took 10x longer to do". Usually testing, correction,
> and general maintenance will be the dominating costs. But 10x as much code
> means approximately 10x as many bugs (assuming the problems were of equal
> difficulty, and the solutions of similar complexity, neither of which was
> probably true). Bug density seems fairly narrow, and depends more on the
> person writing the code than anything else, including language. My
> father-in-law produce something like 50,000 lines of COBOL code, mostly a
> rewrite of a small piece of a moderate scale Smalltalk system I worked on.
> It took one month, and as of this date I believe there were two bugs every
> found in it that weren't compiler bugs. And this was significantly *above*
> his average released-code bug rate.
>
> Clearly the language doesn't matter as much as the discipline. However,
> that month could have been a week in another language.

See the replay of Bill and above (I agree that the discipline matter
but the language is important too).

> > This seems to contradict what I have written in another e-mail, my concern with
> > generality, but it's not so: abstraction, taking out of context, is a process that
> > necessarily needs more concrete "objects" by which to abstract.
>
> Wow, that phrase "abstraction [is] taking out of context" really sticks in
> my mind. I've never heard it described in exactly those words, but they
> *really* fit.

Yes, it really fit. I was impressed also, the first time I have read it.
This is not by me but François-René Rideau in the Tunes glossary:

- http://tunes.org/papers/Glossary/index.html#abstraction

> Anyhow, *finding* the appropriate abstraction level is most of my effort when
> working with Smalltalk, both in reading it and writing it, but in different ways
> that I'm not sure I can describe well.

Probably this is related to the fact that a single notation is not enough,
see my post on Tunes mailing list:

"Meta-programming important for Psychology of Programming"
- http://lists.tunes.org/list/tunes/0005/msg00000.html

Beacuse metaprogramming seems more simple with a concatenative
language than with applicative languages, I think they are a valuable
option as a minimal canonical notation on which construct different
views (abstractions) via metaprograms.

> > In fact in Forth and Smalltalk communities (and probably other, like Self
> > community) this process of abtraction is called factoring or refactoring
> > and moves from more specific/concrete pattern of code/data to more abstract one,
> > in an utilitarian way because only what is really general, in the actual context,
> > is generalized. This contributes to the compactness and then tractability of code
> > (no redundances). I think that generalization and specialization are not necessarily
> > in contrast.
>
> Yes, XP is an interesting phenomenon. I've been using many of the core
> concepts for a long time (as have many other Smalltalkers and others). My
> current practice is quite light on regression test suites, but I'm working
> on it. As the saying goes, "Every day, in every way, I get better and
> better, as shown by the chart on my left, showing the results of regression
> tests on two consecutive days..."
>
> My apologies for such an off-topic post. Hopefully there are enough
> Forthers out there that this isn't as off-topic as I fear it is.


--
Massimo Dentico

Massimo Dentico — 2000-06-01 01:13:42

Massimo Dentico wrote:
> [...]
>
> This is what I meant: "the type declarations of Avail are written in Avail code".
> Another way to express it is: "you blur the distinction between expression
> languages for values and types". This means that you reduce the number of
> different syntactic *and* semantic elements in the language. This is closely
> related to staged computations in my mind:
>
> a type system in a language (statically or dynamically checked) is useful
> for restricting run-time values of expressions (and in case variables);
> a static typing ensures at compile-time that the program will satisfy these
> restrictions; but, in a language where type declarations are written in
> the language itself, the only element that differentiates these (type
> declarations) and other code is the stage (compile-time vs run-time).
> In a language which offers explicit stage annotations (as in Forth with
> words like IMMDEDIATE) the distinction between the language and the type
> checker is completely blur (the code for the checker is not different
> from other IMMEDIATE code).
>
> [...]

Of course, such a language needs to be carefully designed. In fact is
simple to see problems of circularity: "how to type check the type
checker?". Note that this is usually hidden but *always* present:
a compiler designer could write (and usually do) the compiler, and so
the type checker, in a statically typed language.

--
Massimo Dentico

Massimo Dentico — 2000-06-01 01:53:22

wtanksley@... wrote:
> [...]
> BTW, I've been talking with the guy who implemented the strong typesystem
> for Forth, and it turns out that the language IS the typesystem for him. He
> is using the Forth way of doing things after all. You can include any Forth
> code in the type declarations.
> [...]

It's interesting. I have missed various messages on comp.lang.forth
because of my move to another city. Is he Stephan Becher? Is he
publishing his work on the net?

Thanks.

--
Massimo Dentico

Dr. Stephan Becher — 2000-06-01 13:12:20

Yes, that's me. :-)

I already promised to supply the Forth community with an
as-far-as-possible ANS compatible version of my strongly typed
Forth, including documentation, by the end of July. An older
version that produces optimized native code, is already
finished, but it is non-standard and would require a lot of
documentation, which only partly exists (in german, not in
english). Therefore I do not intend to publish this.

Following are some details on the type mechanism in advance.

There exists a hierarchical type structure with two types,
SINGLE and DOUBLE at the top level. The second level inludes
types like INTEGER, ADDRESS and LOGICAL as subtypes of SINGLE,
and for example INTEGER-DOUBLE and TOKEN as subtypes of DOUBLE.
Most second-level types have further subtypes. The type
hierarchy can certainly be extended by user-defined types
without any limits.

Each Forth word has a stack diagram assigned to it, which is
stored as part of the dictionary. When defining a new word, one
has to specify the stack diagram in the usual Forth style, for
example:

: SPACES ( UNSIGNED -- ) ... ;

If the stack diagram is left out, as in

: ." ... ;

the word is assumed to have neither input nor output
parameters.

( is an immediate word, which puts a status code of data type
STACK-DIAGRAM onto the stack and enteres interpreter mode.
Thus, the following words up to ) are normal interpreted Forth
words. UNSIGNED adds an input parameter of type UNSIGNED to the
definition of SPACES into the dictionary and modifies the
status code. Further input parameters can follow. -- does some
checks on the status code and modifies it in a way so that from
now on output parameters are generated. Finally, ) does
additional consistency checks, initializes the data type stack
with the just defined inpur parameters and re-enters compiler
mode. Since the whole stack diagram is just interpreted Forth
code, words like

: 2*INTEGER ( STACK-DIAGRAM -- STACK-DIAGRAM ) INTEGER INTEGER
;

or even

: N*INTEGER ( STACK-DIAGRAM UNSIGNED -- STACK-DIAGRAM ) 0 DO
INTEGER LOOP ;

can be defined. Also very interesting is

: STRING ( STACK-DIAGRAM -- STACK-DIAGRAM ) CDATA -> CHARACTER
UNSIGNED ;

The latter example requires some explanation. Since pointers
are strongly typed, it has to be specified where they point to.
CDATA is a subtype of DATA, which is itself a subtype of
ADDRESS. To be able to use more than 64k of memory, I have
divided the address space, thus DATA is an address in the RAM
part of the data space. CDATA is the address of an object of
character size, while DATA usually addresses cell sized objects
(16 bit on a 16-bit machine). -> is a Forth word that connects
CDATA and CHARACTER, meaning that this is only one parameter,
namely a data space address of a character.

There are certainly much more details to this concept. I've
tried to keep everything as simple as possible and as
Forth-like as possible. Please accept that I can't provide too
many details right now, because I would rather keep my promise
and have a real version finished by the end of July.

Stephan

----------
From: Massimo Dentico <m.dentico@...>
To: concatenative@egroups.com
Subject: Re: [stack] Digest Number 26
Date: Donnerstag, 1. Juni 2000 03:53

wtanksley@... wrote:
> [...]
> BTW, I've been talking with the guy who implemented the
strong typesystem
> for Forth, and it turns out that the language IS the
typesystem for him. He
> is using the Forth way of doing things after all. You can
include any Forth
> code in the type declarations.
> [...]

It's interesting. I have missed various messages on
comp.lang.forth
because of my move to another city. Is he Stephan Becher? Is he
publishing his work on the net?

Thanks.

--
Massimo Dentico

----------------------------------------------------------------
--------
Now there’s a website! Pay half & get free shipping!
....................................................
http://click.egroups.com/1/4736/6/_/_/_/959824125/
----------------------------------------------------------------
--------

To unsubscribe from this group, send an email to:
concatenative-unsubscribe@egroups.com



----------

wtanksley@bigfoot.com — 2000-06-01 16:49:33

From: Massimo Dentico [mailto:m.dentico@...]

>Of course, such a language needs to be carefully designed. In fact is
>simple to see problems of circularity: "how to type check the type
>checker?". Note that this is usually hidden but *always* present:
>a compiler designer could write (and usually do) the compiler, and so
>the type checker, in a statically typed language.

As a matter of fact, I don't see a problem. The structures and data types
used to check the currently-being-defined types are not the same as the
currently-being-defined type (in other words, we shouldn't have to run into
a problem with recursion).

>Massimo Dentico

-Billy

Massimo Dentico — 2000-06-01 17:26:02

"Dr. Stephan Becher" wrote:
>
> [...]
> There are certainly much more details to this concept. I've
> tried to keep everything as simple as possible and as
> Forth-like as possible. Please accept that I can't provide too
> many details right now, because I would rather keep my promise
> and have a real version finished by the end of July.

Thanks Stephan. Ok, I'll try to postpone every question (and
perhaps some contribution) when you'll go on in public with your
project.

--
Massimo Dentico

Massimo Dentico — 2000-06-01 18:59:10

wtanksley@... wrote:
>
> From: Massimo Dentico [mailto:m.dentico@...]
>
> >Of course, such a language needs to be carefully designed. In fact is
> >simple to see problems of circularity: "how to type check the type
> >checker?". Note that this is usually hidden but *always* present:
> >a compiler designer could write (and usually do) the compiler, and so
> >the type checker, in a statically typed language.
>
> As a matter of fact, I don't see a problem. The structures and data types
> used to check the currently-being-defined types are not the same as the
> currently-being-defined type (in other words, we shouldn't have to run into
> a problem with recursion).

This is a possible simple (maybe the best and useful) approach.
What I was thinking here is to turn a problem in an advantage,
specifically I was thinking about coinductive definitions. But
it's only an intuition, perhaps wrong. Unfortunately I have not
sufficient mathematical skills to explore this subject, but
I can try. Sorry for my hazard. :-)

Brian Rice should propose something in this field (co-inductive
types) with his Slate and Mobius. Again, this thread on the Tunes
mailing list (Tml from now) can help to understand what I'm speaking
about:

"Induction, Co-induction, and Types in Slate"
- http://lists.tunes.org/list/tunes/0003/msg00053.html

-------------------------------------------------------------------------

[...] I originally came up with this idea to deal with the oddities of
co-inductive types, such as streams of values or even the rational number
system. Both of these ideas don't admit to a numeration via the natural
number system, and in programming languages are naturally
lazily-implemented. However, a robust general method (using equational
expressions) exists for specifying these types beyond simple algorithms.
However, I'm still working out the mechanics of this type of framework in
Slate, and I invite your comments. [...]

-------------------------------------------------------------------------

Co-induction, as I understand it, means mutually recursive definitions
without a base case.

Brian speaks about equational expressions, in other words a declarative
style of programming. This recalls to my mind what you have said, Billy,
in a long thread about Joy on the Tml:

"RE: Joy, Lambdas, Combinators, Procedures"
- http://lists.tunes.org/list/tunes/0002/msg00004.html

-------------------------------------------------------------------------

[...] An applicative system only makes sense as a declarative system. A
concatenative system also makes sense as an imperative one. [...]

-------------------------------------------------------------------------

Joy shows well this point of view. In fact a definition really looks
like an equation:

square == dup *

but you can immediatly interpret it operationally (imperative style).

Now, in the case of co-induction (mutually recursive definitions w/o base
case) the imperative interpretation produces an infinite loop. If you are
not interested in a perpetual process this is not very useful. With a
declarative interpretation instead you can deal with infinite data
structures, like streams.

Note that I have not distinguished much between processes and data
because it's not relevant here.

Returning to my original question: "how to type check the type checker?"
The circular character of this definition is evident. I don't know if
it's useful and pratical to deal with the problem. I want to hear yours
opinions: this subject merits further investigations?


--
Massimo Dentico