[in which we discuss how to build programs in concatenative languages -- and
in general, in any language]
From: Mark van Gulik [mailto:
ghoul6@...]
>From: wtanksley@...
>> From: Mark van Gulik [mailto:ghoul6@...]
>>>wtanksley@... wrote:
>>>> From: Mark van Gulik [mailto:ghoul6@...]
>>>>>Massimo Dentico <m.dentico@...> wrote:
>>>>>In other words, Smalltalk works much, much better
>>>>>when building upwards than when building downwards.
>>>> I'm not sure how to take this. Do you mean that Smalltalk
>>>> is good at bottom-up design? Forth is also renowned for that.
>>>No, I was referring to building higher level constructs on top
>>>of existing
>>>ones, versus rebuilding lower level constructs to better
>>>support existing ones.
>> How does that differ from bottom-up design? It sounds identical.
>I think bottom-up usually refers to designing the components that are
>predicted to be needed before building the higher level
>constructs that will ultimately require them.
That's one way of doing design -- I've usually heard it called things like
"meet-in-middle", since you have to start by designing top-down in order to
know what's needed.
Forth is usually used in true bottom-up design, where you write (and design)
the components which do work before you write and/or design the components
which use them.
>I'm talking about reworking an existing system.
Refactoring in Forth is usually done in a bit-by-bit way: rather than
"reworking a system" you refactor parts of it as needed.
>Maybe it's my Smalltalk mentality showing through. You get a
>system working
>(you aim for getting end-to-end completeness even if lots of
>functionality
>is totally faked), and then (A) fill in the pieces of the
>implementation
>that you faked, and simultaneously (B) rearchitect the high
>level pieces now
>that you know what the lower level pieces must look like (even
>if they're
>just "hacked" sketches). This allows inside-outward design and
>implementation to occur simultaneously. This really, really
>works well in practice.
That's definitely NOT bottom-up; it's almost pure top-down design. The
reason it works so well is that you're permitting bottom-up concerns to
affect your already-written code by means of refactoring.
It's also used reasonably often in Forth -- one guy (whose name I've
forgotten) writes certified software by a similar method.
>>>>>I can understand *implementing* another language in Forth, but
>>>>>embedding it seems kind of awkward.
>...
>>>In Avail syntax I would probably be able to embed the syntax
>>>directly,
>>>without resorting to quoting (even though the quotes have effectively
>>>dissolved by runtime in your example).
>> I'm mildly confused. First, you seem to have confused the
>> presence of
>> quotes with the action of quoting. There's no quoting going
>> on; I could
>> have just as easily named the word "I(" and had it parse all
>> the input, and
>> stop when it reached an unbalanced ")" (which would appear
>> to be a matching
>> parenthesis to the reader). The syntax IS directly
>> embedded; Forth knows nothing of it.
>> : myWord i( 3*4/x+sin(x)) ;
>Yes, but you couldn't write it as:
> : myWord 3*4/x+sin x ;
>I think that for Forth the quoted version is more appropriate,
>but for Avail I strive for the unquoted form (without the
>delimiters).
Unless I'm seeing something wrong, you're boasting that Avail can read its
own syntax without using quotes. That's so obvious it's not even
commendable -- why would anyone want a language which couldn't handle its
own syntax? The impressive thing about Forth is that it can read languages
which have no relation to its own syntax, and which are even contradictory
to its syntax. Infix is natural for Avail. Infix is NOT natural for Forth.
Let me give you an example of a language which is natural for neither
language: Befunge. Check out
http://www.catseye.mb.ca/esoteric/befunge/index.html for info.
Here's a Forth program (dependant on the 'Befunge:' library word) to print
out the complete lyrics to the immortal ode '99 bottles of beer on the
wall.'
Befunge: ================================
9::*\2*+00p0v"."0<
>310p0"," >"llaw eht no "v >#v_ ^
^_210p0"--:" v ,
: v " of beer" < :
- >"selttob"00g.^ < <
1 >00g1-#^_$" elttob erom enO" ^
>00g#^_$" selttob erom oN" ^
^_110p0",dnuora ti ssap ,nwod eno ekaT"^
^:-1_010p00g1-00pvv:-1g01_@#g00,*25<
^ <
=========================================
>When you said
>there was no quoting going on, I think what you meant was there was no
>*string manipulation* (other than token scanning).
No, I meant that there's no feature of the language being invoked which
tells the language parser to treat the text specially in any way.
You seem to mean that there should be no symbols surrounding the text which
represents a chunk of code from a foreign language. I don't like your
definition, because it doesn't seem to be useful. However, Forth could also
handle such a word -- consider the word 'expr:', which uses the same parsing
engine as 'i"' above. Unlike i", expr: doesn't search for a closing quote;
instead, it simply parses exactly one expression from the following source.
Usage:
: myWord expr: 3 ;
: myWord2 expr: sin ;
: myWord3 expr: (sin x) ;
As you can see, the problem here is that almost any legal lexeme is also a
legal expression.
>>>[...trivial code that will fail with GC but succeed without...]
>> Fortunately, it's easy enough to fix -- simply make MALLOC
>> and FREE the only
>> words which can't be used in GC-able Forth programs. This
>> fits perfectly
>> with what all other GC-able languages do, and it harms
>> almost no Forth programs.
>> You can get more sophisticated, but as you point out, getting more
>> sophisticated adds edge effects. This solution is certain to work.
>Maybe I'm missing something, but isn't MALLOC the only thing
>that GC should
>have any effect whatsoever on (i.e., making sure a ton of
>unreachable data
>isn't preventing MALLOC from succeeding)? Or do you mean you
>would provide
>an alternative like GC-MALLOC or something. If so, that sounds like a
>perfectly reasonble place to draw the line of compromise.
>You'll still have
>to worry about reference policing, I think. Does Forth
>support casting,
>union types, tagged variant records, reinterpretation of pointers as
>integers, etc? If so, your line might not be as straight as
>you thought.
No to all your questions, because malloc is the only Forth function which is
capable of returning a conventional pointer. Anything else in the Forth
language returns not a pointer, but rather an index into the dictionary.
Yes, we'd have to have a GC-compatible malloc equivalent. This way most
malloc programs will be easy to translate: just change 'alloc' to
'gc-alloc'. Some will give you grief; you'll have to remove pointer games
(but the reason for their incompatibility wasn't the games, it was the fact
that they used malloc at all).
>>>Avail doesn't necessarily keep secret which optimizations have been
>>>performed.
>> Yes, but it does them without asking you. My initial
>> language won't do any
>> optimizations except for ones which cause semantics changes;
>> if you want to
>> apply space or time optimizations, you have to apply them yourself.
>I think (hope) you phrased that backwards ("...won't do any
>optimizations
>except for ones which cause semantics changes").
>Optimizations are *never*
>supposed to change the semantics of a program. If a
>transformation causes
>the semantics to change, it's not an optimization, it's a *required*
>transformation. Optimizations are always supposed to be optional.
I don't see anything in the morphism of the word "optimization" which says
anything about optionality or semantics changes. However, I have no problem
with using the phrase "required transformation", so long as I don't have to
use it too often :-).
My compiler will by default do NO "optimizations" (in your sense); it will
only do optimising "required transformations".
If you want to have your program optimised, you have to ask for it
explicitly. Fortunately, I plan on making that easy: since the code
generator is part of the runtime, you'll be able to write your own
optimisers (or use prewritten ones) with complete impunity.
>>>BTW, I don't recommend
>>>leaving it up to the application-level programmer to decide
>>>which areas to apply optimizations.
>> I'm actually not -- it's up to the systems programmer. If
>> anyone writes an
>> application in my initial low-level language, they'll
>> actually be writing a
>> system, and deserve what they get :-); for example, they
>> won't get any GC.
>You may be surprised, but in your context I agree 100% with
>this division of
>responsibility. In Avail I intend that optimizations are provided by a
>systems-level programmer (or theoretician!). The exact places
>at which to
>apply the optimizations is Avail's decision, but an
>optimization may provide
>hints about where to expect the optimization to be profitable.
I do suspect that my high-level language will try to be clever about
optimizations, by the way. I'll have to see how much progress is made on
the concatenative languages front while I'm finishing my low-level language;
if someone else has already made a good enough concatenative language which
handles that sort of problem, I'll skip it.
>>>For a terrifically bad example, look at the way
>>>inlining is specified in C++. An inlining hint actually alters the
>>>semantics!
>> C++ is, as usual, a spectacularly bad example. The problems
>> you're looking
>> at there involve the fact that the programmer can apply
>> semantics-changing
>> optimizations *inconsistently*. In my low-level language,
>> I'm going to have
>> a set of semantics-changing optimizations which will be applied
>> automatically and consistently -- for example, tail calls
>> will be optimised
>> to jumps. Because it happens every time, it'll be perfectly
>> predictable.
>The tail call is atypical.
What does that mean? Are you saying that few functions end with a call to
another function??
>I've been considering almost from the start
>whether to support tail-call elimination in Avail. My current
>choice is no,
>but certain patterns of calls can still be tail-call optimized
>if they are
>able to reconstruct the "effective" stack on demand. I may
>simply change
>the execution model so that tail calls happen every time. I
>call this a
>transformation (or just a different definition of the semantic
>model for Avail), but not an optimization.
But it clearly is an optimization -- it makes the program execute faster.
It's also clearly a semantic transformation as well, since it makes
debugging totally different.
>>>In my current work (large scale databases) it sure would be
>>>nice, but this
>>>is a very rare exception. I doubt, however, that Forth would
>>>significantly
>>>improve disk latency within OODB requests (certainly not 1000x).
>> It certainly wouldn't -- but Chuck would try to find
>> something to replace
>> the entire OODB with which would solve the specific problem
>> 1000x faster.
>> That's the point of his work.
>Ah, but the real bottom line is the total cost to our clients.
> If 1000x
>doesn't significantly improve what they can do with the
>system, then finding
>a way to rewrite it might not be worthwhile. Deciding when to upgrade
>hardware has the same kind of trade-off.
Of COURSE. This is blatantly obvious. But it's also blindingly obvious
that it's better to be able to do something than to not be able to do it.
>>>>>Ah, but that assumes you will never be asked to solve the
>>>>>general problem.
>>>> No, it assumes _nothing_. That's the beauty of it. It only
>>>> solves the
>>>> problem at hand -- and it leaves you with _real_ experience
>>>> of what the
>>>> specific problem is actually like. If you are later asked
>>>> to solve the
>>>> general problem (and that does sometimes happen), you'll be able to
>>>> confidently write the general solution, knowing that you
>>>> have a specific
>>>> solution to test it against (or perhaps even to base it on).
>>>> If you aren't
>>>> asked for that, you haven't spent any extra time or effort on it.
>>>Ok. One test case (or two, as XP implies) will help make sure of the
>>>applicability of a general solution.
>> Specific, not general. You can't possibly test the general
>> case without
>> first solving *all* of the specific cases.
>You mean you can't *cover* the general case without testing
>the specific ones. Testing isn't always about coverage.
I'm not sure why this matters. I mean that you can't test the general
solution in any way without at least inventing some specific problem under
which the solution can be tested.
>>>But the specificity (and complexity)
>>>of these cases might be much higher than similarly "contrived"
>>>test cases
>>>would have. Also, after the general solution is implemented,
>>>the specific
>>>solutions would have to be completely rewritten within the
>>>context of the general solution to be useful as test cases.
>> I may be missing something, but why would they have to be
>> rewritten? On the
>> contrary, you don't want to rewrite them -- they're known to
>> work, so you
>> can run them, and then run your general solution, and
>> compare the outputs.
>Ah, as a cross-check. It seems kind of strange to keep that
>much redundancy
>in the system just for testing, but a cross-check is
>impossible without it. Quite the dilemma.
Dilemma? Not at all. It's the difference between code that's tested and
code that isn't. Not much of a question, especially when the "redundant"
code is already written.
>>>>>I disagree with the level of "fix it up later" that XP
>>>>>advocates.
>>>> And I strongly agree -- with the caveat that "fix it"
>>>> implies that it's
>>>> broken now, which is the opposite of what XP advocates.
>>>I did intentionally avoid saying "fix it later", and chose to insert
>>>"...up...", trying to give it a connotation of something like
>>>cleanliness,
>>>luxury, or maybe attention to detail. Yes, make sure everything is
>>>maximally tested (and working) early on, keeping it that way.
>> XP doesn't recommend skimping on attention to detail, or
>> pretty code, or
>> anything like that. Frankly, I don't see how you can
>> justify implying --
>> no, not implying, but rather explicitly stating -- that about XP.
>Hm. By cleanliness I mean semantic cleanliness, which is kind
>of related to
>the ability to reuse something rapidly and safely. In my
>experience that
>usually involves *planned* genericity, something XP seems (to me) to
>outright avoid. The ruthless refactoring for genericity would
>only happen
>when duplication of functionality would be the alternative. Am I
>misinformed about this aspect of XP?
I believe you are. Reuse is one of the primary rules of XP, but they also
assume that no code will be useful for reuse the first time it's written.
No matter how hard you work on the design-for-reuse, you're going to be
designing for the wrong thing; and when it comes time to actually do the
reuse, you'll have to do all that work over again anyhow, and the code you
wrote earlier to "facilitate reuse" will almost certainly just get in the
way, and will at LEAST be a distraction and a temptation.
It's FAR better to just design for USE, and then when it comes time to
reuse, be ready to refactor that code. By the second time you reuse the
code, you'll probably find that it's much easier to refactor. By the third
time, you probably won't have to do any refactoring at all. And there you
are -- you've reached your goal of reuse, AND you've made money on contracts
without spending time on reuse.
>>>No, I don't
>>>think one should do that to the exclusion of elegant design from
>>>generalization.
>> And I think design from generalization (elegant or
>> otherwise) is the worst
>> mistake you can make, and you should go to almost any
>> lengths to avoid it.
>> Design from generalization solves many problems which you
>> don't have, almost
>> always adds bloat, and makes claims which are inherently untestable.
>> It's much better to design a solution which solves the
>> problem you have.
>My own coding experience strongly indicates otherwise, except
>that more than
>half of the "generic" code I've seen happens to be total crap.
Fits my theory.
> Maybe the
>problem is that anyone can *try* to write generic code, but
>few have the
>knack for it. When I see it done well (which is fairly common
>in Smalltalk code) it saves me a *lot* of effort (i.e., cost).
But Smalltalk has been built up over *years*. They've had time to refactor
all of their solutions. The solutions could easily have started as specific
ones, and then been generalised during use.
>Why would I want some
>almost useless specific solution in the library I'm using?
Specific solutions don't belong in libraries. Generic solutions don't
belong in applications.
>Maybe that's the
>root of our difference in experience: Smalltalk is mostly
>about reusing and
>building libraries (in that order), whereas most other
>languages are about
>writing code. Outside of Smalltalk, I concede that your claims are
>perfectly realistic. Pretty much *all* C++ code is crap,
>*especially* the
>stuff that tries to be generic. Java doesn't have a single feature to
>support genericity, which makes it even worse than C++ for reuseable
>solutions. And these are the "big OO" languages! Scary. I'm
>glad I don't
>have to stick my hands in *those* cages very often.
Smalltalk is so good at that *because* it has good libraries. Why does it
have good libraries? Simply because Smalltalkers were willing to modify bad
libraries.
>>>Shorter code does not imply fewer people are required for
>>>maintenance.
>> Yes, it does.
>No, there is a moderate correlation, not a strict implication.
> Ten lines of
>APL is not easier to maintain than eleven lines of Smalltalk (probably
>ever).
Whoah! Do you have ANY idea how much more those ten lines of APL code are
doing than the ten lines of Smalltalk code? I'd make a guess that APL is at
LEAST ten times more dense than Smalltalk. At least. In fact, I'd guess
that a displayed page of Smalltalk code (80x25) would fit in a line of APL.
That may or may not affect maintainability (it makes readability harder, but
because APL isn't really harder to work with than Smalltalk, I would judge
that it makes maintainability easier; at least by enough to balance things
out).
So you're just comparing apples and oranges. Ten lines of Smalltalk is
easier to maintain than eleven lines of Smalltalk. (I don't care about
lines of code; you're the one who implied that shorter code == fewer lines
of code.)
There's not merely a strong correlation, but an absolute certainty. Smaller
code is easier to reason about, harder to introduce non-obvious errors into,
and easier to locate mistakes in.
>>>With
>>>any refactoring tools (even a multi-file search & replace) one
>>>person can
>>>maintain denormalized code within 10x the cost of normalized
>>>code.
>> I'm sorry, I've never heard that word used that way.
>> Normalized code?
>The "Once-and-only-once" pattern. Programming as compression.
Ah. That's not what I'm talking about, of course.
> This theme
>has shown up in many forms over the years. Really, it just
>means that to
>make most changes, only local edits are required (see work
>circa 1995 on "change cases").
Unrelated, but there's some current work going on in "aspect oriented
programming". See www.aspectj.com for more info.
>Over the last few years I've changed my
>viewpoint from
>striving for "maximally compressed" to "minimal cost of
>change".
Reasonable. Once you have a good knowledge of refactoring, making these
kinds of choices is natural.
>Since the
>likely directions of changes have to be predicted up-front,
>this implies (to
>me) that a good factoring is one in which most of the code is
>as generic as
>possible. YMMV in other languages.
You actually want the top-level code to be completely problem-specific, in
syntax as well as terms. Ideally, a domain expert with only minimal
knowledge of the language could correct your top-level code.
Anyhow.
I don't see how this at all implies generic code. You want code that's easy
to refactor, but for other reasons you don't want that code to also be
generic, since generic code requires extra work which you don't have time to
test (and therefore will be wrong).
Refactoring has some interesting dimensions in concatenative languages.
It's VERY easy to perform the "Extract Function" refactoring, since there
are no local variables; in fact, it's a trivial cut-and-paste. "Inline
Method" is also trivial, and splitting a method down the middle and putting
one half in one function and another half elsewhere becomes quite reasonable
as a single-step refactoring.
-Billy