Quoting versus embedding [Was Re: [stack] Digest Number 28]

Mark van Gulik — 2000-06-07 04:31:08

From: wtanksley@...
> From: Mark van Gulik [mailto:ghoul6@...]
>>wtanksley@... wrote:
>>> From: Mark van Gulik [mailto:ghoul6@...]
>>>>Massimo Dentico <m.dentico@...> wrote:
>
>>>>In other words, Smalltalk works much, much better
>>>>when building upwards than when building downwards.
>
>>> I'm not sure how to take this. Do you mean that Smalltalk is good at
>>> bottom-up design? Forth is also renowned for that.
>
>>No, I was referring to building higher level constructs on top
>>of existing
>>ones, versus rebuilding lower level constructs to better
>>support existing ones.
>
> How does that differ from bottom-up design? It sounds identical.

I think bottom-up usually refers to designing the components that are
predicted to be needed before building the higher level constructs that will
ultimately require them. I'm talking about reworking an existing system.
Maybe it's my Smalltalk mentality showing through. You get a system working
(you aim for getting end-to-end completeness even if lots of functionality
is totally faked), and then (A) fill in the pieces of the implementation
that you faked, and simultaneously (B) rearchitect the high level pieces now
that you know what the lower level pieces must look like (even if they're
just "hacked" sketches). This allows inside-outward design and
implementation to occur simultaneously. This really, really works well in
practice.

I guess bottom-up is close enough to the "building upwards" that I was
talking about.

>>>>I can understand *implementing* another language in Forth, but
>>>>embedding it seems kind of awkward.
...
>>In Avail syntax I would probably be able to embed the syntax directly,
>>without resorting to quoting (even though the quotes have effectively
>>dissolved by runtime in your example).
>
> I'm mildly confused. First, you seem to have confused the presence of
> quotes with the action of quoting. There's no quoting going on; I could
> have just as easily named the word "I(" and had it parse all the input, and
> stop when it reached an unbalanced ")" (which would appear to be a matching
> parenthesis to the reader). The syntax IS directly embedded; Forth knows
> nothing of it.
>
> : myWord i( 3*4/x+sin(x)) ;
>
> I could also write a C compiler which simply parsed the entire rest of the
> file as a C program.

Yes, but you couldn't write it as:

: myWord 3*4/x+sin x ;

I think that for Forth the quoted version is more appropriate, but for Avail
I strive for the unquoted form (without the delimiters). When you said
there was no quoting going on, I think what you meant was there was no
*string manipulation* (other than token scanning).


>>[...trivial code that will fail with GC but succeed without...]
>>> You seem to be confusing GC with memory protection. In a
>>> language which
>>> allows stores to arbitary addresses, memory corruption is
>>> always possible.
>>> You don't need to add GC to allow it, and adding GC doesn't
>>> change anything about it.
>
>>Ah, but if GC turns a working program into a non-working one,
>>then GC is the
>>*cause* of the failure -- i.e., the GC is not working. If you
>>restrict the
>>language features to allow GC to function in all cases, you've
>>effectively
>>implemented GC on a subset of the language, which is different
>>from (but not
>>necessarily worse than) supporting GC in the full language.
>
> I think I see what you're saying. Yes, that makes sense.
>
> Fortunately, it's easy enough to fix -- simply make MALLOC and FREE the only
> words which can't be used in GC-able Forth programs. This fits perfectly
> with what all other GC-able languages do, and it harms almost no Forth
> programs.
>
> You can get more sophisticated, but as you point out, getting more
> sophisticated adds edge effects. This solution is certain to work.

Maybe I'm missing something, but isn't MALLOC the only thing that GC should
have any effect whatsoever on (i.e., making sure a ton of unreachable data
isn't preventing MALLOC from succeeding)? Or do you mean you would provide
an alternative like GC-MALLOC or something. If so, that sounds like a
perfectly reasonble place to draw the line of compromise. You'll still have
to worry about reference policing, I think. Does Forth support casting,
union types, tagged variant records, reinterpretation of pointers as
integers, etc? If so, your line might not be as straight as you thought.


>>Personally, I never disable assertions except one-at-a-time,
>>and only when looking at actual times with a profiler.
>
> I can't imagine distributing assertion-loaded code to users. The assertions
> are of no use to them. However, as I mentioned, assertions have their
> problems as well, and a big one is disabling them, as you mention.

If all that the assertions can do is to pop up some annoying little dialog
box indicating that all is not well, then the assertions are probably worse
than useless. On the other hand, in a deployed system a failed assertion
should cause a stack trace to be dumped and the program should *stop
immediately* if there's a chance of corrupting data outside its direct
control. For example, what would you do if an assertion just detected that
your pragram is about to try to index an array with an out-of-bounds
subscript? Since in practice a program is never complete (a new version in
the works is the normal situation, not the exception), one should try to
capture failure cases like this from the deployed sites, together with the
context of failure (as much as possible, hence the stack trace). These
failure cases are probably the top priority for the next release, and if
possible they should be added to regression tests.

When a programming error makes it past testing and is detected in a deployed
system, aborting the program is almost always preferrable to continuing to
run with known incorrect data. Even for ultra-high reliability situations
the best solution tends to be abort-and-restart.


>>Avail doesn't necessarily keep secret which optimizations have been
>>performed.
>
> Yes, but it does them without asking you. My initial language won't do any
> optimizations except for ones which cause semantics changes; if you want to
> apply space or time optimizations, you have to apply them yourself.

I think (hope) you phrased that backwards ("...won't do any optimizations
except for ones which cause semantics changes"). Optimizations are *never*
supposed to change the semantics of a program. If a transformation causes
the semantics to change, it's not an optimization, it's a *required*
transformation. Optimizations are always supposed to be optional.


>>BTW, I don't recommend
>>leaving it up to the application-level programmer to decide
>>which areas to apply optimizations.
>
> I'm actually not -- it's up to the systems programmer. If anyone writes an
> application in my initial low-level language, they'll actually be writing a
> system, and deserve what they get :-); for example, they won't get any GC.

You may be surprised, but in your context I agree 100% with this division of
responsibility. In Avail I intend that optimizations are provided by a
systems-level programmer (or theoretician!). The exact places at which to
apply the optimizations is Avail's decision, but an optimization may provide
hints about where to expect the optimization to be profitable.

It's interesting that you consider GC an optimization. In theory it is,
although in practice it usually makes such a big difference that it isn't.
Decades ago someone said something to the effect that "GC can simply be
avoided entirely by using virtual memory. Objects that are garbage won't be
accessed (by definition), so they will move to disk and stay there. At a
100KB per second, a 100MB disk will take 20 minutes to fill up, at which
point other techniques can be used." I think those are the approximate
numbers that appeared in the original quotation. Nowadays, I would guess
that 10MB per second of allocations would need 100GB of disk to adequately
postpone GC (for 3 hours). Unfortunately, 10MB per second of continuous
disk activity is probably not appropriate, even on its own spindles (and
everyone would agree that 100GB is still too expensive inside a Coke
machine).


>>For a terrifically bad example, look at the way
>>inlining is specified in C++. An inlining hint actually alters the
>>semantics!
>
> C++ is, as usual, a spectacularly bad example. The problems you're looking
> at there involve the fact that the programmer can apply semantics-changing
> optimizations *inconsistently*. In my low-level language, I'm going to have
> a set of semantics-changing optimizations which will be applied
> automatically and consistently -- for example, tail calls will be optimised
> to jumps. Because it happens every time, it'll be perfectly predictable.

The tail call is atypical. I've been considering almost from the start
whether to support tail-call elimination in Avail. My current choice is no,
but certain patterns of calls can still be tail-call optimized if they are
able to reconstruct the "effective" stack on demand. I may simply change
the execution model so that tail calls happen every time. I call this a
transformation (or just a different definition of the semantic model for
Avail), but not an optimization.


>>In my current work (large scale databases) it sure would be
>>nice, but this
>>is a very rare exception. I doubt, however, that Forth would
>>significantly
>>improve disk latency within OODB requests (certainly not 1000x).
>
> It certainly wouldn't -- but Chuck would try to find something to replace
> the entire OODB with which would solve the specific problem 1000x faster.
> That's the point of his work.

Ah, but the real bottom line is the total cost to our clients. If 1000x
doesn't significantly improve what they can do with the system, then finding
a way to rewrite it might not be worthwhile. Deciding when to upgrade
hardware has the same kind of trade-off.


>>>>Ah, but that assumes you will never be asked to solve the
>>>>general problem.
>
>>> No, it assumes _nothing_. That's the beauty of it. It only
>>> solves the
>>> problem at hand -- and it leaves you with _real_ experience
>>> of what the
>>> specific problem is actually like. If you are later asked
>>> to solve the
>>> general problem (and that does sometimes happen), you'll be able to
>>> confidently write the general solution, knowing that you
>>> have a specific
>>> solution to test it against (or perhaps even to base it on).
>>> If you aren't
>>> asked for that, you haven't spent any extra time or effort on it.
>
>>Ok. One test case (or two, as XP implies) will help make sure of the
>>applicability of a general solution.
>
> Specific, not general. You can't possibly test the general case without
> first solving *all* of the specific cases.

You mean you can't *cover* the general case without testing the specific
ones. Testing isn't always about coverage.


>>But the specificity (and complexity)
>>of these cases might be much higher than similarly "contrived"
>>test cases
>>would have. Also, after the general solution is implemented,
>>the specific
>>solutions would have to be completely rewritten within the
>>context of the
>>general solution to be useful as test cases.
>
> I may be missing something, but why would they have to be rewritten? On the
> contrary, you don't want to rewrite them -- they're known to work, so you
> can run them, and then run your general solution, and compare the outputs.

Ah, as a cross-check. It seems kind of strange to keep that much redundancy
in the system just for testing, but a cross-check is impossible without it.
Quite the dilemma.


>>Getting it right the first time (or soon thereafter) is what
>>professionals do.
>
> Which professionals would that be? I haven't met any of them. I thought
> professionals were simply people who were willing to be identified by what
> they do.

You got me. Actually, "mastery" is the correct term for getting it right
most of the time.


>>>>I disagree with the level of "fix it up later" that XP
>>>>advocates.
>
>>> And I strongly agree -- with the caveat that "fix it"
>>> implies that it's
>>> broken now, which is the opposite of what XP advocates.
>
>>I did intentionally avoid saying "fix it later", and chose to insert
>>"...up...", trying to give it a connotation of something like
>>cleanliness,
>>luxury, or maybe attention to detail. Yes, make sure everything is
>>maximally tested (and working) early on, keeping it that way.
>
> XP doesn't recommend skimping on attention to detail, or pretty code, or
> anything like that. Frankly, I don't see how you can justify implying --
> no, not implying, but rather explicitly stating -- that about XP.

Hm. By cleanliness I mean semantic cleanliness, which is kind of related to
the ability to reuse something rapidly and safely. In my experience that
usually involves *planned* genericity, something XP seems (to me) to
outright avoid. The ruthless refactoring for genericity would only happen
when duplication of functionality would be the alternative. Am I
misinformed about this aspect of XP?


>>No, I don't
>>think one should do that to the exclusion of elegant design from
>>generalization.
>
> And I think design from generalization (elegant or otherwise) is the worst
> mistake you can make, and you should go to almost any lengths to avoid it.
> Design from generalization solves many problems which you don't have, almost
> always adds bloat, and makes claims which are inherently untestable.
>
> It's much better to design a solution which solves the problem you have.

My own coding experience strongly indicates otherwise, except that more than
half of the "generic" code I've seen happens to be total crap. Maybe the
problem is that anyone can *try* to write generic code, but few have the
knack for it. When I see it done well (which is fairly common in Smalltalk
code) it saves me a *lot* of effort (i.e., cost). Why would I want some
almost useless specific solution in the library I'm using? Maybe that's the
root of our difference in experience: Smalltalk is mostly about reusing and
building libraries (in that order), whereas most other languages are about
writing code. Outside of Smalltalk, I concede that your claims are
perfectly realistic. Pretty much *all* C++ code is crap, *especially* the
stuff that tries to be generic. Java doesn't have a single feature to
support genericity, which makes it even worse than C++ for reuseable
solutions. And these are the "big OO" languages! Scary. I'm glad I don't
have to stick my hands in *those* cages very often.


>>Shorter code does not imply fewer people are required for
>>maintenance.
>
> Yes, it does.

No, there is a moderate correlation, not a strict implication. Ten lines of
APL is not easier to maintain than eleven lines of Smalltalk (probably
ever). I'm not even certain that there is a causal relation, although my
experience seems to imply one. Ruthless refactoring usually shrinks code,
makes it more maintainable, and reduces the need to do any maintenance at
all on it. If my goal at the outset is to refactor to shrink the code, both
maintenance-related effects are normally observed afterwards.


>>With
>>any refactoring tools (even a multi-file search & replace) one
>>person can
>>maintain denormalized code within 10x the cost of normalized
>>code.
>
> I'm sorry, I've never heard that word used that way. Normalized code?

The "Once-and-only-once" pattern. Programming as compression. This theme
has shown up in many forms over the years. Really, it just means that to
make most changes, only local edits are required (see work circa 1995 on
"change cases"). Over the last few years I've changed my viewpoint from
striving for "maximally compressed" to "minimal cost of change". Since the
likely directions of changes have to be predicted up-front, this implies (to
me) that a good factoring is one in which most of the code is as generic as
possible. YMMV in other languages.


>>There
>>are much bigger issues to the maintenance cost, e.g., how much early
>>generalization was present (also called Frameworks), etc.
>
> Ah, well, yes. If your code is a nest of cut'n'paste design it's quite
> possible that you'll be able to spend time writing tools to help you
> maintain your code -- and if the cut'n'paste code is done in a standard way,
> you can even buy tools to do it for you.

Many GUI builders have exactly this nasty effect. I like to call them
"manual screen painters". As you can tell, I prefer algorithmic layout.
Much more normalized :-).

stevan apter — 2000-06-07 13:23:49

>
> >>Shorter code does not imply fewer people are required for
> >>maintenance.
> >
> > Yes, it does.
>
> No, there is a moderate correlation, not a strict implication. Ten lines of
> APL is not easier to maintain than eleven lines of Smalltalk (probably
> ever).

maintainability is holistic and context-dependent. even such seemingly
irrelvant factors as whether developers eat lunch at their desks or vacate
to the cafeteria may have an impact. nothing concentrates the mind so much
as programming on a busy trading floor, elbow to elbow with screaming
traders and their nubile assistants. ;-)

wtanksley@bigfoot.com — 2000-06-20 23:50:36

[in which we discuss how to build programs in concatenative languages -- and
in general, in any language]

From: Mark van Gulik [mailto:ghoul6@...]
>From: wtanksley@...
>> From: Mark van Gulik [mailto:ghoul6@...]
>>>wtanksley@... wrote:
>>>> From: Mark van Gulik [mailto:ghoul6@...]
>>>>>Massimo Dentico <m.dentico@...> wrote:

>>>>>In other words, Smalltalk works much, much better
>>>>>when building upwards than when building downwards.

>>>> I'm not sure how to take this. Do you mean that Smalltalk
>>>> is good at bottom-up design? Forth is also renowned for that.

>>>No, I was referring to building higher level constructs on top
>>>of existing
>>>ones, versus rebuilding lower level constructs to better
>>>support existing ones.

>> How does that differ from bottom-up design? It sounds identical.

>I think bottom-up usually refers to designing the components that are
>predicted to be needed before building the higher level
>constructs that will ultimately require them.

That's one way of doing design -- I've usually heard it called things like
"meet-in-middle", since you have to start by designing top-down in order to
know what's needed.

Forth is usually used in true bottom-up design, where you write (and design)
the components which do work before you write and/or design the components
which use them.

>I'm talking about reworking an existing system.

Refactoring in Forth is usually done in a bit-by-bit way: rather than
"reworking a system" you refactor parts of it as needed.

>Maybe it's my Smalltalk mentality showing through. You get a
>system working
>(you aim for getting end-to-end completeness even if lots of
>functionality
>is totally faked), and then (A) fill in the pieces of the
>implementation
>that you faked, and simultaneously (B) rearchitect the high
>level pieces now
>that you know what the lower level pieces must look like (even
>if they're
>just "hacked" sketches). This allows inside-outward design and
>implementation to occur simultaneously. This really, really
>works well in practice.

That's definitely NOT bottom-up; it's almost pure top-down design. The
reason it works so well is that you're permitting bottom-up concerns to
affect your already-written code by means of refactoring.

It's also used reasonably often in Forth -- one guy (whose name I've
forgotten) writes certified software by a similar method.

>>>>>I can understand *implementing* another language in Forth, but
>>>>>embedding it seems kind of awkward.
>...
>>>In Avail syntax I would probably be able to embed the syntax
>>>directly,
>>>without resorting to quoting (even though the quotes have effectively
>>>dissolved by runtime in your example).

>> I'm mildly confused. First, you seem to have confused the
>> presence of
>> quotes with the action of quoting. There's no quoting going
>> on; I could
>> have just as easily named the word "I(" and had it parse all
>> the input, and
>> stop when it reached an unbalanced ")" (which would appear
>> to be a matching
>> parenthesis to the reader). The syntax IS directly
>> embedded; Forth knows nothing of it.

>> : myWord i( 3*4/x+sin(x)) ;

>Yes, but you couldn't write it as:

> : myWord 3*4/x+sin x ;

>I think that for Forth the quoted version is more appropriate,
>but for Avail I strive for the unquoted form (without the
>delimiters).

Unless I'm seeing something wrong, you're boasting that Avail can read its
own syntax without using quotes. That's so obvious it's not even
commendable -- why would anyone want a language which couldn't handle its
own syntax? The impressive thing about Forth is that it can read languages
which have no relation to its own syntax, and which are even contradictory
to its syntax. Infix is natural for Avail. Infix is NOT natural for Forth.

Let me give you an example of a language which is natural for neither
language: Befunge. Check out
http://www.catseye.mb.ca/esoteric/befunge/index.html for info.

Here's a Forth program (dependant on the 'Befunge:' library word) to print
out the complete lyrics to the immortal ode '99 bottles of beer on the
wall.'

Befunge: ================================
9::*\2*+00p0v"."0<
>310p0"," >"llaw eht no "v >#v_ ^
^_210p0"--:" v ,
: v " of beer" < :
- >"selttob"00g.^ < <
1 >00g1-#^_$" elttob erom enO" ^
>00g#^_$" selttob erom oN" ^
^_110p0",dnuora ti ssap ,nwod eno ekaT"^
^:-1_010p00g1-00pvv:-1g01_@#g00,*25<
^ <
=========================================


>When you said
>there was no quoting going on, I think what you meant was there was no
>*string manipulation* (other than token scanning).

No, I meant that there's no feature of the language being invoked which
tells the language parser to treat the text specially in any way.

You seem to mean that there should be no symbols surrounding the text which
represents a chunk of code from a foreign language. I don't like your
definition, because it doesn't seem to be useful. However, Forth could also
handle such a word -- consider the word 'expr:', which uses the same parsing
engine as 'i"' above. Unlike i", expr: doesn't search for a closing quote;
instead, it simply parses exactly one expression from the following source.
Usage:

: myWord expr: 3 ;
: myWord2 expr: sin ;
: myWord3 expr: (sin x) ;

As you can see, the problem here is that almost any legal lexeme is also a
legal expression.

>>>[...trivial code that will fail with GC but succeed without...]

>> Fortunately, it's easy enough to fix -- simply make MALLOC
>> and FREE the only
>> words which can't be used in GC-able Forth programs. This
>> fits perfectly
>> with what all other GC-able languages do, and it harms
>> almost no Forth programs.

>> You can get more sophisticated, but as you point out, getting more
>> sophisticated adds edge effects. This solution is certain to work.

>Maybe I'm missing something, but isn't MALLOC the only thing
>that GC should
>have any effect whatsoever on (i.e., making sure a ton of
>unreachable data
>isn't preventing MALLOC from succeeding)? Or do you mean you
>would provide
>an alternative like GC-MALLOC or something. If so, that sounds like a
>perfectly reasonble place to draw the line of compromise.
>You'll still have
>to worry about reference policing, I think. Does Forth
>support casting,
>union types, tagged variant records, reinterpretation of pointers as
>integers, etc? If so, your line might not be as straight as
>you thought.

No to all your questions, because malloc is the only Forth function which is
capable of returning a conventional pointer. Anything else in the Forth
language returns not a pointer, but rather an index into the dictionary.

Yes, we'd have to have a GC-compatible malloc equivalent. This way most
malloc programs will be easy to translate: just change 'alloc' to
'gc-alloc'. Some will give you grief; you'll have to remove pointer games
(but the reason for their incompatibility wasn't the games, it was the fact
that they used malloc at all).

>>>Avail doesn't necessarily keep secret which optimizations have been
>>>performed.

>> Yes, but it does them without asking you. My initial
>> language won't do any
>> optimizations except for ones which cause semantics changes;
>> if you want to
>> apply space or time optimizations, you have to apply them yourself.

>I think (hope) you phrased that backwards ("...won't do any
>optimizations
>except for ones which cause semantics changes").
>Optimizations are *never*
>supposed to change the semantics of a program. If a
>transformation causes
>the semantics to change, it's not an optimization, it's a *required*
>transformation. Optimizations are always supposed to be optional.

I don't see anything in the morphism of the word "optimization" which says
anything about optionality or semantics changes. However, I have no problem
with using the phrase "required transformation", so long as I don't have to
use it too often :-).

My compiler will by default do NO "optimizations" (in your sense); it will
only do optimising "required transformations".

If you want to have your program optimised, you have to ask for it
explicitly. Fortunately, I plan on making that easy: since the code
generator is part of the runtime, you'll be able to write your own
optimisers (or use prewritten ones) with complete impunity.

>>>BTW, I don't recommend
>>>leaving it up to the application-level programmer to decide
>>>which areas to apply optimizations.

>> I'm actually not -- it's up to the systems programmer. If
>> anyone writes an
>> application in my initial low-level language, they'll
>> actually be writing a
>> system, and deserve what they get :-); for example, they
>> won't get any GC.

>You may be surprised, but in your context I agree 100% with
>this division of
>responsibility. In Avail I intend that optimizations are provided by a
>systems-level programmer (or theoretician!). The exact places
>at which to
>apply the optimizations is Avail's decision, but an
>optimization may provide
>hints about where to expect the optimization to be profitable.

I do suspect that my high-level language will try to be clever about
optimizations, by the way. I'll have to see how much progress is made on
the concatenative languages front while I'm finishing my low-level language;
if someone else has already made a good enough concatenative language which
handles that sort of problem, I'll skip it.

>>>For a terrifically bad example, look at the way
>>>inlining is specified in C++. An inlining hint actually alters the
>>>semantics!

>> C++ is, as usual, a spectacularly bad example. The problems
>> you're looking
>> at there involve the fact that the programmer can apply
>> semantics-changing
>> optimizations *inconsistently*. In my low-level language,
>> I'm going to have
>> a set of semantics-changing optimizations which will be applied
>> automatically and consistently -- for example, tail calls
>> will be optimised
>> to jumps. Because it happens every time, it'll be perfectly
>> predictable.

>The tail call is atypical.

What does that mean? Are you saying that few functions end with a call to
another function??

>I've been considering almost from the start
>whether to support tail-call elimination in Avail. My current
>choice is no,
>but certain patterns of calls can still be tail-call optimized
>if they are
>able to reconstruct the "effective" stack on demand. I may
>simply change
>the execution model so that tail calls happen every time. I
>call this a
>transformation (or just a different definition of the semantic
>model for Avail), but not an optimization.

But it clearly is an optimization -- it makes the program execute faster.
It's also clearly a semantic transformation as well, since it makes
debugging totally different.

>>>In my current work (large scale databases) it sure would be
>>>nice, but this
>>>is a very rare exception. I doubt, however, that Forth would
>>>significantly
>>>improve disk latency within OODB requests (certainly not 1000x).

>> It certainly wouldn't -- but Chuck would try to find
>> something to replace
>> the entire OODB with which would solve the specific problem
>> 1000x faster.
>> That's the point of his work.

>Ah, but the real bottom line is the total cost to our clients.
> If 1000x
>doesn't significantly improve what they can do with the
>system, then finding
>a way to rewrite it might not be worthwhile. Deciding when to upgrade
>hardware has the same kind of trade-off.

Of COURSE. This is blatantly obvious. But it's also blindingly obvious
that it's better to be able to do something than to not be able to do it.

>>>>>Ah, but that assumes you will never be asked to solve the
>>>>>general problem.

>>>> No, it assumes _nothing_. That's the beauty of it. It only
>>>> solves the
>>>> problem at hand -- and it leaves you with _real_ experience
>>>> of what the
>>>> specific problem is actually like. If you are later asked
>>>> to solve the
>>>> general problem (and that does sometimes happen), you'll be able to
>>>> confidently write the general solution, knowing that you
>>>> have a specific
>>>> solution to test it against (or perhaps even to base it on).
>>>> If you aren't
>>>> asked for that, you haven't spent any extra time or effort on it.

>>>Ok. One test case (or two, as XP implies) will help make sure of the
>>>applicability of a general solution.

>> Specific, not general. You can't possibly test the general
>> case without
>> first solving *all* of the specific cases.

>You mean you can't *cover* the general case without testing
>the specific ones. Testing isn't always about coverage.

I'm not sure why this matters. I mean that you can't test the general
solution in any way without at least inventing some specific problem under
which the solution can be tested.

>>>But the specificity (and complexity)
>>>of these cases might be much higher than similarly "contrived"
>>>test cases
>>>would have. Also, after the general solution is implemented,
>>>the specific
>>>solutions would have to be completely rewritten within the
>>>context of the general solution to be useful as test cases.

>> I may be missing something, but why would they have to be
>> rewritten? On the
>> contrary, you don't want to rewrite them -- they're known to
>> work, so you
>> can run them, and then run your general solution, and
>> compare the outputs.

>Ah, as a cross-check. It seems kind of strange to keep that
>much redundancy
>in the system just for testing, but a cross-check is
>impossible without it. Quite the dilemma.

Dilemma? Not at all. It's the difference between code that's tested and
code that isn't. Not much of a question, especially when the "redundant"
code is already written.

>>>>>I disagree with the level of "fix it up later" that XP
>>>>>advocates.

>>>> And I strongly agree -- with the caveat that "fix it"
>>>> implies that it's
>>>> broken now, which is the opposite of what XP advocates.

>>>I did intentionally avoid saying "fix it later", and chose to insert
>>>"...up...", trying to give it a connotation of something like
>>>cleanliness,
>>>luxury, or maybe attention to detail. Yes, make sure everything is
>>>maximally tested (and working) early on, keeping it that way.

>> XP doesn't recommend skimping on attention to detail, or
>> pretty code, or
>> anything like that. Frankly, I don't see how you can
>> justify implying --
>> no, not implying, but rather explicitly stating -- that about XP.

>Hm. By cleanliness I mean semantic cleanliness, which is kind
>of related to
>the ability to reuse something rapidly and safely. In my
>experience that
>usually involves *planned* genericity, something XP seems (to me) to
>outright avoid. The ruthless refactoring for genericity would
>only happen
>when duplication of functionality would be the alternative. Am I
>misinformed about this aspect of XP?

I believe you are. Reuse is one of the primary rules of XP, but they also
assume that no code will be useful for reuse the first time it's written.
No matter how hard you work on the design-for-reuse, you're going to be
designing for the wrong thing; and when it comes time to actually do the
reuse, you'll have to do all that work over again anyhow, and the code you
wrote earlier to "facilitate reuse" will almost certainly just get in the
way, and will at LEAST be a distraction and a temptation.

It's FAR better to just design for USE, and then when it comes time to
reuse, be ready to refactor that code. By the second time you reuse the
code, you'll probably find that it's much easier to refactor. By the third
time, you probably won't have to do any refactoring at all. And there you
are -- you've reached your goal of reuse, AND you've made money on contracts
without spending time on reuse.

>>>No, I don't
>>>think one should do that to the exclusion of elegant design from
>>>generalization.

>> And I think design from generalization (elegant or
>> otherwise) is the worst
>> mistake you can make, and you should go to almost any
>> lengths to avoid it.
>> Design from generalization solves many problems which you
>> don't have, almost
>> always adds bloat, and makes claims which are inherently untestable.

>> It's much better to design a solution which solves the
>> problem you have.

>My own coding experience strongly indicates otherwise, except
>that more than
>half of the "generic" code I've seen happens to be total crap.

Fits my theory.

> Maybe the
>problem is that anyone can *try* to write generic code, but
>few have the
>knack for it. When I see it done well (which is fairly common
>in Smalltalk code) it saves me a *lot* of effort (i.e., cost).

But Smalltalk has been built up over *years*. They've had time to refactor
all of their solutions. The solutions could easily have started as specific
ones, and then been generalised during use.

>Why would I want some
>almost useless specific solution in the library I'm using?

Specific solutions don't belong in libraries. Generic solutions don't
belong in applications.

>Maybe that's the
>root of our difference in experience: Smalltalk is mostly
>about reusing and
>building libraries (in that order), whereas most other
>languages are about
>writing code. Outside of Smalltalk, I concede that your claims are
>perfectly realistic. Pretty much *all* C++ code is crap,
>*especially* the
>stuff that tries to be generic. Java doesn't have a single feature to
>support genericity, which makes it even worse than C++ for reuseable
>solutions. And these are the "big OO" languages! Scary. I'm
>glad I don't
>have to stick my hands in *those* cages very often.

Smalltalk is so good at that *because* it has good libraries. Why does it
have good libraries? Simply because Smalltalkers were willing to modify bad
libraries.

>>>Shorter code does not imply fewer people are required for
>>>maintenance.

>> Yes, it does.

>No, there is a moderate correlation, not a strict implication.
> Ten lines of
>APL is not easier to maintain than eleven lines of Smalltalk (probably
>ever).

Whoah! Do you have ANY idea how much more those ten lines of APL code are
doing than the ten lines of Smalltalk code? I'd make a guess that APL is at
LEAST ten times more dense than Smalltalk. At least. In fact, I'd guess
that a displayed page of Smalltalk code (80x25) would fit in a line of APL.

That may or may not affect maintainability (it makes readability harder, but
because APL isn't really harder to work with than Smalltalk, I would judge
that it makes maintainability easier; at least by enough to balance things
out).

So you're just comparing apples and oranges. Ten lines of Smalltalk is
easier to maintain than eleven lines of Smalltalk. (I don't care about
lines of code; you're the one who implied that shorter code == fewer lines
of code.)

There's not merely a strong correlation, but an absolute certainty. Smaller
code is easier to reason about, harder to introduce non-obvious errors into,
and easier to locate mistakes in.

>>>With
>>>any refactoring tools (even a multi-file search & replace) one
>>>person can
>>>maintain denormalized code within 10x the cost of normalized
>>>code.

>> I'm sorry, I've never heard that word used that way.
>> Normalized code?

>The "Once-and-only-once" pattern. Programming as compression.

Ah. That's not what I'm talking about, of course.

> This theme
>has shown up in many forms over the years. Really, it just
>means that to
>make most changes, only local edits are required (see work
>circa 1995 on "change cases").

Unrelated, but there's some current work going on in "aspect oriented
programming". See www.aspectj.com for more info.

>Over the last few years I've changed my
>viewpoint from
>striving for "maximally compressed" to "minimal cost of
>change".

Reasonable. Once you have a good knowledge of refactoring, making these
kinds of choices is natural.

>Since the
>likely directions of changes have to be predicted up-front,
>this implies (to
>me) that a good factoring is one in which most of the code is
>as generic as
>possible. YMMV in other languages.

You actually want the top-level code to be completely problem-specific, in
syntax as well as terms. Ideally, a domain expert with only minimal
knowledge of the language could correct your top-level code.

Anyhow.

I don't see how this at all implies generic code. You want code that's easy
to refactor, but for other reasons you don't want that code to also be
generic, since generic code requires extra work which you don't have time to
test (and therefore will be wrong).

Refactoring has some interesting dimensions in concatenative languages.
It's VERY easy to perform the "Extract Function" refactoring, since there
are no local variables; in fact, it's a trivial cut-and-paste. "Inline
Method" is also trivial, and splitting a method down the middle and putting
one half in one function and another half elsewhere becomes quite reasonable
as a single-step refactoring.

-Billy