Local Variables, including a rationale for their use.

cpcogan — 2007-01-07 22:14:49

[Skip or just skim the following if you are tired of discussing local
variables; there's probably nothing new here. I'm offering this just
in case there is (though I plan to go back and read the stuff on this
topic, I haven't done so yet).]

[Also, let me know if I've gotten something just plain wrong; my
inexperience with actually using Joy is a real threat to my confidence
in some of the things I say below.]

I'm implementing something like Salvatore Sanfilippo's local
variables, to help reduce the tendency of Joy code to be full of stack
manipulations. I'm not sure how I will restrict their use to protect
the logical integrity of Joy, but I think that Salvatore Sanfilippo's
restrictions would definitely do this. I'm hoping for more-relaxed
restrictions to allow more use of such "variables," but I'm not sure
how far I can go without causing integrity problems in the language.
I have experimented around with the factorial function to see how much
of a savings in stack manipulation results. It's pretty substantial.
Ideally, one would be able to write the code for this function in a
more-readable form than this (view this in a monospace font, like
Courier New):

fact == [1 1] dip [dup [*] dip succ] times pop

[ 1 1 ] dip (* 1 1 N *)
[ dup (* 1 1 1 (N has been removed by times) *)
[*] (* 1 1 1 [*] -- put multiply on stack *)
dip (* 1 1 -- remove multiply, remove top number,
multiply the two remaining *)
succ (* 1 2 -- after putting top back, get
successor *)
[
times (* above list executed N times by times * )
pop (* N+1 was left after last succ, so pop it *)

Now, let's rewrite it with local variables, indicated by parentheses
(reserving "(*" and "*)" for comments):

fact == 1 1 ( N P C ) P N [ C * C succ ( C ) ] times pop
(* N= Number, P = Product, C = Counter *)

Here, the names in parens are defined as values off the stack without
leaving a copy on the stack). Here's the narrative:

1 1 (* push 1 twice *)
( N P C ) (* "capture" the top three items from the
stack and put them into
variables named N, P, and C, with C being
loaded first, and N last *)
P N (* 1 N -- copy each of these two back to the
stack in the order we need them *)

[ C * C succ ( C ) ] (* push a quotation to do the work of
each iteration, including re-assigning
incremented counter to C *)
times (* execute the quotation N times *)
pop (* pop the last instance of the factor,
leaving just the final result *)

I think this is noticeably easier to read. In particular, notice the
relative lack of stack-manipulation. In this case, the improvement is
not huge, but it is still worth having. Note that, in keystrokes, my
version is actually longer than the "standard" version, because there
is coding overhead in capturing the values into variables that makes
it longer. However, I still think it's an improvement because I don't
have to maintain a much-more-detailed "image" of the stack in my mind.

Further, the improvement would be more noticeable in a longer function
where initial values were captured and then used many times in the
rest of the function. I think the "qroots" function might be
significantly shortened this way, because of the many re-uses of the
initial values from the stack. The variables would mean eliminating
some of the complicated stack manipulation used to recover items that
have been buried to allow other items to be worked on. Somewhere, I
have a qroots implementation that shows further reduction in size from
using local variables. If I find it, I'll pass it along for those
still interested.

Notice that the counter C is in the code passed to times. I do worry
that this may cause problems with the purity of the Joy theory, but
does it? I don't know. It would seem that, even though the code is
executed within times, the values assigned to it within times stay
there ("What happens in Vegas stays in Vegas"), so it still seems to
protect against errant side-effects and such. Or, possibly, I'm just
so naive about this sort of thing that I'm missing something obvious
that real Joy programmers would see immediately. If so, what am I missing?

Rationale for Local Variables, from the Programmer's Point of View:

Why does this make programming easier? Because it takes what Joy wants
to treat as a one-dimensional array (the stack) and splits off parts
of it for direct access, so they no longer have to be found in the stack.

Here's a metaphor: Imagine having a stack of papers, each with some
important information on it, but you can only see the top item of the
stack. To perform an operation, you can take the top two items off the
stack and immediately perform an operation, after which you must put
them back on the stack if you are going to need them again.

Now, to get at something further down, you have to do the equivalent
of the rollup function. To go deeper still, you have to bunch up the
topmost items into a folder (i.e., a Joy list) just to save them while
you retrieve the deeper item.

But, if you are finally given a new-fangled thing called a "desk," you
can spread out at least some of these sheets of paper and get to them
directly when you need them. Now, you don't have to keep everything in
one pile. If you have one that you frequently need to look at, you can
put it in a specific location so it'll be handy when you need it, and
you won't have to calculate where it is in the stack (or search for it
in the stack).

Basically, local variables allow a one-dimensional entity to be
partially converted into a two-dimensional layout that is separate
from the one-dimensional entity.

True, theoretically we don't need variables, even local variables (or
stinking loops), but local variables, at least (if not the stinking
loops), can make programming considerably easier, thereby allowing us
to tackle more-interesting projects without getting lost in stack
manipulations. In principle, we can do everything with machine code,
too, but we don't. There are reasons for giving names to things: We
don't have to remember numeric or other equally-arbitrary and
accidental "addresses" (or stack locations, etc.). And there are
cognitive reasons for moving stuff off the stack (as well as reasons
of efficiency, since doing so reduces stack manipulations).