Concatenative Languages vs Static Single Assignment

John Carter — 2006-08-21 22:56:01

With regards my earlier post about the "Killer App" for concatenative
languages.

Well, there are somethings sort of (almost, not quite) in that slot
already. LLVM and Gimple.

The bottom line is real world optimizing / transforming / rewriting
systems are using SSA.

A bundle of questions arise from this...

1) Is SSA === Concatenative. ie. Is there an isomorphism from an SSA
representation to a concatenative representation?

2) What are the pro's and con's of SSA vs Concatenative w.r.t. rewriting
and transforming. (One obvious pro is SSA can keep the source programs
variable names around.)


LLVM

http://llvm.org/

LLVM is a low-level object code representation that uses simple
RISC-like instructions, but provides rich, language-independent, type
information and dataflow (SSA) information about operands. This
combination enables sophisticated transformations on object code, while
remaining light-weight enough to be attached to the executable. This
combination is key to allowing link-time, run-time, and offline
transformations.

Gimple

http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gccint/Tree-SSA.html
http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gccint/SSA.htm

The SSA form is based on the premise that program variables are assigned
in exactly one location in the program. Multiple assignments to the same
variable create new versions of that variable. Naturally, actual
programs are seldom in SSA form initially because variables tend to be
assigned multiple times. The compiler modifies the program
representation so that every time a variable is assigned in the code, a
new version of the variable is created. Different versions of the same
variable are distinguished by subscripting the variable name with its
version number. Variables used in the right-hand side of expressions are
renamed so that their version number matches that of the most recent
assignment.

We represent variable versions using SSA_NAME nodes. The renaming
process in tree-ssa.c wraps every real and virtual operand with an
SSA_NAME node which contains the version number and the statement that
created the SSA_NAME. Only definitions and virtual definitions may
create new SSA_NAME nodes.

Sometimes, flow of control makes it impossible to determine what is the
most recent version of a variable. In these cases, the compiler inserts
an artificial definition for that variable called PHI function or PHI
node. This new definition merges all the incoming versions of the
variable to create a new name for it. For instance,

if (...)
a_1 = 5;
else if (...)
a_2 = 2;
else
a_3 = 13;

# a_4 = PHI <a_1, a_2, a_3>
return a_4;

Since it is not possible to determine which of the three branches will
be taken at runtime, we don't know which of a_1, a_2 or a_3 to use at
the return statement. So, the SSA renamer creates a new version a_4
which is assigned the result of \u201cmerging\u201d a_1, a_2 and a_3.
Hence, PHI nodes mean \u201cone of these operands. I don't know
which\u201d.



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@...
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

William Tanksley, Jr — 2006-08-22 01:58:38

John Carter <john.carter@...> wrote:
> With regards my earlier post about the "Killer App" for concatenative
> languages.
> Well, there are somethings sort of (almost, not quite) in that slot
> already. LLVM and Gimple.

I think I recall your killer app being an intermediate language for
optimization, right? If so, yes; that's what SSA is used for now.

> The bottom line is real world optimizing / transforming / rewriting
> systems are using SSA.

Yup.

> A bundle of questions arise from this...
> 1) Is SSA === Concatenative. ie. Is there an isomorphism from an SSA
> representation to a concatenative representation?

Programs written in one can be transformed to the other. More
importantly, is there an isomorphism between the fancy optimizations
that you can express using SSA and the fancy optimizations that you
can express using a concatenative representation? I don't think so. I
think that the two are largely independant.

The problem is that SSA received a TON of research; I don't see how
it'd be possible to beat it on its own ground, even if concatenative
is entirely better on every front (which I don't believe).

> 2) What are the pro's and con's of SSA vs Concatenative w.r.t. rewriting
> and transforming. (One obvious pro is SSA can keep the source programs
> variable names around.)

No, it can't -- SSA requires a total variable rewrite.

This (pros and cons) is an incredibly technical issue that I lack the
knowledge to really properly discuss -- I've played with it only
enough to assure my understanding of my lack of knowledge. It takes a
lot more math than I have.

I do know one thing: hand written concatenative is fast and good if it
looks fast and good. That's an advantage, but it only applies to
hand-written code.

Here's another fact: concatenative code must contain _explicit_
knowledge of dataflow (how information gets from one storage place to
another). SSA doesn't need that. OTOH, SSA must contain explicit
information about storage places; concatenative code doesn't need
that. In that sense the two seem to be almost exact inverses.

> John Carter

-Billy