Stack analysis (was Another implementation...)

Martin Young — 2000-08-10 13:43:44

On Aug 10, 7:40am, madonl@... wrote:
> Martin Young wrote:
> > The "analysis if P to determine arity" is provably non-computable, btw.
>
> That actually depends on how you define arity. I didn't want to
> complicate my original explanation too much, but all that the compiler
> really needs to know is the "shape" of the arity, not neccesarily the
> concrete number. For example,
>
> [pop] [[pop] times] ifte
>
> has "shape" pop (pop | pop*)
>
> | means alternation
> * means zero or more
>
> * indicates the presence of a "region" - a part of the stack that can
> grow to an arbitrary size, but for the purpose of something like fork,
> you treat the whole region as a single "value".
>
> This is just control flow analysis and it is definetly computable.

I'm still a little confused. Given a programs P1 and P2:

P1 == [ [ pop ] times ] ] (* pop N values off the stack *)
P2 == [ dup [ dup ] times ] (* push N, N+1 times *)

I /think/ the arity shape of these is:

P1 : pop*
P2 : push push*

If I run these concurrently on the stack [ 7 8 9 2 ] (where 2 is the most
recently pushed item), I will see these two stacks (S1 and S2 from P1 and P2
respectively).

S1: [ 7 ]
S2: [ 7 8 9 2 2 2 ]

What would the final combined stack look like? All we know is that P1 has
removed 0 or more items and that P2 has added zero or more. How can you tell
through static analysis how many in each case? For simple examples you could
decide at run-time by stack marking, I suppose, but this might be inefficient.

Consider the programs P3 and P4:

P3 == [ [ [ [ [ pop ] dip ] dip ] dip ] ]
p4 == 4

These have arity shapes

P4: push

The shape of P3 I find harder to form. What about:

pop pop pop pop push push push

Which describes which item was removed. One might simplify it to:

pop

Which states only that the stack has one less element. This is probably not
useful.

Running P3 and P4 on [ 1 2 3 4 5 6 7 ] we see:

S3: [ 1 2 4 5 6 7 ]
S4: [ 1 2 3 4 5 6 7 4 ]

What does the resultant stack look like?

[ 1 2 4 5 6 7 4 ] Maybe? This is [1 2] which were untouched, followed by the
the parts of P3 and P4's stacks above [1 2].

How does one achive it automatically?

I could describe the above algorithm like this: find the highest item which
neither program touched, remove everything about it. Added the segment of P1's
result stack about the untouched item, then do the same for P2. This might be
useful, but (a) it can only be done at runtime, and (b) it doesn't require you
to know anything about the shape of P1 and P2's arities.

I assume that you have an algorithm other than this. What is it?

> No, the regexp scheme above is quite precise. "num-outputs" can include
> some hierarchy of regions.

Hrm, but how do you match items on the stack to parts of the regexp? The
expression pop* push* pop* can be satified numerous ways by the stack
[ 1 2 3 4 ].

> Not at all, again you need to think in terms of "shape" and you'll see
> that the stack can always be delineated into reachable and unreachable
> parts. Moreover, most words have a very small dependance.

But surely a shape of pop* push* can reach any part of the stack? Any word
which performs a number of stack operations determined by a number on the stack
will have a shape of this form, and it's actualy effect on the stack is
undecidable.

I'm obviously missing something here.



--
Martin Young working for STMicroelectronics at `(o)_(o)' The fat wise /
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. ( V ) owl eats only >
+44 1454 462 523 `v' Martin.Young@... `.___,' clean mice. /
_(_)_ -="==="=============='