Constant folding algorithm
Daniel Ehrenberg — 2007-12-06 02:33:56
Hi all,
Right now, I'm working on a module called inverse for Factor. The
basic goal is to provide an inverse (or, technically, a section) of a
quotation, for a generalized form of pattern matching. I've noticed
that this can work both more efficiently and more generally if I first
perform constant folding on the quotation, for example transforming [
1 2 + * ] into [ 3 * ], which can more easily be inverted. Right now,
I have a very simple constant folding algorithm which looks for a word
in a certain predefined set of words, preceded by two constants. This
fails, though, if I use something like [ 1 2 + 3 4 + / ]. I'm
wondering if anyone has any idea of how to make an algorithm which,
given a quotation and a list of safe-to-fold words with their arities,
will perform as much constant folding as is possible on a function.
The Factor compiler implements constant folding, but it operates on an
intermediate dataflow representation rather than quotations. This
looks like one area where it's a little easier to manipulate an
applicative programming language than a concatenative one, though I'm
sure there's a good solution this problem. The first thing that comes
to mind is to use a slightly modified metacircular interpreter, but
that would be very complicated to implement. Ideas?
Daniel Ehrenberg
Christopher Diggins — 2007-12-06 03:11:04
On Dec 5, 2007 9:33 PM, Daniel Ehrenberg <
microdan@...> wrote:
>
> Hi all,
>
> Right now, I'm working on a module called inverse for Factor. The
> basic goal is to provide an inverse (or, technically, a section) of a
> quotation, for a generalized form of pattern matching. I've noticed
> that this can work both more efficiently and more generally if I first
> perform constant folding on the quotation, for example transforming [
> 1 2 + * ] into [ 3 * ], which can more easily be inverted. Right now,
> I have a very simple constant folding algorithm which looks for a word
> in a certain predefined set of words, preceded by two constants. This
> fails, though, if I use something like [ 1 2 + 3 4 + / ]. I'm
> wondering if anyone has any idea of how to make an algorithm which,
> given a quotation and a list of safe-to-fold words with their arities,
> will perform as much constant folding as is possible on a function.
Have you considered the brute force approach? Try to evaluate a
sub-expression: if it throws an error it doesn't work, otherwise go
ahead and replace the subexpression with its evaluateion. Pretty
naive, but very easy to code. This is how I implemented partial
evaluation in an earlier version of Cat (currently broken, while I
focus on other issues).
> The Factor compiler implements constant folding, but it operates on an
> intermediate dataflow representation rather than quotations.
What does the representation look like?
> This
> looks like one area where it's a little easier to manipulate an
> applicative programming language than a concatenative one,
Can you tell me more why you feel this is? I am skeptical about this claim.
> though I'm
> sure there's a good solution this problem. The first thing that comes
> to mind is to use a slightly modified metacircular interpreter, but
> that would be very complicated to implement. Ideas?
- Christopher
Daniel Ehrenberg — 2007-12-06 05:49:44
Thanks for your response, Chris,
> Have you considered the brute force approach? Try to evaluate a
> sub-expression: if it throws an error it doesn't work, otherwise go
> ahead and replace the subexpression with its evaluateion. Pretty
> naive, but very easy to code. This is how I implemented partial
> evaluation in an earlier version of Cat (currently broken, while I
> focus on other issues).
>
That might work, but using that approach, how can I tell where a
successful partial evaluation begins and ends? Is it like you take
each possible starting point in the code? Maybe something like that,
though not necessarily relying on the exception system, is possible.
>
> > The Factor compiler implements constant folding, but it operates on an
> > intermediate dataflow representation rather than quotations.
>
> What does the representation look like?
>
I've never looked into it, but Slava has said it's basically Single
Static Assignment. It can't easily be converted back into a quotation.
>
> > This
> > looks like one area where it's a little easier to manipulate an
> > applicative programming language than a concatenative one,
>
> Can you tell me more why you feel this is? I am skeptical about this claim.
In an applicative language, or at least one with a readily inspectable
AST like Lisp, you could just traverse the code in postorder,
evaluating all code which doesn't refer to variables and calls a
function on a known list. For example, you could tell that (+ x (* (-
4 5) 2)) can reduce, from the inside, to (+ x (+ -1 2)) and then to (+
x 1), though no further, since x is a (free) variable. But I hope I'm
wrong and that a similarly simple, explicit algorithm is possible. The
brute force approach sounds a little complicated to get right, but
I'll see if I can get it to work...
Daniel Ehrenberg
Christopher Diggins — 2007-12-06 14:25:31
On Dec 6, 2007 12:49 AM, Daniel Ehrenberg <
microdan@...> wrote:
> Thanks for your response, Chris,
> > Have you considered the brute force approach? Try to evaluate a
> > sub-expression: if it throws an error it doesn't work, otherwise go
> > ahead and replace the subexpression with its evaluateion. Pretty
> > naive, but very easy to code. This is how I implemented partial
> > evaluation in an earlier version of Cat (currently broken, while I
> > focus on other issues).
> >
> That might work, but using that approach, how can I tell where a
> successful partial evaluation begins and ends? Is it like you take
> each possible starting point in the code?
That's what I did. I would take n consecutive terms (where n was
bounded by some constant to keep the algorithm linear) apply these
terms to the empty stack, and if it doesn't underflow the stack, use
the resulting stack as the new expression.
> Maybe something like that,
> though not necessarily relying on the exception system, is possible.
So, I'm assuming you are hacking the compiler source code? You can
modify it slightly to detect underflow without waiting for exceptions,
which is admittedly quite ugly.
> > > The Factor compiler implements constant folding, but it operates on an
> > > intermediate dataflow representation rather than quotations.
> >
> > What does the representation look like?
> >
> I've never looked into it, but Slava has said it's basically Single
> Static Assignment. It can't easily be converted back into a quotation.
That really surprise me, because I always thought that a language like
Joy or Cat (where there is only one explicit stack) was as easy to
work with as SSA. I suspect it is because Slava does his own register
allocations?
> >
> > > This
> > > looks like one area where it's a little easier to manipulate an
> > > applicative programming language than a concatenative one,
> >
> > Can you tell me more why you feel this is? I am skeptical about this
> claim.
>
> In an applicative language, or at least one with a readily inspectable
> AST like Lisp, you could just traverse the code in postorder,
> evaluating all code which doesn't refer to variables and calls a
> function on a known list.
That makes sense. In the Cat optimizer I use type inference to
reconstruct an AST.
> For example, you could tell that (+ x (* (-
> 4 5) 2)) can reduce, from the inside, to (+ x (+ -1 2)) and then to (+
> x 1), though no further, since x is a (free) variable. But I hope I'm
> wrong and that a similarly simple, explicit algorithm is possible.
Why not remove free variables (e.g. pass them as parameters)? It
simplifies a lot of analyses.
> The
> brute force approach sounds a little complicated to get right, but
> I'll see if I can get it to work...
You'll have to be careful to identify code with side effects for it to
work. You don't want to write to the console for example while
performing partial evaluation (i.e. constant folding)
> Daniel Ehrenberg
Christopher
http://www.cdiggins.com
John Cowan — 2007-12-06 14:41:44
Christopher Diggins scripsit:
> That's what I did. I would take n consecutive terms (where n was
> bounded by some constant to keep the algorithm linear) apply these
> terms to the empty stack, and if it doesn't underflow the stack, use
> the resulting stack as the new expression.
What is more, when you see a non-primitive word whose definition you have,
you can inline it (provided, of course, that you are not already inlining it!)
--
A: "Spiro conjectures Ex-Lax." John Cowan
Q: "What does Pat Nixon frost her cakes with?"
cowan@...
--"Jeopardy" for generative semanticists
http://www.ccil.org/~cowan
Daniel Ehrenberg — 2007-12-06 18:38:35
> So, I'm assuming you are hacking the compiler source code? You can
> modify it slightly to detect underflow without waiting for exceptions,
> which is admittedly quite ugly.
No, I'm not doing anything with the compiler source code. This is a
separate project and is implemented through quotation introspection.
But the way that the Factor error handling system works, it is
possible to trap stack underflow at runtime. Also, each word has it
marked whether it's suitable for constant folding (for the compiler's
use, but I can take advantage...)
> That really surprise me, because I always thought that a language like
> Joy or Cat (where there is only one explicit stack) was as easy to
> work with as SSA. I suspect it is because Slava does his own register
> allocations?
>
It might not be difficult to convert a stack language to SSA, but I
have no idea how to change it back into a quotation. Anyway, I don't
think such a transformation would work for this, since it would change
the quotation in other ways.
> > For example, you could tell that (+ x (* (-
> > 4 5) 2)) can reduce, from the inside, to (+ x (+ -1 2)) and then to (+
> > x 1), though no further, since x is a (free) variable. But I hope I'm
> > wrong and that a similarly simple, explicit algorithm is possible.
>
> Why not remove free variables (e.g. pass them as parameters)? It
> simplifies a lot of analyses.
>
What do you mean?
Daniel Ehrenberg
Christopher Diggins — 2007-12-06 19:38:28
On Dec 6, 2007 1:38 PM, Daniel Ehrenberg <
microdan@...> wrote:
> > So, I'm assuming you are hacking the compiler source code? You can
> > modify it slightly to detect underflow without waiting for exceptions,
> > which is admittedly quite ugly.
>
> No, I'm not doing anything with the compiler source code. This is a
> separate project and is implemented through quotation introspection.
> But the way that the Factor error handling system works, it is
> possible to trap stack underflow at runtime. Also, each word has it
> marked whether it's suitable for constant folding (for the compiler's
> use, but I can take advantage...)
Oh, then I misunderstood. You don't have access to a Factor evaluator,
or do you? Essentially I imagined you were performing a partial
evaluation of Factor step by step. If so then
I think you can detect the underflow by counting the number of
arguments that are about to be consumed, during each stage of the
evalution.
> > That really surprise me, because I always thought that a language like
> > Joy or Cat (where there is only one explicit stack) was as easy to
> > work with as SSA. I suspect it is because Slava does his own register
> > allocations?
> >
> It might not be difficult to convert a stack language to SSA, but I
> have no idea how to change it back into a quotation. Anyway, I don't
> think such a transformation would work for this, since it would change
> the quotation in other ways.
So when introspecting the code, you are getting the SSA form?
> > > For example, you could tell that (+ x (* (-
> > > 4 5) 2)) can reduce, from the inside, to (+ x (+ -1 2)) and then to (+
> > > x 1), though no further, since x is a (free) variable. But I hope I'm
> > > wrong and that a similarly simple, explicit algorithm is possible.
> >
> > Why not remove free variables (e.g. pass them as parameters)? It
> > simplifies a lot of analyses.
> >
>
> What do you mean?
I misunderstood the problem to be that "x" was free. To resolve this,
a standard compilation technique of functional languages is to remove
free variables (e.g. variables that are not arguments or locally
declared) and pass them as arguments. This process is known as lambda
lifting (
http://en.wikipedia.org/wiki/Lambda_lifting). However, after
looking at your problem statement the fact that "x" is free or bound
is irrelevant, the problem is that it is a variable, so forget what I
am talking about.
- Christopher
Daniel Ehrenberg — 2007-12-06 20:26:17
> Oh, then I misunderstood. You don't have access to a Factor evaluator,
> or do you? Essentially I imagined you were performing a partial
> evaluation of Factor step by step. If so then
> I think you can detect the underflow by counting the number of
> arguments that are about to be consumed, during each stage of the
> evalution.
>
That makes sense. We kept misunderstanding each other, though. I have
access to the Factor evaluator easily. (I also have access to the
compiler, though it's not particularly useful.)
>
> > > That really surprise me, because I always thought that a language like
> > > Joy or Cat (where there is only one explicit stack) was as easy to
> > > work with as SSA. I suspect it is because Slava does his own register
> > > allocations?
> > >
> > It might not be difficult to convert a stack language to SSA, but I
> > have no idea how to change it back into a quotation. Anyway, I don't
> > think such a transformation would work for this, since it would change
> > the quotation in other ways.
>
> So when introspecting the code, you are getting the SSA form?
>
No, I get a quotation, which is manipulable like quotations in Joy,
except that it's a random-access array rather than a linked list. The
SSA form is only for compilation.
Thanks for your help, Chris (and John, though I was already doing that)!
Daniel Ehrenberg
William Tanksley, Jr — 2007-12-09 20:14:41
Daniel Ehrenberg <
microdan@...> wrote:
> Right now, I'm working on a module called inverse for Factor. The
> basic goal is to provide an inverse (or, technically, a section) of a
> quotation, for a generalized form of pattern matching. I've noticed
Wow, that's impressive.
> that this can work both more efficiently and more generally if I first
> perform constant folding on the quotation, for example transforming [
> 1 2 + * ] into [ 3 * ], which can more easily be inverted. Right now,
That makes sense. A friend of mine has an algorithm for Forth which
seems to catch a lot of opportunities for optimization (including
stack manipulation and constant folding); I'll see if I can convince
him to either write it up or join this list. In essence, he uses a
stack at compile time to track the type and knownness of values
encountered and/or produced. It's linear time, so it's not hard to
understand.
> looks like one area where it's a little easier to manipulate an
> applicative programming language than a concatenative one, though I'm
Heh. I doubt it.
> Daniel Ehrenberg
-Wm