God's Own Algorithms.
John Carter — 2002-04-03 00:46:44
Up to now, I have been rather reluctant to accept recursion via combinator
as anything but an academic curiosity. But the idea of GOA's (God's Own
Algorithm's) suddenly makes me very interested in them indeed. Especially
in the light of the thread on minimal bases.
Given any program one can always concieve of it implemented by a GOA. ie.
The shortest program that does exactly the same thing.
Now there are two features about Joy that make this concept very
interesting.
One is the "recursion" free nature. Traditionally one of the ways of
hunting a GOA is to break everything into smaller and smaller subroutines
which call each other in fantastical ways. Now in recursion free Joy a
"word" can always be replaced by its definition without it ever appearing
recursively in the expansion.
Now add in the idea of a minimal base, then any Joy program can be
expanded to a single string involving nothing but the minimal base
operators.
Thus we have a remarkably clean concept for a GOA. That program
that provably behaves the same as the defining program, yet is the
shortest string when expanded all the way down to the chosen minimal base.
Thus a GOA is defined by the defining program and the chosen minimal base.
It would be interesting to create a program that, given a program and a
base, searches for the GOA.
It would then be very interesting to study the GOA's and look for nifty
idioms.
This also allows to define the concept of power of a base.
Given a set of defining programs, the most powerful base is that for which
the corresponding GOA's, in some sense eg. minimax, average,... are the
shortest.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email :
john.carter@...
New Zealand
Zeitgeist shutting down for corrective maintenance, please log off.
Manfred von Thun — 2002-04-03 09:43:07
On Wed, 3 Apr 2002, John Carter wrote:
> Up to now, I have been rather reluctant to accept recursion via combinator
> as anything but an academic curiosity.
The idea was to capture the most common recursion patterns in a few
combinators. It works well for linear and binary recursion, and for
folding (and hence concatenating, filtering and mapping) lists.
Many programming languages do the same. In Joy one has the additional
advantage that static typing does not get in the way, and no fancy
moves are needed to circumvent it.
But recursion can express the weirdest recursion patterns, and I doubt
whether there are any nice _clean_ combinators that capture all patterns.
Of course there is the y combinator, and any of its cousins. They can
express any recursion pattern, but the resulting programs can look
rather ugly. (Just try to rewrite the Lisp interpreter with its
eval and apply - but do not use recursion.)
> But the idea of GOA's (God's Own
> Algorithm's) suddenly makes me very interested in them indeed. Especially
> in the light of the thread on minimal bases.
>
> Given any program one can always concieve of it implemented by a GOA. ie.
> The shortest program that does exactly the same thing.
^^^^^^^^
If "shortest" means "shorter than all others", then there may not be
such a program because two or more may have the same length.
If "shortest" means "no longer than any others", then there may be
such a program, and possibly several others.
BUT: there will not be an algorithm for finding it.
This is a consequence of an interesting theorem in
computability theory. Whereas just about all other bad-news
theorems (Halting, Undecidability, Imcompleteness) are based on
diagonalisation and hence indirectly on the ol' Liar paradox,
this theorem is based - of all things - on Berry's paradox:
Let k be "the least positive integer which is not denoted by any
expression in the English language containing fewer a hundred words".
(There cannot be such an integer, because if there were some such k,
then the quoted expression does denote it after all.)
In
Machtey, M. and Young, P.,
_An Introduction to the General Theory of Algorithms_
Elsevier North Holland 1979
you will find on pp 5-7 a surprisingly simple proof of the impossibility
of a certain optimisation problem: For any integer n, find a shortest
(in the second sense) program that prints that integer. There is no
program that can do that. (Of course, for any candidate program one
can prove that indeed it is the shortest - but that is a different
matter.)
If I remember correctly, the theorem can be extended to similar
optimisation problems, including finding shortest programs to do anything.
I hope this will stop somebody from trying to find what cannot be found
because it does not exist.
> Now there are two features about Joy that make this concept very
> interesting.
>
> One is the "recursion" free nature. Traditionally one of the ways of
> hunting a GOA is to break everything into smaller and smaller subroutines
> which call each other in fantastical ways. Now in recursion free Joy a
> "word" can always be replaced by its definition without it ever appearing
> recursively in the expansion.
>
> Now add in the idea of a minimal base, then any Joy program can be
> expanded to a single string involving nothing but the minimal base
> operators.
>
> Thus we have a remarkably clean concept for a GOA. That program
> that provably behaves the same as the defining program, yet is the
> shortest string when expanded all the way down to the chosen minimal base.
>
> Thus a GOA is defined by the defining program and the chosen minimal base.
>
> It would be interesting to create a program that, given a program and a
> base, searches for the GOA.
Impossible, by the above. But one might well be satisfied with what
one knows to be second-best - just a reasonably short one would do.
At the same time, even finding a reasonably short one may well be
computationally unfeasible.
> It would then be very interesting to study the GOA's and look for nifty
> idioms.
>
> This also allows to define the concept of power of a base.
>
> Given a set of defining programs, the most powerful base is that for which
> the corresponding GOA's, in some sense eg. minimax, average,... are the
> shortest.
Yes, that is a useful notion. But do keep in mind that all sorts of
very mature optimising compilers (for other languages) exist which
only aim for what is third-best - and know that even second-best
takes just far too long to find.
I hope this is of some help.
- Manfred
John Carter — 2002-04-04 05:30:05
On Wed, 3 Apr 2002, Manfred von Thun wrote:
> > But the idea of GOA's (God's Own
> > Algorithm's) suddenly makes me very interested in them indeed. Especially
> > in the light of the thread on minimal bases.
> >
> > Given any program one can always concieve of it implemented by a GOA. ie.
> > The shortest program that does exactly the same thing.
> ^^^^^^^^
> If "shortest" means "shorter than all others", then there may not be
> such a program because two or more may have the same length.
> If "shortest" means "no longer than any others", then there may be
> such a program, and possibly several others.
True, GOA's would not necessarily be unique. One could have a
mini-pantheon of GOA's, but the length of the GOA's would be
unique for any program.
> BUT: there will not be an algorithm for finding it.
> This is a consequence of an interesting theorem in
> computability theory. Whereas just about all other bad-news
> theorems (Halting, Undecidability, Imcompleteness) are based on
> diagonalisation and hence indirectly on the ol' Liar paradox,
Last night I was contemplating this and I saw that I would hit up against
the "bad-news" in general. But that need not cause alarm and despondency,
in that non-halting programs with an infinite input range are not very
practical anyway. I curious to know what techniques could I learn from
GOA's for practical programs.
So if we constrain the set of programs we deal with to those, which given
a finite N, halt on all inputs within N steps.
A fierce constraint, especially if N is smallish, but enough to include a
fair number of nifty practical programming idioms.
> of a certain optimisation problem: For any integer n, find a shortest
> (in the second sense) program that prints that integer. There is no
> program that can do that.
It's the _any_ that is not relevant here. There are two inputs to the
GOA(p,b) function. b the base and p the program. I'm quite sure that
GOA(p,b) is non-computable in general. However for a small but perhaps
useful and interesting subset of programs, it is.
Whilst I would love to feed the 150000 line C++ and Java monster I'm
working on into the GOA function, I would be content with merely
uncovering more expressive one or two line idioms.
> > This also allows to define the concept of power of a base.
> >
> > Given a set of defining programs, the most powerful base is that for which
> > the corresponding GOA's, in some sense eg. minimax, average,... are the
> > shortest.
>
> Yes, that is a useful notion. But do keep in mind that all sorts of
> very mature optimising compilers (for other languages) exist which
> only aim for what is third-best - and know that even second-best
> takes just far too long to find.
Ah, but what do optimising compilers compile into. Machine instructions.
There are two views of machine instructions - CISC and RISC.
CISC means have lots of powerful instructions.
RISC means have a few instructions you really need and make them fast.
A powerful base would be the best of both worlds.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email :
john.carter@...
New Zealand
Zeitgeist shutting down for corrective maintenance, please log off.
Brent L Kerby — 2002-04-04 06:25:56
> [impossibility of finding the shortest program that does a certain thing]
It's interesting that this problem is said to be impossible to solve, whereas my
brute-force searcher is an apparent solution. Of course, the theorem is correct,
and the problem is truely impossible.
It's just that the brute-force searcher is not a true solution in all cases,
although it works well for very small program lengths (actually, only for
ridiculuously small program lengths, because of the inefficiency of my
algorithm, sadly). The thing that makes the problem impossible, even by
brute-force search (starting with the smallest programs and working upward) is
that in the course of the brute search, we are bound to try a non-terminating
program, on which we will get stuck forever. In my searcher, I use a heuristic
that if a program takes more than 100 steps and has not terminated, then it is
probably non-terminating, and so the searcher moves on to the next candidate
program. Of course, with larger programs, which can take a long time to
terminate, but nonetheless still terminate (and possibly even giving the desired
result in the shortest possible way), this heuristic would be useless; the
problem is that it is impossible, in general, to tell if a program is
non-terminating, except by trying it out and running it, in which case if it is
non-terminating, we'll never know for sure until the end of eternity.
However, although we can't in general find the _shortest_ program, it is
possible, theoretically at least, to find the _fastest_ program, by brute-force
search, at least in the context of Joy combinators (but, since a Joy
combinator-reduction machine is Turing-complete, this generalizes, really). The
reason is that we can use a basic abstraction algorithm on our desired program
(using i, cons, dip, sip, and zap), which will give a worst-case scenario for
how long a program will take. For example, say we wanted to find a fast way of
doing bury2:
[C] [B] [A] bury2 == [A] [C] [B]
we could find a worst-case by using an abstraction algorithm:
A\ B\ C\ [A] [C] [B]
== A\ B\ [[A]] dip [B]
== A\ [[[A]] dip] dip
== [A\ [[A]] dip] cons dip
== [[A\ [A]] cons dip] cons dip
== [[] cons dip] cons dip
And then, and we try this program out and count how many steps it takes:
[C] [B] [A] [[] cons dip] cons dip
== [C] [B] [[A] [] cons dip] dip
== [C] [A] [] cons dip [B]
== [C] [[A]] dip [B]
== [A] [C] [B]
Depending on the exact way in which steps are counted, you might this consider
this to take 7 steps. Then, because of this, we don't have to waste any more
than 7 steps on a non-terminating program, because after that we know it could
not be the fastest, even if it did eventually terminate. Also, our search will
be over once we finish programs larger than size 7, because all programs bigger
than that will take longer than 7 steps to finish (at least in a certain
formalization of "steps"), assuming each word is executed at least once (and, if
some words are unused, then that program obviously could not be the fastest).
Of course, that's all just theoretical. Really, a brute-force search is totally
impractial for large programs, due to the exponential slowness (perhaps this
could change someday though, with quantum computing, but if I understand right,
even quantum computing cannot solve an exponential problem in linear time).
- Brent
Brent L Kerby — 2002-04-04 06:29:22
> [impossibility of finding the shortest program that does a certain thing]
It's interesting that this problem is said to be impossible to solve, whereas my
brute-force searcher is an apparent solution. Of course, the theorem is correct,
and the problem is truely impossible.
It's just that the brute-force searcher is not a true solution in all cases,
although it works well for very small program lengths (actually, only for
ridiculuously small program lengths, because of the inefficiency of my
algorithm, sadly). The thing that makes the problem impossible, even by
brute-force search (starting with the smallest programs and working upward) is
that in the course of the brute search, we are bound to try a non-terminating
program, on which we will get stuck forever. In my searcher, I use a heuristic
that if a program takes more than 100 steps and has not terminated, then it is
probably non-terminating, and so the searcher moves on to the next candidate
program. Of course, with larger programs, which can take a long time to
terminate, but nonetheless still terminate (and possibly even giving the desired
result in the shortest possible way), this heuristic would be useless; the
problem is that it is impossible, in general, to tell if a program is
non-terminating, except by trying it out and running it, in which case if it is
non-terminating, we'll never know for sure until the end of eternity.
However, although we can't in general find the _shortest_ program, it is
possible, theoretically at least, to find the _fastest_ program, by brute-force
search, at least in the context of Joy combinators (but, since a Joy
combinator-reduction machine is Turing-complete, this generalizes, really). The
reason is that we can use a basic abstraction algorithm on our desired program
(using i, cons, dip, sip, and zap), which will give a worst-case scenario for
how long a program will take. For example, say we wanted to find a fast way of
doing bury2:
[C] [B] [A] bury2 == [A] [C] [B]
we could find a worst-case by using an abstraction algorithm:
A\ B\ C\ [A] [C] [B]
== A\ B\ [[A]] dip [B]
== A\ [[[A]] dip] dip
== [A\ [[A]] dip] cons dip
== [[A\ [A]] cons dip] cons dip
== [[] cons dip] cons dip
And then, and we try this program out and count how many steps it takes:
[C] [B] [A] [[] cons dip] cons dip
== [C] [B] [[A] [] cons dip] dip
== [C] [A] [] cons dip [B]
== [C] [[A]] dip [B]
== [A] [C] [B]
Depending on the exact way in which steps are counted, you might this consider
this to take 7 steps. Then, because of this, we don't have to waste any more
than 7 steps on a non-terminating program, because after that we know it could
not be the fastest, even if it did eventually terminate. Also, our search will
be over once we finish programs larger than size 7, because all programs bigger
than that will take longer than 7 steps to finish (at least in a certain
formalization of "steps"), assuming each word is executed at least once (and, if
some words are unused, then that program obviously could not be the fastest).
Of course, that's all just theoretical. Really, a brute-force search is totally
impractial for large programs, due to the exponential slowness (perhaps this
could change someday though, with quantum computing, but if I understand right,
even quantum computing cannot solve an exponential problem in linear time).
- Brent
e1_t — 2002-04-05 00:04:29
--- In concatenative@y..., John Carter <john.carter@t...> wrote:
> > > This also allows to define the concept of power of a base.
> > >
> > > Given a set of defining programs, the most powerful base is
that for which
> > > the corresponding GOA's, in some sense eg. minimax, average,...
are the
> > > shortest.
> >
> > Yes, that is a useful notion. But do keep in mind that all sorts
of
> > very mature optimising compilers (for other languages) exist which
> > only aim for what is third-best - and know that even second-best
> > takes just far too long to find.
>
> Ah, but what do optimising compilers compile into. Machine
instructions.
> There are two views of machine instructions - CISC and RISC.
>
> CISC means have lots of powerful instructions.
> RISC means have a few instructions you really need and make them
fast.
>
> A powerful base would be the best of both worlds.
>
I don't completely follow. What does that have to do with the fact
that compilers compile into machine code? Unless you're planning on
creating your own virtual machine your compiler will have to do the
same. Having said that I always had a view that hardware should not
be made to run software but the other way around. There are some
exceptions though. Chuck Moore's processors are a good example but
again he didn't make his processors suited to Forth more so than C.
He simply made a decision to use stack instead of registers. Doing
that doesn't cripple C compilers or any other compilers - it does
make Forth interpreters shine though (note that I said interpreters -
a Forth compiler is really no different to any other compiler - it
can be made equally efficient on any platform).
The fact that compilers target a certain architecture (CISC or RISC,
it doesn't matter really) has nothing to do with them aiming for
third-best. Having a CISC/RISC combo architecture (I guess these days
x86 processors would qualify into that category since their
instruction set is CISC but internally they're RISC processors)
wouldn't improve compilers ability to generate better code. So at
least at the lowest level it doesn't matter what base you use.
Compilers will continue aiming for third-best.
It's up to the compilers for the new generation of functional
languages to (try to) look for second-best.
One final note is that many of today's RISC processors actually have
more instructions than CISC processors. PowerPC instruction set is
bigger than x86 instruction set (I didn't verify this but I've been
told that's the case - or at least it was the case before MMX,
3DNow!, SSE, SSE2,...).
>
> John Carter Phone : (64)(3) 358 6639
> Tait Electronics Fax : (64)(3) 359 4632
> PO Box 1645 Christchurch Email : john.carter@t...
> New Zealand
>
> Zeitgeist shutting down for corrective maintenance, please log off.
John Carter — 2002-04-05 06:55:45
On Fri, 5 Apr 2002, e1_t wrote:
> > There are two views of machine instructions - CISC and RISC.
> >
> > CISC means have lots of powerful instructions.
> > RISC means have a few instructions you really need and make them
> fast.
> >
> > A powerful base would be the best of both worlds.
> >
>
> I don't completely follow. What does that have to do with the fact
> that compilers compile into machine code?
>snip<
> Chuck Moore's processors are a good example but
> again he didn't make his processors suited to Forth more so than C.
> He simply made a decision to use stack instead of registers. Doing
> that doesn't cripple C compilers or any other compilers - it does
> make Forth interpreters shine though (note that I said interpreters -
> a Forth compiler is really no different to any other compiler - it
> can be made equally efficient on any platform).
OK, this of "powerful base" idea would only apply in the...
* very unlikely event that you were designing a new CPU and instruction
set or...
* the much more likely event you were designing a byte code for a
compiled to byte code language + byte code interpreter language.
How does one decide what instructions to implement? Well, last time I did
that, I wrote the parser and then compiled it into smallish chunks. Some
chunks were actually very very large, but then that was what that the
language was about. It made access to some very bizarre semantics easy.
The large chunks were just a low level interface to the bizarre semantics.
All very top-down and completely and utterly one-shot single purpose.
Now if one were to design a general purpose byte code language (parrot,
c--, gcc RTL, etc) wouldn't it be better to choose the most powerful and
expressive byte codes to implement.
There is a trade-off. The more byte-codes the less work for the compiler
writer and the more work for the interpreter writer.
In fact, if you are using the byte code as a portable intermediate
representation (think UCSD pascal pcode) the effort is proportional to
number of byte codes * number of platforms ported to.
The GNU gcc project has a brute force tool that seeks the most time wise
efficient implementation of a code fragment. Typically for converting their RTL
intermendiate language into the machine language.
Now wouldn't it be nice to have a brute force search for the most powerful
base.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email :
john.carter@...
New Zealand
Zeitgeist shutting down for corrective maintenance, please log off.
Manfred von Thun — 2002-04-08 03:26:45
On -1 xxx -1, Brent L Kerby wrote: [probably 4-APR-2002]
> > [impossibility of finding the shortest program that does a certain thing]
[..]
> algorithm, sadly). The thing that makes the problem impossible, even by
> brute-force search (starting with the smallest programs and working upward) is
> that in the course of the brute search, we are bound to try a non-terminating
> program, on which we will get stuck forever.
I think there is a quite separate reason which has to do with the
rewriting based on the equivalence of various programs.
Suppose that we write all rewriting explicitly:
LHS ==> RHS
If all the rewriting rules are like definitions, and are being used in the
direction of expanding the RHS, then there is no problem:
a ==> b c d or whatever
If there are some non-definitional rewrite rules in which the RHS is
at least as long as the LHS, then again no problem:
a b ==> c d e
But if there are rules in which the RHS is shorter than the LHS,
like this:
a b c ==> d
then an exhaustive search for short programs will have to consider
longer and longer programs which are in fact longer than the shortest
already found.
(I'll mention one such rule which I stumbled upon a while back:
swap [i] dip == dip)
This is a perennial problem with rewriting systems, and is - as far
as I know - the main source of algorithmic unsolvability.
Something similar explains why propositional logic is decidable but
predicate logic is not: for the latter any proof might have to contain
formulas that are not subformulas of other ones already occurring
but are actually longer than any already occurring.
Sorry if this was not too clear. It has been years since I read about
rewriting systems.
[.. more good things]
> - Brent
- Manfred