map fusion

John Nowak — 2008-08-01 08:52:43

In a functional language, the following optimization is valid
(assuming F and G are pure functions that terminate):

map G (map F xs) -> map (G . F) xs

This is not only an optimization: It is a law that can be used for
understanding and proving program properties.

For a concatenative language, we might assume the following rule
(using Joy's syntax):

[F] map [G] map -> [F G] map

Unfortunately, this is not valid because 'F' and 'G' can access more
than one value on the stack. Here is a counterexample to the above rule:

9 [1 2 3] [drop dup] map [[1 -] dip] map -- yields 6 [9 9 9]
9 [1 2 3] [drop dup [1 -] dip] map -- yields 6 [9 8 7]

This seems to be yet another argument against n-ary combinators where
the quotation is not called a fixed number of times. (Combinators that
call their quotation a fixed number of times like 'dip', 'twice', or
'@bi' are fine.) The fact that 'map' in a concatenative language is
still "purely functional" doesn't change the fact that the
unrestricted stack access complicates things the same way mutable
local variables do.

It seems the solution is to offer two versions of certain combinators.
The restricted versions ('map, 'each', etc) will be easier to reason
about and optimize, while the unrestricted versions ('map*', 'each*',
etc) will remain useful in a way only possible in stack-based
languages. I think this would be worthwhile to adopt in both typed and
untyped languages.

- John

William Tanksley, Jr — 2008-08-01 14:01:46

John Nowak <john@...> wrote:
> Unfortunately, this is not valid because 'F' and 'G' can access more
> than one value on the stack. Here is a counterexample to the above rule:

Okay, I think I get this. (I certainly agree.)

> This seems to be yet another argument against n-ary combinators where
> the quotation is not called a fixed number of times.

_This_ I don't understand. What does that mean?

I would have said "this is an argument against combinators which
repeatedly call function inputs that do not have a statically known
stack effect." Am I being too restrictive? Are you being too
restrictive?

> It seems the solution is to offer two versions of certain combinators.
> The restricted versions ('map, 'each', etc) will be easier to reason
> about and optimize, while the unrestricted versions ('map*', 'each*',
> etc) will remain useful in a way only possible in stack-based
> languages. I think this would be worthwhile to adopt in both typed and
> untyped languages.

I definitely like this; although of course in Factor the asterisk
prefix is already taken.

> - John

-Wm

Christopher Diggins — 2008-08-01 16:29:00

On Fri, Aug 1, 2008 at 4:52 AM, John Nowak <john@...> wrote:
> In a functional language, the following optimization is valid
> (assuming F and G are pure functions that terminate):
>
> map G (map F xs) -> map (G . F) xs
>
> This is not only an optimization: It is a law that can be used for
> understanding and proving program properties.
>
> For a concatenative language, we might assume the following rule
> (using Joy's syntax):
>
> [F] map [G] map -> [F G] map
>
> Unfortunately, this is not valid because 'F' and 'G' can access more
> than one value on the stack. Here is a counterexample to the above rule:
>
> 9 [1 2 3] [drop dup] map [[1 -] dip] map -- yields 6 [9 9 9]
> 9 [1 2 3] [drop dup [1 -] dip] map -- yields 6 [9 8 7]
>
> This seems to be yet another argument against n-ary combinators where
> the quotation is not called a fixed number of times. (Combinators that
> call their quotation a fixed number of times like 'dip', 'twice', or
> '@bi' are fine.) The fact that 'map' in a concatenative language is
> still "purely functional" doesn't change the fact that the
> unrestricted stack access complicates things the same way mutable
> local variables do.
>
> It seems the solution is to offer two versions of certain combinators.
> The restricted versions ('map, 'each', etc) will be easier to reason
> about and optimize, while the unrestricted versions ('map*', 'each*',
> etc) will remain useful in a way only possible in stack-based
> languages. I think this would be worthwhile to adopt in both typed and
> untyped languages.

I think I understand. To summarize with Cat type notation:

map : (list ('a -> 'a) -> list)
map* : (list ('A 'b -> 'A 'b) -> list)

In other words, "map" accepts only 1->1 functions, while map* accepts
n->n functions where n >= 1.

FWIW I think this is a very good idea, and I like the notation.

In Cat I simply disallowed "map" and similar combinators from
modifying the stack. It made type analysis easier, it preserved the
fusion properties, and it didn't really cause any huge loss of
expressiveness AFAICT. However: there was definitely some scenarios,
where Joy would be more succinct.

- Christopher

- Christopher

Daniel Ehrenberg — 2008-08-01 16:33:35

On Fri, Aug 1, 2008 at 7:01 AM, William Tanksley, Jr
<wtanksleyjr@...> wrote:
>> This seems to be yet another argument against n-ary combinators where
>> the quotation is not called a fixed number of times.
>
> _This_ I don't understand. What does that mean?
>
> I would have said "this is an argument against combinators which
> repeatedly call function inputs that do not have a statically known
> stack effect." Am I being too restrictive? Are you being too
> restrictive?

I agree with William. In a high level language, it'd generally be
better if the compiler can figure out its optimizations itself without
needing changes in the language design. Here, it should be easy to
check whether the quotation given affects values lower down on the
stack. In its fusion optimization, the compiler can just use this
information, rather than change the kind of program that the user
writes.

(Maybe map* would be good for documentation purposes or something, but
this is not a good justification for it.)

Dan

Christopher Diggins — 2008-08-01 16:44:08

On Fri, Aug 1, 2008 at 12:33 PM, Daniel Ehrenberg <microdan@...> wrote:
> On Fri, Aug 1, 2008 at 7:01 AM, William Tanksley, Jr
> <wtanksleyjr@...> wrote:
>>> This seems to be yet another argument against n-ary combinators where
>>> the quotation is not called a fixed number of times.
>>
>> _This_ I don't understand. What does that mean?
>>
>> I would have said "this is an argument against combinators which
>> repeatedly call function inputs that do not have a statically known
>> stack effect."

However, they *do* have a statically known stack effect.

> Am I being too restrictive? Are you being too
>> restrictive?
>
> I agree with William. In a high level language, it'd generally be
> better if the compiler can figure out its optimizations itself without
> needing changes in the language design.

> Here, it should be easy to
> check whether the quotation given affects values lower down on the
> stack.

Yes it is (in most scenarios).

> In its fusion optimization, the compiler can just use this
> information, rather than change the kind of program that the user
> writes.

That is a valid point. However, one of the benefits of being able to
manipulate the concatenative program syntactically while preserving
meaning is now lost.

> (Maybe map* would be good for documentation purposes or something, but
> this is not a good justification for it.)

There is a precedence for such a naming system in Scheme. Effectively
"map*" is a version with side-effects. We could perhaps call it
"map!". For Cat which was intended for use as an intermediate language
at least, the goal of being able to apply simply term rewriting to the
syntax of a program (e.g. identify "[f] map [g] map" and rewrite it
without analysis) was really quite appealing.

- Christopher

John Nowak — 2008-08-01 21:06:17

On Aug 1, 2008, at 10:01 AM, William Tanksley, Jr wrote:

>> This seems to be yet another argument against n-ary combinators where
>> the quotation is not called a fixed number of times.
>
> _This_ I don't understand. What does that mean?

I mean that in languages like Factor, allowing the quotation given to
'map' to use any amount of the stack can make things difficult -- even
if that quotation is appropriately balanced (e.g. 2 -> 2, 3 -> 3,
etc). Ideally, it would be nice to have '[F] map [G] map -> [F G] map'
be applicable in the general case, but it isn't unless you restrict
'F' and 'G' to only using one element. Of course, given that Factor
allows unrestricted side effects, there are other issues to deal with
anyway.

At least from an optimization standpoint, this is not an issue in a
language like 5th because "quotations" are supplied directly, stack
effects are always statically determinable, and side effects are
tracked on the type level. Still, it may be useful to offer both 'map'
and 'map*' so that certain optimizations (like map fusion) can be
guaranteed to the programmer more simply.

- John

John Nowak — 2008-08-01 21:21:31

On Aug 1, 2008, at 12:29 PM, Christopher Diggins wrote:

> I think I understand. To summarize with Cat type notation:
>
> map : (list ('a -> 'a) -> list)
> map* : (list ('A 'b -> 'A 'b) -> list)

I don't understand when you're allowed to omit row variables in Cat's
notation. I also think you mean "'a -> 'b" rather than "'a -> 'a".

In any case, here are the (simplified) type signatures as they'd
appear in 5th:

map[a -> b] : R (a) -> R (b).
map*[R a -> R b] : R (a) -> R (b).

> In other words, "map" accepts only 1->1 functions, while map*
> accepts n->n functions where n >= 1.

Correct.

> In Cat I simply disallowed "map" and similar combinators from
> modifying the stack. It made type analysis easier,

Why would it make type analysis easier?

> it preserved the fusion properties, and it didn't really cause any
> huge loss of
> expressiveness AFAICT.

This is arguably a reasonable solution in Cat because you have
'papply' (aka 'curry' in Factor). In 5th however, you don't, so you'd
have to use variables to write things like 'map-with':

x map-with[F] -> map[x F].

The only way to write 'map-with' without variables is to use a version
of 'map' that can call a function that uses more than one element:

map-with[F] -> map*[over dip[swap F]] nip.

Of the two, I prefer the first version for (obvious) reasons, so
perhaps this is a bad example.

Still, there are cases where 'map*', 'each*', etc, are quite handy,
even in a language like Cat. Slava has given me a few examples of
where they're used in Factor (although obviously not with those names).

- John

John Nowak — 2008-08-01 21:37:10

On Aug 1, 2008, at 12:44 PM, Christopher Diggins wrote:

> On Fri, Aug 1, 2008 at 12:33 PM, Daniel Ehrenberg
> <microdan@...> wrote:
>
>> In its fusion optimization, the compiler can just use this
>> information, rather than change the kind of program that the user
>> writes.
>
> That is a valid point. However, one of the benefits of being able to
> manipulate the concatenative program syntactically while preserving
> meaning is now lost.

This is essentially the point I was trying to make. How strong of a
point it is I am not sure.

> There is a precedence for such a naming system in Scheme. Effectively
> "map*" is a version with side-effects. We could perhaps call it
> "map!".

I think you need to be careful here. 'map*' acts strangely because it
is really some sort of map/fold combination. (I've not checked my
dictionary to figure out exactly what sort of morphism it is.)

I had a related discussion with Slava last night where he took issue
with the characterization of 'map*' as having side effects. Instead,
he preferred to talk more meaningfully about the additional stack
elements as loop-carried dependencies. You can find the discussion
here (luckily, it began right around midnight):

http://bespin.org/~nef/logs/concatenative/08.08.01

How come you never make it onto IRC chris? You're leaving me all alone
in my type system and rewriting interests...

> For Cat which was intended for use as an intermediate language
> at least, the goal of being able to apply simply term rewriting to the
> syntax of a program (e.g. identify "[f] map [g] map" and rewrite it
> without analysis) was really quite appealing.

How far along have you gotten on this? At least in my opinion, it is
an open question as to if function-level rewrite rules are a
worthwhile addition to other means of optimization. Their biggest
appeal to me isn't in their power (which likely isn't that great
compared to more advanced methods), but rather in their simplicity
which enables guarantees of optimization to be simply conveyed to
users of a language. Of course, such rules are useful for other things
like as proving program equality.

- John

Chris Cogan — 2008-11-10 01:59:51

It's been a long time since I've participated in this forum, but it hasn't
been because of loss of interest. Mostly, I've been preoccupied with other
things. I'm very glad to see that this forum is still active.

I thought I'd report on an evolutionary programming project using Joy as the
language of the evolved programs. Several months ago, I started this project
but only worked on it for an hour or two, and then set it aside until the
day before yesterday.

General Description

The idea is to write a program that writes Joy programs, tests them for
fitness against samples of the task data, where the task is do something
like handwriting recognition. Those that produce the most nearly correct
results are saved and used as parents for the next generation, which will
consist mostly of variations on each of the parent programs. Because each
one will be tested in isolation against the same test data, there will be no
direct competition for resources (as there would be in Tom Ray's Tierra).

1. A program generator takes as input an existing program and returns a
small modification of it, such as adding, deleting, or replacing a single
atom or item in a list, etc. (I'll talk in Joy terms, since I'm not
sufficiently knowledgeable about any of the other languages discussed in
this forum.) Use this generator to generate a slew of offspring from each of
the programs, if any, that made it through fitness testing for the previous
generation.

3. Execute each generated program on the test data and measure fitness based
on the results. Save the programs with the highest fitness rating and
discard all the rest.

4. When completely successful code is produced, stop.

5. Otherwise go back to step, using the most successful programs so far as
parents of the next generation.

Why Use Joy?

I considered using Prolog (before I ever heard of Joy) for this kind of
thing, because prolog programs can effectively modify themselves in useful
ways (by adding and deleting rules, mainly), and because it has an automatic
solution-tree search as the core of the way Prolog programs work.

However, a modified Joy is a lot easier to work with because Joy has no
syntax to speak of (although Prolog's syntax is simple compared to that of,
say, C++), except at the lowest level (lists must have a bracket at each
end, numbers can't have two decimal points, etc.). This makes generation and
modification trivial, because the code doing either doesn't have to worry
about complex control constructs, and it doesn't have to worry about it in
making modifications. Once you have code for properly recognizing aggregate
data types (lists, strings, etc.), your work is largely done.

There is a problem, but it is relatively minor: Exceptions, such as dividing
by zero. Normally, these can crash a program, and code generated
unintelligently can easily lead to such errors.

The solution is a modified Joy that basically handles all errors by ignoring
them. In Visual Basic, this can be done with a "Try" block, which can be
used to cause execution to resume after the block if an error occurs in it.
It can also be done with two versions of an "On Error" statement: "On Error
Resume Next" and "On Error Goto <label>" (where <label> is a label at a
location in the code where you want execution to resume if an error occurs).
Once one has a version of Joy that won't crash on errors (simply by
modifying a bunch of the primitives), one can then generate truly arbitrary
string of code, as long as it is syntactically correct, and they will
"execute" (perhaps doing nothing much at all, but still terminating without
error (infinite loops are still possible in combinators that repeat chunks
of code, but even this can be counteracted by putting a fixed limit of, say,
one million, on the number of iterations).

Sticking closely to the evolutionary paradigm can mean that one must start
with truly simple versions of the task, such as distinguishing between two
values of a variable. If we start with samples of the full task, such as
recognizing full documents of text correctly, we won't be able to provide
any fitness tests that are both meaningful and useful, and it could be
billions of years before anything works. Narrowing the task down to
something completely simple allows for some fitness to be discovered early
on (then we increase the difficulty of the task a little).

So, I'm planning to start my own evolver on a single Boolean variable. When
the evolver finally produces one that can handle this task reliably, I can
switch to having it try to make code that correctly reports the values of
two Booleans. At some point, I should be able to add code for creating
simple images in memory, and start over with one-pixel images, and only one
distinction (white vs. non-white, perhaps).

Eventually, I should be able to use images with enough pixels to represent
symbols such as plus-signs and minus-signs. From there, I should be able to
start using actual image files, which will require scanning samples of my
handwritten material (over two hundred pounds of the stuff, written over the
past twenty-odd years) and cutting it into small, one-character image files.
This will also require that I provide the correct text for each sample. At
first, I will probably just include the text in the names of the files (and
make sure that the generated code has no access to this information, to
avoid evolving code that only reports text from the file names!).

For now, I'll be happy just to get my evolver working well enough to evolve
code to make some simple distinction reliably, even if I have to hand-tweak
mutation-rates and types of mutations allowed. I'm using what will be my
third implementation of Joy (more of a mini-Joy, in this case). (I've got
one implementation in Visual Basic (Visual Studio) and another in Visual
Basic for Applications (the macro language for the Microsoft Office programs
-- Microsoft Word in this case).

My Visual Studio version of Joy "compiles" to lists of what I call "code
objects," and this makes for a very generalized implementation and allows
for easy de-compilation, etc., but it's also slow, so my new version will
not use "objects," and this will eliminate at least one level of
indirection, in that it will use "delegates" (which are basically pointers
to functions, but with type-checking for parameters and return types). It
will also not "compile" code but will only process code into arrays of
strings (even lists with many elements will each just be a single string
until it is "executed" (pushed onto the stack), at which point it will be
made into an array of strings (one string for each element).

This version of Joy will also be the least faithful to Manfred's version.
I'm more interested, this time, in making a language that simply has a
suitable set of primitives for my evolutionary programming purposes. One
difference will be that executing code can define new functions (and
re-define them, thus making them into variables). This violates Joy's
variable-free nature, especially if I allow non-local variables (but I
haven't decided on this yet).

It's interesting that Joy code is more like the genetic code of DNA than
ordinary programming languages are, in that DNA doesn't have much in the way
of syntax, and it doesn't have any explicit looping or other typical control
constructs. DNA is constructed in a concatenative way, as long chains,
without loops or other interconnections. Of course, this parallelism is very
limited, but it suggests why I think Joy is a good programming language for
evolutionary programming: The only structuring is done by defining functions
and making chunks of code into lists. Both of these are forms of
modularization (and defining functions is basically just naming a list
(quoted program) so it can be used by reference rather than being included
bodily (which also requires an extra step to execute its contents as code).

I had hoped to have some evolving going on by now (Sunday evening), but I
took time out to write this post so I may not get to actual evolution for a
day or so, depending on available time. But, I'm close.

I'm working on the Joy source code preprocessor now. Since I've done it
twice before and since I'm using a simplified version of Joy, this shouldn't
take long.

I'll report on results, soon, I hope.

Chris Cogan

chris glur — 2008-11-15 07:32:10

> The idea is to write a program that writes Joy programs,
> tests them for fitness against samples of the task data,
> where the task is do something like handwriting recognition.
--snip--
>3. Execute each generated program on the test data and measure
>fitness based on the results. Save the programs with the highest
>fitness rating and discard all the rest.

Having just read 75% of 'The collapse of chaos' by Cohen &
Ian Stew[au]rt I'm less believing than when I read of some
claimed successes in this scheme previously.

I see now that my previous extention of neural-nets
concepts, where the state space searched is continuous
and 'hill climbing' is aimed at; is completely misplaced.

The author starts by looking at Mandelbrot fractals ..etc.
where even with infinitesimal errors the output is still
unpredictable. Ie. effectively discontinuity.

>Sticking closely to the evolutionary paradigm can mean that
>one must start with truly simple versions of the task, such
>as distinguishing between two values of a variable. If we
>start with samples of the full task, such as recognizing full
>documents of text correctly, we won't be able to provide any
>fitness tests that are both meaningful and useful, and it could
>be billions of years before anything works. Narrowing the task
>down to something completely simple allows for some fitness to
>be discovered early on (then we increase the difficulty of the
>task a little).

OK, this sound plausible.
But then only the micro-modules evolve, and the intelligent
human uses them to build the complete algorithm.
Or you can automate the combinations of the modules, but
the human has designed the initial intelligent module
partitioning first.

>It's interesting that Joy code is more like the genetic code
>of DNA than ordinary programming languages are, in that DNA
>doesn't have much in the way of syntax, and it doesn't have
>any explicit looping or other typical control constructs.
I suspect these claims are mostly intuitive ?

IMO 'looping' can only apply in a digital computer.
Whereas DNA is digital, I think the functions which it
'controls' are analog-like. Ie. "DO x UNTIL enough".
So you could say 'the molecules of water evaporated
until the cup was dry", but it that really 'looping'.

I suspect that the 'divide by zero' is only one of [possibly
infinite] many discontinuities in the scheme, as I understand
your description of it.

You can simulate an analog-system in your output, but
your input is discontinuous/chaotic ?
OTOH you could simulate all the analog-processes, with
your digital computer ?

Can you give any mathematical insight into why:
run M*N should be closer to a goal than run N ?

It seems like one of these schemes which sound
plausible on a first superficial consideration, but
fails on deeper analysis.


An fascinating can-O-worms ?

It would be nice to be able to consult biologist-Cohen
& mathematician-Stewart on this ?

== Chris Glur.

janvanlent — 2008-11-16 19:06:15

--- In concatenative@yahoogroups.com, "Chris Cogan" <ccogan@...> wrote:
> I thought I'd report on an evolutionary programming project using
Joy as the
> language of the evolved programs.

You might be interested in the work by Lee Spector on the Push, PushGP
and Pushpop systems [1].

jan

[1] http://hampshire.edu/lspector/push.html

Stevan Apter — 2008-11-16 19:25:06

or search back in the concatenative archives and read billy
tanksley's remarks about his languages 01 and 10 and their
suitability for darwinian evolutionary experiments.

i did an implementation in k (www.nsl.com/k/10.k) and stopped
at the point where i needed to adapt dennis shasha's tree-
distance algorithm. it still seems to me that 10 is a perfect
candidate for this kind of work.

----- Original Message -----
From: "janvanlent" <jan.vanlent+concatenative@...>
To: <concatenative@yahoogroups.com>
Sent: Sunday, November 16, 2008 2:06 PM
Subject: [stack] Re: Evolutionary Programming


> --- In concatenative@yahoogroups.com, "Chris Cogan" <ccogan@...> wrote:
>> I thought I'd report on an evolutionary programming project using
> Joy as the
>> language of the evolved programs.
>
> You might be interested in the work by Lee Spector on the Push, PushGP
> and Pushpop systems [1].
>
> jan
>
> [1] http://hampshire.edu/lspector/push.html
>
>
>

William Tanksley, Jr — 2008-11-16 20:46:14

Stevan Apter <sa@...> wrote:
> or search back in the concatenative archives and read billy
> tanksley's remarks about his languages 01 and 10 and their
> suitability for darwinian evolutionary experiments.

The nice thing about 01 (or 10, it's just a logical NOT away) is that
you can cut and splice anything anywhere and it's still a valid
program. The bad thing is that I still haven't decided how to express
input/output... It's merely a matter of convention, but it's a pretty
important convention.

I've not touched "tworing", my 10 lab program, in a while; one of its
unit tests fail, and I'm not sure why. I want to use it to design a
more effective base ... I've got some theories about what might make a
base better than another one, I just need to do some testing.

BTW, while you're thinking about evolutionary algorithms, check this
page out: http://williamtozier.com/slurry/2008/04/02/search-algorithms.

-Wm

Stevan Apter — 2008-11-16 22:00:35

----- Original Message -----
From: "William Tanksley, Jr" <wtanksleyjr@...>
>
> I've not touched "tworing", my 10 lab program, in a while; one of its
> unit tests fail, and I'm not sure why.

perhaps you can post your 10 unit tests. i'd like to try running them in
my implementation.

John Meacham — 2008-11-17 04:21:34

Hi, you would probably be really interested in the work of Jurgen
Schmidhuber, in particular the 'OOPS' optimal ordered problem solver[1].

It is an algorithm to optimally find the solution to any problem, via
a brute force search of the program space. Now, it has been known for a
while that there exists a simple provably optimal solution to all
definable problems (see LSEARCH and HSEARCH), however, it was of purely
theoretical interest since the constant factor was on the order of
2^500.

OOPS has actually made this sort of algorithm practical via the clever
use of.. a concatinative language, in particular a varient of forth. (it
is described in the paper). The novel use of forth in OOPS was taking
advantage of the fact that a program that solves a subset of a certain
problem is more likely to be of use to a program that solves the general
problem, and that the concatination of valid forth programs is a valid
program. In particular, it attempts to solve problems in order from
smallest to larger, incrementally expanding the forth programs biased
towards using succesful runs of smaller functions.

As to a couple other points mentioned, when it comes to exceptions, you
actually, perhaps counterintuitivly, want to abort programs that come
across an exception rather than trying to handle it. in fact, you should
be ultra paranoid. You want to weed out the bad programs as soon as
possible so they don't take up time when you could be testing other
candidates, in particular, you want the most 'dense' encoding possible,
in that you don't want to spend time testing algorithms that have
already been tested, by ignoring exceptions, you have increased your
search space with a lot of 'duplicate' entries. a program that has an
ignored exception and one that never attempted the invalid op in the
first place behave identically, and you will end up running both to
completion needlessly. Of course, there is no way to remove all
redundancy, but you certainly can make it less likely by allowing
exceptions to kill your program.

Also, when it comes to automatcally generating joy programs, you can
avoid having to deal with the bit of syntax joy has by using the floy[2]
variant of joy. floy is joy where are quotations are restricted to a
single symbol. it turns out to be just as expressive as joy, but you
don't need to worry about matching quotations, your primitive 'base
pairs' are simple the joy primitives and the quoted versions of each
and nothing else.

When it comes to recursion, no need to fear, you actually can express
arbitrary recursion without bindings by use of a y combinator[3], which
is easy enough to provide as a primitive in joy, if you include it as a
choice in your genetic code, you are not limited by your lack of
definitions in any way.

I think the following works... but someone check my work.

fix == dup [fix] dip i;


John



[1] http://www.idsia.ch/~juergen/oops.html
[2] http://www.latrobe.edu.au/philosophy/phimvt/joy/jp-flatjoy.html
[3] http://en.wikipedia.org/wiki/Fixed_point_combinator

--
John Meacham - ⑆repetae.net⑆john⑈

chris glur — 2008-11-18 16:37:11

If it's to run as an auto-evolving machine, then decoding,
analysing & reverse-engineering the results is not realistic.
And not a goal.

What is the advantage of having a less syntaxy-language?
Well what does syntax mean ?
In this context it's a human aid to grouping and classifying.
Ie. imposing order. Which is exactly what you DON'T want.
You want the finest possible granularity.
At the same time you want to use the cost effectivness and
speed of a PC.

In fact why bother with joy ?
Just go as close as managable [without crashes] to the
native instruction set of the machine/PC.

For those who know PCs down to the register level,
"divide by zero" is a completely artifical human concept.
In fact "divide" is not know at the register level.
[ It might be interesting to 'evolve' a divide machine
instruction sequence, from the basic add, subtr shifts and
various logical-operations and test/branch instructions ?!]

Similarly the concept of "loop" is just a human construct,
invented for manageability purposes.

When the stone fell from PizaTower it didn't think in
terms of a loop: REPEAT decr(1 mm.) UNTIL (crash).
-------------

So yes, joy is much more appropriate than most
other "languages", but a more general purpose,
finely granulated instruction set, as close as possible
to the hardware's instruction set is best.

Of course the finer the granualarity the closer to a
continuous/analog-computer and the less 'problems'
with chaotic discontinuity.

Alternatively use standard languages to simulate a
continuous analog machine, which would be orders of
magnitude slower, and impose a whole extra layer of
complexity in between.

IMO the divide-by-zero crash is merely one example
of a more general problem. The 'problem' is stability:
a monotomical increase - towards infinity.

Similarly I guess, any infinite-loop [by definition a
stability] is a failure in the-game-of-life algorithm.

So how does natural evolution handle this ?
1. a random mutation breaks the stability;
2. the species goes off towards infinity before a mutation
'turns it', and a failed evolutionary branch results.

== Chris Glur.


On 11/17/08, John Meacham <john@...> wrote:
> Hi, you would probably be really interested in the work of Jurgen
> Schmidhuber, in particular the 'OOPS' optimal ordered problem solver[1].
>
> It is an algorithm to optimally find the solution to any problem, via
> a brute force search of the program space. Now, it has been known for a
> while that there exists a simple provably optimal solution to all
> definable problems (see LSEARCH and HSEARCH), however, it was of purely
> theoretical interest since the constant factor was on the order of
> 2^500.
>
> OOPS has actually made this sort of algorithm practical via the clever
> use of.. a concatinative language, in particular a varient of forth. (it
> is described in the paper). The novel use of forth in OOPS was taking
> advantage of the fact that a program that solves a subset of a certain
> problem is more likely to be of use to a program that solves the general
> problem, and that the concatination of valid forth programs is a valid
> program. In particular, it attempts to solve problems in order from
> smallest to larger, incrementally expanding the forth programs biased
> towards using succesful runs of smaller functions.
>
> As to a couple other points mentioned, when it comes to exceptions, you
> actually, perhaps counterintuitivly, want to abort programs that come
> across an exception rather than trying to handle it. in fact, you should
> be ultra paranoid. You want to weed out the bad programs as soon as
> possible so they don't take up time when you could be testing other
> candidates, in particular, you want the most 'dense' encoding possible,
> in that you don't want to spend time testing algorithms that have
> already been tested, by ignoring exceptions, you have increased your
> search space with a lot of 'duplicate' entries. a program that has an
> ignored exception and one that never attempted the invalid op in the
> first place behave identically, and you will end up running both to
> completion needlessly. Of course, there is no way to remove all
> redundancy, but you certainly can make it less likely by allowing
> exceptions to kill your program.
>
> Also, when it comes to automatcally generating joy programs, you can
> avoid having to deal with the bit of syntax joy has by using the floy[2]
> variant of joy. floy is joy where are quotations are restricted to a
> single symbol. it turns out to be just as expressive as joy, but you
> don't need to worry about matching quotations, your primitive 'base
> pairs' are simple the joy primitives and the quoted versions of each
> and nothing else.
>
> When it comes to recursion, no need to fear, you actually can express
> arbitrary recursion without bindings by use of a y combinator[3], which
> is easy enough to provide as a primitive in joy, if you include it as a
> choice in your genetic code, you are not limited by your lack of
> definitions in any way.
>
> I think the following works... but someone check my work.
>
> fix == dup [fix] dip i;
>
>
> John
>
>
>
> [1] http://www.idsia.ch/~juergen/oops.html
> [2] http://www.latrobe.edu.au/philosophy/phimvt/joy/jp-flatjoy.html
> [3] http://en.wikipedia.org/wiki/Fixed_point_combinator
>
> --
> John Meacham - ⑆repetae.net⑆john⑈
>

Chris Glur —

Did anybody else follow up on the good links which
John Meacham gave ?

Jürgen Schmidhuber starts laying out the algorithms in:
http://www.idsia.ch/~juergen/mljssalevin/node5.html

His drastic claims seem plausible.

So far I haven't seen what language/s he proposes.

== Chris Glur

ET.Do CatGmailSET

Chris Glur —

Did anybody else follow up on the good links which
ementation 2 gave ?

> Jürgen Schmidhuber starts laying out the algorithms in:
> http://www.idsia.ch/~juergen/mljssalevin/node5.html
>
> His drastic claims seem plausible.
>
> So far I haven't seen what language/s he proposes.

== Chris Glur

PS. I'm testing the hack: post via gmail without bogging
down in web-click-mechanism.

My 'proper' mailing-list gives auto-confirms, but perhaps
this one is moderated ?
--------
In: http://www.idsia.ch/~juergen/mljssalevin/node9.html
he tells: ---------
Implementation 2: Incremental Self-Improvement (IS)
..snip...
= Overview. We will use a simple, assembler-like programming language
which allows for writing many kinds of (learning) algorithms.
-------- and in node10.html he tells:-------
$b_0$ :
Add( $w_1,w_2,w_3$ ) : $c_{w_3}\leftarrow c_{w_1} + c_{w_2}$
(add the contents of work cell $w_1$ and work cell $w_2$ ,
write the result into work cell $w_3$ ).
$b_1$ :
Sub( $w_1,w_2,w_3$ ) : $c_{w_3}\leftarrow c_{w_1} - c_{w_2}$ .
$b_2$ :
Mul( $w_1,w_2,w_3$ ) : $c_{w_3}\leftarrow c_{w_1} * c_{w_2}$ .
$b_3$ :
Mov( $w_1, w_2$ ) : $c_{w_2}\leftarrow c_{w_1}$ .
$b_4$ :
JumpHome: IP $\leftarrow 0$ (jump back to 1st program cell).

Instruction probabilities / Current policy. For each program cell $i$
there is a variable probability distribution $P_i$ on $I$ .
----
That's apparently Latex format, where $b_1$ looks like
"b <subscript 1>". But you get the flavour ?

IMO this all looks promising.

BTW I think: c1 + c2 -> c3 is a better syntax.
Pop-11 uses it and I just can't see why "c3 <- c1 + c2", has
prevailed.

== Chris Glur.


ET.Do CatGmailSET

William Tanksley, Jr — 2008-11-22 22:35:05

chris glur <crglur@...> wrote:
> If it's to run as an auto-evolving machine, then decoding,
> analysing & reverse-engineering the results is not realistic.
> And not a goal.

You do have to decide the relative fitness of the results, so some
amount of decoding results is needed.

> At the same time you want to use the cost effectivness and
> speed of a PC.
> In fact why bother with joy ?
> Just go as close as managable [without crashes] to the
> native instruction set of the machine/PC.

There's a reason why this hasn't been done -- CPU instructions sets
and architectures aren't amenable to mathematical modeling.

> For those who know PCs down to the register level,
> "divide by zero" is a completely artifical human concept.
> In fact "divide" is not know at the register level.
> [ It might be interesting to 'evolve' a divide machine
> instruction sequence, from the basic add, subtr shifts and
> various logical-operations and test/branch instructions ?!]

Are you thinking of some specific chip? All PCs have a division
instruction in their CPU, and ways to handle division by zero. Intel
and PowerPC both. Some embedded processors don't (I like the SeaForth
S40C18, for obvious reasons to people in this newsgroup), but those
aren't PCs.

> Similarly the concept of "loop" is just a human construct,
> invented for manageability purposes.

No, it's a machine concept.

> Of course the finer the granualarity the closer to a
> continuous/analog-computer and the less 'problems'
> with chaotic discontinuity.

Now I do like the idea of an analog computer; but "fine granularity"
won't give you that. Technically, proteins/DNA are analog, but they
have ENORMOUS chaotic discontinuity. In fact, I suspect you MUST have
chaotic discontinuity in order to make effective computations;
otherwise you have to expend a lot of energy in order to get results.

> == Chris Glur.

-Wm

John Meacham — 2008-11-23 01:02:44

On Sat, Nov 22, 2008 at 03:59:07PM +0000, Chris Glur wrote:
> Did anybody else follow up on the good links which
> John Meacham gave ?
>
> Jürgen Schmidhuber starts laying out the algorithms in:
> http://www.idsia.ch/~juergen/mljssalevin/node5.html
>
> His drastic claims seem plausible.
>
> So far I haven't seen what language/s he proposes.

There are a wide variety of languages that would be suitable for the
techniques he describes, Appendix A of this paper gives the
specification for the specific forth-derived one used for his
implementation of the OOPS (optimal ordered problem solver), which
should also be usable for the adaptive levin search referenced in your
link above.

ftp://ftp.idsia.ch/pub/juergen/oopsmlj.pdf

John

--
John Meacham - ⑆repetae.net⑆john⑈

John Meacham — 2008-11-24 22:44:36

On Sat, Nov 22, 2008 at 02:35:05PM -0800, William Tanksley, Jr wrote:
> chris glur <crglur@...> wrote:
> > If it's to run as an auto-evolving machine, then decoding,
> > analysing & reverse-engineering the results is not realistic.
> > And not a goal.
>
> You do have to decide the relative fitness of the results, so some
> amount of decoding results is needed.

I should say, the main reason I recommended 'floy' instead of joy was
not for easy interpretation of the results, but rather to have a more
convinient base to use as a genetic code. Imagine your language has 16
basic primitives, when using floy, you need only consider those 16
primitives and the quoted versions of each. You have a straight up
linear language of 32 primitives you can cut-up and paste all you want
and still have a valid program. you can consider a length n program as a
base 32 number of length n if that suits you. If you allowed arbitrary
quoting, then you have to make sure quotes are balanced, you no longer
have a linear reppresentation of the language, but a tree-like one,
where each node is a primitive or a quote of an arbitrary list of
primitives or quotes themselves. Not that this is fatal, but it might
make things less convinient depending on how your algorithm works. for
instance, OOPS likes to extend its programs a single instruction at a
time, which is easier for the linear style.

Note that when talking about 'linear genetic programming' many will
assume you mean a first order imperative language, but a higher order
concatenative language like the floy variant of joy is just as linear.
but when communicating with others it is good to keep in mind. perhaps a
better term is in order 'concatenative linear genetic programming'
'functional linear genetic programing' maybe?


> > Similarly the concept of "loop" is just a human construct,
> > invented for manageability purposes.
>
> No, it's a machine concept.

I would go a bit further and say it is a fundamental mathematical
concept if we are talking about what computational "power" loops give
you over not having them. by loops I am including equivalent
(computationally) constructions such as recursion or fixed point
combinators.

http://en.wikipedia.org/wiki/Recursion_theory is the study of this.

John

--
John Meacham - ⑆repetae.net⑆john⑈

chris glur — 2008-11-29 17:53:58

I know little of joy and none of floy.
> OOPS likes to extend its programs a single >instruction at a time, which is easier for the >linear style.

Please define 'linear style'.
From your description, machine-code is linear,
because all 'constructs' are acheived by goto/s.

I suspected the 'extend by 1 instr scheme', but
couldn't find concrete reference -- yet.

Re. OOPS' probability-matrix: does each instruction
aquire a probability ?
And does it's initial probability also change/evolve?

== Chris Glur.