Factor 0.70

Slava Pestov — 2004-12-18 06:15:25

Hi everybody,

Factor 0.70 is out. Highlights of this release include:

- New compiler.
- Win32 port, now including the foreign function interface. Check out
http://red-stars.net/eiz/mandel.png.
- Object system. I'll have more to say about this in my weblog and in
the developer's guide.
- New example program in examples/irc.factor -- its an IRC client!

Get it from http://factor.sf.net/.

Slava

Chris Cogan — 2004-12-30 04:56:36

Yipes! It’s slow.

No, I didn’t die or fall off the Earth or get lost in a jungle for six
months.

For various largely unpleasant reasons I haven't been participating on
this list lately. The main reason was that my work time was taken up
with other things, and my home life made doing much at home almost
impossible because of having temporary custody of two step-grandchildren
for about eight months, leaving me pretty much exhausted most of the time.

But, now the kids have gone back to their parents, some of my
near-sanity is starting to return, and my home computer is working well
(after a two reinstalls of Windows XP and much other software in a
period of three weeks) and even networking well with the new machine my
wife has for her freelance desktop publishing.

The "Yipes!" at the beginning is a bemoanal of the lack of speed of my
current partial implementation of a Joy-Like Language (which I'm
currently calling JLL).

I modified my JLL implementation to be like XY, but then modified *that*
so that the XY "input queue" is now handled like a stack, and it is not
intended to be used as temporary storage (I may implement another queue
for that aspect of the XY input queue). I'm calling it the "code queue"
instead of "input queue" because it's where the continuations go.

However, they are not simply prepended to the queue (as I think they are
in XY). Instead, each function's code is added to it as a *single*
element which is itself a one-dimensional array. This makes the code
queue a stack/queue/array/tree, all at once.

This gives me a means of easily doing things like tracing and such. It
also means I can do multi-level returns easily, by simply decrementing
the nesting level variable once for each level I want to jump out of.
This means I can finally implement "abort" and other exception-handling
more or less cleanly and efficiently (although I’ve also made simple
exception-handling into a primitive combinator; you pass the code you
want to try to execute and the code to be executed if an error occurs,
etc.).

Not using the input queue for code from calls means that the input queue
(if used, as I expect it will be) becomes a stable entity that does not
have unpredictable things happening to it (mainly at the front end, as,
in XY, its size jumps up and down from adding and using code off of it).
Thus, it becomes free to use cleanly for constructing code and data
structures, and for getting stuff off the stack temporarily so the stack
is not so hard to manage.

I haven't implemented anything like XY's pattern programming yet (though
I’ve partially implemented a variant of Sanfilippo’s local “variables,”
which provides some of the functionality of pattern-programming). This
is partly because I haven't gotten that far, and partly because I’m not
quite satisfied with XY's methods of doing this nor with my own
alternatives.

Part of the slowness of my present implementation is due to the fact
that *every* primitive calls a trace function (even when tracing is
turned off; the testing of the trace flag is internal to it, to reduce
code clutter in the primitives (since I can’t use C-style macros to hide
the details of bare code in Visual Basic .NET)).

Some of the slowness is due to the copying of a function's code each
time it's called. I could speed *this* part up by not (pseudo-)
compiling, except to the token level (I'm currently "compiling" to what
I call "code elements," which are objects each of which has a delegation
to some other Sub (a Sub is like a void function in C). For the
primitives, this delegate executes to the primitive function itself. For
the non-primitives, this sub executes a Sub increments the nesting level
and puts a copy of the function's code as the first element of the code
queue. It also saves the function's name into an "array" indexed by the
same nesting level (this is used in tracing).

Even literals are "compiled" into "code elements." The difference is
that the code element is created on the fly during compilation instead
of being obtained from the dictionary of defined symbols. The Sub it
executes just pushes the literal’s value onto the stack.

Another part of the slowness is probably due to using hashtables instead
of arrays in two cases. One is the name array mentioned above (visual
Basic .NET has no automatically-expanding arrays; you can "ReDim"
arrays, but this involves making a copy of the array into a new one, and
it makes for yet more code-clutter). The other is the code queue itself.
Now that I've got things more or less working, I will probably convert
both of these to one-dimensional arrays and just add the bounds-checking
and ReDimming where necessary.

This will make debugging easier, in that Visual Studio's Watch windows
can display array-element values easily, but not hashtable entries (it
displays the "buckets" it uses internally, but they are filled in
“random” order (using a hashing algorithm, of course), so the data is
typically scattered throughout the buckets).

However, even with all of these improvements, I expect performance not
to be great. Worse, I don't know of any way to make it much better
without giving up too much else (or without switching to C (or C++?)
where function pointers can be used instead of "delegates"). Delegates
provide "signature" matching: parameter count and type checking, and
return value type checking, but I suspect that this also is what makes a
call-by-delegation take *four* times as long as a direct function call.
A plain C-like function pointer should be noticeably faster, and, in
this case, safe enough, because *all* of the JLL primitives have the
same signature (no return value, no parameters).

I've also made some progress on a GUI development environment for JLL,
but it's still pretty spare. It has a window for JLL output, one for
general debugging out put (including nesting-level and function-name
tracing), and one for displaying stack information. And one for editing
code. Unfortunately, it's nothing to write home about, but I plan to add
the usual assortment of drop-down menus (I made myself a wonderful
menuing system, starting from the rather skimpy tools supplied by Visual
Studio), toolbars, and such that a development environment typically
has, along with more debugging tools and improvements on the ones I have.

Merry Christmas, and happy everything.


-Chris Cogan

cpcogan — 2004-12-30 15:48:45

I wrote the previous post by "replying" to another post, and forgot
the change the subject.

Sorry.

In other news, Visual Basic .NET does in fact have a self-expanding
array class, which I'm about to try. It's called an ArrayList
(probably a sparse array internally, I'm guessing). I'll try it for
the two hashtables-as-arrays mentioned in the previous post to see if
I get a significant performance improvement. I'll try to read
everyone's recent posts to bring myself up to date on what's been
going on lately (I did notice that Stevan has come up with another
language).

--Chris

sa@dfa.com — 2004-12-30 16:08:35

"cpcogan" <ccogan@...> wrote on 12/30/2004 10:48:45 AM:

(I did notice that Stevan has come up with another language).

come up with, and subsequently abandoned. X was a bad idea to
begin with, i now think.

welcome back.


>
> --Chris
>
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>

William Tanksley, Jr — 2004-12-30 17:58:29

On Wed, 29 Dec 2004 22:56:36 -0600, Chris Cogan <ccogan@...> wrote:
> I modified my JLL implementation to be like XY, but then modified *that*
> so that the XY "input queue" is now handled like a stack, and it is not
> intended to be used as temporary storage (I may implement another queue
> for that aspect of the XY input queue). I'm calling it the "code queue"
> instead of "input queue" because it's where the continuations go.

Smart.

> However, they are not simply prepended to the queue (as I think they are
> in XY). Instead, each function's code is added to it as a *single*
> element which is itself a one-dimensional array. This makes the code
> queue a stack/queue/array/tree, all at once.

Yes, so it does. That has some interesting implications I'd like to
see explored.

> This gives me a means of easily doing things like tracing and such. It
> also means I can do multi-level returns easily, by simply decrementing
> the nesting level variable once for each level I want to jump out of.

This is a classic Forth trick, too. I really, really like the idea of
using arrays to represent functions; it fits with what I would expect
to be good concatenative implementation. The problem is that it
doesn't fit our current understanding of the theory. (Obviously, the
theory is what needs to be changed.) The problem is that if you
refactor a function, it will leave a different tree signature, while
still having the same basic list signature.

The theory of concatenativity has once again fallen behind the
practice. The major symptom is that things like the above can no
longer be accounted for by the simple textual rules that the theory is
constructed from. In addition, some of the newer implementations use
things other than a single stack; those data structures give huge
advantages in analysis, but full use of them again invalidates the
simple textual analysis that we're used to.

What can we do?

> -Chris Cogan

-Billy

William Tanksley, Jr — 2004-12-30 18:06:36

On Thu, 30 Dec 2004 11:08:35 -0500, sa@... <sa@...> wrote:
[stevan's new language]
> come up with, and subsequently abandoned. X was a bad idea to
> begin with, i now think.

Interesting! Tell us more, please. I found your post on X to be
interesting; X seems like a clever exercise in reduction. I wouldn't
be surprised to find it didn't work; I am surprised to hear that it
was bad to begin with.

-Billy

cpcogan — 2004-12-30 21:27:22

I've discovered that performance of JLL is not nearly as bad if the
system is run as a standalone, without Visual Studio debugging
activated on it.

Precise measurements not made yet, but speed seems about ten times
faster. This is still slow compared to equivalent plain VB code doing
the same calculations, but it gives me hope of having more than just
a washout on my hands. Now that I'm not sick with fear of having
(partially) wasted my time, I figure further optimization efforts can
probably wait until I get the whole core implementation working
smoothly.

I did find that there are some very bizarre performance issues with
arrays, arraylists, and hashtables in VB .NET, including odd
situations in which a hashtable is faster than an ordinary one-
dimensional array (even with ReDims!). Some of these oddities may be
due to unknown behavior of the compiler's optimizations (if any), but
it's still a mystery to me without further research.

I have switched from hashtables to arraylists for the "code queue"
and the name list that I use for tracing, though I no longer suspect
them of being primary performance problems.

Slava Pestov — 2004-12-31 01:44:58

Chris Cogan wrote:
> I haven't implemented anything like XY's pattern programming yet (though
> I’ve partially implemented a variant of Sanfilippo’s local “variables,”
> which provides some of the functionality of pattern-programming). This
> is partly because I haven't gotten that far, and partly because I’m not
> quite satisfied with XY's methods of doing this nor with my own
> alternatives.

What advantage do you see in allowing stack patterns and local variables?

> However, even with all of these improvements, I expect performance not
> to be great. Worse, I don't know of any way to make it much better
> without giving up too much else (or without switching to C (or C++?)
> where function pointers can be used instead of "delegates").

You could compile to .NET bytecode. The obsolete Java implementation of
Factor generated JVM bytecode, for a nice speed boost (not as nice as
x86 machine code generation though).

> I've also made some progress on a GUI development environment for JLL,
> but it's still pretty spare. It has a window for JLL output, one for
> general debugging out put (including nesting-level and function-name
> tracing), and one for displaying stack information.

Is this environment written in JLL?

Slava

Chris Cogan — 2005-01-01 06:05:10

Chris:

>>I've also made some progress on a GUI development environment for JLL,
>>but it's still pretty spare. It has a window for JLL output, one for
>>general debugging out put (including nesting-level and function-name
>>tracing), and one for displaying stack information.
>>
>>
>Slava:
>Is this environment written in JLL?
>
>
Chris:

Are you kidding? I wish.

No; JLL is nowhere near that level yet. It's just a Visual Studio .NET Windows form with a little Visual Basic code behind it to do things like passing code to the JLL processor.


The Visual Studio GUI tools are so advanced that all I have spcific
hopes for is enabling JLL to work with Visual Studio forms (perhaps even
to the point of being able to use the Visual Studo libraries directly
and thus create forms and such that way. But, using it to actually write
a GUI package directly is not something I think I would even want to
attempt. If I had to leave Visual Studio, I'd probably still use some
Windows GUI package of some sort and write some wrappers for JLL use of
it, if necessary. I prefer to use such tools as black boxes that do nice
things visually, but which I don't have to develop on my own (in case
you're wondering, my "fabulous" menuing system was written all in Visual
Basic .NET, but I wouln't try to re-write it in JLL (at least not yet)).

The infrastructure and the system of libraries of Visual Studio .NET is
*huge.* Having access to its facilities, even if they are not ideal,
means not having to develop a truly large amount of functionality in JLL
(or whatever). I have many complaints about the Visual Studio .NET (the
lack of a language like Joy being one of them, of course), but standing
on Microsoft's shoulders is better than starting over, even though, for
any applications I'd want to build, I wouldn't need vast amounts of what
Microsoft provides; the parts I *do* want are still too much for me to
re-write in JLL, even if it were a sufficiently developed language and
development package.

I feel flattered that you may have thought I could have written a real
GUI system in JLL, though.

--Chris