Re: Forthski Vs. Joy.

Brent Kerby — 2002-06-18 21:14:58

Hey Alan, and readers on concatenative ...

> Water told me to ask you about the debate between Forth and Joy
> syntaxes...
>
> Please gimme the lowdown. =)

Hmm... I have little real experience with FORTH, but my understanding is
this: there are 3 main differences between FORTH syntax and Joy syntax:

1) in FORTH (except the recent strongFORTH), everything is untyped (you can
treat pointers as integers, etc., without even needing to use casts) whereas
Joy is typed.
2) Joy allows pushing programs onto the stack, using the "[]" syntax; as far
as I know, FORTH does not have anything like this.
3) In FORTH, normally everything is UPPERCASE, whereas in Joy we usually
prefer lowercase :-)

There's really not much of a conflict between FORTH and Joy, except for the
type-checking issue. It seems that FORTH stems from a bare-bones,
do-it-yourself approach, and type-checking is just seen as an unneccessary
complexity that gets in the programmer's way by "creating errors" for him to
fix.

However, my opinion is that, overall, type-checking not only prevents
mistakes by programmers but also reduces the complexity of the complete
system, as long as the type-checking is provided with polymorphism (which it
naturally is). For example, in a typed system, you can use the same "+"
operator for integer and floating point data; in FORTH, you must use "F+" to
add floats. Similarly, in a typed system, you could define your own type for
fixed-point number data and also use the same "+" and "*" operators. And, in
a typed system, you could write generic numeric routines (e.g., averaging,
sorting, fourier transform, ...) that would work regardless of whether the
numbers are integers, floats, or fixed-points; this leads to a simpler
system.

Nevertheless, FORTH's untyped approach is quite respectable; it is
impressive that some FORTH programmers are able to successfully write large,
complex programs in a system without type-checking. But, in my opinion,
type-checking is generally an advantage.

By the way, I've almost finished writing the article that Manfred requested,
summing up the results about minimal bases for concatenative combinator
systems and about how to construct various series of combinators ("reverse",
"dig", "bury", "dip", etc.); hopefully it will be ready within the next
couple days.

- Brent

e1_t — 2002-06-19 01:42:57

--- In concatenative@y..., "Brent Kerby" <iepos@t...> wrote:
> Hmm... I have little real experience with FORTH, but my
understanding is
> this: there are 3 main differences between FORTH syntax and Joy
syntax:
>
> 1) in FORTH (except the recent strongFORTH), everything is untyped
(you can
> treat pointers as integers, etc., without even needing to use
casts) whereas
> Joy is typed.
> 2) Joy allows pushing programs onto the stack, using the "[]"
syntax; as far
> as I know, FORTH does not have anything like this.

To an extent. Forth allows you to push the execution token of a word
onto the stack (by using the ' word) where it can be executed with
EXECUTE (which is to some extent similar to Joy's i combinator).
Here's an example:

S" blah" ' TYPE EXECUTE

S" blah" creates a counted string "blah", ' TYPE gets the execution
token of the word TYPE and puts it on stack and EXECUTE executes
TYPE's execution token.

> 3) In FORTH, normally everything is UPPERCASE, whereas in Joy we
usually
> prefer lowercase :-)
>

Other significant differences are:

4) Forth doesn't have a garbage collector and considering it's
untyped adding a decent garbage collector is tough unless you're
prepared to give up typelessness (the problem of adding a garbage
collector to Forth is comparable to adding one to C. I know of one
garbage collector for Forth out there but it's fairly restrictive). I
feel that adding a garbage collector to a functional Forth-like
language is a bit of an overkill considering that free-ing of
allocated memory is explicit (i.e. you have to DROP/pop a block from
the stack before it can be freed);
5) Forth has IMMEDIATE words which allow you to change or extend the
syntax of the language (they've been dropped from colorForth however);
6) Forth gives you the ability to create infix and prefix operators
by giving you access to parsing words - in the example above when S"
is executed it parses everything up to the first " character from
Forth's input buffer and compiles that into a counted string. So
while S" blah" may look like a string constant S" is actually a Forth
word (hence the reason why space is needed between S" and blah".

> There's really not much of a conflict between FORTH and Joy, except
for the
> type-checking issue. It seems that FORTH stems from a bare-bones,
> do-it-yourself approach, and type-checking is just seen as an
unneccessary
> complexity that gets in the programmer's way by "creating errors"
for him to
> fix.
>
> However, my opinion is that, overall, type-checking not only
prevents
> mistakes by programmers but also reduces the complexity of the
complete
> system, as long as the type-checking is provided with polymorphism
(which it
> naturally is). For example, in a typed system, you can use the
same "+"
> operator for integer and floating point data; in FORTH, you must
use "F+" to
> add floats. Similarly, in a typed system, you could define your own
type for
> fixed-point number data and also use the same "+" and "*"
operators. And, in
> a typed system, you could write generic numeric routines (e.g.,
averaging,
> sorting, fourier transform, ...) that would work regardless of
whether the
> numbers are integers, floats, or fixed-points; this leads to a
simpler
> system.
>

I don't completely agree. Depending on your needs, there are
arguments for and against types.

Types make it easier to debug programs if you write big chunks of
code at once and then do the debugging.
However considering most Forth-like systems are interactive and
promote writing of small and simple word definitions that can be
tested as soon as they're written, types often get in the way.
Honestly I'm still not completely used to that practise myself but I
can see it's benefits (doing this would probably easier if there was
an easy way to save files from Forth - blocks are an old and
simplistic way of doing this and they're pretty much useless if
you're running Forth on top of an operating system).

On the other hand many modern garbage collectors are impossible to
write in a typeless system.

Other benefits of types is that they can provide info for a compiler
to further optimize code but then again similar optimizations can be
performed on typeless systems through range analysis:

http://users.otenet.gr/~qfp10/thesis.html

For low-level access of hardware types just about always get in the
way. They also make interpreters/compilers more complex.

> Nevertheless, FORTH's untyped approach is quite respectable; it is
> impressive that some FORTH programmers are able to successfully
write large,
> complex programs in a system without type-checking. But, in my
opinion,
> type-checking is generally an advantage.
>
> By the way, I've almost finished writing the article that Manfred
requested,
> summing up the results about minimal bases for concatenative
combinator
> systems and about how to construct various series of combinators
("reverse",
> "dig", "bury", "dip", etc.); hopefully it will be ready within the
next
> couple days.
>

On another unrelated note the first implementation of my programming
language is getting closer to being released (how soon... I can't say
considering that I don't have as much free time as I used to). I
settled for the name Meta/4.

Meta/4 has features taken from Forth, Joy and Scheme. It's virtual
machine is mostly identical to Forth but includes another stack that
contains indexes into the heap (needed for reallocating memory
blocks) and is properly tail recursive. Memory allocator is based on
the next fit algorithm and heap is 4Mb in size by default (once heap
is exhausted malloc is called for allocation).

The basic data structure in Meta/4 is a stack. Because of that
internal memory fragmentation isn't a big problem considering that
every stack has the potential to grow in size and completely fill up
it's block - if it still needs more memory at that point it gets
reallocated (future implementation might include some simple
statistics in the header of memory blocks to help allocator in
deciding how much additional memory to allocate for a stack that
grows quickly).
Meta/4's stacks are similar to Joy's quotations however Meta/4 does
differentiate between code and data stacks (code stacks are created
with [ and ] while data stacks with ( and ) - strings are just a
variation of data stacks). Being typeless it doesn't stop you from
using a code stack as a data stack though.

IMMEDIATE words have been dropped in favor of something similar to
colorForth. Meta/4 has macros that are available only at compile time.
Unlike Forth, Meta/4's interpreter and compiler are split into two
parts.
I did my best to eliminate as many prefix/infix words as possible
while maintaining Forth's flexibility.
As a consequence Meta/4 word definitions are of following format:

[ code ] " word-name" ;

[ code ] places the code stack onto the top of the main data stack, "
acts pretty much like S" in Forth and ; takes a code stack and a
string as parameters and creates a word.

Current implementation is written in C and Netwide Assembler - not
very portable but it can be compiled on just about any x86 OS with a
C compiler. Right now the executable is 5Kb in size (this includes
the memory manager but excludes C run time library) and I expect the
final version to be between 8 and 12Kb (small size is mostly due to
the fact that the implementation is token-threaded).
I expect Meta/4's performance to be comparable to that of Forth.

> - Brent

Ivan

John Cowan — 2002-06-19 02:57:07

e1_t scripsit:

> 4) Forth doesn't have a garbage collector and considering it's
> untyped adding a decent garbage collector is tough unless you're
> prepared to give up typelessness (the problem of adding a garbage
> collector to Forth is comparable to adding one to C.

Well, *that's* a solved problem; the Joy1 implementation uses a
general gc, and is written in C.

--
John Cowan <jcowan@...> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

e1_t — 2002-06-19 03:28:29

--- In concatenative@y..., John Cowan <cowan@c...> wrote:
> e1_t scripsit:
>
> > 4) Forth doesn't have a garbage collector and considering it's
> > untyped adding a decent garbage collector is tough unless you're
> > prepared to give up typelessness (the problem of adding a garbage
> > collector to Forth is comparable to adding one to C.
>
> Well, *that's* a solved problem; the Joy1 implementation uses a
> general gc, and is written in C.
>

Not entirely - because C is typeless only conservative garbage
collectors work with C (and same goes for Forth).
They can only free a block when it's absolutely certain there are no
pointers to that block (that's how that.
In Forth an integer and a memory address both look the same so if
some integer on the stack has the same value as an allocated block
that is not used anymore the garbage collector can't free that block.
The only way around this is adding types to Forth.

Ivan

e1_t — 2002-06-19 03:30:21

> Not entirely - because C is typeless only conservative garbage

Correction... because C is weakly-typed..

Ivan

Brent Kerby — 2002-06-19 03:24:52

>> [Advantages of types]
>
> I don't completely agree. Depending on your needs, there are
> arguments for and against types.

True ... Maybe I was being a bit simplistic. I appreciate your interesting
comments.

> On another unrelated note the first implementation of my programming
> language is getting closer to being released (how soon... I can't say
> considering that I don't have as much free time as I used to). I
> settled for the name Meta/4.
> [...]

Interesting work; I look forward to seeing it. I hope someday to work again
on a toy programming language again also.

> Ivan

- Brent

Stephen J Bevan — 2002-06-24 05:49:51

e1_t writes:
> On another unrelated note the first implementation of my programming
> language is getting closer to being released (how soon... I can't say
> considering that I don't have as much free time as I used to). I
> settled for the name Meta/4.

Were you aware of the other "Meta X" languages when you chose the name?
That is, Meta II and Meta III are languages used in (old) compiler
writing systems and Meta-IV is the specification language in the
Vienna Definition Method (VDM)?

Nick Forde — 2002-06-24 14:00:35

On a related topic, given the simple stack based nature of the Joy
language, and that every node on the stack can be assigned a type, can
anyone tell me if there is any real need to use a general purpose
garbage collector? Is it simply a convenience for the interpreter
writer or is there some language construct that is particularly
difficult or impossible to implement without GC?

Obviously the current implementation requires the GC for strings and
any other malloc'ed data types, but I'm curious whether there is a
more efficient means of implementing Joy specific allocation around
the stack. Does separating memory management from stack manipulation
offer any performance gains? Are there accepted practices in the
Forth community for dynamic memory management that could be adopted in
the Joy interpreter?

Nick.

e1_t writes:
> --- In concatenative@y..., John Cowan <cowan@c...> wrote:
> > e1_t scripsit:
> >
> > > 4) Forth doesn't have a garbage collector and considering it's
> > > untyped adding a decent garbage collector is tough unless you're
> > > prepared to give up typelessness (the problem of adding a garbage
> > > collector to Forth is comparable to adding one to C.
> >
> > Well, *that's* a solved problem; the Joy1 implementation uses a
> > general gc, and is written in C.
> >
>
> Not entirely - because C is typeless only conservative garbage
> collectors work with C (and same goes for Forth).
> They can only free a block when it's absolutely certain there are no
> pointers to that block (that's how that.
> In Forth an integer and a memory address both look the same so if
> some integer on the stack has the same value as an allocated block
> that is not used anymore the garbage collector can't free that block.
> The only way around this is adding types to Forth.
>
> Ivan

John Cowan — 2002-06-24 14:08:35

Nick Forde scripsit:
>
> On a related topic, given the simple stack based nature of the Joy
> language, and that every node on the stack can be assigned a type, can
> anyone tell me if there is any real need to use a general purpose
> garbage collector? Is it simply a convenience for the interpreter
> writer or is there some language construct that is particularly
> difficult or impossible to implement without GC?

The stack is a stack any time that user code looks at it, but it is common
within the interpreter to pop things off the stack, invoke interpreted
code, and then push them back on again. This is done by keeping a pointer
to the old TOS and restoring the elements to the new TOS by link-bending.

> Obviously the current implementation requires the GC for strings and
> any other malloc'ed data types, but I'm curious whether there is a
> more efficient means of implementing Joy specific allocation around
> the stack.

The non-GC version of Joy, BTW, simply mallocs string space freely
and blows up when it runs out of space.

--
John Cowan <jcowan@...> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

Martin Young — 2002-06-24 15:05:09

Hi,

On Mon, 2002-06-24 at 15:00, Nick Forde wrote:
> Obviously the current implementation requires the GC for strings and
> any other malloc'ed data types, but I'm curious whether there is a
> more efficient means of implementing Joy specific allocation around
> the stack.

The following relates not to Joy but to a similar langauge of my own
("Monkey"). It's mostly the same but with some different types,
modified syntax and a splattering of semantic changes.

One of the semantic changes is that the language is linear i.e. there
are no references: all dups produce complete deep copies. It's a while
since I last looked but it's my impression that Joy sometimes allows
dupped things to be mutual aliases so effectively there are implicit
references. I'm sure somebody will correct me here. Because of this
(if true) the following cannot be applied to Joy.

[I picked up the term "linear" so this from a paper by (I think) Henry
Baker. He applied it to a LISP/Scheme varient, I recall. I hope I'm
using it in the same sense.]

In the absence of references GC is trivial: on duplication you allocate
memory for the object, when you destroy it you free the memory. Simple!

Of course actually doing all the allocation and freeing would be
monumentously inefficient. Internally, Monkey makes copious use of
references with a copy-on-modify policy to maintain the illusion of
lineality. It also does in-place stack modification when this is
invisible to the executing program.

Time for a leap of faith: because the language semantics don't emcompass
references it is impossible for a program to create cycles in the data
tree. Thus a pure reference-counting garbage collecter can be used.

I'm sure there's a hefty body of literature about doing ref-counting GC
well and/or quickly. Monkey does it in O(1) space as cells' ref-counts
drop to zero (i.e. it's kind of weakly concurrent).

--
Martin Young, working for: | Phone: +44(0) 1454 615151
Siroyan Limited, Bristol Design Centre, | Mobile: +44(0) 7855 758771
West Point Court, Great Park Road, | web: www.siroyan.com
Bradley Stoke, Bristol BS32 4QG. UK | email: my@...



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

e1_t — 2002-06-25 03:17:00

--- In concatenative@y..., Stephen J Bevan <stephen@e...> wrote:
> Were you aware of the other "Meta X" languages when you chose the
name?
> That is, Meta II and Meta III are languages used in (old) compiler
> writing systems and Meta-IV is the specification language in the
> Vienna Definition Method (VDM)?

No, I wasn't aware of those.

Ivan

e1_t — 2002-06-25 03:17:31

--- In concatenative@y..., Nick Forde <nickf@c...> wrote:
> On a related topic, given the simple stack based nature of the Joy
> language, and that every node on the stack can be assigned a type,
can
> anyone tell me if there is any real need to use a general purpose
> garbage collector? Is it simply a convenience for the interpreter
> writer or is there some language construct that is particularly
> difficult or impossible to implement without GC?
>
> Obviously the current implementation requires the GC for strings and
> any other malloc'ed data types, but I'm curious whether there is a
> more efficient means of implementing Joy specific allocation around
> the stack. Does separating memory management from stack manipulation
> offer any performance gains? Are there accepted practices in the
> Forth community for dynamic memory management that could be adopted
in
> the Joy interpreter?

As I was saying I feel that garbage collector in a functional Forth-
like language (like Joy) isn't neccessary considering that
destruction isn't implicit (you can't just forget about things you
put on stack - you have to destroy them eventually using pop - it
therefore makes sense if freeing operation is embedded into pop which
would remove the need for a garbage collector).

>
> Nick.
>

Ivan

wtanksle — 2002-06-25 20:07:02

--- In concatenative@y..., "Brent Kerby" <iepos@t...> wrote:
> Hey Alan, and readers on concatenative ...

> > Water told me to ask you about the debate between Forth and Joy
> > syntaxes...

> 1) in FORTH (except the recent strongFORTH), everything is untyped
> (you can
> treat pointers as integers, etc., without even needing to use
> casts) whereas Joy is typed.

Right. Note that strongForth is compile-time typed and polymorphic,
whereas Joy is runtime typed and not polymorphic (although it has
RTTI).

> 2) Joy allows pushing programs onto the stack, using the "[]"
> syntax; as far as I know, FORTH does not have anything like this.

Of course Forth allows you to push programs onto the stack; they're
pushed in the form of 'xt's, short for execution tokens. The only
catch versus Joy is that there's no such thing as a program literal.

Quotations in Forth are plain text, not lists; they don't act like
programs (i.e. you use a different word to EVALUATE a quotation than
you use to EXECUTE a program on the stack).

> 3) In FORTH, normally everything is UPPERCASE, whereas in Joy we
> usually prefer lowercase :-)

No, not really. The ANSI standard doesn't require case sensitivity,
but it states that if the system is case sensitive, all ANSI words
shall be in uppercase. This opposes common practice for many Forths,
so most people use non case-sensitive Forths.

> There's really not much of a conflict between FORTH and Joy, except
> for the type-checking issue.

There are many other points as well -- GC, for example.

> It seems that FORTH stems from a bare-bones,
> do-it-yourself approach, and type-checking is just seen as an
> unneccessary
> complexity that gets in the programmer's way by "creating errors"
> for him to fix.

That's a reasonable summary of the philosophy. Substitute any of the
other things Forth has chosen to neglect.

The biggest _language_ difference between the languages (GC and
dynamic typechecking are mere library differences) is almost
certainly Forth's ability to look ahead into its own source code.

Interestingly, the most modern Forth variant, colorForth, although
driven by the same goals as Forth, has abandoned this distinctive,
and replaces it with a very interesting one: synergy with the editor.
The colorForth editor is part of the system, and uses MVC to support
new 'syntactic' constructs. The editor itself is responsible for
precompiling the syntactic constructs into a format ready for the
compiler.

colorForth enjoys the distinction of being the most purely
concatenative language I've seen: in colorForth not only are all
programs formed by composition, but all subprograms can be formed by
decomposition. That is, you can textually factor any colorForth
program into two programs at any blank; no need to check whether the
blank is in the middle of a comment or quotation.

> - Brent

-Billy

John Cowan — 2002-06-25 20:49:28

wtanksle scripsit:

> Right. Note that strongForth is compile-time typed and polymorphic,
> whereas Joy is runtime typed and not polymorphic (although it has
> RTTI).

How so? Joy has many polymorphic words, such as "size", which returns
the size of a string, list, or set.

--
John Cowan <jcowan@...> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

wtanksle — 2002-06-26 15:25:41

John Cowan <cowan@c...> wrote:
> wtanksle scripsit:

> > Right. Note that strongForth is compile-time typed and
> > polymorphic,
> > whereas Joy is runtime typed and not polymorphic
> > (although it has RTTI).

> How so? Joy has many polymorphic words, such as "size", which
> returns the size of a string, list, or set.

Correct -- Joy has polymorphic words. Joy itself isn't polymorphic,
but rather provides RTTI and the ability to branch on that
information. It's not possible to define a new morph of a word
without editing the word's source.

This doesn't seem like a trivial distinction to me; strongForth lacks
RTTI, and so is unable to do some of the things that Joy can.

> John Cowan <jcowan@r...> http://www.reutershealth.com

-Billy