Conca : a small interpreter

Claude Marinier — 2014-04-29 17:24:59

Bonjour,

I was inspired by reading "The joy of Joy" and other things about concatenative languages. Conca is an interpreter for a language like Joy and Cat. It is still young and is missing file output. It can do a few things. Here is a quick example.

C:\Util\conca>conca
conca 0.5, built on 2014-04-23
define built-in words
define parser and evaluator

>> [dup *] "sq" define
>> 12 sq .
144
>> [1 2 3 4 5 6 7 8 9] [sq] map .
[ 1 4 9 16 25 36 49 64 81 ]
>> "fibonacci.conca" load
>> 18 fibonacci .
[ 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 ]
>> quit

You will notice that Conca definitions are postfix like the rest of the language. I could not think of a good reason to deviate from the postfix syntax to handle definitions. Is there a technical reason Cat and Joy use something else?

  • Joy
    square == dup *
  • Cat
    define square { dup * }

I am not sure how to deal with errors. Currently, Conca prints a message and skips the rest of the line. Should errors in a script be treated differently? So many choices to make. This is more complicted than I first thought.  :-)

You can get a Win32 binary distribution here. It includes a Fibonacci definition script and a test script.

   http://sourceforge.net/projects/conca/

I will read and consider comments. If there is interest, I may develop it further.

Merci.


P.S. Has interest in concatenative languages wanned in recent years?


--

Claude Marinier

Jon Purdy — 2014-04-29 18:20:31

> I was inspired by reading "The joy of Joy" and other things about concatenative languages. Conca is an interpreter for a language like Joy and Cat.

This looks like a good project for learning about how concatenative
languages are put together.

> You will notice that Conca definitions are postfix like the rest of the language. I could not think of a good reason to deviate from the postfix syntax to handle definitions. Is there a technical reason Cat and Joy use something else?

Cat is statically typed, and it needs to know ahead of time what
definitions are available for type checking. I don’t know much about
Joy, but it doesn’t seem to have the same facility for mixing
compile-time and runtime evaluation that Forth does, so it seems
simpler to make “==” just part of the syntax.

Since Conca is heavily dynamic, it makes sense to keep it consistently postfix.

> I am not sure how to deal with errors. Currently, Conca prints a message and skips the rest of the line. Should errors in a script be treated differently? So many choices to make. This is more complicted than I first thought. :-)

Error handling is a hard problem! You might implement an exception
system of some kind; it’s entirely up to you.

> I will read and consider comments. If there is interest, I may develop it further.

You should develop it further because of your own interest, not
because of other people’s. The question is: what problem do you want
Conca to solve? If you can answer that clearly, even if the answer is
“I just want to write a programming language for fun”, you will know
exactly what to do. :)

> P.S. Has interest in concatenative languages wanned in recent years?

The community is rather quiet, but fairly active. You should join
#concatenative on Freenode, where we talk about Factor and I post
updates about my own statically typed concatenative language project
called Kitten.

William Tanksley, Jr — 2014-04-29 19:33:53

Jon's advice is good.

Claude Marinier <claudem223@...> wrote:
> You will notice that Conca definitions are postfix like the rest of the language. I could not think of a good reason to deviate from the postfix syntax to handle definitions. Is there a technical reason Cat and Joy use something else?

Making definitions "postfix" in a concatenative language actually
means that they're executed at runtime, which means it's possible to
build definitions at runtime. That's a major problem in a typechecked
language, or one that's trying to be theoretically pure in some
defined way.

At least, so far as we know...

> P.S. Has interest in concatenative languages wanned in recent years?

It comes and goes. There are some unsolved questions about the idea,
and some people think there's no solution. There are some reasonable
success stories, but all of them have shortcomings.

> Claude Marinier

-Wm

Jon Purdy — 2014-04-29 19:58:06

> Making definitions "postfix" in a concatenative language actually
> means that they're executed at runtime, which means it's possible to
> build definitions at runtime.

Not necessarily. “define” could be a compile-time word that adds a
definition to the dictionary. So the language would need forward
declarations like C, but would still statically checkable.

Claude Marinier — 2014-04-29 22:39:07

On Tue, 29 Apr 2014, Jon Purdy wrote:
> > You will notice that Conca definitions are postfix like the rest of
> > the language. I could not think of a good reason to deviate from the
> > postfix syntax to handle definitions. Is there a technical reason Cat
> > and Joy use something else?
>
> Cat is statically typed, and it needs to know ahead of time what
> definitions are available for type checking. I don’t know much about
> Joy, but it doesn’t seem to have the same facility for mixing
> compile-time and runtime evaluation that Forth does, so it seems
> simpler to make “==” just part of the syntax.
>
> Since Conca is heavily dynamic, it makes sense to keep it consistently
> postfix.

Ah. OK.

> > I am not sure how to deal with errors. Currently, Conca prints a
> > message and skips the rest of the line. Should errors in a script be
> > treated differently? So many choices to make. This is more complicted
> > than I first thought. :-)
>
> Error handling is a hard problem! You might implement an exception
> system of some kind; it’s entirely up to you.

I will consider this.

> > I will read and consider comments. If there is interest, I may develop
> > it further.
>
> You should develop it further because of your own interest, not
> because of other people’s. The question is: what problem do you want
> Conca to solve? If you can answer that clearly, even if the answer is
> “I just want to write a programming language for fun”, you will know
> exactly what to do. :)

It is mostly the fun of developping it. :-)

--
Claude Marinier



[Non-text portions of this message have been removed]

Claude Marinier — 2014-04-29 22:49:28

On Tue, 29 Apr 2014, William Tanksley, Jr wrote:
>
> Jon's advice is good.
>
> Claude Marinier <claudem223@...> wrote:
>
> > You will notice that Conca definitions are postfix like the rest of
> > the language. I could not think of a good reason to deviate from the
> > postfix syntax to handle definitions. Is there a technical reason Cat
> > and Joy use something else?
>
> Making definitions "postfix" in a concatenative language actually
> means that they're executed at runtime, which means it's possible to
> build definitions at runtime. That's a major problem in a typechecked
> language, or one that's trying to be theoretically pure in some
> defined way.

I think I see the problem: it is nice to have assurances at the time the
function is defined rather than later that the code will work as expected.

--
Claude Marinier

Claude Marinier — 2014-04-29 22:53:54

On Tue, 29 Apr 2014, Jon Purdy wrote:
>
> > Making definitions "postfix" in a concatenative language actually
> > means that they're executed at runtime, which means it's possible to
> > build definitions at runtime.
>
> Not necessarily. “define” could be a compile-time word that adds a
> definition to the dictionary. So the language would need forward
> declarations like C, but would still statically checkable.

The code in a quotation could be type checked as it is read; this could be
done for all cases: a quotation before a conditional or a loop as well as
code for a function definition.

Conca does all its type checking at run time. I see that this will delay
error detection to a less convenient time.

--
Claude Marinier


[Non-text portions of this message have been removed]

— 2014-04-30 10:11:54

No, my own Furphy (see http://users.beagle.com.au/peterl/furphy.html) uses postfix (Reverse Polish) naming during compilation, and it doesn't have any distinct run time stuff during compilation apart from what is implemented by immediate words. In fact, that was the most natural way to work it with a simple compile-and-go compiler, so as not to need any special construct to be actioned during compilation to carry out a definition - and, without a defining word at all, there isn't a defining word available later at run time. With that approach, the compiler starts a new word and then compiles tokens and numbers from the source into a new word until it reaches the end or an unrecognised token, either of which terminates the current word with a return or tail call optimisation. An unrecognised token is assigned as the current word's name and then the compiler starts a new word and continues as before; at the end of the source there is one final anonymous word, and that is just run (it can save an executable and halt before falling through into application words that the executable will start at, if you want compile-and-save behaviour).

— 2014-04-30 10:34:21

Interestingly, from the point of view of error checking, the simplified version of my own Furphy (see http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible compile time errors apart from insufficient memory. ALL possible source is syntactically valid, and run time checks are the only meaningful ones! But I also have immediate words to run during compilation to provide syntactic sugar, which not only makes some syntax like matching square brackets and quotation marks necessary, it also provides natural places to test for errors in them. Ideally, the compiler builds up a pretty printed listing as it goes, and syntax failures simply drop error annotations into that while keeping track of the error level reached and making safe-ish default assumptions to try to fill in mismatches; at the end, just before the compiler triggers its compile-and-go run time execution, it should check the error level and only run if an immediate word has specified a safe higher error level than a default of no errors (and there should be a default output of the listing at that stage whether the program runs or not, unless another immediate word has stopped that).

A good all round guide to this whole area is P.J.Brown's "Writing Interactive Compilers and Interpreters".

— 2014-04-30 10:37:17

Testing... I'm not seeing confirmation that my replies to WT Jr. and CM have gone through.

John Cowan — 2014-04-30 18:26:02

Claude Marinier scripsit:

> You will notice that Conca definitions are postfix like the rest of the
> language. I could not think of a good reason to deviate from the postfix
> syntax to handle definitions. Is there a technical reason Cat and Joy use
> something else?

Joy programs rely on the fact that you can't redefine Joy words at runtime.
So yes, the way definitions are written is just syntax.

--
John Cowan http://www.ccil.org/~cowan cowan@...
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply: "Many happy returns of the day!"

Claude Marinier — 2014-04-30 22:53:53

On Wed, 30 Apr 2014, pml540114@... wrote:
>
> Interestingly, from the point of view of error checking, the simplified
> version of my own Furphy (see
> http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible
> compile time errors apart from insufficient memory. ALL possible source
> is syntactically valid, and run time checks are the only meaningful
> ones! But I also have immediate words to run during compilation to
> provide syntactic sugar, which not only makes some syntax like matching
> square brackets and quotation marks necessary, it also provides natural
> places to test for errors in them. Ideally, the compiler builds up a
> pretty printed listing as it goes, and syntax failures simply drop error
> annotations into that while keeping track of the error level reached and
> making safe-ish default assumptions to try to fill in mismatches; at the
> end, just before the compiler triggers its compile-and-go run time
> execution, it should check the error level and only run if an immediate
> word has specified a safe higher error level than a default of no errors
> (and there should be a default output of the listing at that stage
> whether the program runs or not, unless another immediate word has
> stopped that).
>
> A good all round guide to this whole area is P.J.Brown's "Writing
> Interactive Compilers and Interpreters".

I tend to read e-mail from this account in the evening (North America,
EDT).

I am reading and digesting what you say. That's a lot to think about.

Yes, there is so lottle syntax that all programs are valid. The issue here
is type checking. At run-time, built-in words check the type of the data
they use. The dynamic nature of the language make this the default
behaviour. Static type checking requires knowledge of what the caller
expects (usually easy) and of what the caller is providing (much more
difficult).

This is starting to look like "more than I can chew"(1).

Thanks for the comments.

(1) American English has an abundant supply of colourful idiomatic
expressions. :-)

--
Claude Marinier

eas lab — 2014-05-01 01:36:57

Thanks for good hi-level/over-view info on Furphy, in these F/B twitter days,
where there's nothing but repetitive eye-candy and pointers to ...
Ie. no meat!

== Chris Glur.

— 2014-05-01 10:17:34

>Wed Apr 30, 2014 3:53 pm (PDT) . Posted by:
>"Claude Marinier"
>
>On Wed, 30 Apr 2014, pml540114@... wrote:
>>
>> Interestingly, from the point of view of error checking, the simplified
>> version of my own Furphy (see
>> http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible
>> compile time errors apart from insufficient memory. ALL possible source
>> is syntactically valid, and run time checks are the only meaningful
>> ones! But I also have immediate words to run during compilation to
>> provide syntactic sugar, which not only makes some syntax like matching
>> square brackets and quotation marks necessary, it also provides natural
>> places to test for errors in them. Ideally, the compiler builds up a
>> pretty printed listing as it goes, and syntax failures simply drop error
>> annotations into that while keeping track of the error level reached and
>> making safe-ish default assumptions to try to fill in mismatches; at the
>> end, just before the compiler triggers its compile-and-go run time
>> execution, it should check the error level and only run if an immediate
>> word has specified a safe higher error level than a default of no errors
>> (and there should be a default output of the listing at that stage
>> whether the program runs or not, unless another immediate word has
>> stopped that).
>>
>> A good all round guide to this whole area is P.J.Brown's "Writing
>> Interactive Compilers and Interpreters".
>
>I tend to read e-mail from this account in the evening (North America,
>EDT).

I am in Australia; it's around 8 p.m. right now.

>
>I am reading and digesting what you say. That's a lot to think about.
>
>Yes, there is so lottle syntax that all programs are valid. The issue here
>is type checking. At run-time, built-in words check the type of the data
>they use. The dynamic nature of the language make this the default
>behaviour. Static type checking requires knowledge of what the caller
>expects (usually easy) and of what the caller is providing (much more
>difficult).

The issue is partly what the syntax features offer a programmer by holding his hand and making him do the right thing, and partly what functionality can be provided most easily that way. For instance, you can do object oriented programming in C even though it doesn't have the features for coding it directly, but the features for that in C++ make it easier.

Since I come from a Forth-ish orientation, I don't want to constrain the programmer too much. Rather, I have found that careful choice of meaningful word names helps a lot here by making him think through what he is doing. So called "Hungarian notation" - naming guidelines to reflect conceptual types (i.e. what the programmer has in mind as a type, not what the virtual machine will insist on) - may help too.

Over and above that, I have found Forth stack commenting helpful, e.g. ( true/false val1 val2 --- val ) is a comment for Furphy's IF that should make it clear that it expects three parameters, of which the first should be conceptually Boolean but the others are unrestricted, and leaves just one unrestricted value. I have thought about using a variant of that comment formalism to set up "assertions", i.e. pieces of run time code that are only triggered during testing and return error messages if parameters and results don't match the assertions. That might perhaps be extended to static testing during compilation.

For me, types don't look as promising as a coding guideline so much as for implementing object oriented stuff; Wirth's Oberon family of languages implements object orientation by using type extension, so it should be possible. If I ever add it to Furphy, first I will look hard at Oberon and FICL (an object oriented Forth).

>
>This is starting to look like "more than I can chew"(1).
>
>Thanks for the comments.
>
>(1) American English has an abundant supply of colourful idiomatic
>expressions. :-)

Although I am in Australia, I am in fact British, of a third generation of world travellers. "Biting off more than you can chew" isn't U.S. English in particular, just English. Then again, French has not a few such expressions of its own; "faire suer le burnous" (mes parents habitaient et travaillaient en Algers entre 1944 et 1948, et s'y rencontraient) and "aller aux fraises" come to mind, just off the top of my head (which is itself an expression). PML.

>
>--
>Claude Marinier

John Cowan — 2014-05-01 14:04:03

pml540114@... scripsit:

> Although I am in Australia, I am in fact British, of a third
> generation of world travellers. "Biting off more than you can chew"
> isn't U.S. English in particular, just English.

As I said in connection with "till the cows come home" once, the
Sundering Sea has not sundered our homely metaphors. (Alas, however,
"homely" means "ugly" in North America.)

--
John Cowan http://www.ccil.org/~cowan cowan@...
That you can cover for the plentiful and often gaping errors, misconstruals
and disinformation in your posts through sheer volume -- that is another
misconception. --Mike to Peter

— 2014-05-01 14:23:39

Hello Claude, hello *,

the decision of Joy to allow definitions only at compile time may force development into a direction like term rewriting or something like that, I don't know.

What I think is: introducing a define-primitive is a door-opener into a scheme-like style of programming, maybe a bit more than scheme-style, a more natural way than scheme marcos.

Scheme and Joy both say: Programm == Data  == List
So put something on stack, do all data-transformation you want until you get the definition you need and define.

I'm, like you, am one of those with an self made Joy implementation.
this is how my joy does it:

zip befor transformation:
zip:    {l l  -- l}
    ( nil?          | nip
    | nild?         | zap
    others        | 2uncons pair^^  loop cons    )dsplit cond*

and afterwards:
(zip) {l l -- l} (((nil? )(nip )(nild? )(zap )(else )(2uncons (pair )dip2 zip cons ))cond* ) define


about type checking at compile time:

Programming in Haskell or OCaml, when comming for scheme or lisp a cool experience:
when it runs it is correct.

But there are very successful languages without this property, as scheme/lisp or Python.

best regards
Heiko

Claude Marinier — 2014-05-01 22:59:14

On Thu, 1 May 2014, pml540114@... wrote:
>
> So called "Hungarian notation" - naming guidelines to reflect conceptual
> types (i.e. what the programmer has in mind as a type, not what the
> virtual machine will insist on) - may help too.

This is a controversial topic in some circles but in this case it is a
valuable practice.

> Over and above that, I have found Forth stack commenting helpful, e.g. (
> true/false val1 val2 --- val ) is a comment for Furphy's IF that should
> make it clear that it expects three parameters, of which the first
> should be conceptually Boolean but the others are unrestricted, and
> leaves just one unrestricted value. I have thought about using a variant
> of that comment formalism to set up "assertions", i.e. pieces of run
> time code that are only triggered during testing and return error
> messages if parameters and results don't match the assertions. That
> might perhaps be extended to static testing during compilation.

That is a good idea: type assertions which can be turned off for
production code. It could be handled (internally) by one function which
compares the assertions with the contents of the stack. It might
not be too hard. Hum ...

--
Claude Marinier

William Tanksley, Jr — 2014-05-01 23:36:45

Claude Marinier <claudem223@...> wrote:
> That is a good idea: type assertions which can be turned off for
> production code. It could be handled (internally) by one function which
> compares the assertions with the contents of the stack. It might
> not be too hard. Hum ...

Another solution would involve type assertions that are handled
statically when possible, and dynamically when not.

Factor, by the way, is a concatenative language that performs static
typechecking. It's a very impressive project.

> Claude Marinier

-Wm