Christopher Diggins — 2007-04-05 21:37:59

I've written a short article about how the Cat interpreter works on
the Cat wiki http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks.
I am hoping that it will help other people who are interested in
learning how to construct interpreters and compilers.

Any comments or suggestions would be much appreciated!

Christopher Diggins
http://www.cdiggins.com

chris glur — 2007-04-07 07:18:51

From a superficial initial read it seems to be a well structured
top-down presentation. I've avoided C#, because I don't want
to be screwed by M$, but I was able to understand what I
examined of the text.

I know a bit about compilers/interpreters and it's not clear
why I would want to analyse another one. I was hoping that
'concatinative methods' would lead towards a formal
[mathematical like] way of anlysing algorithms and/or
proving code correctness. Does it ?

== Chris Glur.

On 4/5/07, Christopher Diggins <cdiggins@...> wrote:
> I've written a short article about how the Cat interpreter works on
> the Cat wiki
> http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks.
> I am hoping that it will help other people who are interested in
> learning how to construct interpreters and compilers.
>
> Any comments or suggestions would be much appreciated!
>
> Christopher Diggins
> http://www.cdiggins.com
>

Christopher Diggins — 2007-04-07 07:46:23

>
> On 4/5/07, Christopher Diggins <cdiggins@...> wrote:
> > I've written a short article about how the Cat interpreter works on
> > the Cat wiki
> > http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks.
> > I am hoping that it will help other people who are interested in
> > learning how to construct interpreters and compilers.
> >
> > Any comments or suggestions would be much appreciated!
> >
> > Christopher Diggins
> > http://www.cdiggins.com
> >
>On 4/7/07, chris glur <crglur@...> wrote:
>
> From a superficial initial read it seems to be a well structured
> top-down presentation. I've avoided C#, because I don't want
> to be screwed by M$, but I was able to understand what I
> examined of the text.
>
> I know a bit about compilers/interpreters and it's not clear
> why I would want to analyse another one. I was hoping that
> 'concatinative methods' would lead towards a formal
> [mathematical like] way of anlysing algorithms and/or
> proving code correctness. Does it ?
>

Thanks for compliment Chris.

In general the concatenative approach greatly simplifies reasoning
about algorithms and code. It also simplifies automated translation,
analysis, compilation, and optimization. Check out Manfred von Thun's
writing about algebraic manipulation of concatenative code (
http://www.latrobe.edu.au/philosophy/phimvt/joy/j04alg.html ).

It is common for most compilers to reduce code to a point-free form
(e.g. nameless) before conducting further analysis, which I think
testifies to the usefulness of point-free forms (concatenative or
otherwise). The fact that call-by-push-value semantics (
http://citeseer.ist.psu.edu/234600.html ) subsumes call by name and
call by value semantics may also indicate that stack-based approaches
are perhaps a more general formalism than either of them.

There are of course some who disagree on the value of concatenative
approaches ( http://lambda-the-ultimate.org/node/2099 )

By the way, the Mono implementation of C# (
http://www.mono-project.com/ ) is quite good, and the Cat
implementation has been reported to compile on implementations of
Mono. So you aren't completely bound to MS, but I understand the
reluctance.

Hope this helps,
Christopher Diggins

chris glur — 2007-04-08 07:15:36

I HAD previously fetched several of that site's pages.
But not this one which seems to contain the guts of the applicable theory.
I've started analysing it, but I fear I might find the same
problem, as indicated by my below pAste of a complaint/suggestion
to Usenet on this same topic. As a student I remember that ususally
when I completed the course and saw the BIG-picture, I always
thought that the material had not been sequenced properly.

j04alg.html too seems to be just a set of bottom-up constructs,
where I can't see what his final aim is.

If/when I finally DO see the big picture, I'll try to structure the
explanation, in a top-down way, instead of just complaining.

Thanks,

== Chris Glur.
--------- extract of previous post to Usenet-groups:-
So far the best fetched material, suggests that what I'm 'looking for'
does exists. Eg.
http://www.mactech.com/articles/mactech/Vol.07/07.05/
LambdaCalculus/

> "A Calculus for the Algebraic-like Manipulation of Computer Code"
> Some of his [optimistic for me] statements are:

> Wouldn't it be nice to be able to manipulate computer code in the
> same sort of clean mathematical way that one does algebra?

His 1'000 line article handles [nicely] sequentially:
Intro to l-calculus, LISP 10, Currying, Free Variables and
Higher-Order Functions, Intro to Combinators, Programming with
Combinators, Church Numerals, Arithmetic Functions, The Y Combinator,
Aftermath; which are each individually understandable sections.

But I can't see how they form an inference chain to the hoped-for
conclusion.

IMO the correct way to explain a complex topic is by backward
chaining.
Ie. top-down decomposition: the same way as programming is best done.
Eg. we want to acheive X,
which requires X1, X2...Xn,
which requires X11, X12....X21, X22...

Ie. the goal must be stated up front and be kept in sight and it must
be always clear that/how each step will lead to the goal.
Which is the principle of "Hello world". Ie. start with a minimal but
complete [up to the conclusion] explanation/statement. And proceed
by successive refinement.

His article seems to be suggesting the conclusions which I hope for,
namely eg. :
* any algorithm can be expressed in some minimum/cannonical syntax,
* which can be transformed to equivalent syntax[s]: S;
* which satisfy 'mathemetical' rule-set M;
* and allows manipulations P;
* which can arrive at conclusions C, for the original algorithm/code.

On 4/7/07, Christopher Diggins <cdiggins@...> wrote:
> >
> > On 4/5/07, Christopher Diggins <cdiggins@...> wrote:
> > > I've written a short article about how the Cat interpreter works on
> > > the Cat wiki
> > > http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks.
> > > I am hoping that it will help other people who are interested in
> > > learning how to construct interpreters and compilers.
> > >
> > > Any comments or suggestions would be much appreciated!
> > >
> > > Christopher Diggins
> > > http://www.cdiggins.com
> > >
> >On 4/7/07, chris glur <crglur@...> wrote:
> >
> > From a superficial initial read it seems to be a well structured
> > top-down presentation. I've avoided C#, because I don't want
> > to be screwed by M$, but I was able to understand what I
> > examined of the text.
> >
> > I know a bit about compilers/interpreters and it's not clear
> > why I would want to analyse another one. I was hoping that
> > 'concatinative methods' would lead towards a formal
> > [mathematical like] way of anlysing algorithms and/or
> > proving code correctness. Does it ?
> >
>
> Thanks for compliment Chris.
>
> In general the concatenative approach greatly simplifies reasoning
> about algorithms and code. It also simplifies automated translation,
> analysis, compilation, and optimization. Check out Manfred von Thun's
> writing about algebraic manipulation of concatenative code (
> http://www.latrobe.edu.au/philosophy/phimvt/joy/j04alg.html ).
>
> It is common for most compilers to reduce code to a point-free form
> (e.g. nameless) before conducting further analysis, which I think
> testifies to the usefulness of point-free forms (concatenative or
> otherwise). The fact that call-by-push-value semantics (
> http://citeseer.ist.psu.edu/234600.html ) subsumes call by name and
> call by value semantics may also indicate that stack-based approaches
> are perhaps a more general formalism than either of them.
>
> There are of course some who disagree on the value of concatenative
> approaches ( http://lambda-the-ultimate.org/node/2099 )
>
> By the way, the Mono implementation of C# (
> http://www.mono-project.com/ ) is quite good, and the Cat
> implementation has been reported to compile on implementations of
> Mono. So you aren't completely bound to MS, but I understand the
> reluctance.
>
> Hope this helps,
> Christopher Diggins
>

chris glur — 2008-04-15 02:38:56

> > I know a bit about compilers/interpreters and it's not clear
> > why I would want to analyse another one. I was hoping that
> > 'concatinative methods' would lead towards a formal
> > [mathematical like] way of anlysing algorithms and/or
> > proving code correctness. Does it ?

Christopher Diggins wrote;

> In general the concatenative approach greatly simplifies reasoning
> about algorithms and code. It also simplifies automated translation,
> analysis, compilation, and optimization. Check out Manfred von Thun's
> writing about algebraic manipulation of concatenative code
>( http://www.latrobe.edu.au/philosophy/phimvt/joy/j04alg.html ).
>It is common for most compilers to reduce code to a point-free form
> (e.g. nameless) before conducting further analysis, which I think
> testifies to the usefulness of point-free forms (concatenative or
> otherwise). The fact that call-by-push-value semantics
> ( http://citeseer.ist.psu.edu/234600.html )
> subsumes call by name and call by value semantics may also
> indicate that stack-based approaches are perhaps a more
> general formalism than either of them.

As previously stated I'm aware of Backus' and McCarthy's
promotions of functional programming for facilitating formal methods
of correctness proof. But that's decades ago, and I've never seen that
anything usable came of it. It reminds me of these blokes that spend
every weekend polishing their vehicles, but have never made a 'journey'!

The first part of joy/j04alg.html looks like interesting manipulations,
but no indication for me that it will achieve 'formal methods'.

What I need to see is a 'hello world', as a proof of concept.
My related rant below, hopefully further explains why I'm reluctant to
start a journey which won't reach a profitable destination.

Thanks,

== Chris Glur.

PS. rant will be enlarged in later post.
WebBased gmail is difficult to manage ?
And spammers drove me from normal mailer.

William Tanksley, Jr — 2008-04-15 16:47:45

chris glur <crglur@...> wrote:

> As previously stated I'm aware of Backus' and McCarthy's
> promotions of functional programming for facilitating formal methods
> of correctness proof. But that's decades ago, and I've never seen that
> anything usable came of it. It reminds me of these blokes that spend
> every weekend polishing their vehicles, but have never made a 'journey'!

There are quite a few production languages based on that type of work:
OCaml (fast, a bit quick-and-dirty rather than pure-and-theoretical
but easy to use because of it); Erlang (multiprocessor); J (array
based, like APL); Haskell (pure and theoretical); and Scala (a bit
experimental, but being proven).

Lisp, of course, is the best-known example, but it's much more-so
quick and dirty than any of the above.

> The first part of joy/j04alg.html looks like interesting manipulations,
> but no indication for me that it will achieve 'formal methods'.
> What I need to see is a 'hello world', as a proof of concept.
> My related rant below, hopefully further explains why I'm reluctant to
> start a journey which won't reach a profitable destination.

Cool. I look forward to hearing it!

> == Chris Glur.

-Wm

Rodney D Price — 2008-04-15 17:52:53

Chris,

Perhaps a comparison to another functional language, Haskell,
would be useful.

Haskell offers many opportunities for the use of formal
methods, since it is referentially transparent. This property
allows you to manipulate Haskell code in the same way you
can manipulate a set of equations.

An example of Haskell's use in this manner is in the paper
"The Design of a Pretty Printing Library", by John Hughes.
(http://citeseer.ist.psu.edu/hughes95design.html) He begins
with a simple design and elaborates it through algebraic
transformations of the Haskell source code until he arrives at
a sophisticated, efficient design. The final product is related
to the initial product through these transformations.

But perhaps that's not precisely what you mean by formal
methods. Another Haskell-based approach is embodied in
the library QuickCheck. Here you write specifications for
your code in the embedded domain-specific language
QuickCheck, which is itself a Haskell combinator library.
Then QuickCheck automatically generates test cases to check
that your Haskell code obeys your specifications. This is more
than unit testing -- it's a way to develop formal specifications
and working code simultaneously. See the paper "Specification-
based testing with QuickCheck" at
http://www.math.chalmers.se/~koen/pubs/entry-fop-quickcheck.html

But QuickCheck is still not a proof. It's just looking for
counter-examples to your specifications. And the "places" it
looks are random, at that. The Alloy tool (not related to Haskell)
is like QuickCheck, in that it looks for counter-examples, but it
does an exhaustive check all the way up to a maximum size.
See http://alloy.mit.edu.

But that's still not a proof of correctness. Once you begin asking
for real proofs, you're right, it's very difficult. Progress has been
slow. The so-called lightweight formal methods above give
confidence that your code is correct, but they are not proofs.

However... proofs using Haskell are far easier than proofs using,
say, C. Referential transparency is the key. And proofs have
been done on practical Haskell libraries. See Graham Hutton,
http://www.cs.nott.ac.uk/~gmh/bib.html, for a few examples.

So, back to concatenative programming. Most if not all proofs
done in Haskell rely on the type system. The Haskell type
system enforces referential transparency. Cat is the only stack
language I know of that has a type *system* that might be up
to the task. Yet I hear people on this list claiming that the
property of concatenativity alone will get you to a place where
formal methods become practical. What about it?

-Rod

> > I know a bit about compilers/interpret

ers and it's not clear

> > why I would want to analyse another one. I was hoping that

> > 'concatinative methods' would lead towards a formal

> > [mathematical like] way of anlysing algorithms and/or

> > proving code correctness. Does it ?

Christopher Diggins wrote;

> In general the concatenative approach greatly simplifies reasoning

> about algorithms and code. It also simplifies automated translation,

> analysis, compilation, and optimization. Check out Manfred von Thun's

> writing about algebraic manipulation of concatenative code

>(

http://www.latrobe.
edu.au/philosoph
y/phimvt/
joy/j04alg.
html
).

>It is common for most compilers to reduce code to a point-free form

> (e.g. nameless) before conducting further analysis, which I think

> testifies to the usefulness of point-free forms (concatenative or

> otherwise). The fact that call-by-push-

value semantics

> (

http://citeseer.
ist.psu.edu/
234600.html
)

> subsumes call by name and call by value semantics may also

> indicate that stack-based approaches are perhaps a more

> general formalism than either of them.

As previously stated I'm aware of Backus' and McCarthy's

promotions of functional programming for facilitating formal methods

of correctness proof. But that's decades ago, and I've never seen that

anything usable came of it. It reminds me of these blokes that spend

every weekend polishing their vehicles, but have never made a 'journey'!

The first part of joy/j04alg.html looks like interesting manipulations,

but no indication for me that it will achieve 'formal methods'.

What I need to see is a 'hello world', as a proof of concept.

My related rant below, hopefully further explains why I'm reluctant to

start a journey which won't reach a profitable destination.

Thanks,

== Chris Glur.

PS. rant will be enlarged in later post.

WebBased gmail is difficult to manage ?

And spammers drove me from normal mailer.

William Tanksley, Jr — 2008-04-15 18:04:16

Rodney D Price <rodprice@...> wrote:

> Perhaps a comparison to another functional language, Haskell,
> would be useful.

Great examples -- thanks.

> But that's still not a proof of correctness. Once you begin asking
> for real proofs, you're right, it's very difficult. Progress has been
> slow. The so-called lightweight formal methods above give
> confidence that your code is correct, but they are not proofs.

There are systems designed to allow formal proofs to be embedded with
regular code. Interestingly, the only one I'm familiar is for a
language that isn't referentially transparent, Ada. (SPARK Ada,
specifically.)

> So, back to concatenative programming. Most if not all proofs
> done in Haskell rely on the type system. The Haskell type
> system enforces referential transparency. Cat is the only stack
> language I know of that has a type *system* that might be up
> to the task. Yet I hear people on this list claiming that the
> property of concatenativity alone will get you to a place where
> formal methods become practical. What about it?

Well, a type system is itself a formal method; it's just a subset of
more extensive automated proofs. Fifth is another concatenative
language that, in addition to an inferenced type system, has an
effects system. I don't know whether its effects system is influenced
or benefitted by its concatenativity.

I do know of one concatenative language, strongForth, that added a
complete strong type system (excluding type inference) that runs at
the same time as the compiler's linear-time parse. That (linear time
typechecking) was made simpler by concatenativity (although I don't
know that it's impossible without it).

> -Rod

-Wm

John Nowak — 2008-04-16 05:35:18

On Apr 15, 2008, at 2:04 PM, William Tanksley, Jr wrote:

>> So, back to concatenative programming. Most if not all proofs
>> done in Haskell rely on the type system. The Haskell type
>> system enforces referential transparency. Cat is the only stack
>> language I know of that has a type *system* that might be up
>> to the task. Yet I hear people on this list claiming that the
>> property of concatenativity alone will get you to a place where
>> formal methods become practical. What about it?
>
> Well, a type system is itself a formal method; it's just a subset of
> more extensive automated proofs. Fifth is another concatenative
> language that, in addition to an inferenced type system, has an
> effects system. I don't know whether its effects system is influenced
> or benefitted by its concatenativity.

The concatenative nature of Fifth *hugely* simplifies things. Let's
assume for the moment that all of the data structures in Fifth are
persistent/immutable. The effect system therefore only needs to track
IO (reading from a file, generating random numbers, etc). We can
actually add this effect tracking to Fifth without any modifications
to the type system.

(This probably gets a bit confusing as I wrote this very quickly; skip
to the end for the summary if you want.)

As has been mentioned here previously, we can conceptualize functions
that perform IO as reading/writing some "world" value that gets
threaded through the entire program on the top of the stack. This
threading is easy to accomplish because of the monoidal nature of
concatenative language. To be more clear, the 'dup' function normally
has the following type:

dup :: R a -> R a a

However, if we were to actually include the "world" in the type, it
would look like this:

dup :: R a b -> R a a b

The reason we just use a variable for the world is that 'dup' doesn't
access it; it just passes it along. However, some functions do modify
the world. For example, the type of the 'display' function that prints
a string to the screen would be as such:

display :: R String a -> R Impure

This type says that display takes some stack 'R' with a 'String' as
the second element and a world value of any type on top. The display
function then yields the stack 'R' with an "impure" world on the top.
In other words, display takes a string and any world and yields a
string with a modified world.

Now if we look back at the type of 'dup', we can see that composing
'display' and 'dup' would result in the impure world type being
propagated forward as 'dup' is simply "a -> a" as far as the world is
concerned. Essentially, if 'display' is impure, then so is 'display
dup'.

Finally, we get to the point: If we mandate a world that is "pure",
then we can enforce that a given function has no side effects. For
example, here's the full type of 'map' (a simpler type is shown to the
user):

map :: R (List a) [R a Pure -> R b Pure] c -> R (List b) c

So what does this say? Well, it's clear that map needs a list. It also
needs a function. However, the presence of 'Pure' in this function
says that if given a pure world, the purity of that world must be
preserved. Accordingly, any function provided to 'map' is guaranteed
not to have side effects. Note however that 'map' itself can be
composed in between functions that do have side effects.

Here's one more example of why this system is very nice. The 'o'
function (called 'compose' in Cat and '.' and Haskell) composes two
functions on the top of the stack to yield a new function. Including
the world (again, this is presented more concisely to the user), it
has the following type:

o :: R [S a -> T a] [T a -> U a] b -> R [S a -> U a] b

As you can (maybe) see from this type, the manner in which the world
is handled by the resulting composed function depends on how the two
functions provided to 'o' handle the world. However, even if these two
functions have effects and modify the world, this will not make the
call to 'o' itself impure because its world variable is 'b', not 'a'
as it is for the functions to be composed.

- - -

I've likely lost you by now, as my explanation isn't very clear, but
the point is that you get a very useful system that can be used to
enforce purity -- without the composability headache of monads -- for
free! If you add another variable to the top of the stack and disallow
recursive definitions and non-terminating self-application (the
"occurs check" that prevents infinite types does the latter for you),
you can track and ensure termination as well!

You might be wondering if you could add yet another variable to track
mutations to data structures. The answer is that you *can* but the
results will be too conservative; it becomes impossible to write a
pure function that is implemented using impure means. To handle this
properly, we can again do so through adding variables, but these need
to be attached to the individual parameterized types for the mutable
values rather than passed on top of the stack. A small addition to the
type system is then required to handle the enforcement of purity
properly. However, this is just a 10 line addition to the ~80 line
type system core type system. (The "non-core" features include
implicit parameters, aka "globals done right", as well as tracking of
exceptions.)

- John

How the Interpreter Works