Preliminary Notes on a Joy-like VM written in Oberon

srenner@mail.ru — 2000-05-04 16:37:00

In this implementation, the fundamental data structure is a box chain.
A box is just a record with a pointer to another box. A box chain is
just a linked list. Ignore the asterisks in the declarations below.
They specify visibility outside the module.

TYPE Box* = POINTER TO RECORD;
VAR
next*: Box;
PROCEDURE Exec*(bstack: BoxStack; istack: IntStack; dict: Dictionary;
VAR box: Box);
END Exec;
PROCEDURE Print*;
END Print;
END Box;

"Box" is the basic type of box. It doesn't contain anything. You could
call it an abstract type. Here is an extension of "Box" which holds an
integer. Notice that "Exec" doesn't operate on global variables but
takes arguments. This corresponds nicely to Joy: all literals are
functions of type stack -> stack.

TYPE Int* = POINTER TO RECORD(Box);
VAR
n: LONGINT;
PROCEDURE Exec*(bstack: BoxStack; istack: IntStack; dict: Dictionary;
VAR box: Box);
BEGIN
istack.Push(n);
box := box.next;
END Exec;
PROCEDURE Print*;
BEGIN
Out.String(' Int Box containing: '); Out.Int(n,10);
Out.Ln;
IF next # NIL THEN next.Print END
END Print;
END Int;

Here is another extension of "Box". It holds a string which specifies
an operation. It would probably be better if every primitive op had
its own box type. Then Oberon's polymorphic dispatch on "exec" would
take the place of the IF .. ELSIF ... statement.

TYPE Op* = POINTER TO RECORD(Box);
VAR
op: Word;
PROCEDURE Exec*(bstack: BoxStack; istack: IntStack; dict:
Dictionary);
BEGIN
IF op = 'POP' THEN
istack.Pop;
ELSIF op = 'SHOW' THEN
istack.Show;
ELSIF op = 'DSHOW' THEN
dict.Print;
ELSIF op = 'PLUS' THEN
istack.Plus;
ELSIF op = 'MUL' THEN
istack.Mul;
ELSIF op = 'DUP' THEN
istack.Dup;
ELSIF op = 'IF' THEN
IF istack.stack[istack.top] > 0 THEN
next := bstack.stack[bstack.top];
ELSE
box.next :=bstack.stack[bstack.top-1];
END;
bstack.top := bstack.top - 2;
ELSE (* op unprim *)
Out.String('unprim '); Out.String(op); Out.Ln;
IF next # NIL THEN bstack.Push(next) END;
next := dict.get(op);
END;
END Exec;

PROCEDURE Print*;
BEGIN
Out.String(' Op Box containing: ');Out.String(op);
Out.Ln;
IF next # NIL THEN next.Print END
END Print;
END Op;

*****************************************

OK. Now I want to talk about stacks.

TYPE BoxStack* = POINTER TO RECORD;
VAR
stack: ARRAY 10 OF Box;
top: INTEGER;

PROCEDURE & new;
BEGIN
top := 0;
END new;

PROCEDURE Show*;
VAR i: INTEGER;
BEGIN
Out.String('BSTACK:');
FOR i := 0 TO top DO
IF stack[i] # NIL THEN stack[i].Print; Out.Ln
END;
END;
END Show;

PROCEDURE GetTop*(VAR box: Box);
BEGIN
box := stack[top];
END GetTop;

PROCEDURE Push*(b:Box);
BEGIN
stack[top+1] := b;
top := top + 1;
END Push;

PROCEDURE Dup*;
BEGIN
stack[top+1] := stack[top];
top := top + 1;
END Dup;

PROCEDURE Pop*;
BEGIN
top := top - 1;
END Pop;

END BoxStack;

This stack is implemented as an array. Of course a depth of 10 is
impractical. Should the stack be a linked list instead of an array? It
would require double boxing. Consider a stack implemented as a linked
list with a box B on top of the stack. B.next points to the next box
in the chain, or NIL if the chain is just B (chain length 1). B.next
cannot be used to point to the next element of the stack. So B itself
must be placed in a new kind of box. Then the stack can be a special
kind of box chain. I don't like this idea very much.

*************************************************

You will have noticed that besides box chains and stack arrays there
is also a dictionary. The dictionary maps strings to definitions (box
chains.) However, "define" isn't functional. It could be argued that
"define" in Joy is a meta-level operation which changes the language.
If ops were tied to types as I suggested above, "define" could be
implemented without a dictionary by using the Oberon compiler.

square [dup times] def

==> Modify the source module in which the box types are defined to
include

TYPE Square* = POINTER TO RECORD(Box);
PROCEDURE Exec*(bstack: BoxStack; istack: IntStack);
BEGIN
dup(bstack: BoxStack; istack: IntStack);
times(bstack: BoxStack; istack: IntStack)
END Exec
END Square;

(Notice that the dictionary is gone from the parameters. )
Now call the Oberon compiler (which is VERY fast) on the modified
module. The signature of the module now exports a new procedure.

The VM itself would have to die and be reborn, but maybe that's all
right. We could make the stack persistent.

sr

iepos@tunes.org — 2000-05-05 16:19:55

> In this implementation, the fundamental data structure is a box chain.
> A box is just a record with a pointer to another box. A box chain is
> just a linked list. Ignore the asterisks in the declarations below.
> They specify visibility outside the module.
>
> [...]
>
> PROCEDURE Exec*(bstack: BoxStack; istack: IntStack; dict: Dictionary;
> VAR box: Box);

I'm a bit puzzled about one thing... Why are there two separate stacks
(a box stack and an int stack)?

Your "IF" primitive seems to pop things off the box stack, but I haven't
seen any primitives that push things on... Does you system have a
quotation construct like Joy's "[]"s?

> This stack is implemented as an array. Of course a depth of 10 is
> impractical. Should the stack be a linked list instead of an array?

An array sounds fine to me... but, does Oberon support dynamic allocation
of arrays, so that if you exceeded the bounds you could just reallocate
a larger one?

Of course, an array might not really be acceptable, if you're going to
be dealing with lots of data on the stack and need to be able to quickly
insert, delete, and move things around. If that were the case, then
maybe a linked list would be a good option (although you'd probably
want to group several items under the same node).

> sr

- Brent Kerby ("iepos")

wtanksley@bigfoot.com — 2000-05-05 19:04:23

From: iepos@... [mailto:iepos@...]

>> In this implementation, the fundamental data structure is a
>> box chain.
>> A box is just a record with a pointer to another box. A box chain is
>> just a linked list. Ignore the asterisks in the declarations below.
>> They specify visibility outside the module.

>> PROCEDURE Exec*(bstack: BoxStack; istack: IntStack; dict: Dictionary;
>> VAR box: Box);

>I'm a bit puzzled about one thing... Why are there two separate stacks
>(a box stack and an int stack)?

For the same reason that Forth has two stacks: one stack holds the
continuation to which the current function will return, and the other stack
holds the data.

>Your "IF" primitive seems to pop things off the box stack, but
>I haven't
>seen any primitives that push things on... Does you system have a
>quotation construct like Joy's "[]"s?

Yes, that's what a box is. I'm not sure whether or not he's invented a
syntax for boxing yet.

>> This stack is implemented as an array. Of course a depth of 10 is
>> impractical. Should the stack be a linked list instead of an array?

>An array sounds fine to me... but, does Oberon support dynamic
>allocation
>of arrays, so that if you exceeded the bounds you could just reallocate
>a larger one?

It does, but as his words imply, he hasn't implemented that.

It turns out, though, that 32 stack items is virtually infinite (in the
words of a prominent Forth programmer). In general, it's easy to use more;
however, in real use, it's very hard.

>Of course, an array might not really be acceptable, if you're going to
>be dealing with lots of data on the stack and need to be able
>to quickly
>insert, delete, and move things around. If that were the case, then
>maybe a linked list would be a good option (although you'd probably
>want to group several items under the same node).

Arrays are generally good for this kind of thing, really. Especially if all
your data items are represented by pointers, in which case it takes less
effort to swap the data items than it would have taken to rebuild a list.

>- Brent Kerby ("iepos")

-Billy