a concatenative language for the Semantic Web
Joshua Shinavier — 2007-05-21 01:44:59
Hi all,
I'd like to introduce a new RDF query language, called Ripple, which
applies the basic ideas of concatenative languages to the Semantic
Web. The Semantic Web is a network of machine-understandable data
which is to a software agent what the hypertext Web is to a person.
It replaces the unconstrained, natural language text of HTML web pages
with the software-friendly subject-predicate-object triples of RDF,
enabling specialized applications to make more intelligent use of the
vast amounts of information on the Web (or at least, that's the
vision). Personally, I'm very interested in the procedural aspect of
the Semantic Web, or the logic and algorithms which drive the
crawlers, reasoners, and innovative mashups that consume and do useful
things with the data. I also like functional programming, so
expressing Semantic Web programs as Semantic Web data seemed to me
like a natural thing to do: Ripple programs are both expressed in RDF
and operate upon RDF metadata. I decided to make Ripple a
concatenative language because of the simplicity of concatenative
programs and because of their resemblance to path expressions, which
are very appropriate for the labeled graph model of RDF. There's some
documentation on Ripple here:
http://ripple.projects.semwebcentral.org/
The Java implementation is available here:
http://projects.semwebcentral.org/frs/?group_id=125
Ripple's syntax and query model borrow heavily from Joy, with a few
key differences:
1) There is only one "operator" (a dispatch function called op) in the
language; everything else is treated as a constant. For instance,
whereas in Joy you would write
2 dup
In Ripple, you might write
2 dup op
In Ripple, dup is a "passive" stack item just like the literal value
2. We need an operator, op, to consume dup and push a dup "filter" to
the stack, which then consumes the 2. Ripple draws in programs and
data from the Semantic Web on the fly, so the op symbol potentially
saves on a lot of comparison operations by telling the evaluator
exactly when it needs to examine a resource and treat it as a program.
Since op is so common, Ripple's notation allows for its abbreviation
as a prefix (the slash character). Thus, the above could (and usually
would) be written:
2 /dup
2) Quotations (indicated with parentheses rather than square
brackets), primitive functions, and RDF properties are
interchangeable. In Ripple, combinators such as i and dip have a
slightly different mode of operation than they do in Joy, in that they
push op to the stack instead of applying functions directly. So, for
instance,
3 (1 2) /dip -- or: 3 (1 2) dip op
is equivalent to
/(1 2) 3 -- or: (1 2) op 3
And this is where evaluation stops. If anything beneath the topmost
item on the stack is needed for the subsequent application of a
function, then the stack will be further reduced to
1 2 3
I called op a dispatch function because, having consumed an argument,
it applies a function to the remainder of the stack which depends on
the type of that argument. If op is applied to a primitive function,
it in turn applies the function to the rest of the stack. A
construction like the following is legal in Ripple (think of dup as
[dup] in Joy):
2 dup/i -- or: 2 dup i op
which becomes
2/dup -- or: 2 dup op
which becomes
2 2
When applied to an RDF property, op yields an atomic RDF path
operation. For example:
:arthur/foaf:name -- or: :arthur foaf:name op
becomes
"Arthur Dent"
This is a program which takes us from the resource identified by
:arthur to a value associated with :arthur by a foaf:name edge in the
global RDF graph we're querying. The foaf:name operation behaves like
a partial function over all of the resources in the graph.
3) There is no single global stack in Ripple. Instead, Ripple is
geared towards "streams" of stacks in order to deal with RDF's
multivalued relations. Filters may consume more than one stack, and
produce any number of stacks. E.g.
(1 2 3)/each -- or: (1 2 3) each op
This program yields not one, but three values, each in its own stack:
1
2
3
The three stacks continue down the evaluation "pipeline" individually,
and further operations are applied to each of them in turn.
(1 2 3)/each 100/add -- or: (1 2 3) each op 100 add op
becomes
101
102
103
Here's the definition of a simple program in Ripple:
@define fib:
0 1 /rolldown # push initial value pair and put n on top
(/swap/dupd/add) # push the step function
/swap/times # execute the step function n times
/pop. # select the low value
Here's another which uses "stream" primitives and RDF properties:
@define foafStep: # iterator for a FOAF crawler
( id # include foaf1 itself
owl:sameAs # include nodes identified with foaf1
foaf:knows # include those foaf:known by foaf1
)/each/i # apply all three patterns at once
/unique. # eliminate duplicate results
That's all for now :-) I hope this was interesting. Comments and
criticism are very welcome.
Josh
Joe Bowbeer — 2007-05-21 03:59:35
On 5/20/07, Joshua Shinavier wrote:
>
>
> [...] I hope this was interesting. Comments and
> criticism are very welcome.
>
>
I'm interested. A couple questions:
I don't understand how the concatenative aspect works to your advantage.
Can you elaborate?
As a veteran of Lisp and Scheme, I think I would naturally turn to them when
presented with this kind of problem. If you have reasons why Ripple is more
suitable than Lisp or Scheme, it might help me better understand what you're
doing.
--Joe
[Non-text portions of this message have been removed]
William Tanksley, Jr — 2007-05-21 15:39:57
Joshua Shinavier <
parcour@...> wrote:
> 1) There is only one "operator" (a dispatch function called op) in the
> language; everything else is treated as a constant. For instance,
> whereas in Joy you would write
I think one of Apter's experimental languages used something like
this; a bare word simply pushes the function on the stack, while an
exclamation mark executes the function at the TOS.
> 3) There is no single global stack in Ripple. Instead, Ripple is
> geared towards "streams" of stacks in order to deal with RDF's
> multivalued relations. Filters may consume more than one stack, and
> produce any number of stacks. E.g.
Wow. This seems very significant.
> The three stacks continue down the evaluation "pipeline" individually,
> and further operations are applied to each of them in turn.
In turn, or in parallel?
I'll be reading more about this.
> Josh
-Billy
Joseph Huang — 2007-05-22 01:21:52
What about making / postfix? It seems misleading to have blah op == /blah,
especially in a concatenative language.
How about shortening op to 1 character? Maybe | because it's so narrow. Or
just use / instead of op.
How good is Ripple for general purpose computing?
--
-JSH
[Non-text portions of this message have been removed]
Joshua Shinavier — 2007-05-22 01:46:39
On 5/20/07, Joe Bowbeer <joe.bowbeer@...> wrote:
Hi Joe,
Thanks for the feedback!
> I'm interested. A couple questions:
>
> I don't understand how the concatenative aspect works to your advantage.
> Can you elaborate?
Ripple programs are meant to be embedded in an RDF graph, and it was a
design goal to make their RDF representation as simple as possible.
Ripple uses very few symbols, above and beyond the RDF and RDFS
vocabularies, which are relevant to its evaluator, and Ripple
expressions are meant to be easy to read even in a generic RDF
representation language like Notation3. The presence of bound
variables and other abstractions (i.e. beyond concatenation and
quotation) would complicate its syntax. In this sense, I'm exploring
the idea that extreme simplicity, in the form of "point free"
programs, will pay off. Most importantly, a stack language lends
itself very well to path expressions, and Ripple's query model is
path-based.
> As a veteran of Lisp and Scheme, I think I would naturally turn to them when
> presented with this kind of problem.
Others have, e.g. Wilbur [1] or SWCLOS [2]. However, it is not my
understanding that these toolkits actually express program code in
RDF.
If you have reasons why Ripple is more
> suitable than Lisp or Scheme, it might help me better understand what you're
> doing.
See above. While it's possible to express arbitrarily complex data
structures in RDF, simplicity is preferred. Ripple is closer to the
typical, simple data representation languages of the Semantic Web than
the rather complex grammars of typical programming languages. I've
just been made aware of a fascinating object-oriented language called
Neno [3] which is also represented in RDF, and its programs are rather
more complex than Ripple's, so the path paradigm is certainly not the
only option. I chose it because it makes for particularly simple,
modular, path-like programs which lend themselves very well to a
"networked programs" model. Also, I think Ripple's evaluator is
rather efficient at converting raw RDF data to executable code, which
in the networked environment is often more critical than actual
efficiency of computation. If I had chosen to integrate Scheme with
RDF instead, I doubt I'd have had my proof of concept up and running
in twice the time it has taken me to implement Ripple, and I don't
think I would prefer the end result.
> --Joe
Best,
Josh
[1] http://wilbur-rdf.sourceforge.net/
[2] http://iswc2004.semanticweb.org/demos/32/
[3] http://neno.lanl.gov/Home.html
Joshua Shinavier — 2007-05-22 01:55:09
On 5/21/07, William Tanksley, Jr <wtanksleyjr@...> wrote:
Hi Billy,
> > 1) There is only one "operator" (a dispatch function called op) in the
> > language; everything else is treated as a constant. For instance,
> > whereas in Joy you would write
>
> I think one of Apter's experimental languages used something like
> this; a bare word simply pushes the function on the stack, while an
> exclamation mark executes the function at the TOS.
I'd like to take a look at it... that does sound like the same idea.
Earlier on in the design of Ripple, I thought I would use op as the
actual application operator, where each instance of op applies a
function to a single argument. So instead of instead of (1 2 /add) or
(1 2 add op) you would have written (1 2 //add), or (1 2 add op op),
which makes explicit how many arguments the 'add' function is to
consume, and is nicely symmetrical with the fully-parenthesized
applicative expression ((add 2) 1) -- written in reverse and with the
opening parens replaced with op. Ripple would have been an
applicative language, then, but still nominally concatenative (I
think), and in any case still a stack language. I chose the current
semantics of op so as to make nullary functions less of a pain and to
make the code more compact. I still like the applicative style,
though, and I think at some point I'll write an evaluator which uses
it, just to see what it's like to program in.
> > 3) There is no single global stack in Ripple. Instead, Ripple is
> > geared towards "streams" of stacks in order to deal with RDF's
> > multivalued relations. Filters may consume more than one stack, and
> > produce any number of stacks. E.g.
>
> Wow. This seems very significant.
I was wondering if it was new. I'm not aware another stack language
which resembles mine in that respect, but then again I have yet to
take a closer look at some of the more exotic languages discussed on
this list (as well as very many non- functional stack languages like
Forth).
> > The three stacks continue down the evaluation "pipeline" individually,
> > and further operations are applied to each of them in turn.
>
> In turn, or in parallel?
For now, in turn. The order of output of stacks in a stream is
technically undefined, but evaluation occurs in a single thread, so
it's actually completely predictable. Concurrent and distributed
computing with Ripple is definitely on the TODO list.
> I'll be reading more about this.
Your critique couldn't be more welcome :-)
> -Billy
Josh
Joshua Shinavier — 2007-05-22 02:38:17
On 5/21/07, Joseph Huang <josephshuang@...> wrote:
Hi Joseph!
> What about making / postfix? It seems misleading to have blah op == /blah,
> especially in a concatenative language.
>
> How about shortening op to 1 character? Maybe | because it's so narrow. Or
> just use / instead of op.
I've deliberated a bit about op as a prefix. It's potentially a
little confusing when you have to match up the text representation
with the RDF representation, but a postfix symbol would not be as
handy for path expressions, e.g.
foo/bar/quux
...has a pretty obvious path-like quality, whereas
foo bar op quux op
IMO does not, even if op is replaced by a special character:
foo bar! quux!
Maybe I'm too attached to infix operators for path expressions, but
the first version looks better to me, and is probably closer to what
someone familiar with Notation3 would expect.
> How good is Ripple for general purpose computing?
The Java implementation of Ripple is best suited to applications
involving multivalued relations, such as RDF's. It uses some rather
expensive compositional plumbing to distribute each operation over
arbitrarily many values, so if you don't need that functionality,
you're better off using Joy, Factor, etc. However, as a
representation language, Ripple can just as well express single-input,
single-output programs, so executing those programs efficiently would
be a matter of writing a specialized evaluator or compiler (also on
the TODO list). The advantage of using Ripple in that case is that
you could more easily share and re-use programs; rather than
downloading a program manually and then running it, all you need to
execute a Ripple program is its URI.
> -JSH
All this feedback is great. Many thanks!
Josh
Joe Bowbeer — 2007-05-22 02:45:26
Thanks for the explanation.
This comment about syntax tripped me up at first:
"The presence of bound variables and other abstractions (i.e. beyond
concatenation and
quotation) would complicate its syntax."
because it didn't apply to Lisp's simple s-expression syntax, but then I
realized this statement was only addressing my concatenative question.
Good point about the stack. Do your code fragments stand alone, or do they
communicate via the stack(s)? (If they stand alone, then an enclosing "let"
expression could be used in Lisp to define temporary variables.)
--Joe
On 5/21/07, Joshua Shinavier wrote:
>
> On 5/20/07, Joe Bowbeer wrote:
>
> Hi Joe,
>
> Thanks for the feedback!
>
>
> > I'm interested. A couple questions:
> >
> > I don't understand how the concatenative aspect works to your
> advantage.
> > Can you elaborate?
>
> Ripple programs are meant to be embedded in an RDF graph, and it was a
> design goal to make their RDF representation as simple as possible.
> Ripple uses very few symbols, above and beyond the RDF and RDFS
> vocabularies, which are relevant to its evaluator, and Ripple
> expressions are meant to be easy to read even in a generic RDF
> representation language like Notation3. The presence of bound
> variables and other abstractions (i.e. beyond concatenation and
> quotation) would complicate its syntax. In this sense, I'm exploring
> the idea that extreme simplicity, in the form of "point free"
> programs, will pay off. Most importantly, a stack language lends
> itself very well to path expressions, and Ripple's query model is
> path-based.
>
>
> > As a veteran of Lisp and Scheme, I think I would naturally turn to them
> when
> > presented with this kind of problem.
>
> Others have, e.g. Wilbur [1] or SWCLOS [2]. However, it is not my
> understanding that these toolkits actually express program code in
> RDF.
>
>
> If you have reasons why Ripple is more
> > suitable than Lisp or Scheme, it might help me better understand what
> you're
> > doing.
>
> See above. While it's possible to express arbitrarily complex data
> structures in RDF, simplicity is preferred. Ripple is closer to the
> typical, simple data representation languages of the Semantic Web than
> the rather complex grammars of typical programming languages. I've
> just been made aware of a fascinating object-oriented language called
> Neno [3] which is also represented in RDF, and its programs are rather
> more complex than Ripple's, so the path paradigm is certainly not the
> only option. I chose it because it makes for particularly simple,
> modular, path-like programs which lend themselves very well to a
> "networked programs" model. Also, I think Ripple's evaluator is
> rather efficient at converting raw RDF data to executable code, which
> in the networked environment is often more critical than actual
> efficiency of computation. If I had chosen to integrate Scheme with
> RDF instead, I doubt I'd have had my proof of concept up and running
> in twice the time it has taken me to implement Ripple, and I don't
> think I would prefer the end result.
>
>
> > --Joe
>
> Best,
>
> Josh
>
>
> [1] http://wilbur-rdf.sourceforge.net/
> [2] http://iswc2004.semanticweb.org/demos/32/
> [3] http://neno.lanl.gov/Home.html
>
>
[Non-text portions of this message have been removed]
William Tanksley, Jr — 2007-05-22 14:04:11
Joe Bowbeer <
joe.bowbeer@...> wrote:
> This comment about syntax tripped me up at first:
> "The presence of bound variables and other abstractions (i.e. beyond
> concatenation and quotation) would complicate its syntax."
> because it didn't apply to Lisp's simple s-expression syntax, but then I
> realized this statement was only addressing my concatenative question.
Lisp's syntax includes bound variables, which was his point.
> Good point about the stack. Do your code fragments stand alone, or do they
> communicate via the stack(s)? (If they stand alone, then an enclosing "let"
> expression could be used in Lisp to define temporary variables.)
"let" defines bound variables; he's working point-free, which means he
doesn't use them.
> --Joe
-Billy
Joe Bowbeer — 2007-05-22 15:06:23
On 5/22/07, William Tanksley, Jr <
wtanksleyjr@...> wrote:
>
> > Good point about the stack. Do your code fragments stand alone, or do
> they
> > communicate via the stack(s)? (If they stand alone, then an enclosing
> "let"
> > expression could be used in Lisp to define temporary variables.)
>
> "let" defines bound variables; he's working point-free, which means he
> doesn't use them.
>
>
Thanks. I get that. It's as clear to me as if someone wrote "A hammer
drives nails, but he's using a screw driver, so he doesn't need nails."
My question "Do the code fragments stand alone?" is more at "Why use a
screwdriver?"
--Joe
[Non-text portions of this message have been removed]
William Tanksley, Jr — 2007-05-22 15:45:46
Joe Bowbeer <
joe.bowbeer@...> wrote:
> William Tanksley, Jr <wtanksleyjr@...> wrote:
> > > Good point about the stack. Do your code fragments stand alone, or do
> > > they
> > > communicate via the stack(s)? (If they stand alone, then an enclosing
> > > "let" expression could be used in Lisp to define temporary variables.)
> > "let" defines bound variables; he's working point-free, which means he
> > doesn't use them.
> Thanks. I get that. It's as clear to me as if someone wrote "A hammer
> drives nails, but he's using a screw driver, so he doesn't need nails."
> My question "Do the code fragments stand alone?" is more at "Why use a
> screwdriver?"
Let me quote from his paper:
"Broadly speaking, the goal of this project is to demonstrate that:
1. linked programs have all of the advantages of generic linked data
2. a stack language is like a path language, only better
3. for most linked data purposes, forward traversal is all you need"
It could be stated that his goal is to prove by example that a
screwdriver is better than a hammer for driving screws :-). But enough
of that analogy. On to specifics.
His design is, from the ground up (and in the first paragraph of his
paper, which I recommend in entirety, BTW) described as taking a
"multivalued, pipeline" approach. Scheme, for all its excellent
qualities, is not a dataflow language, and is thus not suited for a
pipeline approach. Adding in the idea of multiple values (really a
clever innovation -- I don't know of anyone else who's done that)
makes bound variables even more unlikely to work (one could use
generators, like Icon, but that adds a deep nest of unusual
semantics).
So his idea is basically and fundamentally different from anything
Scheme could possibly support. Not a slam against Scheme. He just
needed something different in order to prove his point.
And his result is indeed very interesting, wouldn't you say?
I find Ripple to be in the top 3 interesting concatenative languages.
I don't know anything about RDF/Semantic Web, but you bet I'll be
learning about it now.
> --Joe
-Billy
Manfred Von Thun — 2007-05-24 07:54:14
On 22/5/07 12:38 PM, "Joshua Shinavier" <
parcour@...> wrote:
>
> [..]
> The Java implementation of Ripple is best suited to applications
> involving multivalued relations, such as RDF's. It uses some rather
> expensive compositional plumbing to distribute each operation over
> arbitrarily many values, so if you don't need that functionality,
> you're better off using Joy, Factor, etc.
It is the word ³multivalued² that prompts me to respond. Relations instead
of functions. Just about all of the discussion here has implicitly equated:
concatenative = functional = stackbased. That need not be so. Consider
the classical declarative languages: (the purely functional subset of) Lisp
and (the purely ³logical² subset of) Prolog. Take Lisp, remove abstraction,
remove application, replace by concatenation: you get one or the other
concatenative language which uses a stack. Now take Prolog: remove
the (³logical²) variables, remove application, use a stack. (I did not
at this point say: use only concatenation). Programs now compute
binary relations between stacks. Relations can be composed (and since
unary functions are binary relations, unary (stack) functions can be
composed. We could use program concatenation to denote the relation
which is the composition of the relations denoted by the programs
that are being concatenated. Hence a relational stack language.
Many of the primitives of such a language would be indistinguishable
from those in a functional stack language. But there may well be
useful primitives that are relational. One example: the two square
roots (positive and negative) of a positive number. Another: the
data relations in the RDF that you speak about.
More likely than not one would want to construct multivalued relations
from others, including ones that are single-valued. One needs some
kinds of operators that can do this. In Henderson¹s wonderful Lispkit
book, Functional Programming, Interpretation and Implementation,
there is a chapter in which discusses how to turn a functional language
into a relational language. Two novel operators are needed: some
kind of OR that splits execution into two paths, and a NONE operator
which (supposedly) is never executed. (OR and NONE correspond to
what in Prolog are disjunction (³;²) and fail.) One might do the same
to a functional stack language: add OR and NONE to get a relational
stack language.
As far as the syntax is concerned, NONE is unproblematic: it has to
be a nullary operator. But for OR there are two possibilities:
(1) Write OR as an infix operator, perhaps as ³|², the vertical bar
familiar from regular expressions, or just as ³OR².
(2) Write OR as a combinator which takes two quotations off the stack
and let it initiate two executions.
I am still unclear on the advantages and disadvantages of these two
possible notations. Of course there may be others again.
As an unintentional ³side effect² of my current experiments with Joy
in Prolog, I stumbled upon (2), and it was trivial to implement. But
I still have not thought of any remotely useful programs to benefit
from this relational extension.
Maybe these thoughts are of some value for your Ripple. Best
wishes for the implementation, Joshua.
- Manfred
[Non-text portions of this message have been removed]