Major Quirks of the REBOL Language

This is the third of several articles I’m going to write about the REBOL programming language. To learn more about it, you can visit rebol.com. But my hope is to demystify some of its strengths and weaknesses in a way that their website currently does not, so if you read what I write first then it might help. :)

(Clear Warning: REBOL is not “free as in freedom” software, and no commitment has been laid out for how the commercial scaffolding which supports its development would be phased out. I know of no published statement that RT would not sue ORCA or other open-source efforts to implement the language. Until these issues are resolved, I consider it only an interesting thing to study and do *not* suggest its use in important projects. While REBOL may rebel against complexity, I think the rebellion for freedom is more fundamental—and the infrastructure we build on is too crucial to be left in the hands of one company that decides who may use a tool and how.)


We know from the English language that humans are a bit lazy when it comes to expression. We’re always dropping syllables off of words if we use them often, or taking difficult combinations of letters and turning them into something easier to pronounce. Yet of course, this means we live in an environment ripe for ambiguity:

Ambiguity in a Sentence, diagram from Deena Oodles

(Note: Image via Deena Hyatt)

By contrast, computer languages typically make programmers be redundantly clear in their notations. You’re always dealing with syntax… putting in parentheses before a list of arguments to a function, putting a close parentheses to say when you’re done. There are semicolons in many languages to tell the computer when you finished a line.

Yet we get a lot done with English without that symbol soup. Somehow, we communicate to each other with little more than a series of words separated by spaces. Essentially, that’s what a REBOL program is… “words” separated by “spaces”. It has conspicuously few parentheses or semicolons. Or equals signs, for that matter! If you strip out some of the incidental uses of symbols in names, it might be mistaken for human writing.

(Note: Some of the stranger notational aspects are just for show, for instance the function named none? could have just as easily been called is_value_none… you can change it to that if you want to. But the question mark convention is nice for boolean functions.)

So keep that interesting aspect in the back of your mind while I go straight for the jugular in terms of things about REBOL that may seem totally insane. I’m just being up front and honest with you about things you will find surprising. I will talk about the curious upsides after we’ve banged our heads against our keyboards a few times.

Whitespace matters in REBOL, a lot

I mentioned that REBOL programs are basically words separated by spaces. The text “hand book” and “handbook” have unique meanings in English. Strangely enough, there is a difference between 1 + 2, 1+ 2, and 1+2 in REBOL:

>> 1 + 2
==3
 
>> 1+ 2
** Syntax Error: Invalid integer -- 1+
** Near: (line 1) 1+ 2
 
>> 1+2
** Syntax Error: Invalid integer -- 1+2
** Near: (line 1) 1+2

The first is a series of three items… and has the effect you probably meant. The second as a series of two items (1+ and 2). The final we might hypothetically think of as a single token called 1+2. Now…as perverse as this sounds…you can give each of these cases different meanings. Please don’t panic as I show you how to make 1+ an alias for the print command.

(Note: We have to use an API to build such a program, because if we went through the default parser with a token like 1+ we’d get a syntax error. Yet this has nothing to do with REBOL’s runtime, which is quite agnostic about names.)

First let’s make our crazy program:

>> crazy_program: reduce [(to-word "1+") "Hello world"]
== [1+ "Hello world"]

The first symbol in this series isn’t a string, it’s what REBOL calls a word!. Now we will make the word! an alias for print. Once again we have to write it in an odd way to keep the parser from choking:

>> do [(to-set-word "1+") :print]

Now that we’ve got our new definition for 1+ and our crazy program that uses it, we can run the evaluator to see our totally loony result:

>> do crazy_program
"Hello world"

I’ll talk more about this and its implications in another post. I just brought it up early to make the point that spaces matter in day-to-day REBOL programming… a lot. One of the few places you don’t really have to worry about them is near brackets and parentheses, so these three things are equivalent:

>> first[print "Hello world"]
== print
 
>> first [print "Hello world"]
== print
 
>> first [ print "Hello world" ]
== print

Yet there’s an oddity in the console when it comes to newlines. If you tried running the following:

>> first
   [
   print
   "Hello world"
   ]

You’ll get a weird error:

** Script Error: first expected series argument of type: series pair event money date object port time tuple any-function library struct ...
** Near: first
>>    [
[       print
[       "Hello world"
[       ]
== [
    print 
    "Hello world"
]

Because there was a newline, the “first” command thought you were missing an argument. The following code would work as expected, because the bracket which starts the series parameter is on the same line as the command:

>> first [
   print "Hello World"
]
== print

This particular issue only happens if you are running at the command line—not if you are running a script from a file. But generally, REBOL programmers abide by the convention of not entering newlines in a statement unless there is an open brace. So the either function (which is equivalent to an If-Then-Else, and hence takes three arguments) would generally be written as:

>> REBOL-is-very...interesting?: true
== true
 
>> either (REBOL-is-very...interesting? = true) [
    print "Is REBOL very interesting?"
    print "Apparently so!"
] [
    print "Is REBOL very interesting?"
    print "My sources say no."
]
Is REBOL very interesting?
Apparently so!

Note the free usage of hyphens, periods, and question marks in tokens… something that you aren’t allowed in other languages.

Highly permissive evaluator

One property of REBOL is that when a block is evaluated, the “value” is always whatever the last item in a series is. So if you said:

>> 1 2 3
== 3

A frightening implication of this is that there’s nothing checking to ensure you didn’t pass too many arguments to a function. For instance, I’ve been using print in these examples, and it only takes one argument—the object to print, and it returns nothing at all. If you pass more values to it… guess what happens:

>> print "Goodbye" "cruel" "world"
Goodbye
== "world"

It’s clear to see what happened here:

  • print took its one argument, and once it was fulfilled, it knew it could run…so it did, and printed “Goodbye”
  • Next in the left-to-right evaluation was “cruel”, and it was just a string constant with no side effects so it vanished into the void
  • Chugging merrily along, the interpreter saw “world” also evaluated to a string, but had the peculiar property that it was the last item in the series… so the overall expression has the value “world”.

In C++ this would have been:

std::string MysteriousPrint(
   std::string printMe, 
   std::string ignoreMe,
   std::string returnMe)
{
   cout << printMe;
   return returnMe;
}
 
MysteriousPrint("Goodbye", "cruel", "world");

This can feel a whole lot like walking a tightrope without a safety net. It leads to some puzzling errors… that are especially puzzling to new REBOL programmers. Yet it does mean you have a rather interesting pipeline of processing on every line of code.

There’s no operator precedence

Look at this little snippet of REBOL code, which does what you’d think:

> add 1 2
== 3

Similarly, there is multiplication:

>> multiply 2 3
== 6

Despite the existence of these verbose keywords, REBOL lets you represent these as seemingly infix programs:

>> 1 + 2
== 3
 
>> 2 * 3
== 6

DO NOT be fooled by this. That’s just a shorthand, and under the hood it’s still a left-to-right expression evaluator. For instance, check this out:

>> 1 + 2 * 3
== 9

Yowza. You have to use parentheses if you want to get the expected result:

>> 1 + ( 2 * 3 )
== 7

Further adding to the weirdness, the parentheses are first-class objects within the interpreter. Like blocks, they are part of the abstract structure. You can even make a series of parentheses:

>> first [ ( ) ( ) ( ) ]
== ()

This means every time you use parentheses to disambiguate your code, you are making the program bigger and slower to evaluate. So if you show a REBOL developer any code with parentheses, they’ll encourage you to take as many of them out as you can. That’s going to grate on some people who build formal systems and don’t want to be “taking out parentheses in order to speed up the code”.

“You’re kidding. How can I code under these conditions?”

I really think that to understand why one might use a language with these properties you have to take a serious look at what makes us use English. Why do we like writing a bunch of words in a line and then ending it with a period, instead of something more “formal”? I’m not sure, but I do suspect that if English required you to call out the noun phrases or subclauses in a sentence, then << [ ( people ) | would probably ( use ) >> | { something , else } ].

Every language makes you think differently. In fact, one of my favorite free C++ books online is titled Thinking in C++. Yet REBOL tries to not stray too far from our natural instincts for language just to appease a computer’s thirst for formalism. Instead it encourages fluidity, and fulfills the confusing promise of “no reserved keywords”… as in English, everything can be overridden or given new meaning.

It takes some getting used to, certainly. There is no need to look further into the language if you could not (under any circumstances) work this way. But if you think that you’d be willing to accept these ideas if there was a payoff like the one I described in Is REBOL Actually a Revolution… then you should check out my next articles!

Tags:

10 Responses to “Major Quirks of the REBOL Language”

  1. Gregg Irwin Says:

    “I’d argue REBOL is more defined by its runtime than its syntax. So whether you like the REBOL parser is only semi-relevant.”

    What you call the REBOL parser there is really REBOL’s console behavior. If you aren’t working in the console, the parser will be happy with FIRST on a line by itself. It’s the way the console converts lines to blocks for evaluation that causes the problem.

    Of course, it can bite you if you paste code into the console as well.

    Nice REBOL articles so far!

  2. Hostile Fork Says:

    Hi Gregg! Good to have an experienced REBOL programmer visiting the site…

    Interesting, I got the idea in my head that REBOL actually didn’t permit you to put a newline in a statement unless there was an open brace from somewhere. Possibly early on working with the command line, and I just never saw code that did otherwise. I’ve updated that section of the article, thanks!

    Please feel free to point any REBOL developers you know to this stuff so it gets more eyeballs auditing it. I’m trying a grassroots campaign to hopefully push REBOL a little more into the public awareness. Even like, by making little icons and such:

    http://hostilefork.com/rebol/favicon.html

    But I’d like to be correct on my technical and philosophical arguments, at least. My hope is to maybe help bridge some of the divide with the communities of influence that have made Ruby and Python so successful. Seems there must be a way…

  3. Graham Says:

    There is an open source clone of the REBOL language .. see http://freshmeat.net/projects/rebol-orca/ and it is still under developement.

    BTW, your examples are being done with an old version of REBOL .. forall now resets the series back to the head on completion. Current version is 2.7.6

    Cheers,

  4. Hostile Fork Says:

    Hi Graham - Thanks for the pointer to ORCA! It doesn’t look like it’s under too active a development, but an interesting start. The project name is already taken in software by Sun for a screen reader, maybe FREEBOL would be better. :)

    I’d think that breaking it into two apps (one with a graphics subsystem and one without) would be ideal. Small EXE size shouldn’t be a key goal, just small source code and leveraging existing effort. For instance, it should probably be written in C++ with Boost to handle things like cross platform file paths:

    http://www.boost.org/doc/libs/1_36_0/libs/filesystem/doc/index.htm

    And perhaps the Boehm garbage collector could manage a lot of the GC issues:

    http://www.hpl.hp.com/personal/Hans_Boehm/gc/

    But the real question comes down to REBOL Technologies’ willingness to allow an open source kernel to use their mezzanine functions. If they’re not willing, that would make it almost impossible for anyone to reasonably build an interchangeable interpreter.

    Was there any reaction to ORCA in terms of the legal issues? Should a statement be pushed for?

  5. Graham Says:

    There have been a few open source efforts to produce a REBOL clone and Orca has survived the longest, and is actually in use by the chief developer as the scripting language for Syllable ( http://web.syllable.org/pages/index.html ).

    I’m not aware of any official reaction to Orca … Carl does not tend to comment on these things, except by promising new features each time a clone appears :) I’m not aware that one can copyright a language so no one is seeing any legal issues.

    Having two distributables .. one with graphics and one without is how RT currently distribute REBOL.

    As for whether RT will allow their mezzanine functions to be used by an OS clone … the license header for the SDK says

    ” ; You are free to use, modify, and distribute this file as long as the
    ; above header, copyright, and this entire comment remains intact.
    ; This software is provided “as is” without warranties of any kind.
    ; In no event shall REBOL Technologies or source contributors be liable
    ; for any damages of any kind, even if advised of the possibility of such
    ; damage. See license for more information.

    ; Please help us to improve this software by contributing changes and
    ; fixes. See http://www.rebol.com/support.html for details.

  6. Kaj Says:

    You still have to shed a bit more of your old conceptions. :-) Standard libraries are usually not a good fit for a REBOL implementation. The Boost libraries and others are probably too big to be worth it, and the Boehm garbage collector was not designed for the fast-changing allocation of small amounts of memory that dynamic languages and REBOL in particular do. It’s also difficult to port to various platforms.

    There is no reason that the REBOL mezzanine functions couldn’t be implemented independently just like the native C interpreter itself. As long as it isn’t taking copyrighted REBOL Technologies code, there’s nothing RT can do about that. The language at a conceptual level is a form of expression and not patentable. Actually, RT is releasing quite a bit of code under the BSD license or even as public domain, so it can be shared.

    There have been a series of alternative REBOL implementations over the years, but ORCA is farthest along and the only usable one. The name does have to change someday, though.

  7. Hostile Fork Says:

    Hi Kaj, thanks for reading!

    I’d like to believe that legal pressures wouldn’t be used to shut down an open REBOL implementation. And what you say about languages seems reasonable–that it cannot be patented–but much of copyright and patent law is not based on reason! You may remember the big controversy with Sun and Microsoft over Java…

    http://www.javaworld.com/javaworld/jw-01-2001/jw-0124-iw-mssuncourt.html

    So it’s important to have an official word on the issue. If an open implementation of REBOL that shipped with the mezzanine would be considered a breach of the license or RT’s intellectual property, this must be established up front. Otherwise, open source developers should spend their time working on improving other languages and bringing some of the REBOL goodness to Python or Ruby.

    I understand your feelings about wanting to make the open implementation match REBOL’s efficiency–even so far as to implement the Mezzanine in C! But an Open REBOL effort would almost certainly be running with a skeleton crew. Consequently, if the executable ended up being 2GB for just the core… this would be an acceptable tradeoff IF it meant that the code could keep pace with the evolving standard.

    Attracting developers would rely on it not being a lot of custom code, but rather a minimalist and maintainable reinvention. Those few who are really intense about wanting the super small executable should be willing to pay RT for their proprietary and optimized version–they’ll come out ahead in the end, it’s worth the money. The point of the Open REBOL would be so that developers could not fear that there isn’t a free alternative to turn to in the case of RT taking some kind of unpleasant turn, or if a non-commercial enterprise wished to leverage a REBOL codebase.

    Best,
    Fork

  8. Kaj Says:

    As happens so often when discussing innovative concepts such as REBOL, I have to step on the breaks here to correct a number of errors and misconceptions.

    I will also remark that I find it funny to see your text above the form I’m typing this comment in:

    “Please leave a valid email address, I promise I will not use it for anything evil.”

    In the same way, you promise not to betray our trust by putting ads on your site. Well, no problem, I am willing to believe you. Even if I just learned of your existence this week. - But why shouldn’t I extend a bit of trust to other people as well; particularly someone who has been a public figure for almost a quarter century, who has contributed greatly to computer science, craft and society, who has helped me in my life and career through those products, and who is even occasionally helping me personally in business matters?

    But first the misconceptions. I’d like to clarify what Graham said above. I am not the chief developer of ORCA [http://freshmeat.net/projects/rebol-orca/]; Karl Robillard is. I am not even the chief developer of the Syllable operating system [http://web.syllable.org], although I am its co-leader together with its founder. I am the creator of the Syllable Server Linux-based operating system [http://distrowatch.com/syllable], though. Both these operating systems use ORCA in their core. Syllable Server also ships REBOL/Core and a number of open source frameworks on top of it, but is not dependent on it. We hope to use them to build your flying cars someday. ;-)

    As Richard Stallman tirelessly keeps reminding us, it’s bad to heap patent law, copyright law, trademark law and contract law together (and maybe call it intellectual property), because they’re very different. The Sun vs. Microsoft case you point to was about contract law, supported by copyright law and trademark law. MS simply signed a contract with Sun to get to use their Java code and broke the contract requirements. It even cost them a lot of effort to win, but Sun could never force an independent Java implementation to be compatible, or force them by law in other ways, if they don’t sign a contract with Sun, don’t steal their proprietary code, and don’t violate their Java and Sun trademarks. Even under US law.

    Nobody who worked on ORCA has a contract affecting it with RT.

    No RT C code was used in ORCA’s implementation (nobody outside RT even has it, probably even outside Carl, except for escrow). It’s not always possible to have mezzanine functions that don’t resemble the RT REBOL functions, especially if they’re very small, but that in itself is a proof that such cases are not significant engineering efforts, so they wouldn’t fall under copyright and patent law. I didn’t even know that the mezzanines in the RT SDK seem to have a permissive license, but the goal of the ORCA mezzanines is to be independent implementations (which may in some cases even be necessitated by a different underlying implementation of ORCA).

    Carl has stated that the language at a conceptual level is a form of communication and not patentable in the US. That certainly means it’s not patentable in Europe, where software patents were eventually prevented. This is confirmed by many independent implementations of many other languages.

    Presumably, RT could keep ORCA from explaining that it means “Open source REBOL Can be Achieved”. The working name for my Linux distribution was ReboLinux, but since both REBOL and Linux are trademarked, it would have been asking for trouble to release it under that name. As I said, the name ORCA is overloaded and has to change, anyway.

    On to technical misconceptions. I didn’t say anything about ORCA’s efficiency, and certainly not that I would want to match REBOL’s by implementing mezzanines in C, instead. As far as I am concerned, the goals of ORCA are the same as those of REBOL, except for being open source. The level of implementation of each function, in C or in REBOL/ORCA, must be decided separately for each function in conjunction with the engine it’s running on. If you do these things well and are not oblivious to computer science and engineering craft, there is no reason that ORCA would have to be significantly bigger or slower than REBOL.

    I also don’t see why an open-source REBOL project would be condemned to be run by “a skeletal crew”. Open source language projects range from dead and buried to hundreds of contributors.

    To conclude, I think you know that the only certainties in this life we’re in are death and taxes. We must have a business sense about that. I don’t think it’s reasonable to ask a company to issue a statement with the goal of making it easier for others to compete with them, without getting anything in return. Carl wants us to help, without fracturing his core, which is a semantic platform that would be prevented from fulfilling its promise if everyone would have an incompatible implementation. This problem is very real, as Karl Robillard has already gone in a different direction after ORCA with this Thune language. Things get fishy very quickly. ;-)

    As I said before, apart from the core, RT releases many things under open-source licenses and in the public domain. Legally, it doesn’t get clearer than this. You can use them under copyright law according to their licenses.

    I am writing these comments under an email address I set up particularly for your blog. If you turn capitalistic on us, or what you call “evil”, I can block it. I am including REBOL in Syllable, but I’m also including ORCA and making sure the base system is dependent on that and RT’s open-source code, instead of the REBOL core.

  9. Hostile Fork Says:

    Hi Kaj,

    I knew that there was a special relationship between MS and Sun in the Java case. But I’m not a legal expert, so I do not know precisely what you can and can’t do when it comes to implementing someone’s language down to the letter. Bear in mind that people have filed lawsuits on things like “similar look and feel”.

    Also, it’s important to remember that companies can be sold, and their new owners can have different ideas. Imagine–for instance–RT falling into the hands of Software Conglomerate. This conglomerate immediately sends a letter from a lawyer demanding the code that analyzes for the “REBOL []” header in source files be changed to say “ORCA []”.

    That’s on day 1. Who knows what would be on day 2? Technicalities preoccupy people who haven’t gotten a solid commitment that it’s NOT going to happen.

    why shouldn’t I extend a bit of trust to other people as well; particularly someone who has been a public figure for almost a quarter century, who has contributed greatly to computer science, craft and society, who has helped me in my life and career through those products, and who is even occasionally helping me personally in business matters?

    Do not get me wrong, Carl seems like a great guy. If you have a specific quote or article about his disposition toward open source implementations of REBOL, then that’s exactly the kind of thing I’m looking for. My disclaimers are not etched in stone here (nor is the rest of the article). I’m just trying to oust the correct information, as Google didn’t reveal it to me.

    I’m a reasonably trusting individual, but I’ve learned it is best to get things in writing—even when dealing with people you know to be good. Until you’ve done that, everyone might not be on the same page (remember the “when you ass-u-me…” joke). We all have different ideas about what “good” or “evil” are, and even if we agree in 99% of cases the 1% might be the most relevant bit at some point in time.

    So getting a clear statement of DOs and DONTs up front would be good.

    I also don’t see why an open-source REBOL project would be condemned to be run by “a skeletal crew”.

    Not meaning to condemn anyone’s project! It’s clear ORCA represents a lot of work and I was glad to be shown it. But do note that the main trunk of Python was updated less than two hours ago (as of when I started writing this comment):

    http://svn.python.org/view/python/trunk/

    Ruby Core, also 2 hours ago:

    http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/

    If I’m reading correctly, the last updates to ORCA were 10 months ago:

    http://trac.geekisp.com/orca/browser/trunk/orca

    That’s just a data point. I poked through the code a bit and had some opinions on how I would do it differently. I’d use as many well-vetted cross-platform C++ libraries as I could. And using the boost file path library is a perfect example of what I would do instead of trying to rewrite the forward/backslash logic, etc.

    If you disagree with that then you would probably disagree with other ways I would do it. But these are my opinions, and we must not mistake disagreements for misconceptions! On the subject of engineering craft, especially hand-optimized C vs C++…this article by Bjarne Stroustrup is food for thought:

    http://www.research.att.com/~bs/new_learning.pdf

    Bjarne’s samples are pretty much what I would have given. There are just too many opportunities for bugs when writing old-style C. Even if someone is a maverick and can strcpy with the best of them, you won’t find many people at that level to collaborate with. The code falls down at inconvenient times, and has vulnerabilities to hacking.

    I don’t think it’s reasonable to ask a company to issue a statement with the goal of making it easier for others to compete with them, without getting anything in return.

    REBOL is competing for mindshare among programmers. Right now only a small fringe “gets” REBOL or cares to try to. You may know Carl and RT staff personally, but others aren’t in that situation. They want reassurance and fallbacks, and the point I’m trying to make is that those things aren’t currently available. If they were, it could change some attitudes.

    A viable and active open source version would be a great thing for RT, because commercial enterprises that wanted the “optimized” version *would* pay for it. There’d be a lot more code out there produced by academics and free software hackers for them to be running. In the meantime, you have evangelists flocking to other languages, making them more accessible… volunteering to design pretty websites…

    http://tryruby.hobix.com/

    That could have been REBOL catching the world by storm! My hypothesis is that REBOL has failed to do this for reasons that are not at all based on a lack of technical merit… and I’d like to help hammer a few of them out if I can…

  10. Kaj Says:

    Please note that I’m not trying to sell you either REBOL or ORCA. You found them yourself and are enquiring about their state on your blog. I’m giving you some insight into what that state is. People in the REBOL community know very well that people who find that state dissatisfying are not going to change their minds about it. Yet, we decided that we still want to work with the technology because the benefits far outweigh the disadvantages. As I have tried to make clear, for each of us that is an educated guesstimate and some failsafes that we can put into place in case something goes against our expectations.

    RT is not going to open source REBOL, at least not in the near future, and they are not going to sign legal documents giving legal rights away to competitors. I don’t think I would even want them to, because it doesn’t make business sense and would be a sign that they are not running their business properly. As it stands, they have been in business for eleven years. They are, however, releasing source code of their choice under open-source licenses of their choice. Those are products and legal documents you can work with. Decide for yourself if it’s worth it to you and then take it or leave it. If you insist on having code to the core, build one yourself. If you want the project to be run differently, set one up yourself. The great paradox here is that everyone thinks REBOL is special, yet everyone seems to think that Carl should change the process that created it.

    ORCA is available, it works, it’s not finished, and it’s not being developed quickly at the moment. One of the reasons it’s not being developed is that it works for me. I’m happily using it to support two operating systems. When I find issues that block me, I fix them. Once I want more out of it, I’m going to do more work on it. Further, it’s written in C. The engine is complete. It has its own garbage collector, it evaluates paths. It’s great fun theorising about how it could be done differently, and if you like having such discussions, the REBOL community is a great place for you. But it would amount to writing yet another clone. There are two in C, one in Java, one in JavaScript and even a REBOL interpreter in REBOL. Personally, I would have done it in OCaml. Yet, ORCA is the only usable one, so I use it.

Leave a Reply


Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported