REBMU: Code Golf Language for Humans
Code Golf is the recreational pursuit of solving programming problems in as few characters of ASCII source code as possible. Rebmu is a language tailored to winning such challenges (hopefully!)—with several twists. One is that it specifically emphasizes the ability for humans to read, absorb and maintain the source code without assistive tools, despite their compression.
Here is the “long way” to express a Rebmu program which reads a string of Roman Numerals from the input console, and then converts it into a decimal equivalent:
r s fe c s [ n: se [x 10 i 1 v 5 l 50 c 100 d 500 m 1000] tw c uz j [a+ k el j n [al sb n j n: 0] 'j j: n] j: n cn ] k + j
When the shorthands above are mapped into the commands they represent, this reads very naturally.
r s means “read in the string s”.
fe c s says “foreach character c in the string s, do the following code block”.
uz j means “unless j is zero, do the following codeblock”. It’s straightforward, but there’s more here than meets the eye in terms of the type system etc. I’ll get to that in a moment, but…
…to put the language on the map for Code Golf, I threw in a gimmick. Since Rebmu isn’t case-sensitive in terms of its identifier lookup, it lets you use the alternation of upper and lower case characters to separate words. The process is called “mushing”—and though purely optional it can empirically save about 40% on the character count:
Separations are in character runs (
ABCdefGHIjkl) instead of using an uppercase character to indicate each break (
AbcDefGhiJkl). This is so that the special choice of when to capitalize the letters of the first word in a group can indicate whether it represents an assignment or not. So
ABCdef gives you
abc: def (an assignment of def to abc) while
abcDEF gives you
abc def (two separate words, abc and def).
Trying to read it for the first time is a little bit like stumbling over Pig Latin before you know the trick. But once your brain “gets it”, the transformation is pretty transparent. Well, at least compared to other character count reduction methods…like converting to base-64! (That may sound outrageous, but people actually submit “winning” entries that are all hexadecimal-encoded machine code… I’m not kidding.)
The language is real and can be run on almost any platform. The source is published on GitHub:
>> do %rebmu.r Script: "Rebmu Dialect" (10-Jan-2010) Script: "The Mu Rebol Library" (none) Script: "Mushing Routines" (none) >> rebmu [pr"HEllO, REBmu!"] HEllO, REBmu! >> rebmu %examples/roman.r Input String: MCCCXXXVII == 1337
I’ve been solving old StackOverflow code golf challenges in Rebmu, to test and refine the design:
The Punch Line
What makes Rebmu somewhat mind-blowing is that it is actually not an independent language in its own right, but rather what is known as a “dialect” of something called Rebol. I didn’t have complete freedom to define the syntax, so Rebmu is trickily dodging the cases which would generate invalid tokens (
A1 B is legal Rebol, but
A 1B will give you an “invalid integer” during the parse).
Because of this sleight of hand, you can slip in and out of using the abbreviated form. That makes a wide range of functions and alternate coding styles available, as with the date arithmetic injected into this example:
So unlike other golfing-specific languages like GolfScript, Rebmu can speak internet protocols, handle Unicode, and draw GUIs. And the interpreter is shockingly small—delegating most all parsing issues to Rebol (which is shockingly small itself). Yet it preserves as much of the original language’s character as possible, so that knowledge transfers easily.
The design is stabilizing, but still in flux as I look at what balances readability with what’s commonly useful to abbreviate in the problems I’m looking at. That does introduce the risk that I might subconsciously tailor the system to narrowly solve only those problems succinctly. Yet just like in real golf, a proper game of code golf is not played in one hole and then a winner declared. The true test of Rebmu’s chutzpah will be when it is used to play several games with no modifications to the language during that tournament.
I scribbled down some notes the night I thought of it, on 8-Jan-2010 in a blog post modest-proposal-for-code-golf-in-rebol. At that point it was only an idea with no implementation, but two days later I made the first version (capable of running the “mushed” Roman numeral example). Since then it has evolved into an implementation which others could reasonably start using.
YOU CAN HELP! If Code Golf is an interesting game to you, or Rebol an interesting language (or both) then I invite your collaboration!
(Note: If you believe Code Golf is of dubious benefit, I know where you’re coming from. But I think the key is that practicing problem solving of any kind under constraints leads one to stretch into new ways of thinking. Some other opinions on this matter are in this StackOverflow question)