USCII: Character Codes With Meaning

UPDATE: Try out the new online Encoder and Decoder for USCII-5×7-ENGLISH-C0!
USCII (”you-ski”) stands for Universal Semiotic Coding for Information Interchange. It is a system for embedding pictures inside the coded numbers agreed upon to represent symbols and signals. I was inspired to create it by the famous Arecibo Message, which attempted to convey humanity’s physics knowledge without assuming a cultural context other than math.
For instance, instead of ASCII’s encoding of 65 for “A” and 66 for “B”…we might consider using the number 15621226033 for “A” and 16400753439 for “B”. To see the bitmaps, you must first convert these values into binary:
- 15621226033 (base 10) =
01110100011000110001111111000110001 (base 2) - 16400753439 (base 10) =
11110100011000111110100011000111110 (base 2)
When transmitted in a medium which hints at the significance of a 35-bit pattern, the semiprime nature of 35 suggests decomposing it into the factors 7 and 5. Interpreting these bits as a 5×7 rectangle produces a picture—which happens to be a picture of the letter which the code is representing! For 01110100011000110001111111000110001, you would get:
0 1 1 1 0
1 0 0 0 1
1 0 0 0 1
1 0 0 0 1
1 1 1 1 1
1 0 0 0 1
1 0 0 0 1
Do you see the “A” drawn by the ones? Now let’s try putting a break at every fifth bit in 11110100011000111110100011000111110:
1 1 1 1 0
1 0 0 0 1
1 0 0 0 1
1 1 1 1 0
1 0 0 0 1
1 0 0 0 1
1 1 1 1 0
Since USCII is a general methodology, there can be many standards—the only requirement is that the bit length be a product of primes. You could have a 31×31 USCII which establishes codes for all the characters in Chinese. Or a 7×11 USCII that has all the characters you need for Dutch…as well as a standard set of Emoticons!
USCII codes have additional interesting properties, some of which I laid out in a blog post from last year.
Arecibo ASCII
I’ve developed a *draft* specification of USCII variation “5×7-ENGLISH-C0″. This uses 35 bits per character, and includes printable characters as well as the “C0 control codes”. You can read the script that generates it, which contains comments on why I picked the bit patterns:
I’ve informally named this variant “Arecibo Ascii”. That’s because it is possible to losslessly convert a stream of conventional ASCII characters into USCII-5×7-ENGLISH-C0 (and back again). It’s still a work in progress, but here’s the table as it currently stands:
| ASCII | Character | Arecibo ASCII (35-bit binary) |
| 0 | Null character | 10101010101010101010101010101010101 |
| 1 | Start of Header | 10101101111010110111101011011110101 |
| 2 | Start of Text | 11011111111101111111110111111111011 |
| 3 | End of Text | 11011110111101111011110111111111011 |
| 4 | End of Transmission | 11111111111111111111111111001110011 |
| 5 | Enquiry | 11111111010000011101101110000010111 |
| 6 | Acknowledgment | 11111101011111111111011101000111111 |
| 7 | Bell | 11011100011000110001000001111111011 |
| 8 | Backspace | 11111110111011100000101111101111111 |
| 9 | Horizontal Tab | 00000000000000111101000010000000000 |
| 10 | Line Feed | 11100001000010000100111110111000100 |
| 11 | Vertical Tab | 00100001000010000100001000000001110 |
| 12 | Form feed | 11111011100010000000111110111000100 |
| 13 | Carriage return | 00001000010010101101111110110000100 |
| 14 | Shift Out | 00100101111101111011110111110111100 |
| 15 | Shift In | 11100111011101111011110111011100100 |
| 16 | Data Link Escape | 11111111110010001110001001111111111 |
| 17 | Device Control 1 | 11111101111001110001100111011111111 |
| 18 | Device Control 2 | 11011110110101001010011100111010001 |
| 19 | Device Control 3 | 11111101011010110101101011010111111 |
| 20 | Device Control 4 | 11111100011000110001100011000111111 |
| 21 | Negative Acknowledgement | 11111101011111111111100010111011111 |
| 22 | Synchronous Idle | 11111111111111111111111110101011111 |
| 23 | End of Trans. Block | 11111000000111001010011100000011111 |
| 24 | Cancel | 10001000000101000100010100000010001 |
| 25 | End of Medium | 11111100010110001010001101000111111 |
| 26 | Substitute | 10001011101111011101110111111111011 |
| 27 | Escape | 00011001100101011110011100111010001 |
| 28 | File Separator | 10101101011010110101101011010110101 |
| 29 | Group Separator | 11011110111101111011110111101111011 |
| 30 | Record Separator | 11110111101101010010000001001111011 |
| 31 | Unit Separator | 11111111111111111111100111101110111 |
| 32 | Space | 00000000000000000000000000000000000 |
| 33 | ! | 00100001000010000100000000000000100 |
| 34 | “ | 01010010100101000000000000000000000 |
| 35 | # | 01010010101111101010111110101001010 |
| 36 | $ | 00100011111010001110001011111000100 |
| 37 | % | 11000110010001000100010001001100011 |
| 38 | & | 01100100101010001000101011001001101 |
| 39 | ‘ | 01100001000100000000000000000000000 |
| 40 | ( | 00010001000100001000010000010000010 |
| 41 | ) | 01000001000001000010000100010001000 |
| 42 | * | 00000001001010101110101010010000000 |
| 43 | + | 00000001000010011111001000010000000 |
| 44 | , | 00000000000000000000011000010001000 |
| 45 | - | 00000000000000011111000000000000000 |
| 46 | . | 00000000000000000000000000110001100 |
| 47 | / | 00000000010001000100010001000000000 |
| 48 | 0 | 01110100011001110101110011000101110 |
| 49 | 1 | 00100011000010000100001000010001110 |
| 50 | 2 | 01110100010000100010001000100011111 |
| 51 | 3 | 11111000100010000010000011000101110 |
| 52 | 4 | 00010001100101010010111110001000010 |
| 53 | 5 | 11111100001111000001000011000101110 |
| 54 | 6 | 00110010001000011110100011000101110 |
| 55 | 7 | 11111000010001000100010000100001000 |
| 56 | 8 | 01110100011000101110100011000101110 |
| 57 | 9 | 01110100011000101111000010001001100 |
| 58 | : | 00000011000110000000011000110000000 |
| 59 | ; | 00000011000110000000011000010001000 |
| 60 | < | 00010001000100010000010000010000010 |
| 61 | = | 00000000001111100000111110000000000 |
| 62 | > | 01000001000001000001000100010001000 |
| 63 | ? | 01110100010000100010001000000000100 |
| 64 | @ | 01110100010000101101101011010101110 |
| 65 | A | 01110100011000110001111111000110001 |
| 66 | B | 11110100011000111110100011000111110 |
| 67 | C | 01110100011000010000100001000101110 |
| 68 | D | 11100100101000110001100011001011100 |
| 69 | E | 11111100001000011110100001000011111 |
| 70 | F | 11111100001000011110100001000010000 |
| 71 | G | 01110100011000010111100011000101111 |
| 72 | H | 10001100011000111111100011000110001 |
| 73 | I | 01110001000010000100001000010001110 |
| 74 | J | 00111000100001000010000101001001100 |
| 75 | K | 10001100101010011000101001001010001 |
| 76 | L | 10000100001000010000100001000011111 |
| 77 | M | 10001110111010110101100011000110001 |
| 78 | N | 10001100011100110101100111000110001 |
| 79 | O | 01110100011000110001100011000101110 |
| 80 | P | 11110100011000111110100001000010000 |
| 81 | Q | 01110100011000110001101011001001101 |
| 82 | R | 11110100011000111110101001001010001 |
| 83 | S | 01111100001000001110000010000111110 |
| 84 | T | 11111001000010000100001000010000100 |
| 85 | U | 10001100011000110001100011000101110 |
| 86 | V | 10001100011000110001100010101000100 |
| 87 | W | 10001100011000110001101011010101010 |
| 88 | X | 10001100010101000100010101000110001 |
| 89 | Y | 10001100011000101010001000010000100 |
| 90 | Z | 11111000010001000100010001000011111 |
| 91 | [ | 01110010000100001000010000100001110 |
| 92 | \ | 00000100000100000100000100000100000 |
| 93 | ] | 01110000100001000010000100001001110 |
| 94 | ^ | 00100010101000100000000000000000000 |
| 95 | _ | 00000000000000000000000000000011111 |
| 96 | ` | 01000001000001000000000000000000000 |
| 97 | a | 00000000000111000001011111000101111 |
| 98 | b | 10000100001000011110100011000111110 |
| 99 | c | 00000000000111110000100001000001111 |
| 100 | d | 00001000010000101111100011000101111 |
| 101 | e | 00000000000111010001111111000001111 |
| 102 | f | 00010001010010001110001000010000100 |
| 103 | g | 00000000000111110001011110000111110 |
| 104 | h | 10000100001000011110100011000110001 |
| 105 | i | 00000001000000000100001000010000100 |
| 106 | j | 00010000000001000010000101001001100 |
| 107 | k | 01000010000100101010011000101001001 |
| 108 | l | 01100001000010000100001000010001110 |
| 109 | m | 00000000001101110101101011010110001 |
| 110 | n | 00000000001011011001100011000110001 |
| 111 | o | 00000000000111010001100011000101110 |
| 112 | p | 00000000001111010001111101000010000 |
| 113 | q | 00000000000111110001011110000100001 |
| 114 | r | 00000000001011011001100001000010000 |
| 115 | s | 00000000000111110000011100000111110 |
| 116 | t | 00100001001111100100001000010100010 |
| 117 | u | 00000000001000110001100011000101110 |
| 118 | v | 00000000001000110001100010101000100 |
| 119 | w | 00000000001000110001101011010101010 |
| 120 | x | 00000000001000101010001000101010001 |
| 121 | y | 00000000001000101010001000010001000 |
| 122 | z | 00000000001111100010001000100011111 |
| 123 | { | 00011001000010001000001000010000011 |
| 124 | | | 00100001000010000000001000010000100 |
| 125 | } | 11000001000010000010001000010011000 |
| 126 | ~ | 00001011101000000000000000000000000 |
| 127 | Delete | 11111110001010100010101011100011111 |
The C0 codes are admittedly rather tricky. Especially to depict things like “Data Link Escape” or “Device Control 1″! It would be possible to use a larger bit size and get clearer images. But I’d like to see how far the 35 bit standard can go in cueing people who aren’t familiar with ASCII into what the bitmaps signify…

October 18th, 2010 at 3:40 pm
Neat idea! In your examples for A and B, may I suggest you insert a space between the digits (for better physical proportions), and reverse the red and blue colors (I kept trying to “read” the blue characters, which are the background). And I think the crossbar for the A should be one row higher.
]
[I did something like this years ago to punch human-readable messages into paper tape (with an ASR-33 Teletype driven by a Data General Nova minicomputer, using BASIC), but used a 6-dot-high character, leaving an unpunched row along the edges of the tape to make it less easily tearable. Yes, I’m old
December 20th, 2010 at 7:35 pm
@RSMilward: You are right, the red/blue was hard to read. I switched to black and gray and it’s much better.
As for changing the symbol for A… well, one issue is that I have sent a few Arecibo Ascii messages out into the ether by now. And if I change the patterns then that means they won’t decode. Which wouldn’t be the end of the world with this standard, especially since people can go down to the bits and judge for themselves what the letter is!
But the font was not my design, it came from the PIC CPU:
http://www.noritake-itron.com/Softview/fontspic.htm
I’m assuming they had a reason for putting the crossbar where they did. In context it may help differentiate the A… I don’t know (?) It’s certainly not the most questionable choice in the standard so far.
What I did do is I made an updated check in so that all the printable characters are now easy to edit in the source:
http://github.com/hostilefork/uscii/blob/master/specifications/uscii-5×7-english-c0.rebol
So if the idea entertains you enough to want to get involved in helping nail down the symbols for a “formal” release…I’ll check the pull requests.
Thanks for the interest, in any case!