USCII: Character Codes With Meaning

UPDATE: Try out the new online Encoder and Decoder for USCII-5×7-ENGLISH-C0!
USCII (”you-ski”) stands for Universal Semiotic Coding for Information Interchange. It is a system for embedding pictures inside the coded numbers agreed upon to represent symbols and signals. I was inspired to create it by the famous Arecibo Message, which attempted to convey humanity’s physics knowledge without assuming a cultural context other than math.
For instance, instead of ASCII’s encoding of 65 for “A” and 66 for “B”…we might consider using the number 15621226033 for “A” and 16400753439 for “B”. To see the bitmaps, you must first convert these values into binary:
- 15621226033 (base 10) =
01110100011000110001111111000110001 (base 2) - 16400753439 (base 10) =
11110100011000111110100011000111110 (base 2)
When transmitted in a medium which hints at the significance of a 35-bit pattern, the semiprime nature of 35 suggests decomposing it into the factors 7 and 5. Interpreting these bits as a 5×7 rectangle produces a picture—which happens to be a picture of the letter which the code is representing! For 01110100011000110001111111000110001, you would get:
01110
10001
10001
10001
11111
10001
10001
Do you see the “A” drawn by the ones? Now let’s try putting a break at every fifth bit in 11110100011000111110100011000111110:
11110
10001
10001
11110
10001
10001
11110
Since USCII is a general methodology, there can be many standards—the only requirement is that the bit length be a product of primes. You could have a 31×31 USCII which establishes codes for all the characters in Chinese. Or a 7×11 USCII that has all the characters you need for Dutch…as well as a standard set of Emoticons!
USCII codes have additional interesting properties, some of which I laid out in a blog post from last year.
Arecibo ASCII
I’ve developed a *draft* specification of USCII variation “5×7-ENGLISH-C0″. This uses 35 bits per character, and includes printable characters as well as the “C0 control codes”. You can read the script that generates it, which contains comments on why I picked the bit patterns:
I’ve informally named this variant “Arecibo Ascii”. That’s because it is possible to losslessly convert a stream of conventional ASCII characters into USCII-5×7-ENGLISH-C0 (and back again). It’s still a work in progress, but here’s the table as it currently stands:
| ASCII | Character | Arecibo ASCII (35-bit binary) |
| 0 | Null character | 10101010101010101010101010101010101 |
| 1 | Start of Header | 10101101111010110111101011011110101 |
| 2 | Start of Text | 11011111111101111111110111111111011 |
| 3 | End of Text | 11011110111101111011110111111111011 |
| 4 | End of Transmission | 11111111111111111111111111001110011 |
| 5 | Enquiry | 11111111010000011101101110000010111 |
| 6 | Acknowledgment | 11111101011111111111011101000111111 |
| 7 | Bell | 11011100011000110001000001111111011 |
| 8 | Backspace | 11111110111011100000101111101111111 |
| 9 | Horizontal Tab | 00000000000000111101000010000000000 |
| 10 | Line Feed | 11100001000010000100111110111000100 |
| 11 | Vertical Tab | 00100001000010000100001000000001110 |
| 12 | Form feed | 11111011100010000000111110111000100 |
| 13 | Carriage return | 00001000010010101101111110110000100 |
| 14 | Shift Out | 00100101111101111011110111110111100 |
| 15 | Shift In | 11100111011101111011110111011100100 |
| 16 | Data Link Escape | 11111111110010001110001001111111111 |
| 17 | Device Control 1 | 11111101111001110001100111011111111 |
| 18 | Device Control 2 | 11011110110101001010011100111010001 |
| 19 | Device Control 3 | 11111101011010110101101011010111111 |
| 20 | Device Control 4 | 11111100011000110001100011000111111 |
| 21 | Negative Acknowledgement | 11111101011111111111100010111011111 |
| 22 | Synchronous Idle | 11111111111111111111111110101011111 |
| 23 | End of Trans. Block | 11111000000111001010011100000011111 |
| 24 | Cancel | 10001000000101000100010100000010001 |
| 25 | End of Medium | 11111100010110001010001101000111111 |
| 26 | Substitute | 10001011101111011101110111111111011 |
| 27 | Escape | 00011001100101011110011100111010001 |
| 28 | File Separator | 10101101011010110101101011010110101 |
| 29 | Group Separator | 11011110111101111011110111101111011 |
| 30 | Record Separator | 11110111101101010010000001001111011 |
| 31 | Unit Separator | 11111111111111111111100111101110111 |
| 32 | Space | 00000000000000000000000000000000000 |
| 33 | ! | 00100001000010000100000000000000100 |
| 34 | “ | 01010010100101000000000000000000000 |
| 35 | # | 01010010101111101010111110101001010 |
| 36 | $ | 00100011111010001110001011111000100 |
| 37 | % | 11000110010001000100010001001100011 |
| 38 | & | 01100100101010001000101011001001101 |
| 39 | ‘ | 01100001000100000000000000000000000 |
| 40 | ( | 00010001000100001000010000010000010 |
| 41 | ) | 01000001000001000010000100010001000 |
| 42 | * | 00000001001010101110101010010000000 |
| 43 | + | 00000001000010011111001000010000000 |
| 44 | , | 00000000000000000000011000010001000 |
| 45 | - | 00000000000000011111000000000000000 |
| 46 | . | 00000000000000000000000000110001100 |
| 47 | / | 00000000010001000100010001000000000 |
| 48 | 0 | 01110100011001110101110011000101110 |
| 49 | 1 | 00100011000010000100001000010001110 |
| 50 | 2 | 01110100010000100010001000100011111 |
| 51 | 3 | 11111000100010000010000011000101110 |
| 52 | 4 | 00010001100101010010111110001000010 |
| 53 | 5 | 11111100001111000001000011000101110 |
| 54 | 6 | 00110010001000011110100011000101110 |
| 55 | 7 | 11111000010001000100010000100001000 |
| 56 | 8 | 01110100011000101110100011000101110 |
| 57 | 9 | 01110100011000101111000010001001100 |
| 58 | : | 00000011000110000000011000110000000 |
| 59 | ; | 00000011000110000000011000010001000 |
| 60 | < | 00010001000100010000010000010000010 |
| 61 | = | 00000000001111100000111110000000000 |
| 62 | > | 01000001000001000001000100010001000 |
| 63 | ? | 01110100010000100010001000000000100 |
| 64 | @ | 01110100010000101101101011010101110 |
| 65 | A | 01110100011000110001111111000110001 |
| 66 | B | 11110100011000111110100011000111110 |
| 67 | C | 01110100011000010000100001000101110 |
| 68 | D | 11100100101000110001100011001011100 |
| 69 | E | 11111100001000011110100001000011111 |
| 70 | F | 11111100001000011110100001000010000 |
| 71 | G | 01110100011000010111100011000101111 |
| 72 | H | 10001100011000111111100011000110001 |
| 73 | I | 01110001000010000100001000010001110 |
| 74 | J | 00111000100001000010000101001001100 |
| 75 | K | 10001100101010011000101001001010001 |
| 76 | L | 10000100001000010000100001000011111 |
| 77 | M | 10001110111010110101100011000110001 |
| 78 | N | 10001100011100110101100111000110001 |
| 79 | O | 01110100011000110001100011000101110 |
| 80 | P | 11110100011000111110100001000010000 |
| 81 | Q | 01110100011000110001101011001001101 |
| 82 | R | 11110100011000111110101001001010001 |
| 83 | S | 01111100001000001110000010000111110 |
| 84 | T | 11111001000010000100001000010000100 |
| 85 | U | 10001100011000110001100011000101110 |
| 86 | V | 10001100011000110001100010101000100 |
| 87 | W | 10001100011000110001101011010101010 |
| 88 | X | 10001100010101000100010101000110001 |
| 89 | Y | 10001100011000101010001000010000100 |
| 90 | Z | 11111000010001000100010001000011111 |
| 91 | [ | 01110010000100001000010000100001110 |
| 92 | \ | 00000100000100000100000100000100000 |
| 93 | ] | 01110000100001000010000100001001110 |
| 94 | ^ | 00100010101000100000000000000000000 |
| 95 | _ | 00000000000000000000000000000011111 |
| 96 | ` | 01000001000001000000000000000000000 |
| 97 | a | 00000000000111000001011111000101111 |
| 98 | b | 10000100001000011110100011000111110 |
| 99 | c | 00000000000111110000100001000001111 |
| 100 | d | 00001000010000101111100011000101111 |
| 101 | e | 00000000000111010001111111000001111 |
| 102 | f | 00010001010010001110001000010000100 |
| 103 | g | 00000000000111110001011110000111110 |
| 104 | h | 10000100001000011110100011000110001 |
| 105 | i | 00000001000000000100001000010000100 |
| 106 | j | 00010000000001000010000101001001100 |
| 107 | k | 01000010000100101010011000101001001 |
| 108 | l | 01100001000010000100001000010001110 |
| 109 | m | 00000000001101110101101011010110001 |
| 110 | n | 00000000001011011001100011000110001 |
| 111 | o | 00000000000111010001100011000101110 |
| 112 | p | 00000000001111010001111101000010000 |
| 113 | q | 00000000000111110001011110000100001 |
| 114 | r | 00000000001011011001100001000010000 |
| 115 | s | 00000000000111110000011100000111110 |
| 116 | t | 00100001001111100100001000010100010 |
| 117 | u | 00000000001000110001100011000101110 |
| 118 | v | 00000000001000110001100010101000100 |
| 119 | w | 00000000001000110001101011010101010 |
| 120 | x | 00000000001000101010001000101010001 |
| 121 | y | 00000000001000101010001000010001000 |
| 122 | z | 00000000001111100010001000100011111 |
| 123 | { | 00011001000010001000001000010000011 |
| 124 | | | 00100001000010000000001000010000100 |
| 125 | } | 11000001000010000010001000010011000 |
| 126 | ~ | 00001011101000000000000000000000000 |
| 127 | Delete | 11111110001010100010101011100011111 |
The C0 codes are admittedly rather tricky. Especially to depict things like “Data Link Escape” or “Device Control 1″! It would be possible to use a larger bit size and get clearer images. But I’d like to see how far the 35 bit standard can go in cueing people who aren’t familiar with ASCII into what the bitmaps signify…
