BLACKHIGHLIGHTER: Protected Public Discussions
NOTE: Blackhighlighter is in the midst of a port from Apache+Django+MySQL to be run on dedicated servers to Node.js+Express+Swig+MongoDB to be run on Cloud Foundry. Until that is complete and documented, this page will be out of sync with the GitHub repo! I’ll try and keep the links working, though.
BlackHighlighter combines modern cryptography and client-side editing to create a new way of communicating on the Internet. You publish text on a network server with the ability to “protect” certain parts of what you have written. These protected parts can only be read by those who you give access to (and, of course, whoever they share it with!)
This may bring some “cloak and dagger” and legalistic scenarios to mind. Yet despite the obscuring nature, this was conceived as a tool to help achieve greater transparency! The hope is to coax completely private conversations into the light so they are mostly public. There’s also an element of accountability in the mix: using a commitment scheme, any missing portions that surface can be checked to ensure it was the text that was originally blacked out at the time of publication.
BlackHighlighter takes the guesswork out of forwarding. The sender points out any sensitive information at the time of writing, so anyone receiving certificates with the missing bits understands those should be treated as confidential. Yet everything else was published on a server for anyone to read, so there’s a clear delineation of what’s okay to share. This stops misunderstandings, and removes unnecessary blocks on the flow of information to those who may need it.
The easiest way to understand it is probably to see it in action, so I built a working prototype. The underlying system supports multiple separate “colors” of redaction pens that can be revealed independently, but the UI doesn’t let you pick a color at the moment. Here is a screencast (circa May 2009):
I’ve published the source to my demo in order to get community feedback. Yet just as Ward Cunningham outlined “The Wiki Way” as more of a collaborative mindset than a specific technology, I’m most interested in seeing the “spirit” of the idea spread. The point is to be transparent to the greatest extent possible, while transforming any “unknown unknowns” into “known unknowns” for which you take responsibility.
(UPDATE: I’ve put up a tentative demo server running the django app at blackhighlighter.org. It’s a bare-bones site which hosts the app and has no other site features, so don’t expect too much. But it shows the workflow of using BlackHighlighter.)
It’s released under the GNU Affero General Public License, Version 3 and you can browse it on GitHub:
An important aspect in the separation of the code is that the protected parts of your text are never sent over the network in an unencrypted form. This way you don’t have to trust the person running the server not to read or reveal the protected parts—they don’t have them! Yet the system is in its infancy, so if you’re a cryptography expert then please chime in with any ideas. (Here’s what I got so far from sci.crypt.)
There is also protection in the protocol to prevent the server from lying about the contents of the message or the commitment hashes. The URL itself contains a hash of the canonized commitment JSON. This way, even if you don’t have a certificate you can verify that what’s on the server is what it should be. You only need the URL itself to verify that the contents are correct! This is a step up from using a random ID for each blackhighlighter text.
I’m going to have to put together an INSTALL file to help anyone unfamiliar with django get started running this on their own server, and maybe zip up the required libraries so you don’t have to hunt them all down yourself. In the meantime, the video above is probably the easiest way to experience and share the idea. (I’m not thrilled about setting up and administrating an Apache server to run this for the general public myself, but will probably do so anyway.)
History and Motivation
I started the project after hearing Kimo Crossman describe a problem with getting information from the government. Their conventional use of Microsoft Word or Excel freely mixes in sensitive information with matters of public record. Lawyers have to “scrub” the confidential data out of the files (which may include “meta-data” that is hidden in parts of the document one does not typically see!)
Because there is cost associated with separating the private data from the parts that should be public, it’s an uphill battle in the courts to get the documents. Kimo suggested remedying this with tools for declaring the sensitive information at the time it’s entered. Ideally this would allow everything else to be published instantly to the web.
My variation on his idea was to create something that people outside the government could use, thus building awareness that working in this style is even possible! So I envisioned a service for communicating with U.S. Senators and Representatives. By encouraging people to formulate their correspondence so most of it could be shared on the Internet, it would effectively make the inboxes of elected leaders searchable by anyone. Yet it wouldn’t have the usual problem of anonymous Internet posts, because the protected portions would be sent to the official.
I intended to submit my site to the 2009 Apps for America Competition, but only had two weeks and couldn’t make the deadline. So I took that as a mixed blessing, and decided to modularize the code so that it might be used in blog commenting systems or other parts of a site. I’m still looking to enter this into a web innovation contest or grant program of some sort, so if you have any leads then let me know!