javascript : BLACKHIGHLIGHTER: Protected Public Discussions

Tags: javascript, node.js, cryptography

Blackhighlighter combines novel editing with in-browser cryptography to facilitate new ways of communicating. It was initially designed for a Sunlight Foundation contest to improve transparency in correspondence with elected officials. However, it was later evolved to a widget that helps hybridize redaction with accountability in many possible scenarios. It has applications to open government, but also generally raising the bar for fairness in communications.

It is packaged as a jQuery widget (with no other dependencies), along with a server library that can be integrated into Node.JS projects as an npm module.

Overview Video

Description

The best way to understand the idea is to see a running demo of a site that incorporates it. So please visit the demo deployment:

http://www.blackhighlighter.org

The widget itself is the area where you compose and interact with the content--and its only client-side dependency is jQuery. All the buttons, tabs, and theming you see are specific to that demo and its usage. Alternative blackhighlighter workflows could vary wildly...with different people doing the authorship vs. the redaction, or automatically storing certificates with rules triggering revelation (such as after a certain date).

Yet even that demo shows a very cool usage for blackhighlighter: a letter-writing tool that takes the guesswork out of forwarding! The sender points out any sensitive information at the time of writing--so anyone given a certificate understands that should be treated as confidential. But everything else was published on a server for anyone to read; so it's not necessary to worry about sharing it.

So despite the seemingly "obscuring" nature, blackhighlighter was conceived as a tool to achieve greater transparency! The hope is to coax completely private conversations into the light so they are mostly public. That removes unnecessary blocks on the flow of information to those who may need it. (There are of course legalistic scenarios too...but those people probably won't like the "being held accountable to what you redacted" part! :-/)

The demo and widget are free software--published for community review, usage, and enhancement. Yet just how Ward Cunningham saw "The Wiki Way" as more of a collaborative mindset than a specific technology, I'm most interested in seeing the "spirit" of the idea spread. The point is to be transparent to the greatest extent possible, while transforming any "unknown unknowns" into "known unknowns", for which you take responsibility.

Methodology

The demo at blackhighlighter.org is designed to give a pretty good step-by-step walkthrough. But I'll summarize some of the methodology here.

During the composition process, no information is transmitted to the server where the message is to be posted. The only plaintext the server receives are the parts that have not been blacked out. The widget can be changed back and forth between a 'compose' mode and 'protect' mode by method calls, until a request to publish is made.

On publication, what the server does receive are cryptographic signatures which "commit" to the content of the hidden portions. This constitutes a "commitment scheme", and the signatures are calculated by unobfuscated JavaScript that runs in the user's browser using SHA256. The widget can also generate a textual "certificate" for verifying the missing portion--again with no server communication.

When the posting process has been completed, the client widget receives the timestamp of the post from the server. That timestamp is then included in the message, whose canonical JSON is cryptographically hashed. This becomes the commit ID. Thus, even without a certificate, it's possible to verify that the server returns valid information for a given commit ID.

Note The blackhighlighter.org demo sandbox takes this ID and uses it in the URL of the published commitment (which you send to other people). That means it's not possible for the server operator to fake the content at that URL without it being noticeable. Of course--if you're skeptical of the server you shouldn't be running the version of the checking JavaScript client code that they served you! :-/

The blackhighlighter widget has a method call for applying a certificate to protected data it has been loaded with. Showing the missing portions (and verifying they were the ones posted) is done entirely in the browser--the certificate does not need to be shared with the server. But the widget also has a method for letting anyone with a valid certificate reveal it publicly. Putting it in the mode for doing this will "blink red" the parts that would be revealed.

The Code

Source for the blackhighlighter npm module (which includes the jQuery widget) is available at:

https://github.com/hostilefork/blackhighlighter

Source for the demonstration website running at blackhighlighter.org is available at:

https://github.com/hostilefork/blackhighlighter.org

Originally, the server was written in Python and built on the Django framework. It was then ported to Node.JS. This allowed removal of some redundancy between the cryptography used by the client and server--although that was less of a benefit than expected. (See Thoughts on Sharing Client and Server code with Node.JS)

The underlying widget jquery-blackhighlighter.js is separated out so that the only dependency it has is on jQuery. An arbitrary number of widgets can be put on the same page, but no good demo of this exists...yet. So the current development focus is on a blackhighlighter-powered comment service to show that off (something like Disqus). You would write a comment but be able to use a pen to mark out any information you wanted to be viewable by the article author--and anyone you shared your certificate with.)

History and Motivation

I started the project after hearing Kimo Crossman describe a problem with getting information from a local government housing authority. Their conventional use of Microsoft Word or Excel freely mixed in sensitive data with matters that they were legally obligated to share as public record. Lawyers have to "scrub" the confidential data out of the files (which may include "meta-data" that is hidden in parts of the document one does not typically see!)

Because there is cost associated with separating the private data from the parts that were rightfully public, it became an uphill battle in the courts to get the documents. Kimo suggested remedying this with tools for declaring the sensitive information at the time it's entered. Ideally this would make it fair to ask for a continual publication process; where the agency would publish their information.

My variation on his idea was to create something that people outside the government could use, thus building awareness that working in this style is even possible! So I envisioned a service for communicating with U.S. Senators and Representatives. By encouraging people to formulate their correspondence so most of it could be shared on the Internet, it would effectively make the inboxes of elected leaders searchable by anyone. Yet it wouldn't have the usual problem of anonymous Internet posts, because the protected portions would be sent to the official--proving the author was a constituent. Here was a prototype of the homepage:

I had intended to submit my site to the 2009 Apps for America Competition, but only had two weeks from the time I found out about the contest before the deadline. So I couldn't finish it in time. It stayed in storage for years after that, but was revived and given new life as a Node.JS project, and made more modular to make it easier to apply to other kinds of problems.

License

I'm a pretty strong believer in the Stallman-style of "Software Freedom". It would be a better world if those who adapted free software (and then deployed it to users) would share their adaptations back with the community. His arguments have always seemed pretty solid to me:

http://www.gnu.org/philosophy/shouldbefree.html

When this project was first started in 2009, I tended to err on the side of conservatism in using GPL-style licenses. The AGPL closes the "hole" in the GPL so that giving access to a program running GPL code on a server requires provision of any source adaptations made in that case.

Yet at the same time, being able to borrow and recombine code without worrying about where it comes from is empowering to programmers. Black Highlighter owes its existence to code from which copying and pasting could be done. So being "unfriendly" about the license, in the sense of imposing concerns on people trying to solve a problem... is not my intention. I just would like to be a little bit of a social agitator, so my own contributions are AGPL; for the moment.

But if you are interested in applying Blackhighlighter in your project--and the license is too restrictive and a barrier to doing so, please contact me. I'd loosen it if there was a good reason to. Note that as people like Jack Slocum have learned, it's better to relax a license than tighten it after you've set expectations in your community:

http://blog.hostilefork.com/extjs-licensing-fiasco/