Cloudy with a Chance of Privacy – Light Blue Touchpaper

Three Paper Thursday is an experimental new feature in which we highlight research that group members find interesting.

When new technologies become popular, we privacy people are sometimes miffed that nobody asked for our opinions during the design phase. Sometimes this leads us to make sweeping generalisations such as “only use the Cloud for things you don’t care about protecting” or “Facebook is only for people who don’t care about privacy.” We have long accused others of assuming that the real world is incompatible with privacy, but are we guilty of assuming the converse?

On this Three Paper Thursday, I’d like to highlight three short papers that challenge these zero-sum assumptions. Each is eight pages long and none requires a degree in mathematics to understand; I hope you enjoy them.

“Reclaiming space from duplicate files in a serverless distributed file system“, JR Douceur et al., International Conference on Distributed Computing Systems (ICDCS), 2002.

This paper, when stripped of implementation details, contains a simple, elegant idea: convergent encryption.

The Cloud is a great place to store data reliably. One feature is de-duplication: there is no need to back up everyone’s copy of Papers separately, nor does every conference attendee need to save their own copy of the official photo. This efficient pooling of shared resources is the kind of thing that makes the cloud so attractive. On the other hand, cloud providers can make mistakes—just ask Dropbox users. Rather than depending on the cloud’s security, it’s a good idea to protect sensitive information with cryptography, but that negates the shared benefit that comes from de-duplication.

Convergent encryption is a deterministic way of encrypting things. You generate a secret key by hashing the content of a file, then encrypt that key under your key. Anyone who encrypts the same plaintext will get the same ciphertext, restoring our ability to de-duplicate storage. Of course, an attacker can decrypt the file if she knows the plaintext, but then why bother decrypting?

Convergent encryption alone does not provide anonymity: a business (e.g. the MPAA) could ask the Cloud, “have you already seen this content?” then send lawyers to ask “who uploaded it?” If all you want is confidentiality, though, convergent encryption provides an elegant solution to a real-world problem. Confidentiality can co-exist with the benefits of the public Cloud.

“Privacy Protection for Social Networking Platforms“, A Felt and D Evans, Web 2.0 Security & Privacy (W2SP), 2008.

Privacy and performance don’t have to be enemies, even in the oft-villanised realm of online social networking.

In this paper, Felt and Evans studied the top 150 Facebook applications and found that 90% of them didn’t need any of the user data which they were able to access while the other 10% were largely using personal information for trivial things such as displaying it to the user or choosing a horoscope. Of the 14 applications with non-trivial data use, four were contravening Facebook’s Terms of Service.

The paper proposes “privacy-by-proxy”, an extension to the protocols spoken by third-party social applications. Like Facebook’s own FBML (Facebook Markup Language), the privacy-by-proxy system would allow applications to name information without reading it. For instance, an application could tell Facebook UI to “insert the user’s name and a list of friends here” without knowing that user’s name.

Facebook provided FBML for performance reasons: inserting the <fb:name> tag could eliminate a round trip between application servers and Facebook. If such identifiers were mandatory, it would greatly improve privacy protection and would also improve performance for overly-communicative applications. Since 2008, spellings have changed (FBML is deprecated in favour of the JavaScript API, etc.), but the core ideas are still valid: proxying access to user data could improve privacy and performance.

“Aligning Security and Usability“, KP Yee, IEEE Security & Privacy Magazine 2(5), 2004.

Good security and good usability are both about inferring the user’s intent.

It has often been assumed that security and usability are intrinsically opposed forces. Security is assumed to mean “procedures that get in the way of getting work done” or even “lots of pop-up dialogs asking for permission”, whereas usability is assumed to be about pretty pixels and forcing programmers to use the mouse more often. In reality, though, we are in the same business: inferring user intent. To use a gross simplification, usability is about helping users do what they want and security is about preventing things that users don’t want.

Yee observes that security and usability come into conflict when software developers disregard the principle of least privilege. If a word processor is able to delete any file on the computer, then pop-ups asking “do you really want to delete this file?” start to look attractive. If, on the other hand, that word processor has can only access files which the user explicitly opens, there is no need to second-guess their intent every time a file is modified.

This model was explored in the Polaris system for Windows XP and our very own Capsicum for FreeBSD, but Apple brought powerboxes to the mainstream with the Mac OS X App Sandbox. I hope to see more of this in the future: software that treats security and usability as complementary partners rather than conflicting priorities.