Evaluating statistical attacks on personal knowledge questions

What is your mother’s maiden name? How about your pet’s name? Questions like these were a dark corner of security systems for quite some time. Most security researchers instinctively think they aren’t very secure. But they still have gained widespread deployment as a backup to password-based authentication when email-based identification isn’t available. Free webmail providers, for example, may have no other choice. Unfortunately, because most websites rely on email when passwords fail, and email providers rely on personal knowledge questions, most web authentication is no more secure than personal knowledge questions. This risk has gotten more attention recently, with high profile compromises of Paris Hilton’s phone, Sarah Palin’s email, and Twitter’s corporate Google Documents occurring due to guessed personal knowledge questions.

There’s finally been a surge of academic research into the area in the last five years. It’s been shown, for example, that these questions are easy to look up online, often found in public records, and easy for friends and acquaintances to guess. In a joint work with Mike Just and Greg Matthews from the University of Edinburgh published this week in the proceedings of Financial Cryptography 2010, we’ve examined the more basic question of how secure the underlying answer distributions are to statistical guessing. Put another way, if an attacker wants to do no target-specific work, but just guess common answers for a large number of accounts using population-wide statistics, how well can she do?

Answering this question first required developing the right mathematical model for resistance of a question to guessing. Entropy (specifically Shannon entropy H1) is commonly thrown around as the measure of resistance to guessing, but it was never intended for this purpose and is not appropriate for measuring guessing of non-uniform distributions. Guessing entropy G, the expected number of guesses if answers are guessed in decreasing order of likeliness, is better, but still highly skewed by low-probability events which wouldn’t be guessed in practice. We’re concerned with a trawling attacker, who will guess values like “Smith,” “Jones,” and “Johnson” for a target’s mother’s maiden name, and then move on to other accounts if these don’t work. The frequencies of uncommon names like “Zabielskis” are irrelevant because a trawling attacker will never try them, yet they inflate the values of both H1 and G. Entropy can be very misleading for real-world security, and we hope a contribution of our paper is to encourage the use of “marginal” guessing metrics instead. We even provide a few theorems that prove in a strong way that high entropy (H1 or G) can give you no security at all against a trawling attacker in the real world.

Using these new metrics, we examined a range of statistics on answer distributions to common personal knowledge questions. It turns out the majority of personal knowledge questions ask for proper names of people, pets, and places, and the rest are trivially insecure (eg “What is my favourite day of the week?”). We collected government census data, pet registration records, and also completely crawled Facebook’s people directory. Incidentally, we believe this Facebook names corpus, consisting of 269 M full names, is the largest such dataset ever assembled and may have many uses outside of security research, which we are happy to provide it for.

Analysing our data for security, though, shows that essentially all human-generated names provide poor resistance to guessing. For an attacker looking to make three guesses per personal knowledge question (for example, because this triggers an account lock-down), none of the name distributions we looked at gave more than 8 bits of effective security except for full names. That is, about at least 1 in 256 guesses would be successful, and 1 in 84 accounts compromised. For an attacker who can make more than 3 guesses and wants to break into 50% of available accounts, no distributions gave more than about 12 bits of effective security. The actual values vary in some interesting ways-South Korean names are much easier to guess than American ones, female first names are harder than male ones, pet names are slightly harder than human names, and names are getting harder to guess over time.

Still, there is a strong result that anything named by humans is dangerous to use as a secret. Sociologists have known this for years. Most human names follow a power-law distribution fairly close to Zipfian, which we confirmed in our study. This means every name distribution has a few disproportionately common names—”Gonzalez” amongst Chilean surnames, “Guðrún” amongst Icelandic forenames, “Buddy” amongst pets—for attackers to latch on to. Combined with previous results on other attack methods, there should be no doubt that personal knowledge questions are no longer viable for email, which has come to play too critical a role in web security.