A few weeks ago I detailed how Gawker lost a million of their users’ passwords. Soon after this I found an interesting vulnerability in Gawker’s password deployment involving the handling of non-ASCII characters. Specifically, they didn’t handle them at all until two weeks ago, instead they were mapping all non-ASCII characters to the ASCII ‘?’ prior to hashing them. This not only greatly limited the theoretical space of passwords, but meant that passwords consisting of any n non-ASCII characters were equivalent to ‘?’^n. Native Telugu or Korean speakers with passwords like ‘రహస్య సంకేత పదం’ or ‘비밀번호’ were vulnerable to an attacker simply guessing a string of question marks. An attacker may in fact know in advance that some users are from non-Latin countries (for example by looking at their email addresses) potentially making this more easily exploitable.