The Gawker hack: how a million passwords were lost

Almost a year to the date after the landmark RockYou password hack, we have seen another large password breach, this time of Gawker Media. While an order of magnitude smaller, it’s still probably the second largest public compromise of a website’s password file, and in many ways it’s a more interesting case than RockYou. The story quickly made it to the mainstream press, but the reported details are vague and often wrong. I’ve obtained a copy of the data (which remains generally available, though Gawker is attempting to block listing of the torrent files) so I’ll try to clarify the details of the leak and Gawker’s password implementation (gleaned mostly from the readme file provided with the leaked data and from reverse engineering MySQL dumps). I’ll discuss the actual password dataset in a future post.

Background: Gawker Media manages many high-profile blogs including Gawker, Gizmodo, Kotaku, LifeHacker, and Deadspin. Collectively it’s a multi-million dollar blogging empire which attracts hundreds of millions of readers each month. Gawker and its founder Nick Denton are notorious for publishing lurid details of celebrities’ personal lives and leaked corporate documents that mainstream media won’t report. They also apparently have an ongoing feud with the community of 4chan users, whom Denton has publicly criticised. A group calling themselves Gnosis has taken credit for the attacks. They are organised through 4chan but clearly not a significant portion of the large 4chan community. They don’t appear to have much history outside of attacks on Gawker, and despite false reports they are not directly connected to the DDOS attacks in support of WikiLeaks last week. They are specifically motivated to damage Gawker as much as possible with statements like “Fuck you gawker, hows this for ‘script kids’? Your empire has been compromised.’ The irony of the breach has been widely noted, particularly as Gawker has ridiculed others for similar incidents.

The initial attack: The details provided by Gnosis are sparse, but Nick Denton’s passwords were recovered for his Google Apps-provisioned @gawker email address, Twitter, and the collaboration site Campfire which is heavily used by Gawker staff. There are no details of the compromise, but brute force was likely involved as Denton’s password was weak enough for online brute-force to be feasible (along with the passwords of 16 other employees). All were amongst the 250,000 most common passwords from the RockYou dataset. They were used at least 10 times on RockYou, whereas over a third of passwords in that dataset were unique. Were the password stolen by a key-logger we’d expect at least one to be stronger. They may have been brute-forced online from Campfire, which doesn’t appears to implement any restrictions on password guessing (like most sites), and then re-used for the email accounts. Gnosis did report eventually attracting attention from Campfire sysadmins. Gnosis also reports observing an email thread discussing surprise by Denton and his IT staff at seeing an email about Denton’s Campfire account being updated — although he apparently never actually used it — then dismissing it as a false alarm.

Escalation: Given the large number of Campfire accounts, Gnosis had access to 4 GB of company chat logs archived on Campfire, which included login details shared between sysadmins for several Gawker servers. From here Gnosis was able to gain root access (they’ve indicated the servers were not up to date with patches), perhaps brute-forcing the root MySQL password as well (which was far too weak to withstand an offline dictionary attack). A database table of 1.5 M users of the site was downloaded, along with all of the source code used to run Gawker’s servers. These were then released via bittorrent. Gnosis also indicated they dropped all of the tables in one database instance, though it’s not clear they managed to seriously disrupt Gawker service.

The password data: The MySQL data released included 1,247,894 accounts (as Gnosis notes, an incomplete dump of the more than 1.5M total users). As 499,000 of these had no password hash stored (probably accounts which only logged in via FacebookConnect), the dataset actually contains 748,490 potentially vulnerable passwords. Contrary to many reports, Gawker did hash the stored passwords, and even salted them. Compared to the RockYou hack, this is a major improvement; it means many user passwords aren’t trivially accessible. Unfortunately, neither salting nor hashing was done very well, as the salted hashes were produced using crypt(). Besides being based on long-outdated DES encryption, crypt() has two big problems. The first is that all passwords are truncated to 8 characters due to the small DES key size (and no support is available for non-ASCII characters). Second, the salts are only 12 bits. A proper implementation would use salts twice as large as the logarithm of the number of passwords which could ever be stored, forcing the attacker to break accounts one-by-one (64 bits would be a reasonable size to support the entire population of the Earth). Thus, it’s possible to test fairly large dictionaries against the Gawker data set. Gnosis did so and claimed 300,000 cracked passwords, though their released data includes only 188,281 . DuoSecurity has reported 400,000 cracked passwords using JtR, I’m currently running my own cracking experiment by directly re-using the RockYou password file.

The response: Gawker claims to have initially thought that only corporate email was affected which delayed any formal response(though this appears embarrassingly refuted by a leaked chat of Gawker employees referring to users as ‘peasants’ and downplaying the consequences of leaking users’ passwords). The company and Denton in particular now appear to be showing more contrition. They’ve also put up an FAQ apologising and promising to improve security. However, the technical response has been bungled. A red banner on top of all Gawker blogs warns users that they may want to re-set their passwords. There is no excuse for not directly forcing users to update their passwords immediately. All users should be emailed as well, since they may rarely log into Gawker but still use their compromised password elsewhere. Gawker claims to have hit technical difficulties when trying to email all users, meaning most users haven’t actually been informed. Meanwhile, two external sites allow users to test if their account was compromised.

Conclusions: Sadly, Gawker’s security and response were probably both better than average. Going by the results of our password implementation survey, the fact that any hashing was done at all puts Gawker in the upper tier of sites. Additionally, the incident received far more attention than normal due to Gawker’s popularity and notoriety. A key lesson from the attack is that any large password collector must have a plan for responding to a compromised password file-Gawker’s technical inability to force password updates or even email their users is inexcusable. Still, these measures can’t contain the damage. The biggest missed angle on this story is that it’s not just a Gawker hack, accounts on thousands of websites can be compromised as many users use the same email/password combination everywhere. LinkedIn has taken the unusual step (similar to Twitter last year) of forcing its users contained in the Gawker leak to update their LinkedIn password. Most sites will do nothing though, and Gawker bears no liability to any of them (indeed, the accounts have already been used to spam Twitter). This is a clear case of the market failure we’ve previously described, reinforcing the point that password authentication is losing viability for large, well-connected but carelessly implemented websites.

6 thoughts on “The Gawker hack: how a million passwords were lost

  1. I guess “all passwords are effectively truncated to 8 bits due the the small DES key size” should read “8 bytes”.

  2. The oauth protocol that Twitter and Facebook are using (it’s the glue behind Facebook Connect) has the merit that you can do third-party authentication with your credentials only going to the second party. That means that other applications can use your Twitter account as an authentication channel whilst not seeing, and therefore not being able to store or divulge, your Twitter credentials. I’m in the midst of a protocol verification exercise (ProVerif) on oauth and it looks OK so far, and I have a ground-up implementation that passes informal scrutiny. Obviously, this approach doesn’t prevent twitter/facebook from leaking your credentials, but they have them already: anything which reduces the number of places your credentials are stored (put all your eggs in one basket, as it makes it easier to watch that basket) is beneficial unless you have the discipline to use independent passwords for each account.

  3. Can you talk about password recovery techniques? That is, if a user forgets the password to an account online, there are numerous techniques by which a password can be reset. Some involve security questions, others involve sending email with a link back to the site allowing a new password to be created.

    What are the vulnerabilities and best practices here?

    Thanks!

  4. Give me something to “gawk” at? Then your troubles will go away in time for the nxt wave. *waves*

    Gnosis didn’t do it – I have news 4 u.

Leave a Reply

Your email address will not be published. Required fields are marked *