UPDATE 2012-06-07: LinkedIn has confirmed the leak is real, that they “recently” switched to salted passwords (so the data is presumably an out-of-date backup) and that they’re resetting passwords of users involved in the leak. There is still no credible information about if the hackers involved have the account names or the rest of the site’s passwords. If so, this incident could still have serious security consequences for LinkedIn users. If not, it’s still a major black eye for LinkedIn, though they deserve credit for acting quickly to minimise the damage.
LinkedIn appears to have been the latest website to suffer a large-scale password leak. Perhaps due to LinkedIn’s relatively high profile, it’s made major news very quickly even though LinkedIn has neither confirmed nor denied the reports. Unfortunately the news coverage has badly muddled the facts. All I’ve seen is a list 6,458,020 unsalted SHA-1 hashes floating around. There are no account names associated with the hashes. Most importantly the leaked file has no repeated hashes. All of the coverage appears to miss this fact. Most likely, the leaker intentionally ran it through ‘uniq’ in addition to removing account info to limit the damage. Also interestingly, 3,521,180 (about 55%) of the hashes have the first 20 bits over-written with 0. Among these, 670,785 are otherwise equal to another hash, meaning that they are actually repeats of the same password stored in a slightly different format (LinkedIn probably just switched formats at some point in the past). So there are really 5,787,235 unique hashes leaked.
This gives us no idea how many total accounts were affected by the leak. It’s probably much less than the total number at the site (about 161 M) unless LinkedIn users choose implausibly terrible passwords. The RockYou leak, which included passwords for 32M users, had over 14 M unique passwords. A random sample of about 12.5 M RockYou passwords has an expected 5.8 M unique passwords, so we might project that the LinkedIn leak represents closer to 12.5 M users if the password distributions are similar. Any news reporting indicating “6.5 M accounts are affected” has not done basic investigation on the source data here.
Here’s the more important thing though: in its current form this leak has minimal security implications for LinkedIn users. No account identifiers were leaked, so this doesn’t allow offline attack against individual accounts. The passwords were hashed, which means it doesn’t reveal any user’s password which is so strong as to be out of the range of current cracking libraries. The fact that the passwords are uniqued means little information about the most popular passwords at LinkedIn is revealed. In short, while this data might be interesting for research, there’s nothing particularly useful here for attackers with one exception: given a list of LinkedIn usernames, an attacker might try many variations of them and see if they’re in the list and then try those that are with the known usernames at the real login page. This is a minor risk: relatively few users use variations of their username, and those that do can already be attacked online with slightly less efficiency.
This situation could turn out to be much worse. The attacker could be sitting on the whole database and leaked this subset accidentally or to try to find a buyer. For now though, the amount of news coverage is way out of proportion with the real impact. LinkedIn should be criticised heavily if they truly had their password database breached, but the news coverage so far has caused premature panic among LinkedIn users.
A further note on cracking so far: I’ve also seen a list of 163,237 hashes which have been inverted, leading to reports along the lines of “x passwords have already been compromised.” The list I’ve seen doesn’t seem to be a skilled job-missing basic things like “password” which are in the leak and only cracking hashes which didn’t have the high-order bits zeroed out. Surely a better cracking effort will be done, but this doesn’t matter much anyways given the lack of account information in the leak.
20 thoughts on “On the (alleged) LinkedIn password leak”
“minimal security implications for LinkedIn users.”
Completely false, the attackers who have the full list without unique filter can create top 100 password list which would be pretty successful in attacking current accounts.
I’m sorry, I cannot see what you base any of your reassuring conclusions on.
1. We have no knowledge about what was stolen but not released. There is every reason to belive that the thieves have the account-password link, but didn’t release the account names since that’s where the value is.
2. We don’t even know if it was released or if it was stolen from the thieves by other thieves.
3. Experience shows that many of the users who have their passwords brute-forced here, much aided by the lack of salt and a fast hash, will have used the same password or password-method on other sites.
Poul covers most of it; consider also that this a database of hashes of people who are most likely business-savvy and middle class, the resulting frequency tables could be very interesting indeed.
Biggest threat is that someone else has the hash->emailaddr mappings.
It still really isn’t an issue. It’s simply a race between the crackers and the average user to change their password (and pws on associated accounts). Time is on the side of the user. Most technical aspects are pretty irrelevant.
“Time is on the side of the user”
When did the compromise take place?
LinkedIn have now confirmed that the passwords are real.
They’re now upgrading to salted hashes and forcing users to change their passwords. Of course, salted hashes can be easily cracked using GPUs now. bcrypt would be a better choice.
Aleksey, Poul, and Alec: I agree with your points that this incident could have serious implications if usernames are eventually released. My main point was that what’s available so far isn’t really a problem and the early coverage missed this, along with getting the numbers very wrong. I understated the risk though of more information coming out which was equally a mistake.
Thanks very much for the feedback.
“My main point was that what’s available so far isn’t really a problem”
Available to the public: Yes.
Available to the thieves: How do you know ?
The multiple zero prefix are those passwords already cracked by the community as revealed by Ars Technica.
@stu: my situation would contradict that theory, as documented at http://dropsafe.crypticide.com/article/7235 – my guess is one of these three:
– some weird attempt at whitening
– side effect of some previous attempt to hash the hashes into buckets, never cleaned up (ie: also corruption)
The “Hashes that have been cracked were prepended with “00000” by the people who run the site to tell them apart from those not cracked by hackers yet.” on http://arstechnica.com/security/2012/06/10-or-so-of-the-worst-passwords-exposed-by-the-linkedin-hack/ stinks of bad reporting, contextually it just seems wrong.
“Most likely, the leaker intentionally ran it through ‘uniq’ in addition to removing account info to limit the damage.”
How is this limiting the damage? This article is BS.
Sorry guys but we don’t appear to “get it” with regards passwords and attackers…
We know that attackers go for the “low hanging fruit” first and work upwards, we also know that over fifty years of using passwords we only make small incremental improvments that the attackers overcome usually with little difficulty in fairly short order these days.
So let us assume that a salt is added or some other system such as bcrypt is used…
You are not solving the problem only increasing the bar so that the attacker will go to the next weakest point to get their desired data. For instance if the DB is made less vulnerable, the attacker will get to the point where as they can get into the target servers they will just install a logger or some such to intercept the username and password prior to the authentication process. If it’s found on the servers then they can move the logging onto a close intermediary node on the internet outside of the target servers organisation, or make some other attack to ensure the traffic flows through a node under the attackers control.
So what to do, well you could ask the question about why let the password off the client machine in the first place?
That is from the targets point of view they fully externalise the risk onto the client machine, and thus make it the client owners issue. It also at the same time makes the attackers move their attention away from the target servers as the desirable information is nolonger there (though there may well be other information that is still desirable other than the users password).
Can this be done, yes and has been possible one way or another for around a quater of a century.
Oh I forgot to add the obligitory Wiki page link for the Secure Remote Password Protocol,
The fact that LinkedIn didn’t salt their passwords from day 1 indicates that they’re clueless about security. And that they couldn’t be bothered to have any decent outside review of their security plan and its implementation. Seriously, guys, salting passwords isn’t exactly a recent discovery, nor is it hard to implement. It’s been part of every decent recommended-security-practices guide for a long time…..
@ Jonathan Thornburg
That’s the conclusion I came to discussing this with Clive Robinson on Schneier’s blog. Nice stuff like SRP aside, salting already makes the crooks work for their take, yet LinkedIn didn’t do even the most basic security. The stuff that was practically free thanks to groups like OWASP. There should be legal liability when a company making so much money with so much data doesn’t do the bare minimum. I’m not asking for perfect or anything from web 2.0. Just standard commercial best practices at a minimum. Linkedin is far from it.
Joseph; Linkedin’s handling of this incident so far has been slow, with little useful information. Today it has also been revealed that they do do not have a CIO/CISO. Those responsible for setting policies and doing controls are also responsible for operations.
Separation of duties seems to be missing, and that within a company with 150mill+ users. They could have learned from breaches like Sony last year, apparently they didn’t.
Good thing is there are no reliable reports of account breaches & abuse, neither at Linkedin or at any other sites where one would expect password reuse.
I am very happy to say that I managed to alert national CERTs and others before the story appeared in media, and I also believe that the massive media coverage have prevented any serious fallout following the initial breach.
As for lots of the BS floating around I would like to see some serious stats of real – economical – losses or gains following such breaches. After all HBGary shares went up and finally the company got purchased – AFTER a truly embarrassing password breach with their CEO in the front seat.
Considering the usual high level of insight and quality of opinion on this blog, this post is truly disappointing.
You clearly have no idea when the theft took place, exactly what was taken, nor who has the complete dataset.
And yet at the same time as blasting mainstream media for hype and misinformation, you yourself, based only on assumption, claim there is minimal risk to users.
A balanced response is required but do not underplay the risk, which is just as dangerous or silly as overplaying it.
To make matters worse you assume “recently” adding salting implies the theft was of an old backup. To me, what that says is that since knowing about the theft they have begun to salt passwords. Not at all the same thing!
In future please try to sound less like you’re in the pocket of big business, and more like an intelligent and independent security researcher.
Not sure where to post this, so I thought posting here would do the trick. Yahoo reports that some passwords have been stolen.
After my LinkedIn password hash was leaked. I had to change 10+ website passwords and I don’t want to do that again. I have been working on a feasible solution since then.
The solution is called Aladdin and it is an open source USB key(board) to your computer & websites. He types your password so you don’t have to. There is no software to install and works everywhere because it appears as an USB keyboard to the operating system. All it does is type your password.
I’m trying to raise funds by crowdfunding at http://www.indiegogo.com/aladdin-key so I invite you to take a look.
Howdy! This article could not be written any better!
Looking through this post reminds me of my previous roommate!
He always kept talking about this. I will forward this article to him.
Pretty sure he will have a very good read. I appreciate you for sharing!