Monthly Archives: February 2009

Optimised to fail: Card readers for online banking

A number of UK banks are distributing hand-held card readers for authenticating customers, in the hope of stemming the soaring levels of online banking fraud. As the underlying protocol — CAP — is secret, we reverse-engineered the system and discovered a number of security vulnerabilities. Our results have been published as “Optimised to fail: Card readers for online banking”, by Saar Drimer, Steven J. Murdoch, and Ross Anderson.

In the paper, presented today at Financial Cryptography 2009, we discuss the consequences of CAP having been optimised to reduce both the costs to the bank and the amount of typing done by customers. While the principle of CAP — two factor transaction authentication — is sound, the flawed implementation in the UK puts customers at risk of fraud, or worse.

When Chip & PIN was introduced for point-of-sale, the effective liability for fraud was shifted to customers. While the banking code says that customers are not liable unless they were negligent, it is up to the bank to define negligence. In practice, the mere fact that Chip & PIN was used is considered enough. Now that Chip & PIN is used for online banking, we may see a similar reduction of consumer protection.

Further information can be found in the paper and the talk slides.

Evil Searching

Tyler Moore and I have been looking into how phishing attackers locate insecure websites on which to host their fake webpages, and our paper is being presented this week at the Financial Cryptography conference in Barbados. We found that compromised machines accounted for 75.8% of all the attacks, “free” web hosting accounts for 17.4%, and the rest is various specialist gangs — albeit those gangs should not be ignored; they’re sending most of the phishing spam and (probably) scooping most of the money!

Sometimes the same machine gets compromised more than once. Now this could be the same person setting up multiple phishing sites on a machine that they can attack at will… However, we often observe that the new site is in a completely different directory — strongly suggesting a different attacker has broken into the same machine, but in a different way. We looked at all the recompromises where there was a delay of at least a week before the second attack and found that in 83% of cases a different directory was used… and using this definition of a “recompromise” we found that around 10% of machines were recompromised within 4 weeks, rising to 20% after six months. Since there’s a lot of vulnerable machines out there, there is something slightly different about the machines that get attacked again and again.

For 2486 sites we also had summary website logging data from The Webalizer; where sites had left their daily visitor statistics world-readable. One of the bits of data The Webalizer documents is which search terms were used to locate the website (because these are available in the “Referrer” header, and that will document what was typed into search engines such as Google).

We found that some of these searches were “evil” in that they were looking for specific versions of software that contained security vulnerabilities (“If you’re running version 1.024 then I can break in”); or they were looking for existing phishing websites (“if you can break in, then so can I”); or they were seeking the PHP “shells” that phishing attackers often install to help them upload files onto the website (“if you haven’t password protected your shell, then I can upload files as well”).

In all, we found “evil searches” on 204 machines that hosted phishing websites AND that, in the vast majority of cases, these searches corresponded in time to when the website was broken into. Furthermore, in 25 cases the website was compromised twice and we were monitoring the daily log summaries after the first break-in: here 4 of the evil searches occurred before the second break in, 20 on the day of the second break in, and just one after the second break-in. Of course, where people didn’t “click through” from Google search results, perhaps because they had an automated tool, then we won’t have a record of their searches — but neverthless, even at the 18% incidence we can be sure of, searches are an important mechanism.

The recompromise rates for sites where we found evil searches were a lot higher: 20% recompromised after 4 weeks, nearly 50% after six months. There are lots of complicating factors here, not least that sites with world-readable Webalizer data might simply be inherently less secure. However, overall we believe that it clearly indicates that the phishing attackers are using search to find machines to attack; and that if one attacker can find the site, then it is likely that others will do so independently.

There’s a lot more in the paper itself (which is well-worth reading before commenting on this article, since it goes into much more detail than is possible here)… In particular, we show that publishing URLs in PhishTank slightly decreases the recompromise rate (getting the sites fixed is a bigger effect than the bad guys locating sites that someone else has compromised); and we also have a detailed discussion of various mitigation strategies that might be employed, now that we have firmly established that “evil searching” is an important way of locating machines to compromise.

When Layers of Abstraction Don’t Get Along: The Difficulty of Fixing Cache Side-Channel Vulnerabilities

(co-authored with Robert Watson)

Recently, our group was treated to a presentation by Ruby Lee of Princeton University, who discussed novel cache architectures which can prevent some cache-based side channel attacks against AES and RSA. The new architecture was fascinating, in particular because it may actually increase cache performance (though this point was spiritedly debated by several systems researchers in attendance). For the security group, though, it raised two interesting and troubling questions. What is the proper defence against side-channels due to processor cache? And why hasn’t it been implemented despite these attacks being around for years?

Continue reading When Layers of Abstraction Don’t Get Along: The Difficulty of Fixing Cache Side-Channel Vulnerabilities

Technical aspects of the censoring of archive.org

Back in December I wrote an article here on the “Technical aspects of the censoring of Wikipedia” in the wake of the Internet Watch Foundation’s decision to add two Wikipedia pages to their list of URLs where child sexual abuse images are to be found. This list is used by most UK ISPs (and blocking systems in other countries) in the filtering systems they deploy that attempt to prevent access to this material.

A further interesting censoring issue was in the news last month, and this article (a little belatedly) explains the technical issues that arose from that.

For some time, the IWF have been adding URLs from The Internet Archive (widely known as “the wayback machine“) to their list. I don’t have access to the list and so I am unable to say how many URLs have been involved, but for several months this blocking also caused some technical problems.
Continue reading Technical aspects of the censoring of archive.org

Missing the Wood for the Trees

I’ve just submitted a (rather critical) public response to an ICANN working group report on fast-flux hosting (read the whole thing here).

Many phishing websites (and other types of wickedness) are hosted on botnets, with the hostname resolving to different machines every few minutes or hours (hence the “fast” in fast-flux). This means that in order to remove the phishing website you either have to shut-down the botnet — which could take months — or you must get the domain name suspended.

ICANN’s report goes into lots of detail about how fast-flux hosting has been operated up to now, and sets out all sorts of characteristics that it currently displays (but of course the criminals could do something different tomorrow). It then makes some rather ill-considered suggestions about how to tackle some of these symptoms — without really understanding how that behaviour might be being used by legimitate companies and individuals.

In all this concentration on the mechanics they’ve lost sight of the key issue, which is that the domain name must be removed — and this is an area where ICANN (who look after domain names) might have something to contribute. However, their report doesn’t even tackle the different roles that registries (eg Nominet who look after the .UK infrastructure) and registrars (eg Gradwell who sell .UK domain names) might have.

From my conclusion:

The bottom line on fast-flux today is that it is almost entirely associated with a handful of particular botnets, and a small number of criminal gangs. Law enforcement action to tackle these would avoid a further need for ICANN consideration, and it would be perfectly rational to treat the whole topic as of minor importance compared with other threats to the Internet.

If ICANN are determined to deal with this issue, then they should leave the technical issues almost entirely alone. There is little evidence that the working group has the competence for considering these. Attention should be paid instead to the process issues involved, and the minimal standards of behaviour to be expected of registries, registrars, and those investigators who are seeking to have domain names suspended.

I strongly recommend adopting my overall approach of an abstract definition of the problem: The specific distinguisher of a fast-flux attack is that the dynamic nature of the DNS is exploited so that if a website is to be suppressed then it is essential to prevent the hostname resolving, rather than attempting to stop the website being hosted. The working group should consider the policy and practice issues that flow from considering how to prevent domain name resolution; rather than worrying about the detail of current attacks.

New Facebook Photo Hacks

Last March, Facebook caught some flak when some hacks circulated showing how to access private photos of any user. These were enabled by egregiously lazy design: viewing somebody’s private photos simply required determining their user ID (which shows up in search results) and then manually fetching a URL of the form:
www.facebook.com/photo.php?pid=1&view=all&subj=[uid]&id=[uid]
This hack was live for a few weeks in February, exposing some photos of Facebook CEO Mark Zuckerberg and (reportedly) Paris Hilton, before the media picked it up in March and Facebook upgraded the site.

Instead of using properly formatted PHP queries as capabilities to view photos, Faceook now verifies the requesting user against the ACL for each photo request. What could possibly go wrong? Well, as I discovered this week, the photos themselves are served from a separate content-delivery domain, leading to some problems which highlight the difficulty of building access control into an enormous, globally distributed website like Facebook.

Continue reading New Facebook Photo Hacks

Variable Length Fields in Cryptographic Protocols

Many crypto protocols contain variable length fields: the names of the participants, different sizes of public key, and so on.

In my previous post, I mentioned how Liqun Chen has (re)discovered the attack that many protocols are broken if you don’t include the field lengths in MAC or signature computations (and, more to the point, a bunch of ISO standards fail to warn the implementor about this issue).

The problem applies to confidentiality, as well as integrity.

Many protocol verification tools (ProVerif, for example) will assume that the attacker is unable to distinguish enc(m1, k, iv) from enc(m2, k, iv) if they don’t know k.

If m1 and m2 are of different lengths, this may not be true: the length of the ciphertext leaks information about the length of the plaintext. With Cipher Block Chaining, you can tell the length of the plaintext to the nearest block, and with stream ciphers you can tell the exact length. So you can have protocols that are “proved” correct but are still broken, because the idealized protocol doesn’t properly represent what the implementation is really doing.

If you want different plaintexts to be observationally equivalent to the attacker, you can pad the variable-length fields to a fixed length before encrypting. But if there is a great deal of variation in length, this may be a very wasteful thing to do.

The alternative approach is to change your idealization of the protocol to reflect the reality of your encryption primitive. If your implementation sends m encrypted under a stream cipher, you can idealize it as sending an encrypted version of m together with L_m (the length of m) in the clear.

Hidden Assumptions in Cryptographic Protocols

At the end of last week, Microsoft Research hosted a meeting of “Cryptoforma”, a proposed new project (a so-called “network of excellence”) to bring together researchers working on applying formal methods to security. They don’t yet know whether or not this project will get funding from the EPSRC, but I wish them good luck.

There were several very interesting papers presented at the meeting, but today I want to talk about the one by Liqun Chen, “Parsing ambiguities in authentication and key establishment protocols”.

Some of the protocol specifications published by ISO specify how the protocol should be encoded on the wire, in sufficient detail to enable different implementations to interoperate. An example of a standard of this type is the one for the public key certificates that are used in SSL authentication of web sites (and many other applications).

The security standards produced by one group within ISO (SC27) aren’t like that. They specify the abstract protocols, but give the implementor considerable leeway in how they are encoded. This means that you can have different implementations that don’t interoperate. If these implementations are in different application domains, the lack of interoperability doesn’t matter. For example, Tuomas Aura and I recently wrote a paper in which we presented a protocol for privacy-preserving wireless LAN authentication, which we rather boldly claim to be based on the abstract protocol from ISO 9798-4.

You could think of these standards as separating concerns: the SC27 folks get the abstract crypto protocol correct, and then someone else standardises how to encode it in a particular application. But does the choice of concrete encoding affect the protocol’s correctness?

Liqun Chen points out one case where it clearly does. In the abstract protocols in ISO 9798-4 and others, data fields are joined by a double vertical bar operator. If you want to find out what that double vertical bar really means, you have to spend another 66 Swiss Francs and get a copy of ISO 9798-1, which tells you that Y || Z means “the result of the concatenation of the data items Y and Z in that order”.

Oops.

When we specify abstract protocols, it’s generally understood that the concrete encoding that gets signed or MAC’d contains enough information to unambigously identify the field boundaries: it contains length fields, a closing XML tag, or whatever. A signed message {Payee, Amount} K_A should not allow a payment of $3 to Bob12 to be mutated by the attacker into a payment of $23 to Bob1. But ISO 9798 (and a bunch of others) don’t say that. There’s nothing that says a conforming implementation can’t send the length field without authentication.

No of course, an implementor probably wouldn’t do that. But they might.

More generally: do these abstract protocols make a bunch of implicit, undocumented assumptions about the underlying crypto primitives and encodings that might turn out not to be true?

See also: Boyd, C. Hidden assumptions in cryptographic protocols. Computers and Digital Techniques, volume 137, issue 6, November 1990.