A truly marvellous proof of a transaction

When you transact with an EMV payment card (a “Chip and PIN” card), typical UK operation is for the bank to exchange three authentication “cryptograms”. First comes the request from the card (the ARQC), then comes the response from the bank (the ARPC) and finally the transaction certificate (TC). The idea with the transaction certificate is that the card signs off on the correct completion of the protocol, having received the response from the bank and accepted it. The resulting TC is supposed to be a sort of “proof of transaction”, and because it certifies the completion of the entire transaction at a technical level, one can presume that both the money was deducted and the goods were handed over. In an offline transaction, one can ask the card to produce a TC straight away (no AQRC and ARPC), and depending on various risk management settings, it will normally do so. So, what can you do with a TC?

Well, the TC is sent off to the settlment and clearing system, along with millions of others, and there it sits in the transaction logs. These logs get validated as part of clearing and the MAC is checked (or it should be). In practice the checks may get deferred until there is a dispute over a transaction (i.e. a human complains). The downside is, if you don’t check all TCs, you can’t spot offline fraud in the chip and pin system. Should a concerned party dispute a transaction, the investigation team will pull the record, and use a special software tool to validate the TC – objective proof that the card was involved.

Another place the TC gets put is on your receipt. An unscientific poll of my wallet reveals 13 EMV receipts with TCs present (if you see a hex string like D3803D679B33F16E8 you’ve found it) and 7 without – 65% of my receipts have one. The idea is that the receipt can stand as a cryptographic record of the transaction, should the banks logs fail or somehow be munged.

But there is a bit of a problem: sometimes there isn’t enough space for all the data. It’s a common problem: Mr. Fermat had a truly marvellous proof of a proposition but it wouldn’t fit in the margin, and unfortunately while the cryptogram fits on the receipt, you can’t check a MAC without knowing the input data, and EMV terminal software developers have a pretty haphazard approach to including this on the receipts… after all, paper costs money!

Look at the following receipts. Cab-Inn Aarhus has the AID, the ATC and various other input goodies to the MAC (sorry if I’m losing you in EMV technicalities); Ted Baker has the TC, the CVMR, the TSI, TVR and IACs (but no ATC), and the Credit Agricole cash withdrawal has the TC and little else. One doesn’t want to be stuck like Andrew Wiles spending seven years guessing the input data! Only at one shop have I seen the entire data required on the receipt (including CID et al) and it was (bizarrely) from a receipt for a Big Mac at McDonalds!

Various POS receipts

Now in case of dispute one can brute force the missing data and guess up to missing 40 bits of data or so without undue difficulty. And there is much wrangling (indeed the subject of previous and current legal actions) about exactly what data should be provided, whether the card UDKs should be disclosed to both sides in a dispute, and generally what constitutes adequate proof.

A "fake" transaction certificate
A “fake” transaction certificate

But most worrying is something I observed last week, making a credit card purchase for internet access at an Airport: a fake transaction certificate. I can tell its a fake because it firstly its the wrong length, and secondly, it’s online — I only typed in my card number, and never plugged my card into a smartcard reader. Now I can only assume that some aesthetically minded developer decided that the online confirmation needed a transaction certificate so took any old hex field and just dumped it. It could be an innocent mistake. But whether or not the “margin was too small” to contain this TC, its a sad reflection on the state of proof in EMV when cryptographic data fields only appear for decorative purposes. Maybe some knowledgeable reader can shed some light on this?

Facebook Giving a Bit Too Much Away

Facebook has been serving up public listings for over a year now. Unlike most of the site, anybody can view public listings, even non-members. They offer a window into the Facebook world for those who haven’t joined yet, since Facebook doesn’t allow full profiles to be publicly viewable by non-members (unlike MySpace and others). Of course, this window into Facebook comes with a prominent “Sign Up” button, growth still being the main mark of success in the social networking world. The goal is for non-members to stumble across a public listing, see how many friends are already using Facebook, and then join. Economists call this a network effect, and Facebook is shrewdly harnessing it.

Of course, to do this, Facebook is making public every user’s name, photo, and 8 friendship links. Affiliations with organizations, causes, or products are also listed, I just don’t have any on my profile (though my sister does). This is quite a bit of information given away by a feature many active Facebook user are unaware of. Indeed, it’s more information than the Facebook’s own privacy policy indicates is given away. When the feature was launched in 2007, every over-18 user was automatically opted-in, as have been new users since then. You can opt out, but few people do-out of more than 500 friends of mine, only 3 had taken the time to opt out. It doesn’t help that most users are unaware of the feature, since registered users don’t encounter it.

Making matters worse, public listings aren’t protected from crawling. In fact they are designed to be indexed by search engines. In our own experiments, we were able to download over 250,000 public listings per day using a desktop PC and a fairly crude Python script. For a serious data aggregator getting every user’s listing is no sweat. So what can one do with 200 million public listings?

I explored this question along with Jonathan Anderson, Frank Stajano, and Ross Anderson in a new paper which we presented today at the ACM Social Network Systems Workshop in Nuremberg. Facebook’s public listings give us a random sample of the social graph, leading to some interesting exercises in graph theory. As we describe in the paper, it turns out that this sampled graph allows us to approximate many properties of the complete network surprisingly well: degree and centrality of nodes, small dominating sets, short paths, and community structure. These are all things marketers and sociologists alike would love to know for the complete Facebook graph.

This result leads to two interesting conclusions. First, protecting a social graph is hard. Consistent with previous results, we found that giving away a seemingly small amount can allow much information to be inferred. It’s also been shown that anonymising a social graph is almost impossible.

Second, Facebook is developing a track record of releasing features and then being surprised by the privacy implications, from Beacon to NewsFeed and now Public Search. Analogous to security-critical software, where new code is extensively tested and evaluated before being deployed, social networks should have a formal privacy review of all new features before they are rolled out (as, indeed, should other web services which collect personal information).  Features like public search listings shouldn’t make it off the drawing board.

The Snooping Dragon

There’s been much interest today in a report that Shishir Nagaraja and I wrote on Chinese surveillance of the Tibetan movement. In September last year, Shishir spent some time cleaning out Chinese malware from the computers of the Dalai Lama’s private office in Dharamsala, and what we learned was somewhat disturbing.

Later, colleagues from the University of Toronto followed through by hacking into one of the control servers Shishir identified (something we couldn’t do here because of the Computer Misuse Act); their report relates how the attackers had controlled malware on hundreds of other PCs, many in government agencies of countries such as India, Vietnam and the Phillippines, but also in US firms such as AP and Deloittes.

The story broke today in the New York Times; see also coverage in the Telegraph, the BBC, CNN, the Times of India, AP, InfoWorld, Wired and the Wall Street Journal.

Democracy Theatre on Facebook

You may remember a big PR flap last month about Facebook‘s terms of service, followed by Facebook backing down and promising to involve users in a self-governing process of drafting their future terms. This is an interesting step with little precedent amongst commercial web sites. Facebook now has enough users to be the fifth largest nation on earth (recently passing Brazil), and operators of such immense online societies need to define a cyber-government which satisfies their users while operating lawfully within a multitude of jurisdictional boundaries, as well as meeting their legal obligations to the shareholders who own the company.

Democracy is an intriguing approach, and it is encouraging that Facebook is considering this path. Unfortunately, after some review my colleagues and I are left thoroughly disappointed by both the new documents and the specious democratic process surrounding them. We’ve outlined our arguments in a detailed report, the official deadline for commentary is midnight tonight.

The non-legally binding Statement of Principles outline an admirable set of goals in plain language, which was refreshing. However, these goals are then undermined for a variety of legal and business reasons by the “Statement of Rights and Responsibilities“, which would effectively be the new Terms of Service. For example, Facebook demands that application developers comply with user’s privacy settings which it doesn’t provide access to, states that users should have “programmatic access” and then bans users from interacting with the site via “automated means,” and states that the service will transcend national boundaries while banning users from signing up if they live in a country embargoed by the United States.

The stated goal of fairness and equality is also lost. The Statement of Rights and Responsibilities primarily assigns rights to Facebook and responsibilities on users, developers, and advertisers. Facebook still demands a broad license to all user content, shifts all responsibility for enforcing privacy onto developers, and sneakily disclaims itself of all liability. Yet it demands an unrealistic set of obligations: a literal reading of the document requires users to get explicit permission from other users before viewing their content. Furthermore, they have applied the banking industry’s well-known trick of shifting liability to customers, binding users to not do anything to “jeopardize the security of their account,” which can be used to dissolve the contract.

The biggest missed opportunity, however, is the utter failure to provide a real democratic process as promised. Users are free to comment on terms, but Facebook is under no obligation to listen. Facebook‘s official group for comments contains a disorganised jumble of thousands of comments, some insightful and many inane. It is difficult to extract intelligent analysis here. Under certain conditions a vote can be called, but this is hopelessly weakened: it only applies to certain types of changes, the conditions of the vote are poorly specified and subject to manipulation by Facebook, and in fact they reserve the right to ignore the vote for “administrative reasons.”

With a nod to Bruce Schneier, we call such steps “democracy theatre.” It seems the goal is not to actually turn governance over to users, but to use the appearance of democracy and user involvement to ward off future criticism. Our term may be new, but this trick is not, it has been used by autocratic regimes around the world for decades.

Facebook’s new terms represent a genuine step forward with improved clarity in certain areas, but an even larger step backward in using democracy theatre to cover the fact that Facebook is a business and its ultimate accountability is to its shareholders. The outrage over the previous terms was real and it was justified, social networks mean a great deal to their users, and they want to have a real say.  Since Facebook appears unwilling to actually do so, though, we would be remiss to allow them to deflect user’s anger with flowery language and a sham democratic process. For this reason we cannot support the new terms.

[UPDATE: Our report has been officially backed by the Open Rights Group]

EFF and Tor Project in Google Summer of Code

The EFF and the Tor Project have been accepted into Google Summer of Code. This programme offers students a stipend for contributing to open source software over a 3 month period. Google Summer of Code has been running since 2005 and the Tor project has been a participant since 2007.

We are looking for talented and motivated students to work on a number of projects to improve Tor, and related applications. Students are also welcome to come up with their own ideas. Applications must be submitted by 3 April 2009. For further information, and details on how to apply, see the Tor blog.

National Fraud Strategy

Today the Government “launches” its National Fraud Strategy. I qualify the verb because none of the quality papers seems to be running the story, and the press releases have not yet appeared on the websites of the Attorney General or the Ministry of Justice.

And well might Baroness Scotland be ashamed. The Strategy is a mishmash of things that are being done already with one new initiative – a National Fraud Reporting Centre, to be run by the City of London Police. This is presumably intended to defuse the Lords’ criticisms of the current system whereby fraud must be reported to the banks, not to the police. As our blog has frequently reported, banks dump liability for fraud on customers by making false claims about system security and imposing unreasinable terms and conditions. This is a regulatory failure: the FSA has been just as gullible in accepting the banking industry’s security models as they were about accepting its credit-risk models. (The ombudsman has also been eager to please.)

So what’s wrong with the new arrangements? Quite simply, the National Fraud Reporting Centre will nestle comfortably alongside the City force’s Dedicated Cheque and Plastic Crime Unit, which investigates card fraud but is funded by the banks. Given this disgraceful arrangement, which is more worthy of Uzbekistan than of Britain, you have to ask how eager the City force will be to investigate offences that bankers don’t want investigated, such as the growing number of insider frauds and chip card cloning? And how vigorously will City cops investigate their paymasters for the fraud of claiming that their systems are secure, when they’re not, in order to avoid paying compensation to defrauded accountholders? The purpose of the old system was to keep the fraud figures artificially low while enabling the banks to control such investigations as did take place. And what precisely has changed?

The lessons of the credit crunch just don’t seem to have sunk in yet. The Government just can’t kick the habit of kowtowing to bankers.

Hot Topics in Privacy Enhancing Technologies (HotPETs 2009)

HotPETs – the 2nd Hot Topics in Privacy Enhancing Technologies (co-located with PETS) will be held in Seattle, 5–7 August 2009.

HotPETs is the forum for new ideas on privacy, anonymity, censorship resistance, and related topics. Work-in-progress is welcomed, and the format of the workshop will be to encourage feedback and discussion. Submissions are especially encouraged on the human side of privacy: what do people believe about privacy? How does privacy work in existing institutions?

Papers (up to 15 pages) are due by 8 May 2009. Further information can be found in the call for papers.

Optimised to fail: Card readers for online banking

A number of UK banks are distributing hand-held card readers for authenticating customers, in the hope of stemming the soaring levels of online banking fraud. As the underlying protocol — CAP — is secret, we reverse-engineered the system and discovered a number of security vulnerabilities. Our results have been published as “Optimised to fail: Card readers for online banking”, by Saar Drimer, Steven J. Murdoch, and Ross Anderson.

In the paper, presented today at Financial Cryptography 2009, we discuss the consequences of CAP having been optimised to reduce both the costs to the bank and the amount of typing done by customers. While the principle of CAP — two factor transaction authentication — is sound, the flawed implementation in the UK puts customers at risk of fraud, or worse.

When Chip & PIN was introduced for point-of-sale, the effective liability for fraud was shifted to customers. While the banking code says that customers are not liable unless they were negligent, it is up to the bank to define negligence. In practice, the mere fact that Chip & PIN was used is considered enough. Now that Chip & PIN is used for online banking, we may see a similar reduction of consumer protection.

Further information can be found in the paper and the talk slides.

Evil Searching

Tyler Moore and I have been looking into how phishing attackers locate insecure websites on which to host their fake webpages, and our paper is being presented this week at the Financial Cryptography conference in Barbados. We found that compromised machines accounted for 75.8% of all the attacks, “free” web hosting accounts for 17.4%, and the rest is various specialist gangs — albeit those gangs should not be ignored; they’re sending most of the phishing spam and (probably) scooping most of the money!

Sometimes the same machine gets compromised more than once. Now this could be the same person setting up multiple phishing sites on a machine that they can attack at will… However, we often observe that the new site is in a completely different directory — strongly suggesting a different attacker has broken into the same machine, but in a different way. We looked at all the recompromises where there was a delay of at least a week before the second attack and found that in 83% of cases a different directory was used… and using this definition of a “recompromise” we found that around 10% of machines were recompromised within 4 weeks, rising to 20% after six months. Since there’s a lot of vulnerable machines out there, there is something slightly different about the machines that get attacked again and again.

For 2486 sites we also had summary website logging data from The Webalizer; where sites had left their daily visitor statistics world-readable. One of the bits of data The Webalizer documents is which search terms were used to locate the website (because these are available in the “Referrer” header, and that will document what was typed into search engines such as Google).

We found that some of these searches were “evil” in that they were looking for specific versions of software that contained security vulnerabilities (“If you’re running version 1.024 then I can break in”); or they were looking for existing phishing websites (“if you can break in, then so can I”); or they were seeking the PHP “shells” that phishing attackers often install to help them upload files onto the website (“if you haven’t password protected your shell, then I can upload files as well”).

In all, we found “evil searches” on 204 machines that hosted phishing websites AND that, in the vast majority of cases, these searches corresponded in time to when the website was broken into. Furthermore, in 25 cases the website was compromised twice and we were monitoring the daily log summaries after the first break-in: here 4 of the evil searches occurred before the second break in, 20 on the day of the second break in, and just one after the second break-in. Of course, where people didn’t “click through” from Google search results, perhaps because they had an automated tool, then we won’t have a record of their searches — but neverthless, even at the 18% incidence we can be sure of, searches are an important mechanism.

The recompromise rates for sites where we found evil searches were a lot higher: 20% recompromised after 4 weeks, nearly 50% after six months. There are lots of complicating factors here, not least that sites with world-readable Webalizer data might simply be inherently less secure. However, overall we believe that it clearly indicates that the phishing attackers are using search to find machines to attack; and that if one attacker can find the site, then it is likely that others will do so independently.

There’s a lot more in the paper itself (which is well-worth reading before commenting on this article, since it goes into much more detail than is possible here)… In particular, we show that publishing URLs in PhishTank slightly decreases the recompromise rate (getting the sites fixed is a bigger effect than the bad guys locating sites that someone else has compromised); and we also have a detailed discussion of various mitigation strategies that might be employed, now that we have firmly established that “evil searching” is an important way of locating machines to compromise.