All posts by Richard Clayton

Cambridge Cloud Cybercrime Centre

We have recently won a major grant (around £2 million over 5 years) under the EPSRC Contrails call which we will be using to set up the “Cambridge Cloud Cybercrime Centre”:

https://www.cambridgecybercrime.uk/

The will be a multi-disciplinary initiative combining expertise from the University of Cambridge’s Computer Laboratory, Institute of Criminology and Faculty of Law. We will be operational from 1 October 2015.

Our approach will be data driven. We have already negotiated access to some very substantial datasets relating to cybercrime and we aim to leverage our neutral academic status to obtain more data and build one of the largest and most diverse data sets that any organisation holds.

We will mine and correlate these datasets to extract information about criminal activity. Our analysis will enhance understanding of crime ‘in the cloud’, enable us to devise identifiers of such criminality, allow us to build systems to detect this type of crime when it occurs, and aid us in showing how it is possible to collect extremely reliable evidence of wrongdoing. When it is appropriate, we will work closely with law enforcement so that interventions can be undertaken.

Our overall objective is to create a sustainable and internationally competitive centre for academic research into cybercrime.

Importantly, we will not be keeping all this data to ourselves… a key aim of our Centre is to make data available to other academics for them to apply their own skills to address cybercrime issues.

Academics currently face considerable difficulties in researching cybercrime. It is difficult, and time consuming, to negotiate access to real data on actual abuse and then it is necessary to build and deploy data collection tools before the real work can even be started.

We intend to drive a step change in the amount of cybercrime research by making datasets available, not just of URLs but content as well, so that other academics can concentrate on their particular areas of expertise and start being productive immediately. These datasets will be both ‘historic’ and, where appropriate ‘real-time’.

We will maintain high ethical standards in everything we do and will develop a strong legal framework for our operations. In particular we will always ensure that the data we handle is treated fully in accord with the spirit, and not just the letter, of the agreements we enter into.

We will shortly be hiring for the first few research positions … pointers to the job adverts will appear on this blog.

Phishing that looks like another risk altogether

I came across an unusual DHL branded phish recently…

The user receives an email with the Subject of “DHL delivery to [ xxx ]June ©2015” where xxx is their valid email address. The From is forged as “DHLexpress<noreply@delivery.net>” (the criminal will have used this domain since delivery.net hasn’t yet adopted DMARC whereas dhl.com has a p=reject policy which would have prevented this type of forgery altogether).

The email looks like this (I’ve blacked out the valid email address):
DHL email body
and so, although we would all wish otherwise, it is predictable that many recipients will have opened the attachment.

BTW: if the image looks in the least bit fuzzy in your browser then click on the image to see the full-size PNG file and appreciate how realistic the email looks.

I expect many now expect me to explain about some complex 0-day within the PDF that infects the machine with malware, because after all, that’s the main risk from opening unexpected attachments isn’t it ?

But no!
Continue reading Phishing that looks like another risk altogether

Which Malware Lures Work Best?

Last week at the APWG eCrime Conference in Barcelona I presented some new results about an old Instant Messaging (IM) worm from a paper written by Tyler Moore and myself.

In late April 2010 users of the Yahoo and Microsoft IM systems started to get messages from their buddies which said, for example:
foto ☺ http://www.example.com/image.php?user@email.example.com
where the email address was theirs and the URL was for some malware.

Naturally, since the message was from their buddy a lot of folks clicked on the link and when the Windows warning pop-up said “you cannot see this photo until you press OK” they pressed OK and (since the Windows message was in fact a warning about executing unknown programs downloaded from the Internet) they too became infected with the malware. Hence they sent foto ☺ messages to all their buddies and the worm spread at increasing speed.

By late May 2010 I had determined how the malware was controlled (it resolved hostnames to locate IRC servers then joined particular channels where the topic was the message to be sent to buddies) and built a Perl program to join in and monitor what was going on. I also determined that the criminals were often hosting their malware on hosting sites with world-readable Apache weblogs so we could get exact counts of malware downloads (how many people clicked on the links).

Full details, and the story of a number of related worms that spread over the next two years can be found in the academic paper (and are summarised in the slides I used for a very short talk in Barcelona and a longer version I presented a week earlier in Luxembourg).

The key results are:

  • Thanks to some sloppiness by the criminals we had some brief snapshots of activity from an IRC channel used when the spreading phase was complete and infected machines were being forced to download new malware — this showed that 95% of people had clicked OK to dismiss the Microsoft warning message.
  • We had sufficient download data to estimate that around 3 million users were infected by the initial worm and we have records of over 14 million distinct downloads over all of the different worms (having ignored events caused by security monitoring, multiple clicks by the same user, etc.). That is — this was a large scale event.
  • We were able to compare the number of clicks during periods where the criminals vacillated between using URL shorteners in their URLs and when they used hostnames that (vaguely resembled) brands such as Facebook, MySpace, Orkut and so on. We found that when shorteners were used this reduced the number of clicks by almost half — presumably because it made users more cautious.
  • From early 2011 the worms were mainly affecting Brazil — and the simple “foto ☺” had long been replaced by other textual lures. We found that when the criminals used lures in Portuguese (e.g. “eu acho que é você na”, which has, I was told in Barcelona, a distinctive Brazilian feel to it) they were far more successful in getting people to click than when they used ‘language independent’ lures such as “hahha foto”

There’s nothing here which is super-surprising, but it is useful to see our preconceptions borne out not in a laboratory experiment (where it is hard to ensure that the experimental subjects are behaving quite the way that they would ‘in the wild’) but by large scale measurements from real events.

A dubious article for a dubious journal

This morning I received a request to review a manuscript for the “Journal of Internet and Information Systems“. That’s standard for academics — you regularly get requests to do some work for the community for free!

However this was a little out of the ordinary in that the title of the manuscript was “THE ASSESSING CYBER CRIME AND IT IMPACT ON INFORMATION TECHNOLOGY IN NIGERIA” which is not, I feel, particularly grammatical English. I’d expect an editor to have done something about that before I was sent the manuscript…

I stared hard at the email headers (after all I’d just been sent some .docx files out of the blue) and it seems that the Journals Review Department of academicjournals.org uses Microsoft’s platform for their email (so no smoking gun from a spear-fishing point of view). So I took some appropriate precautions and opened the manuscript file.

It was dreadful … and read like it had been copied from somewhere else and patched together — indeed one page appeared twice! However, closer examination suggested it had been scanned rather than copy-typed.

For example:

The primary maturation of malicious agents attacking information system has changed over time from pride and prestige to financial again.

Which, some searches will show you comes from page 22 of Policing Cyber Crime written by Petter Gottschalk in 2010 — a book I haven’t read so I’ve no idea how good it is. Clearly “maturation” should be “motivation”, “system” should “systems” and “again” should be “gain”.

Much of the rest of the material (I didn’t spend a long time on it) was from the same source. Since the book is widely available for download in PDF format (though I do wonder how many versions were authorised), it’s pretty odd to have scanned it.

I then looked harder at the Journal itself — which is one of a group of 107 open-access journals. According to this report they were at one time misleadingly indicating an association with Elsevier, although they didn’t do that on the email they sent me.

The journals appear on “Beall’s list“: a compendium of questionable, scholarly open-access publishers and journals. That is, publishing your article in one of these venues is likely to make your CV look worse rather than better.

In traditional academic publishing the author gets their paper published for free and libraries pay (quite substantial amounts) to receive the journal, which the library users can then read for free, but the article may not be available to non-library users. The business model of “open-access” is that the author pays for having their paper published, and then it is freely available to everyone. There is now much pressure to ensure that academic work is widely available and so open-access is very much in vogue.

There are lots of entirely legitimate open-access journals with exceedingly high standards — but also some very dubious journals which are perceived of as accepting most anything and just collecting the money to keep the publisher in the style to which they have become accustomed (as an indication of the money involved, the fee charged by the Journal of Internet and Information Systems is $550).

I sent back an email to the Journal saying “Even a journal with your reputation should not accept this item“.

What does puzzle me is why anyone would submit a plagiarised article to an open-access journal with a poor reputation. Paying money to get your ripped-off material published in a dubious journal doesn’t seem to be good tactics for anyone. Perhaps it’s just that the journal wants to list me (enrolling my reputation) as one of their reviewers? Or perhaps I was spear-phished after all? Time will tell!

On the measurement of banking fraud

Kidnapping is not an easy crime to be successful at…

… it is of course easy to grab the heiress from outside the nightclub at 3am. It’s easy to incarcerate her at the remote farmhouse. If you pick the right henchmen then it’s easy to cut off her ear and post it off to the frantic family.

Thereafter it gets very difficult — you must communicate directly several times and you must physically go and pick up the bag of money. These last two tasks are extremely difficult to manage successfully which is why police forces solve kidnap cases so often (in its first 5 years the Metropolitan Police Kidnap Unit solved 100% of their cases).

Theft from online bank accounts also has its difficulties. It remains relatively easy to gain access to a victim’s bank account and to issue instructions on their behalf. Last decade this was all about “phishing” — gathering credentials by creating fake websites; more recently credentials have been compromised by means of “man-in-the-browser” malware: you think you are paying your gas bill and that’s what your browser tells you is occurring. In practice you’re approving a money transfer to a criminal.

However, moving the money to another account does not mean that the criminal has got away with it. If the bank notices a suspicious pattern of transfers then they can investigate, and when they see the tell-tale signs of fraud then the transfers (which were only changes to computer records) can be trivially reversed. It is only when the criminal can extract folding money from an ATM, or can move the money abroad in such a way that it will never be repatriated that they have been truly successful. So like kidnap, theft from bank accounts is somewhat harder to pull off than one might initially think.

This has turned out to be a surprise to the Treasury Select Committee.

Last month I was asked to give oral evidence to them and the very first question related to how much fraud there was relating to online banking. I explained that the banks collated figures showing how much money was actually “lost” (viz: the amount that the banks ended up, usually anyway, reimbursing to the unfortunate customers who had been defrauded).

However, industry insiders say that about twice this amount is moved to another account but — and this is basically Very Good News — it is then transferred back so there is no actual loss to anyone. We don’t know the exact figures here, because they are not collated and published.

Furthermore, the bank should also be measuring “money at risk” that is the total amount in the compromised accounts. If their security measures failed and criminals stole every last penny then these would be actual losses — an order of magnitude more, perhaps, than the published figures.

The Select Committee chairman is now writing to the banks to ask if this is all true and what the “true” fraud figures might be. If the banks reply with detailed information then we might finally understand quite how difficult bank fraud is. I fully expect the story will run something along the lines that <n> accounts with 10,000 pounds in them are comprised, that the crooks fraudulently transfer 995 pounds from most, but not all of these <n> — but that half the time the fraudulent transaction is reversed.

If this analysis is correct then online banking fraud is a still, on average, much more lucrative than kidnapping — but we must make up our mind as to whether to measure it using the figures of 10,000 or 995 or “about half of 995 is permanently lost”. There’s justification to every way of measuring the problem — but it it’s important to understand the limitations of any single measurement; failure to do so will mean that the banks will not deploy the right level of security measures — and the politicians will fail to give the issue an appropriate level of  consideration.

A Study of Whois Privacy and Proxy Service Abuse

Long time readers will recall that last year ICANN published the draft report of our study into the abuse of privacy and proxy services when registering domain names.
At WEIS 2014 I will present our academic paper summarising what we have found — and the summary (as the slides for the talk indicate) is very straightforward:

  • when criminals register domain names for use in online criminality they don’t provide their names and addresses;
  • we collected substantial data to show that this is generally true;
  • in doing so we found that the way in which contact details are hidden varies somewhat depending upon the criminal activity and this gives new insights;
  • meantime, people calling for changes to domain ‘privacy’ and ‘proxy’ services “because they are used by criminals” must understand:
    • the impact of such a policy change on other registrants
    • the limitations of such a policy change on criminals

To give just one example, the registrants of the domain names used for fake pharmacies are the group that uses privacy and proxy services the most (55%) : that’s because a key way in which such pharmacy domains are suppressed is to draw attention to invalid details having been provided when the domain was registered. Privacy and proxy services hide this fakery. In contrast, the registrants of domains that are used to supply child sexual images turn to privacy and proxy services just 29% of the time (only just higher than banks — 28%)… but drawing attention to fallacious registration details is not the approach that is generally taken for this type of content.

Our work provides considerable amounts of hard data to inform the debates around changing the domain Whois system to significantly improve accuracy and usefulness and to prevent misuse. Abolishing privacy and proxy services, if this was even possible, would affect a substantial amount of lawful activity — while criminals currently using these services might be expected to adopt the methods of their peers and instead provide incomplete and inaccurate data. However, insisting that domain registration data was always complete and accurate would mean a great many lawful registrations would need to be updated.

Ghosts of Banking Past

Bank names are so tricksy — they all have similar words in them… and so it’s common to see phishing feeds with slightly the wrong brand identified as being impersonated.

However, this story is about how something the way around has happened, in that AnonGhost, a hacker group, believe that they’ve defaced “Yorkshire Bank, one of the largest United Kingdom bank” and there’s some boasting about this to be found at http://www.p0ison.com/ybs-bank-got-hacked-by-team-anonghost/.

However, it rather looks to me as if they’ve hacked an imitation bank instead! A rather less glorious exploit from the point of view of potential admirers.
Continue reading Ghosts of Banking Past

Don't believe what you read in the papers

Yesterday the heads of “MI5”, “MI6” and GCHQ appeared before the Intelligence Security Committee of Parliament. The uncorrected transcript of their evidence is now online (or you can watch the video).

One of the questions fielded by Andrew Parker (“MI5”) was how many terrorist plots there had been over the past ten years. According to the uncorrected transcript (and this accords with listening to the video — question starts at 34:40) he said:

I think the number since… if I go back to 2005, rather than ten years… 7/7 is that there have been 34 plots towards terrorism that have been disrupted in this country, at all sizes and stages. I have referred publicly and previously, and my predecessors have, to the fact that one or two of those were major plots aimed at mass casualty that have been attempted each year. Of that 34, most of them, the vast majority, have been disrupted by active detection and intervention by the Agencies and the police. One or two of them, a small number, have failed because they just failed. The plans did not come together. But the vast majority by intervention.

I understand that to mean 34 plots over 8 years most but not all of which were disrupted, rather than just discovered. Of these, one or two per year were aimed at causing mass casualties (that’s 8 to 16 of them). I find it really quite surprising that such a rough guess of 8 to 16 major plots was not remarked upon by the Committee — but then they were being pretty soft generally in what they asked about.

The journalists who covered the story heard this all slightly differently, both as to how many plots were foiled by the agencies and how many were aimed at causing mass casualties!
Continue reading Don't believe what you read in the papers

A Study of Whois Privacy and Proxy Service Abuse

ICANN have now published a draft for public comment of “A Study of Whois Privacy and Proxy Service Abuse“. I am the primary author of this report — the work being done whilst I was collaborating with the National Physical Laboratory (NPL) under EPSRC Grant EP/H018298/1.

This particular study was originally proposed by ICANN in 2010, one of several that were to examine the impact of domain registrants using privacy services (where the name of a domain registrant is published, but contact details are kept private) and proxy services (where even the domain licensee’s name is not made available on the public database).

ICANN wanted to know if a significant percentage of the domain names used to conduct illegal or harmful Internet activities are registered via privacy or proxy services to obscure the perpetrator’s identity? No surprises in our results: they are!

However, it’s more interesting to ask whether this percentage is somewhat higher than the usage of privacy or proxy services for entirely lawful and harmless Internet activities? This turned out NOT to be the case — for example banks use privacy and proxy services almost as often as the registrants of domains used in the hosting of child sexual abuse images; and the registrants of domains used to host (legal) adult pornography use privacy and proxy services more often than most (but not all) of the different types of malicious activity that we studied.

It’s also relevant to consider what other methods might be chosen by those involved in criminal activity to obscure their identities, because in the event of changes to privacy and proxy services, it is likely that they will turn to these alternatives.

Accordingly, we determined experimentally whether a significant percentage of the domain names we examined have been registered with incorrect Whois contact information – and specifically whether or not we could reach the domain registrant using a phone number from the Whois information. We asked them a single question in their native language “did you register this domain”?

We got somewhat variable results from our phone survey — but the pattern becomes clear if we consider whether there is any a priori hope at all of ringing up the domain registrant?

If we sum up the likelihoods:

  • uses privacy or proxy service
  • no (apparently valid) phone number in whois
  • number is apparently valid, but fails to connect
  • number reaches someone other than the registrant

then we find that for legal and harmless activities the probability of a phone call not being possible ranges between 24% (legal pharmacies on the Legitscript list) and 62% (owners of lawful websites that someone has broken into and installed phishing pages). For malicious activities the probability of failure is 88% or more, with typosquatting (which is a civil matter, rather than a criminal one) sitting at 68% (some of the typosquatters want to hide, some do not).

There’s lots of detail and supporting statistics in the report… and an executive summary for the time-challenged. It will provide real data, rather than just speculative anecdotes, to inform the debate around reforming Whois — and the difficulties of doing so.

Traceability in the Queen's Speech

The Queen’s speech at today’s state opening of Parliament includes the prediction:

“In relation to the problem of matching Internet protocol addresses, my Government will bring forward proposals to enable the protection of the public and the investigation of crime in cyberspace”

This is all that remains of the Home Office’s ambition to bring forward a revised version of the Draft Communications Data Bill that two Parliamentary Select Committees were so unimpressed by, and which the Liberal Democrats have declined to support.

The sole issue on which there appears to be political consensus is that “something must be done” about the traceability failure that regularly occurs when the Internet is accessed from a smartphone. The shortage of IPv4 addresses means that the mobile companies cannot give each smartphone a unique IP address — so hundreds of users share the same IP address with only the TCP/UDP source port number distinguishing their traffic. Because this sharing is done very dynamically the mobile phone companies find it problematic to record the source port mapping, and they have argued that the way the EU Data Retention Directive is written they have no obligation to make and keep such records.

I wrote about this issue at some length on this blog in January 2010, although until very recently the Home Office considered it to be tantamount to a state secret and were extremely coy about discussing it in the public.

The Queen’s “bring forward proposals” phrase appears to cover a range of options:

  • the mobile companies decide that they can manage to log the source port mapping data after all;
  • the Home Office pays for new kit at the mobile companies that will allow source port mapping to be done;
  • there is a short bill (or clause in another bill) that requires the logging to be done (this might avoid any question of payments being ultra vires, or would ensure compliance by companies (possibly broadband suppliers) that looked like becoming stragglers;
  • there are discussions but nothing happens at all — perhaps because the tide turns against Data Retention as being a necessary and proportionate policy. A number of other EU countries have found it to be incompatible with fundamental human rights.

The Open Rights Group (ORG) have recently produced a pamphlet (available online here) setting out how surveillance might be better approached in this century. I contributed the chapter on the technical issues…

… if you don’t have time to read the whole thing then the New Statesman has an edited version of my chapter; and you can watch a short video of myself (and two other contributors) explaining the major issues.