Talking Trojan: Analyzing an Industry-Wide Disclosure

Talking Trojan: Analyzing an Industry-Wide Disclosure tells the story of what happened after we discovered the Trojan Source vulnerability, which broke almost all computer languages, and the Bad Characters vulnerability, which broke almost all large NLP tools. This provided a unique opportunity to measure software maintenance in action. Who patched quickly, reluctantly, or not at all? Who paid bug bounties, and who dodged liability? What parts of the disclosure ecosystem work well, which are limping along, and which are broken?

Security papers typically describe a vulnerability but say little about how it was disclosed and patched. And while disclosing one vulnerability to a single vendor can be hard enough, modern supply chains multiply the number of affected parties leading to an exponential increase in the complexity of the disclosure. One vendor will want an in-house web form, another will use an outsourced bug bounty platform, still others will prefer emails, and *nix OS maintainers will use a very particular PGP mailing list. Governments sort-of want to assist with disclosures but prefer to use yet another platform. Many open-source projects lack an embargoed disclosure process, but it is often in the interest of commercial operating system maintainers to write embargoed patches – if you can get hold of the right people.

A vulnerability that affected many different products at the same time and in similar ways gave us a unique chance to observe the finite-impulse response of this whole complex system. Our observations reveal a number of weaknesses, such as a potentially dangerous misalignment of incentives between commercially sponsored bug bounty programs and multi-vendor coordinated disclosure platforms. We suggest tangible changes that could strengthen coordinated disclosure globally.

We also hope to inspire other researchers to publish the mechanics of individual disclosures, so that we can continue to measure and improve the critical ecosystem on which we rely as our main defense against growing supply chain threats. In the meantime, our paper can be found here, and will appear in SCORED ‘22 this November.

ExtremeBB: Supporting Large-Scale Research into Misogyny and Online Extremism

Online anonymous platforms such as forums enable freedom of speech, but also facilitate misogyny, extremism, and political polarisation. We have collected tens of millions of postings to such forums and created a new tool for social scientists to study how these phenomena are linked.

Far-right extremism has been associated with a growing number of mass killings, overtaking Islamist terrorism in about 2018. Examples include the Wisconsin Sikh temple shooting (2012), the riots in Charlottesville (2017), the Pittsburgh synagogue shooting (2018), the Christchurch mosque shootings (2019), the US Capitol riots (January 2021), and recently the Buffalo shooting (May 2022). Misogyny has been explicitly linked with terror attacks including the Isla Vista killings (2014), the Toronto Van attack (2018), the Hanau shootings (early 2020), and most recently, the Plymouth shooting in the UK (August 2021).

Are extremism and misogyny linked? Joan Smith documented how the great majority of the men who committed terrorist killings in Europe since 9/11, whether far-right or Islamist, display strongly misogynistic attitudes. Most also have a history of physically abusing women — often in their own families — before committing acts of violence against strangers. The Womanstats database, created by Val Hudson and colleagues, has uncovered many statistically significant relationships between the physical security of women and the security of states: authoritarian patriarchal attitudes undermine good government in multiple ways.

Social scientists — who often have limited technical skills to deal with complicated collection techniques to compile a reasonably meaningful database — lack quantitative measurements at a finer granularity. The case studies collected by Smith and the macroeconomic data collected in Womanstats are compelling in their own ways. However, there are not many high-quality datasets that support quantitative analysis at scales in between individuals and whole societies. The existing resources tend to be small, difficult to access, or not well-maintained.

We have therefore created ExtremeBB, a longitudinal structured textual database of nearly 50M posts made by around 400K registered active members on 12 online extremist forums that promote misogyny and far-right extremism (as of September 2022). Its goal is to facilitate both qualitative and quantitative research on historical trends going back two decades. Our data can help researchers trace the evolution of extremist ideology, extremist behaviours, external political movements and relationships between online subcultures; measure hate speech and toxicity; and explore links between misogyny, far-right extremism, and their correlation. A better understanding of extremist subcultures may lead to more effective interventions, while ExtremeBB may also help monitor the effectiveness of any interventions that are undertaken.

This database is being actively maintained and developed with special attention to ensuring data completeness and making it a reliable resource. Academic researchers can request access through the Cambridge Cybercrime Centre, subject to a standard license to ensure lawful and ethical use. Since the database was first opened to external researchers in 2021, access has been granted to 49 researchers from 16 groups in 12 universities. The paper describing this powerful new resource and describing some of the things we have so far discovered using it can be found here.

The Dynamics of Industry-wide Disclosure

Last year, we disclosed two related vulnerabilities that broke a wide range of systems. In our Bad Characters paper, we showed how to use Unicode tricks – such as homoglyphs and bidi characters – to mislead NLP systems. Our Trojan Source paper showed how similar tricks could be used to make source code look one way to a human reviewer, and another way to a compiler, opening up a wide range of supply-chain attacks on critical software. Prior to publication, we disclosed our findings to four suppliers of large NLP systems, and nineteen suppliers of software development tools. So how did industry respond?

We were invited to give the keynote talk this year at LangSec, and the video is now available. In it we describe not just the Bad Characters and Trojan Source vulnerabilities, but the large natural experiment created by their disclosure. The Trojan Source vulnerability affected most compilers, interpreters, code editors and code repositories; this enabled us to compare responses by firms versus nonprofits and by firms that managed their own response versus those who outsourced it. The interaction between bug bounty programs, government disclosure assistance, peer review and press coverage was interesting. Most of the affected development teams took action, though some required a bit of prodding.

The response by the NLP maintainers was much less enthusiastic. By the time we gave this talk, only Google had done anything – though we now hear that Microsoft is now also working on a fix. The reasons for this responsibility gap need to be understood better. They may include differences in culture between C coders and data scientists; the greater costs and delays in the build-test-deploy cycle for large ML models; and the relative lack of press interest in attacks on ML systems. If many of our critical systems start to include ML components that are less maintainable, will the ML end up being the weakest link?

Morello chip on board

Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties

Memory safety bugs continue to be a major source of security vulnerabilities, with their root causes ingrained in the industry:

  • the C and C++ systems programming languages that do not enforce memory protection, and the huge legacy codebase written in them that we depend on;
  • the legacy design choices of hardware that provides only coarse-grain protection mechanisms, based on virtual memory; and
  • test-and-debug development methods, in which only a tiny fraction of all possible execution paths can be checked, leaving ample unexplored corners for exploitable bugs.

Over the last twelve years, the CHERI project has been working on addressing the first two of these problems by extending conventional hardware Instruction-Set Architectures (ISAs) with new architectural features to enable fine-grained memory protection and highly scalable software compartmentalisation, prototyped first as CHERI-MIPS and CHERI-RISC-V architecture designs and FPGA implementations, with an extensive software stack ported to run above them.

The academic experimental results are very promising, but achieving widespread adoption of CHERI needs an industry-scale evaluation of a high-performance silicon processor implementation and software stack. To that end, Arm have developed Morello, a CHERI-enabled prototype architecture (extending Armv8.2-A), processor (adapting the high-performance Neoverse N1 design), system-on-chip (SoC), and development board, within the UKRI Digital Security by Design (DSbD) Programme (see our earlier blog post on Morello). Morello is now being evaluated in a range of academic and industry projects.

Morello desktopMorello chip on board

However, how do we ensure that such a new architecture actually provides the security guarantees it aims to provide? This is crucial: any security flaw in the architecture will be present in any conforming hardware implementation, quite likely impossible to fix or work around after deployment.

In this blog post, we describe how we used rigorous engineering methods to provide high assurance of key security properties of CHERI architectures, with machine-checked mathematical proof, as well as to complement and support traditional design and development workflows, e.g. by automatically generating test suites. This is addressing the third problem, showing that, by judicious use of rigorous semantics at design time, we can do much better than test-and-debug development.
Continue reading Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties

Text mining is harder than you think

Following last year’s row about Apple’s proposal to scan all the photos on your iPhone camera roll, EU Commissioner Johansson proposed a child sex abuse regulation that would compel providers of end-to-end encrypted messaging services to scan all messages in the client, and not just for historical abuse images but for new abuse images and for text messages containing evidence of grooming.

Now that journalists are distracted by the imminent downfall of our great leader, the Home Office seems to think this is a good time to propose some amendments to the Online Safety Bill that will have a similar effect. And while the EU planned to win the argument against the pedophiles first and then expand the scope to terrorist radicalisation and recruitment too, Priti Patel goes for the terrorists from day one. There’s some press coverage in the Guardian and the BBC.

We explained last year why client-side scanning is a bad idea. However, the shift of focus from historical abuse images to text scanning makes the government story even less plausible.

Detecting online wickedness from text messages alone is hard. Since 2016, we have collected over 99m messages from cybercrime forums and over 49m from extremist forums, and these corpora are used by 179 licensees in 55 groups from 42 universities in 18 countries worldwide. Detecting hate speech is a good proxy for terrorist radicalisation. In 2018, we thought we could detect hate speech with a precision of typically 92%, which would mean a false-alarm rate of 8%. But the more complex models of 2022, based on Google’s BERT, when tested on the better collections we have now, don’t do significantly better; indeed, now that we understand the problem in more detail, they often do worse. Do read that paper if you want to understand why hate-speech detection is an interesting scientific problem. With some specific kinds of hate speech it’s even harder; an example is anti-semitism, thanks to the large number of synonyms for Jewish people. So if we were to scan 10bn messages a day in Europe there would be maybe a billion false alarms for Europol to look at.

We’ve been scanning the Internet for wickedness for over fifteen years now, and looking at various kinds of filters for everything from spam to malware. Filtering requires very low false positive rates to be feasible at Internet scale, which means either looking for very specific things (such as indicators of compromise by a specific piece of malware) or by having rich metadata (such as a big spam run from some IP address space you know to be compromised). Whatever filtering Facebook can do on Messenger given its rich social context, there will be much less that a WhatsApp client can do by scanning each text on its way through.

So if you really wish to believe that either the EU’s CSA Regulation or the UK’s Online Harms Bill is an honest attempt to protect kids or catch terrorists, good luck.

Screenshot of the archived version of the Action Fraud website linked to from the NCA contact us page.

Reporting cybercrime is hard: NCA link to Action Fraud broken for 3 years

Screenshot of the archived version of the Action Fraud website linked to from the NCA contact us page.
Archived version of Action Fraud website

Yesterday I was asked for advice on anonymously reporting a new crypto scam that a potential victim had spotted before they lost money (hint: to a first approximation all cryptocurrencies and cryptoassets are a scam). In the end they got fed up with the difficulty of finding someone they could tell and gave up. However, to give the advice I thought I would check what the National Crime Agency’s National Cyber Crime Unit suggested so I searched “NCA NCCU report scam” and the first result was for the NCA’s Contact us page. Sounds good. It has a “Fraud” section which (as expected) talks about Action Fraud. However, since 2019 this page has linked to the National Archives archive of an old version of the Action Fraud website. So for three years if you followed the NCA’s website’s advice on how to report fraud you would have got very confused until you worked out you were on a (clearly labelled) archive rather than the proper website, which is why none of the forms work.

I reported this problem yesterday and I do not expect it to have been fixed by the time of writing but this problem going unresolved for three years is a clear example of the difficulties faced by victims of cybercrime.

2019 is also the year that Police Scotland declined to pay for Action Fraud as they did not consider it to provide value for money and instead handle fraud reporting internally.

I am PI of a jointly supervised between the University of Strathclyde and the University of Edinburgh PhD project funded by the Scottish Institute for Policing Research and the University of Strathclyde on Improving Cybercrime Reporting. Do get in touch with other stories of the difficulties of reporting cybercrime. The student, Juraj Sikra has published a systematic literature review on Improving Cybercrime Reporting in Scotland. It is clear that there is a long way to go to provide person centred cybercrime reporting for victims and potential victims. However, UK law enforcement in general, and Police Scotland in particular know there is a problem and do want to fix it.

European Commission prefers breaking privacy to protecting kids

Today, May 11, EU Commissioner Ylva Johannson announced a new law to combat online child sex abuse. This has an overt purpose, and a covert purpose.

The overt purpose is to pressure tech companies to take down illegal material, and material that might possibly be illegal, more quickly. A new agency is to be set up in the Hague, modeled on and linked to Europol, to maintain an official database of illegal child sex-abuse images. National authorities will report abuse to this new agency, which will then require hosting providers and others to take suspect material down. The new law goes into great detail about the design of the takedown process, the forms to be used, and the redress that content providers will have if innocuous material is taken down by mistake. There are similar provisions for blocking URLs; censorship orders can be issued to ISPs in Member States.

The first problem is that this approach does not work. In our 2016 paper, Taking Down Websites to Prevent Crime, we analysed the takedown industry and found that private firms are much better at taking down websites than the police. We found that the specialist contractors who take down phishing websites for banks would typically take six hours to remove an offending website, while the Internet Watch Foundation – which has a legal monopoly on taking down child-abuse material in the UK – would often take six weeks.

We have a reasonably good understanding of why this is the case. Taking down websites means interacting with a great variety of registrars and hosting companies worldwide, and they have different ways of working. One firm expects an encrypted email; another wants you to open a ticket; yet another needs you to phone their call centre during Peking business hours and speak Mandarin. The specialist contractors have figured all this out, and have got good at it. However, police forces want to use their own forms, and expect everyone to follow police procedure. Once you’re outside your jurisdiction, this doesn’t work. Police forces also focus on process more than outcome; they have difficulty hiring and retaining staff to do detailed technical clerical work; and they’re not much good at dealing with foreigners.

Our takedown work was funded by the Home Office, and we recommended that they run a randomised controlled trial where they order a subset of UK police forces to use specialist contractors to take down criminal websites. We’re still waiting, six years later. And there’s nothing in UK law that would stop them running such a trial, or that would stop a Chief Constable outsourcing the work.

So it’s really stupid for the European Commission to mandate centralised takedown by a police agency for the whole of Europe. This will be make everything really hard to fix once they find out that it doesn’t work, and it becomes obvious that child abuse websites stay up longer, causing real harm.

Oh, and the covert purpose? That is to enable the new agency to undermine end-to-end encryption by mandating client-side scanning. This is not evident on the face of the bill but is evident in the impact assessment, which praises Apple’s 2021 proposal. Colleagues and I already wrote about that in detail, so I will not repeat the arguments here. I will merely note that Europol coordinates the exploitation of communications systems by law enforcement agencies, and the Dutch National High-Tech Crime Unit has developed world-class skills at exploiting mobile phones and chat services. The most recent case of continent-wide bulk interception was EncroChat; although reporting restrictions prevent me telling the story of that, there have been multiple similar cases in recent years.

So there we have it: an attack on cryptography, designed to circumvent EU laws against bulk surveillance by using a populist appeal to child protection, appears likely to harm children instead.

Hiring for iCrime

A Research Assistant/Associate position is available at the Department of Computer Science and Technology to work on the ERC-funded Interdisciplinary Cybercrime Project (iCrime). We are looking to appoint a computer scientist to join an interdisciplinary team reporting to Dr Alice Hutchings.

iCrime incorporates expertise from criminology and computer science to research cybercrime offenders, their crime type, the place (such as online black markets), and the response. Within iCrime, we sustain robust data collection infrastructure to gather unique, high quality datasets, and design novel methodologies to identify and measure criminal infrastructure at scale. This is particularly important as cybercrime changes dynamically. Overall, our approach is evaluative, critical, and data driven.

Successful applicants will work in a team to collect and analyse data, develop tools, and write research outputs. Desirable technical skills include:

– Familiarity with automated data collection (web crawling and scraping) and techniques to sustain the complex data collection in adversarial environments at scale.
– Excellent software engineering skills, being familiar with Python, Bash scripting, and web development, particularly NodeJS and ReactJS.
– Experience in DevOps to integrate and migrate new tools within the existing ecosystem, and to automate data collection/transmission/backup pipelines.
– Working knowledge of Linux/Unix.
– Familiarity with large-scale databases, including relational databases and ElasticSearch.
– Practical knowledge of security and privacy to keep existing systems secure and protect against data leakage.
– Expertise in cybercrime research and data science/analysis is desirable, but not essential.

Please read the formal advertisement (at https://www.jobs.cam.ac.uk/job/34324/) for the details about exactly who and what we’re looking for and how to apply — and please pay special attention to our request for a covering letter!