Category Archives: Security engineering

Bad security, good security, case studies, lessons learned

ML models must also think about trusting trust

Our latest paper demonstrates how a Trojan or backdoor can be inserted into a machine-learning model by the compiler. In his Turing Award lecture, Ken Thompson explained how this could be done to an operating system, and in previous work we’d shown you you can subvert a model by manipulating the order in which training data are presented. Could these ideas be combined?

The answer is yes. The trick is for the compiler to recognise what sort of model it’s compiling – whether it’s processing images or text, for example – and then devising trigger mechanisms for such models that are sufficiently covert and general. The takeaway message is that for a machine-learning model to be trustworthy, you need to assure the provenance of the whole chain: the model itself, the software tools used to compile it, the training data, the order in which the data are batched and presented – in short, everything.

The Online Safety Bill: Reboot it, or Shoot it?

Yesterday I took part in a panel discussion organised by the Adam Smith Institute on the Online Safety Bill. This sprawling legislative monster has outlasted not just six Secretaries of State for Culture, Media and Sport, but two Prime Ministers. It’s due to slither back to Parliament in November, so we wrote a Policy Brief that explains what it tries to do and some of the things it gets wrong.

Some of the bill’s many proposals command wide support – for example, that online services should enable users to contact them effectively to report illegal material, which should be removed quickly. At present, only copyright owners and the police seem to be able to get the attention of the major platforms; ordinary people, including young people, should also be able to report unlawful things and have them taken down quickly. Here, the UK government intends to bind only large platforms like Facebook and Twitter. We propose extending the duty to gaming platforms too. Kids just aren’t on Facebook any more.

The Bill also tries to reignite the crypto wars by empowering Ofcom to require services to use “accredited technology” (read: software written by GCHQ contractors) to scan your WhatsApp messages. The idea that you can catch violent criminals such as child abusers and terrorists by bulk text scanning is entirely implausible; the error rates are so high that the police would swamped with false positives. Quite apart from that, bulk intercept has always been illegal in Britain, and would also contravene the European Convention on Human Rights, to which we are still a signatory despite Brexit. This power to mandate client-side scanning has to be scrapped, a move that quite a few MPs already support.

But what should we do instead about illegal images of minors, and about violent online political extremism? More local policing would be better; we explain why. This is informed by our work on the link between violent extremism and misogyny, as well as our analysis of a similar proposal in the EU. So it is welcome that the government is hiring more police officers. What’s needed now is a greater focus on family violence, which is the root cause of most child abuse, rather than using child abuse as an excuse to increase the central agencies’ surveillance powers and budgets.

In our Policy Brief, we also discuss content moderation, and suggest that it be guided by the principle of minimising cruelty. One of the other panelists, Graham Smith, discussed the legal difficulties of regulating speech and made a strong case that restrictions (such as copyright, libel, incitement and harassment) should be set out in primary legislation rather than farmed out to private firms, as at present, or to a regulator, as the Bill proposes. Given that most of the bad stuff is illegal already, why not make a start by enforcing the laws we already have, as they do in Germany? British policing efforts online range from the pathetic to the outrageous. It looks like Parliament will have some interesting decisions to take when the bill comes back.

Talking Trojan: Analyzing an Industry-Wide Disclosure

Talking Trojan: Analyzing an Industry-Wide Disclosure tells the story of what happened after we discovered the Trojan Source vulnerability, which broke almost all computer languages, and the Bad Characters vulnerability, which broke almost all large NLP tools. This provided a unique opportunity to measure software maintenance in action. Who patched quickly, reluctantly, or not at all? Who paid bug bounties, and who dodged liability? What parts of the disclosure ecosystem work well, which are limping along, and which are broken?

Security papers typically describe a vulnerability but say little about how it was disclosed and patched. And while disclosing one vulnerability to a single vendor can be hard enough, modern supply chains multiply the number of affected parties leading to an exponential increase in the complexity of the disclosure. One vendor will want an in-house web form, another will use an outsourced bug bounty platform, still others will prefer emails, and *nix OS maintainers will use a very particular PGP mailing list. Governments sort-of want to assist with disclosures but prefer to use yet another platform. Many open-source projects lack an embargoed disclosure process, but it is often in the interest of commercial operating system maintainers to write embargoed patches – if you can get hold of the right people.

A vulnerability that affected many different products at the same time and in similar ways gave us a unique chance to observe the finite-impulse response of this whole complex system. Our observations reveal a number of weaknesses, such as a potentially dangerous misalignment of incentives between commercially sponsored bug bounty programs and multi-vendor coordinated disclosure platforms. We suggest tangible changes that could strengthen coordinated disclosure globally.

We also hope to inspire other researchers to publish the mechanics of individual disclosures, so that we can continue to measure and improve the critical ecosystem on which we rely as our main defense against growing supply chain threats. In the meantime, our paper can be found here, and will appear in SCORED ‘22 this November.

The Dynamics of Industry-wide Disclosure

Last year, we disclosed two related vulnerabilities that broke a wide range of systems. In our Bad Characters paper, we showed how to use Unicode tricks – such as homoglyphs and bidi characters – to mislead NLP systems. Our Trojan Source paper showed how similar tricks could be used to make source code look one way to a human reviewer, and another way to a compiler, opening up a wide range of supply-chain attacks on critical software. Prior to publication, we disclosed our findings to four suppliers of large NLP systems, and nineteen suppliers of software development tools. So how did industry respond?

We were invited to give the keynote talk this year at LangSec, and the video is now available. In it we describe not just the Bad Characters and Trojan Source vulnerabilities, but the large natural experiment created by their disclosure. The Trojan Source vulnerability affected most compilers, interpreters, code editors and code repositories; this enabled us to compare responses by firms versus nonprofits and by firms that managed their own response versus those who outsourced it. The interaction between bug bounty programs, government disclosure assistance, peer review and press coverage was interesting. Most of the affected development teams took action, though some required a bit of prodding.

The response by the NLP maintainers was much less enthusiastic. By the time we gave this talk, only Google had done anything – though we now hear that Microsoft is now also working on a fix. The reasons for this responsibility gap need to be understood better. They may include differences in culture between C coders and data scientists; the greater costs and delays in the build-test-deploy cycle for large ML models; and the relative lack of press interest in attacks on ML systems. If many of our critical systems start to include ML components that are less maintainable, will the ML end up being the weakest link?

Morello chip on board

Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties

Memory safety bugs continue to be a major source of security vulnerabilities, with their root causes ingrained in the industry:

  • the C and C++ systems programming languages that do not enforce memory protection, and the huge legacy codebase written in them that we depend on;
  • the legacy design choices of hardware that provides only coarse-grain protection mechanisms, based on virtual memory; and
  • test-and-debug development methods, in which only a tiny fraction of all possible execution paths can be checked, leaving ample unexplored corners for exploitable bugs.

Over the last twelve years, the CHERI project has been working on addressing the first two of these problems by extending conventional hardware Instruction-Set Architectures (ISAs) with new architectural features to enable fine-grained memory protection and highly scalable software compartmentalisation, prototyped first as CHERI-MIPS and CHERI-RISC-V architecture designs and FPGA implementations, with an extensive software stack ported to run above them.

The academic experimental results are very promising, but achieving widespread adoption of CHERI needs an industry-scale evaluation of a high-performance silicon processor implementation and software stack. To that end, Arm have developed Morello, a CHERI-enabled prototype architecture (extending Armv8.2-A), processor (adapting the high-performance Neoverse N1 design), system-on-chip (SoC), and development board, within the UKRI Digital Security by Design (DSbD) Programme (see our earlier blog post on Morello). Morello is now being evaluated in a range of academic and industry projects.

Morello desktopMorello chip on board

However, how do we ensure that such a new architecture actually provides the security guarantees it aims to provide? This is crucial: any security flaw in the architecture will be present in any conforming hardware implementation, quite likely impossible to fix or work around after deployment.

In this blog post, we describe how we used rigorous engineering methods to provide high assurance of key security properties of CHERI architectures, with machine-checked mathematical proof, as well as to complement and support traditional design and development workflows, e.g. by automatically generating test suites. This is addressing the third problem, showing that, by judicious use of rigorous semantics at design time, we can do much better than test-and-debug development.
Continue reading Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties

Text mining is harder than you think

Following last year’s row about Apple’s proposal to scan all the photos on your iPhone camera roll, EU Commissioner Johansson proposed a child sex abuse regulation that would compel providers of end-to-end encrypted messaging services to scan all messages in the client, and not just for historical abuse images but for new abuse images and for text messages containing evidence of grooming.

Now that journalists are distracted by the imminent downfall of our great leader, the Home Office seems to think this is a good time to propose some amendments to the Online Safety Bill that will have a similar effect. And while the EU planned to win the argument against the pedophiles first and then expand the scope to terrorist radicalisation and recruitment too, Priti Patel goes for the terrorists from day one. There’s some press coverage in the Guardian and the BBC.

We explained last year why client-side scanning is a bad idea. However, the shift of focus from historical abuse images to text scanning makes the government story even less plausible.

Detecting online wickedness from text messages alone is hard. Since 2016, we have collected over 99m messages from cybercrime forums and over 49m from extremist forums, and these corpora are used by 179 licensees in 55 groups from 42 universities in 18 countries worldwide. Detecting hate speech is a good proxy for terrorist radicalisation. In 2018, we thought we could detect hate speech with a precision of typically 92%, which would mean a false-alarm rate of 8%. But the more complex models of 2022, based on Google’s BERT, when tested on the better collections we have now, don’t do significantly better; indeed, now that we understand the problem in more detail, they often do worse. Do read that paper if you want to understand why hate-speech detection is an interesting scientific problem. With some specific kinds of hate speech it’s even harder; an example is anti-semitism, thanks to the large number of synonyms for Jewish people. So if we were to scan 10bn messages a day in Europe there would be maybe a billion false alarms for Europol to look at.

We’ve been scanning the Internet for wickedness for over fifteen years now, and looking at various kinds of filters for everything from spam to malware. Filtering requires very low false positive rates to be feasible at Internet scale, which means either looking for very specific things (such as indicators of compromise by a specific piece of malware) or by having rich metadata (such as a big spam run from some IP address space you know to be compromised). Whatever filtering Facebook can do on Messenger given its rich social context, there will be much less that a WhatsApp client can do by scanning each text on its way through.

So if you really wish to believe that either the EU’s CSA Regulation or the UK’s Online Harms Bill is an honest attempt to protect kids or catch terrorists, good luck.

Arm releases experimental CHERI-enabled Morello board as part of £187M UKRI Digital Security by Design programme

Professor Robert N. M. Watson (Cambridge), Professor Simon W. Moore (Cambridge), Professor Peter Sewell (Cambridge), Dr Jonathan Woodruff (Cambridge), Brooks Davis (SRI), and Dr Peter G. Neumann (SRI)

After over a decade of research creating the CHERI protection model, hardware, software, and formal models and proofs, developed over three DARPA research programmes, we are at a truly exciting moment. Today, Arm announced first availability of its experimental CHERI-enabled Morello processor, System-on-Chip, and development board – an industrial quality and industrial scale demonstrator of CHERI merged into a high-performance processor design. Not only does Morello fully incorporate the features described in our CHERI ISAv8 specification to provide fine-grained memory protection and scalable software compartmentalisation, but it also implements an Instruction-Set Architecture (ISA) with formally verified security properties. The Arm Morello Program is supported by the £187M UKRI Digital Security by Design (DSbD) research programme, a UK government and industry-funded effort to transition CHERI towards mainstream use.

Continue reading Arm releases experimental CHERI-enabled Morello board as part of £187M UKRI Digital Security by Design programme

Security engineering course

This week sees the start of a course on security engineering that Sam Ainsworth and I are teaching. It’s based on the third edition of my Security Engineering book, and is a first cut at a ‘film of the book’.

Each week we will put two lectures online, and here are the first two. Lecture 1 discusses our adversaries, from nation states through cyber-crooks to personal abuse, and the vulnerability life cycle that underlies the ecosystem of attacks. Lecture 2 abstracts this empirical experience into more formal threat models and security policies.

Although our course is designed for masters students and fourth-year undergrads in Edinburgh, we’re making the lectures available to everyone. I’ll link the rest of the videos in followups here, and eventually on the book’s web page.

Electhical 2021

Electhical is an industry forum whose focus is achieving a low total footprint for electronics. It is being held on Friday December 10th at Churchill College, Cambridge. The speakers are from government, industry and academia; they include executives and experts on technology policy, consumer electronics, manufacturing, security and privacy. It’s sponsored by ARM, IEEE, IEEE CAS and Churchill College; registration is free.

Rollercoaster: Communicating Efficiently and Anonymously in Large Groups

End-to-end (E2E) encryption is now widely deployed in messaging apps such as WhatsApp and Signal and billions of people around the world have the contents of their message protected against strong adversaries. However, while the message contents are encrypted, their metadata still leaks sensitive information. For example, it is easy for an infrastructure provider to tell which customers are communicating, with whom and when.

Anonymous communication hides this metadata. This is crucial for the protection of individuals such as whistleblowers who expose criminal wrongdoing, activists organising a protest, or embassies coordinating a response to a diplomatic incident. All these face powerful adversaries for whom the communication metadata alone (without knowing the specific message text) can result in harm for the individuals concerned.

Tor is a popular tool that achieves anonymous communication by forwarding messages through multiple intermediate nodes or relays. At each relay the outermost layer of the message is decrypted and the inner message is forwarded to the next relay. An adversary who wants to figure out where A’s messages are finally delivered can attempt to follow a message as it passes through each relay. Alternatively, an adversary might confirm a suspicion that user A talks to user B by observing traffic patterns at A’s and B’s access points to the network instead. If indeed A and B are talking to each other, there will be a correlation between their traffic patterns. For instance, if an adversary observes that A sends three messages and three messages arrive at B shortly afterwards, this provides some evidence that A talks to B. The adversary can increase their certainty by collecting traffic over a longer period of time.

Mix networks such as Loopix use a different design, which defends against such traffic analysis attacks by using (i) traffic shaping and (ii) more intermediate nodes, so called mix nodes. In a simple mix network, each client only sends packets of a fixed length and at predefined intervals (e.g. 1 KiB every 5 seconds). When there is no payload to send, a cover packet is crafted that is indistinguishable to the adversary from a payload packet. If there is more than one payload packet to be sent, packets are queued and sent one by one on the predefined schedule. This traffic shaping ensures that an observer cannot gain any information from observing outgoing network packets. Moreover, mix nodes typically delay each incoming message by a random amount of time before forwarding it (with the delay chosen independently for each message), making it harder for an adversary to correlate a mix node’s incoming and outgoing messages, since they are likely to be reordered. In contrast, Tor relays forward messages as soon as possible in order to minimise latency.

Mix Networks work well for pairwise communication, but we found that group communication creates a unique challenge. Such group communication encompasses both traditional chat groups (e.g. WhatsApp groups or IRC) and collaborative editing (e.g. Google Docs, calendar sync, todo lists) where updates need to be disseminated to all other participants who are viewing or editing the content. There are many scenarios where anonymity requirements meet group communication, such as coordination between activists, diplomatic correspondence between embassies, and organisation of political campaigns.

The traffic shaping of mix networks makes efficient group communication difficult. The limited rate of outgoing messages means that sequentially sending a message to each group member can take a long time. For instance, assuming that the outgoing rate is 1 message every 5 seconds, it will take more than 8 minutes to send the message to all members in a group of size 100. During this process the sender’s output queue is blocked and they cannot send any other messages.

In our paper we propose a scheme named Rollercoaster that greatly improves the latency for group communication in mix networks. The basic idea is that group members who have already received a message can help distribute it to other members of the group. Like a chain reaction, the distribution of the message gains momentum as the number of recipients grows. In an ideal execution of this scheme, the number of users who have received a message doubles with every round, leading to substantially more efficient message delivery across the group.

Rollercoaster works well because there is typically plenty of spare capacity in the network. At any given time most clients will not be actively communicating and they are therefore mostly sending cover traffic. As a result, Rollercoaster actually improves the efficiency of the network and reduces the rate of cover traffic, which in turn reduces the overall required network bandwidth. At the same time, Rollercoaster does not require any changes to the existing Mix network protocol and can benefit from the existing user base and anonymity set.

The basic idea requires more careful consideration in a realistic environment where clients are offline or do not behave faithfully. A fault-tolerant version of our Rollercoaster scheme addresses these concerns by waiting for acknowledgement messages from recipients. If those acknowledgement messages are not received by the sender in a fixed period of time, forwarding roles are reassigned and another delivery attempt is made via a new route. We also show how a single number can seed the generation of a deterministic forwarding schedule. This allows efficient communication of different forwarding schedules and balances individual workloads within the group.

We presented our paper at USENIX Security ‘21 (paper, slides, and recording). It contains more extensions and optimisations than we can summarise here. There is also an extended version available as a tech report with more detailed security arguments in the appendices. The paper reference is:
Daniel Hugenroth, Martin Kleppmann, and Alastair R. Beresford. Rollercoaster: An Efficient Group-Multicast Scheme for Mix Networks. Proceedings of the 30th USENIX Security Symposium (USENIX Security), 2021.