July | 2022 | Light Blue Touchpaper

Memory safety bugs continue to be a major source of security vulnerabilities, with their root causes ingrained in the industry:

the C and C++ systems programming languages that do not enforce memory protection, and the huge legacy codebase written in them that we depend on;
the legacy design choices of hardware that provides only coarse-grain protection mechanisms, based on virtual memory; and
test-and-debug development methods, in which only a tiny fraction of all possible execution paths can be checked, leaving ample unexplored corners for exploitable bugs.

Over the last twelve years, the CHERI project has been working on addressing the first two of these problems by extending conventional hardware Instruction-Set Architectures (ISAs) with new architectural features to enable fine-grained memory protection and highly scalable software compartmentalisation, prototyped first as CHERI-MIPS and CHERI-RISC-V architecture designs and FPGA implementations, with an extensive software stack ported to run above them.

The academic experimental results are very promising, but achieving widespread adoption of CHERI needs an industry-scale evaluation of a high-performance silicon processor implementation and software stack. To that end, Arm have developed Morello, a CHERI-enabled prototype architecture (extending Armv8.2-A), processor (adapting the high-performance Neoverse N1 design), system-on-chip (SoC), and development board, within the UKRI Digital Security by Design (DSbD) Programme (see our earlier blog post on Morello). Morello is now being evaluated in a range of academic and industry projects.

However, how do we ensure that such a new architecture actually provides the security guarantees it aims to provide? This is crucial: any security flaw in the architecture will be present in any conforming hardware implementation, quite likely impossible to fix or work around after deployment.

In this blog post, we describe how we used rigorous engineering methods to provide high assurance of key security properties of CHERI architectures, with machine-checked mathematical proof, as well as to complement and support traditional design and development workflows, e.g. by automatically generating test suites. This is addressing the third problem, showing that, by judicious use of rigorous semantics at design time, we can do much better than test-and-debug development.
Continue reading Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties →

Following last year’s row about Apple’s proposal to scan all the photos on your iPhone camera roll, EU Commissioner Johansson proposed a child sex abuse regulation that would compel providers of end-to-end encrypted messaging services to scan all messages in the client, and not just for historical abuse images but for new abuse images and for text messages containing evidence of grooming.

Now that journalists are distracted by the imminent downfall of our great leader, the Home Office seems to think this is a good time to propose some amendments to the Online Safety Bill that will have a similar effect. And while the EU planned to win the argument against the pedophiles first and then expand the scope to terrorist radicalisation and recruitment too, Priti Patel goes for the terrorists from day one. There’s some press coverage in the Guardian and the BBC.

We explained last year why client-side scanning is a bad idea. However, the shift of focus from historical abuse images to text scanning makes the government story even less plausible.

Detecting online wickedness from text messages alone is hard. Since 2016, we have collected over 99m messages from cybercrime forums and over 49m from extremist forums, and these corpora are used by 179 licensees in 55 groups from 42 universities in 18 countries worldwide. Detecting hate speech is a good proxy for terrorist radicalisation. In 2018, we thought we could detect hate speech with a precision of typically 92%, which would mean a false-alarm rate of 8%. But the more complex models of 2022, based on Google’s BERT, when tested on the better collections we have now, don’t do significantly better; indeed, now that we understand the problem in more detail, they often do worse. Do read that paper if you want to understand why hate-speech detection is an interesting scientific problem. With some specific kinds of hate speech it’s even harder; an example is anti-semitism, thanks to the large number of synonyms for Jewish people. So if we were to scan 10bn messages a day in Europe there would be maybe a billion false alarms for Europol to look at.

We’ve been scanning the Internet for wickedness for over fifteen years now, and looking at various kinds of filters for everything from spam to malware. Filtering requires very low false positive rates to be feasible at Internet scale, which means either looking for very specific things (such as indicators of compromise by a specific piece of malware) or by having rich metadata (such as a big spam run from some IP address space you know to be compromised). Whatever filtering Facebook can do on Messenger given its rich social context, there will be much less that a WhatsApp client can do by scanning each text on its way through.

So if you really wish to believe that either the EU’s CSA Regulation or the UK’s Online Harms Bill is an honest attempt to protect kids or catch terrorists, good luck.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Light Blue Touchpaper

Security Research, Computer Laboratory, University of Cambridge

Monthly Archives: July 2022

Formal CHERI: rigorous engineering and design-time proof of full-scale architecture security properties

Text mining is harder than you think