Category Archives: Open-source security

Our Christmas message for troublemakers: how to do anonymity in the real world

On the 5th of December I gave a talk at a journalists’ conference on what tradecraft means in the post-Snowden world. How can a journalist, or for that matter an MP or an academic, protect a whistleblower from being identified even when MI5 and GCHQ start trying to figure out who in Whitehall you’ve been talking to? The video of my talk is now online here. There is also a TV interview I did later, which can be found here, while the other conference talks are here.

Enjoy!

Ross

Design and Implementation of the FreeBSD Operating System, Second Edition now shipping

Kirk McKusick, George Neville-Neil, and I are pleased to announce that The Design and Implementation of the FreeBSD Operating System, Second Edition is now available from Pearson Education (Amazon link for non-US folk). Light Blue Touchpaper readers might be particularly interested in the new chapter on FreeBSD’s kernel security features including:

  • Process Credentials
  • Users and Groups
  • Privilege Model
  • Interprocess Access Control
  • Discretionary Access Control
  • Capsicum Capability Model
  • Jails
  • Mandatory Access-Control Framework
  • Security Event Auditing
  • Cryptographic Services
  • GELI Full-Disk Encryption

There is detailed coverage of the FreeBSD TCB, POSIX.1e and NFSv4 ACLs, OS sandboxing features, the Mandatory Access Control Framework used not just in FreeBSD but also Junos/Mac OS X/iOS, the FreeBSD kernel’s Yarrow-based pseudo-random number generator, and both confidentiality and integrity cryptographic protection for filesystems, and the kernel’s IPsec implementation. Other new content in this edition of the book includes ZFS, paravirtualised device drivers, DTrace, NFSv4, network-stack virtualisation, and much more.

We will be using this book as one of the core texts for our new masters-level operating-system course at Cambridge, L41, in spring 2015.

The CHERI capability model: Revisiting RISC in an age of risk (ISCA 2014)

Last week, Jonathan Woodruff presented our joint paper on the CHERI memory model, The CHERI capability model: Revisiting RISC in an age of risk, at the 2014 International Symposium on Computer Architecture (ISCA) in Minneapolis (video, slides). This is our first full paper on Capability Hardware Enhanced RISC Instructions (CHERI), collaborative work between Simon Moore’s and my team composed of members of the Security, Computer Architecture, and Systems Research Groups at the University of Cambridge Computer Laboratory, Peter G. Neumann’s group at the Computer Science Laboratory at SRI International, and Ben Laurie at Google.

CHERI is an instruction-set extension, prototyped via an FPGA-based soft processor core named BERI, that integrates a capability-system model with a conventional memory-management unit (MMU)-based pipeline. Unlike conventional OS-facing MMU-based protection, the CHERI protection and security models are aimed at compilers and applications. CHERI provides efficient, robust, compiler-driven, hardware-supported, and fine-grained memory protection and software compartmentalisation (sandboxing) within, rather than between, addresses spaces. We run a version of FreeBSD that has been adapted to support the hardware capability model (CheriBSD) compiled with a CHERI-aware Clang/LLVM that supports C pointer integrity, bounds checking, and capability-based protection and delegation. CheriBSD also supports a higher-level hardware-software security model permitting sandboxing of application components within an address space based on capabilities and a Call/Return mechanism supporting mutual distrust.

The approach draws inspiration from Capsicum, our OS-facing hybrid capability-system model now shipping in FreeBSD and available as a patch for Linux courtesy Google. We found that capability-system approaches matched extremely well with least-privilege oriented software compartmentalisation, in which programs are broken up into sandboxed components to mitigate the effects of exploited vulnerabilities. CHERI similarly merges research capability-system ideas with a conventional RISC processor design, making accessible the security and robustness benefits of the former, while retaining software compatibility with the latter. In the paper, we contrast our approach with a number of others including Intel’s forthcoming Memory Protection eXtensions (MPX), but in particular pursue a RISC-oriented design instantiated against the 64-bit MIPS ISA, but the ideas should be portable to other RISC ISAs such as ARMv8 and RISC-V.

Our hardware prototype is implemented in Bluespec System Verilog, a high-level hardware description language (HDL) that makes it easier to perform design-space exploration. To facilitate both reproducibility for this work, and also future hardware-software research, we’ve open sourced the underlying Bluespec Extensible RISC Implementation (BERI), our CHERI extensions, and a complete software stack: operating system, compiler, and so on. In fact, support for the underlying 64-bit RISC platform, which implements a version of the 64-bit MIPS ISA, was upstreamed to FreeBSD 10.0, which shipped earlier this year. Our capability-enhanced versions of FreeBSD (CheriBSD) and Clang/LLVM are distributed via GitHub.

You can learn more about CHERI, BERI, and our larger clean-slate hardware-software agenda on the CTSRD Project Website. There, you will find copies of our prior workshop papers, full Bluespec source code for the FPGA processor design, hardware build instructions for our FPGA-based tablet, downloadable CheriBSD images, software source code, and also our recent technical report, Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture, and Jon Woodruff’s PhD dissertation on CHERI.

Jonathan Woodruff, Robert N. M. Watson, David Chisnall, Simon W. Moore, Jonathan Anderson, Brooks Davis, Ben Laurie, Peter G. Neumann, Robert Norton, and Michael Roe. The CHERI capability model: Revisiting RISC in an age of risk, Proceedings of the 41st International Symposium on Computer Architecture (ISCA 2014), Minneapolis, MN, USA, June 14–16, 2014.

Heartbleed and RSA private keys

UPDATE:

Matthew Arcus pointed out that a freed buffer will always have the first 16 bytes (x86_64) corrupted by two free list pointers, according to glibc’s malloc implementation. This means the leak mechanism I described below will only explain the partial leak I exploited, but not complete prime number leaks others have seen. In fact, the uncorrupted numbers come from the long-lived copy of p and q in the Montgomery context.

——–

By now everyone knows about the OpenSSL Heartbleed vulnerability: a missing bounds check in one of the most popular TLS implementations has made millions of web servers (and others) leak all sorts of sensitive information from memory. This could leak login credentials, authentication cookies and web traffic to attackers. But could it be used to recover the site’s TLS private key? This would enable complete decryption of previously-recorded traffic if perfect forward secrecy was not negotiated at the time and otherwise Man-in-The-Middle attacks to all future TLS sessions.

Since this would be a much more serious consequence of Heartbleed, I decided to investigate. The results were positive: I was able to extract private keys from a test Nginx server after a few day’s work. Later I applied my techniques to solve the CloudFlare Challenge. Along with a few other security researchers, we independently demonstrated that RSA private keys are indeed at risk. In this blog post, I’ll provide some detail on how to extract the private key and why the attack is possible.

How to extract the private key

Readers not familiar with RSA can read about it here. To simplify things a bit, a large (2048 bits) number N is constructed by multiplying together two large randomly generated prime numbers p and q. N is made public while p and q are kept secret. Finding p or q allows recovery of the private key. A generic attack is just factorising N, but this is believed to be difficult. However, with a vulnerability like Heartbleed, the attack is much simpler: because the web server needs the private key in memory to sign the TLS handshake, p and q must live in memory and we can try to obtain them with Heartbleed packets. Then the problem simply becomes how to identify them in the returned data. This is easy, as we know p and q are 1024 bits (128 bytes) long, and OpenSSL represents big numbers little-endian in memory, so a brute-force approach of treating every 128 consecutive bytes in the heartbleed packets as little-endian numbers and testing if it divides N is sufficient to spot potential leaks. This is how most people solved the CloudFlare challenge.

Coppersmith improvement

But wait, isn’t our scenario just like a cold boot attack? And there has been a lot of research on recovering RSA private keys with partial information. One of the most famous papers from Coppersmith presents message recovery attacks on related messages or insufficient padding, as well as factorisation with partial knowledge with the help of lattice basis reduction. With the Coppersmith attack, N can be efficiently factorised if either the top or bottom half bits of p is known. Put it in context, we only need the top/bottom 64 bytes of p in order to compute the private keys, compared to the naive brute-force which requires all 128 bytes. In practice, reaching Coppersmith’s limit is computational expensive (although still much better than factorisation), but assuming 77 bytes (60%) known we can comb the heartbleed packets for potential private key segments very quickly.

In retrospect, there were 242 instances of private key remnants suitable for the Coppersmith attack, out of 10,000 packets (64 KB each) that I had collected. Implementation of the Coppersmith attack was made easy thanks to the comprehensive computer algebra building blocks from Sage (although I later found out that Sage already has Coppersmith implemented).

Can we do even better? If you have ever viewed a RSA private key using openssl rsa -text -in server.key, you will notice that there are quite a few numbers other than the two prime factors p and q. In fact, they are precomputed values for RSA’s Chinese Remainder Theorem optimisation. If some of them are leaked they can also be used to deduce p. And what about the Montgomery representations of p and q that OpenSSL use for fast multiplication? They also admit a variant of Coppersmith so partial bits are useful as well. With this in mind I set out to search for them in the collected packets from my test server where these values are all known. But not even single one of them appears partially (> 16 bytes) in the dataset. How is this possible?

Note: all my experiments and the CloudFlare challenge are targeting Nginx which is single-threaded. It is possible that for a multi-threaded web server, more leakage can be observed.

Why p and only p leaks

When heartbleed first came out, people argued that RSA private keys should not be leaked because they are loaded when the web server starts so they occupy a lower memory address, and because the heap grows upwards, a later-allocated buffer leaked by Heartbleed shouldn’t reach them. This argument is consistent with my inability to find any CRT precomputed values, but with one exception: p is definitely leaked somehow. If we assume this argument is correct, the question becomes: why is p leaked?

To add more to the mystery, OpenSSL apparently scrubs all temporary BigNums it used! To reduce the overhead of dynamically allocating temporary values OpenSSL provides a BN_CTX which is a pool of BigNums operating in stack (LIFO) fashion. Upon finishing the context is destroyed and all allocated buffers are scrubbed. This would mean that when the heartbleed packet is constructed there shouldn’t be any temporary values left in memory (again assuming single thread) because the BN_CTX has long been released.

I won’t bother you with all the pain I went through to identify the cause, so here is the spoiler: when a BigNum is being extended to a bigger buffer size, its original buffer is not zeroised before being freed. The chain of the control flow path that leads to the leak of p is even more subtle. During intial TLS handshake the server key exchange is signed with the private key. The CRT signing performs a modulo p operation, which caused p<<BN_BITS2 to be stored in a temporary variable allocated from the BN_CTX pool. Later in the CRT fault-injection check, that temporary variable is reused (remember BN_CTX operates like a stack) as val[0]. An interesting fact is that a reallocated temporary variable only gets its lowest nibble zeroised, and in the case of p<<BN_BITS2 nothing is destroyed. val[0] immediately receives a Montgomery-reduced value, but since the original buffer is unable to accommodate the new value, it gets extended and p gets released to free heap space, waiting to be grabbed. And because this happens every time a TLS handshake occurs, it can be spilled everywhere.

Since it is difficult to find which BigNums may be extended and leaked statically, I instrumented OpenSSL and experimented a bit. It turned out that a shifted version of the Montgomery representation of p will also be freed at the leaky point, but that only happened at the first RSA exponentiation when the Montgomery context is initialised, so it will live in a low memory address and I was unable to locate it in any collected packets.

The OpenSSL team has been notified about the above leak. Although to be fair this is not exactly a security bug by itself as OpenSSL is never explicitly designed to prevent sensitive data leakage to heap.

I’d like to thank Joseph Bonneau for the useful comments and proof reading.

Research Assistants and Associates in OS, Compiler and CPU Security

We are pleased to announce a job ad for two new research assistants or post-doctoral research associates working on our CTSRD Project, whose target research areas include OS, compiler, and CPU security. This is a joint project between the University of Cambridge’s Security, NetOS, and Computer Architecture research groups, as well as the Computer Science Laboratory at SRI International.

Research Assistants and Associates in OS, Compiler and CPU Security
Fixed-term: The funds for this post are available for 18 months in the first instance.

We are seeking multiple Research Assistants and Post-Doctoral Research Associates to join the CTSRD Project, which is investigating fundamental improvements to CPU-architecture, operating-system (OS), program-analysis, and programming-language structure in support of computer security. The CTSRD Project is a collaboration between the University of Cambridge and SRI International, and part of the DARPA CRASH research programme on clean-slate computer system design for security. More information may be found at:

This position will be an integral part of an international team of researchers spanning multiple institutions in academia and industry. Successful candidates will contribute to the larger research effort by performing system-software, compiler, and hardware implementation and experimentation, developing and evaluating novel hypotheses about refinements to the vertical hardware-software stack. Possible areas of responsibility include: modifying OS kernels (e.g., FreeBSD), adapting compiler suites (e.g., Clang/LLVM); extending an open-source Bluespec-based research-processor design (CHERI); supporting an early-adopter user community for open-source hardware and software; and improving the quality and performance of hardware-software prototypes. The successful candidate must be willing to travel in the UK and abroad engaging with downstream user communities.
Continue reading Research Assistants and Associates in OS, Compiler and CPU Security

Crypto festival video

We had a crypto festival in London in London in November at which a number of cryptographers and crypto policy folks got together with over 1000 mostly young attendees to talk about what might be done in response to the Snowden revelations.

Here is a video of the session in which I spoke. The first speaker was Annie Machon (at 02.35) talking of her experience of life on the run from MI5, and on what we might do to protect journalists’ sources in the future. I’m at 23.55 talking about what’s changed for governments, corporates, researchers and others. Nick Pickles of Big Brother Watch follows at 45.45 talking on what can be done in terms of practical politics; it turned out that only two of us in the auditorium had met our MPs over the Comms Data Bill. The final speaker, Smari McCarthy, comes on at 56.45, calling for lots more encryption. The audience discussion starts at 1:12:00.

2013 Capsicum year in review

It’s been a busy year for Capsicum, practical capabilities for UNIX, so a year-end update seemed in order:

The FreeBSD Foundation and Google jointly funded a Capsicum Integration Project that took place throughout 2013 — described by Foundation project technical director Ed Maste in a recent blog article. Pawel Jakub Dawidek refined several Capsicum APIs, improving support for ioctls and increasing the number of supported capability rights for FreeBSD 10. He also developed Casper, a helper daemon that provides services (such as DNS, access to random numbers) to sandboxes — and can, itself, sandbox services. Casper is now in the FreeBSD 11.x development branch, enabled by default, and should appear in FreeBSD 10.1. The Google Open Source Program Office (OSPO) blog also carried a September 2013 article on their support for open-source security, featuring Capsicum.

Capsicum is enabled by default in the forthcoming FreeBSD 10.0 release — capability mode, capabilities, and process descriptors are available in the out-of-the-box GENERIC kernel. A number of system services use Capsicum to sandbox themselves — such as the DHCP client, high-availability storage daemon, audit log distribution daemon, but also command-line tools like kdump and tcpdump that handle risky data. Even more will appear in FreeBSD 10.1 next year, now that Casper is available.

David Drysdale at Google announced Capsicum for Linux, an adaptation of Linux to provide Capsicum’s capability mode and capabilities, in November 2013. David and Ben Laurie visited us in Cambridge multiple times this year to discuss the design and implementation, review newer Capsicum APIs, and talk about future directions. They hope to upstream this work to the Linux community. Joris Giovannangeli also announced an adaptation of Capsicum to DragonFlyBSD in October 2013.

Over the summer, Mariusz Zaborski and Daniel Peryolon were funded by Google Summer of Code to work on a variety of new Capsicum features and services, adapting core UNIX components and third-party applications to support sandboxing. For example, Mariusz looked at sandboxing BSD grep: if a vulnerability arises in grep’s regular-expression matching, why should processing a file of malicious origin yield full rights to your UNIX account?

In May 2013, our colleagues at the University of Wisconsin, Madison, led by Bill Harris, published a paper at the IEEE Symposium on Security and Privacy (“Oakland”) on “Declarative, Temporal, and Practical Programming with Capabilities” — how to model program behaviour, and automatically transform some classes of applications to use Capsicum sandboxing. We were very pleased to lend a hand with this work, and feel the art of programming for compartmentalisation is a key research challenge. We also collaborated with folk at SRI and Google on a a workshop paper developing our ideas about application compartmentalisation, which appeared at the Security Protocols Workshop here in Cambridge in March 2013.

Google and the FreeBSD Foundation are committed to further work on Capsicum and its integration with applications, and research continues on how to apply Capsicum at several institutions including here at Cambridge. We hope to kick off a new batch of application adaptation in coming months — as well as integration with features such as DNSSEC. However, we also need your help in adapting applications to use Capsicum on systems that support it!

Google funding of open-source security projects

I was pleased to contribute to a recent blog article by Ben Laurie, a frequent collaborator with the Cambridge security group, on the Google Open Source Programs Office blog. We describe open-source security work OSPO has sponsored over the last couple of years, including our joint work on Capsicum, and its followup projects funded jointly by Google and the FreeBSD Foundation. He also talks about Google support for Certificate Transparency, OpenSSL, Tor, and Libpurple — projects focussed not just on communications security, but also communications privacy on the Internet.

Capsicum

Over the last decade or so, it has become increasingly (and painfully) apparent that ACLs and MAC, which were originally designed to protect expensive mainframes from their users, and the users from each other, are failing to secure modern cheap machines with single users who need protecting from the software they run.

Instead, we need fine-grained access control and strong sandboxing.
Continue reading Google funding of open-source security projects