What you get is what you C

We have a new paper on compiler security appearing this morning at EuroS&P.

Up till now, writers of crypto and security software not only have to fight the bad guys. We also have to deal with compiler writers, who every so often dream up some new optimisation routine which spots the padding instructions that we put in to make our crypto algorithms run in constant time, or the tricks that we use to ensure that sensitive data will be zeroised when a function returns. All of a sudden some critical code is optimised away, your code is insecure, and you scramble to figure out how to outwit the compiler once more.

So while you’re fighting the enemy in front, the compiler writer is a subversive fifth column in your rear.

It’s time that our toolsmiths were our allies rather than our enemies. We have therefore worked out what’s needed for a software writer to tell a compiler that a loop really must be executed in constant time, or that a variable really must be set to zero when a function returns. Languages like C have no way of expressing programmer intent, so we do this by means of code annotations.

Doing it properly turns out to be surprisingly tricky, but we now have a working proof of concept in the form of plugins for LLVM. For more details, and links to the code, see the web page of Laurent Simon, the lead author; the talk slides are here. This is the first technical contribution in our research programme on sustainable security.

One thought on “What you get is what you C

  1. Here’s a very handy GCC trick, which I believe also works in LLVM:

    asm(“” : “+Xm” (variable))

    This is a convenient and architecture-independent, optimization barrier.

    It tells the compiler that it must explicitly construct the value (no induction variable trickery!), place it in storage somewhere (it may choose between “X”, meaning any register whatsoever including non-general ones, and “m” meaning a memory location), and then assume that the asm() has modified it in some unknown manner, so it must use the constructed value.

    I know the empty asm() does nothing to the value, but the compiler cannot assume that; it must discard all its assumptions.

    I’ve found this very useful for constraining overeager optimizers.

Leave a Reply

Your email address will not be published. Required fields are marked *