Three Paper Thursday: Adversarial Machine Learning, Humans and everything in between

Recent advancements in Machine Learning (ML) have taught us two main lessons: a large proportion of things that humans do can actually be automated, and that a substantial part of this automation can be done with minimal human supervision. One no longer needs to select features for models to use; in many cases people are moving away from selecting the models themselves and perform a Network Architecture Search. This means non-stop search across billions of dimensions, ever improving different properties of deep neural networks (DNNs).

However, progress in automation has brought a spectre to the feast. Automated systems seem to be very vulnerable to adversarial attacks. Not only is this vulnerability hard to get rid of; worse, we often can’t even define what it means to be vulnerable in the first place.

Furthermore, finding adversarial attacks on ML systems is really easy even if you do not have any access to the models. There are only so many things that make cat a cat, and all the different models that deal with cats will be looking at the same set of features. This has an important implication: learning how to trick one model dealing with cats often transfers over to other models. Transferability is a terrible property for security because it makes adversarial ML attacks cheap and scalable. If there is a camera in the bank running a similar ML model to the camera you can get in Costco for $5, then the cost of developing an attack is $5.

As of now, we do not really have good answers to any of these questions. In the meantime, ML controlled systems are entering the human realm.

In this Three Paper Thursday I want to talk about works from the field of adversarial ML that make it much more understandable.

Continue reading Three Paper Thursday: Adversarial Machine Learning, Humans and everything in between

Contact Tracing in the Real World

There have recently been several proposals for pseudonymous contact tracing, including from Apple and Google. To both cryptographers and privacy advocates, this might seem the obvious way to protect public health and privacy at the same time. Meanwhile other cryptographers have been pointing out some of the flaws.

There are also real systems being built by governments. Singapore has already deployed and open-sourced one that uses contact tracing based on bluetooth beacons. Most of the academic and tech industry proposals follow this strategy, as the “obvious” way to tell who’s been within a few metres of you and for how long. The UK’s National Health Service is working on one too, and I’m one of a group of people being consulted on the privacy and security.

But contact tracing in the real world is not quite as many of the academic and industry proposals assume.

First, it isn’t anonymous. Covid-19 is a notifiable disease so a doctor who diagnoses you must inform the public health authorities, and if they have the bandwidth they call you and ask who you’ve been in contact with. They then call your contacts in turn. It’s not about consent or anonymity, so much as being persuasive and having a good bedside manner.

I’m relaxed about doing all this under emergency public-health powers, since this will make it harder for intrusive systems to persist after the pandemic than if they have some privacy theater that can be used to argue that the whizzy new medi-panopticon is legal enough to be kept running.

Second, contact tracers have access to all sorts of other data such as public transport ticketing and credit-card records. This is how a contact tracer in Singapore is able to phone you and tell you that the taxi driver who took you yesterday from Orchard Road to Raffles has reported sick, so please put on a mask right now and go straight home. This must be controlled; Taiwan lets public-health staff access such material in emergencies only.

Third, you can’t wait for diagnoses. In the UK, you only get a test if you’re a VIP or if you get admitted to hospital. Even so the results take 1–3 days to come back. While the VIPs share their status on twitter or facebook, the other diagnosed patients are often too sick to operate their phones.

Fourth, the public health authorities need geographical data for purposes other than contact tracing – such as to tell the army where to build more field hospitals, and to plan shipments of scarce personal protective equipment. There are already apps that do symptom tracking but more would be better. So the UK app will ask for the first three characters of your postcode, which is about enough to locate which hospital you’d end up in.

Fifth, although the cryptographers – and now Google and Apple – are discussing more anonymous variants of the Singapore app, that’s not the problem. Anyone who’s worked on abuse will instantly realise that a voluntary app operated by anonymous actors is wide open to trolling. The performance art people will tie a phone to a dog and let it run around the park; the Russians will use the app to run service-denial attacks and spread panic; and little Johnny will self-report symptoms to get the whole school sent home.

Sixth, there’s the human aspect. On Friday, when I was coming back from walking the dogs, I stopped to chat for ten minutes to a neighbour. She stood halfway between her gate and her front door, so we were about 3 metres apart, and the wind was blowing from the side. The risk that either of us would infect the other was negligible. If we’d been carrying bluetooth apps, we’d have been flagged as mutual contacts. It would be quite intolerable for the government to prohibit such social interactions, or to deploy technology that would punish them via false alarms. And how will things work with an orderly supermarket queue, where law-abiding people stand patiently six feet apart?

Bluetooth also goes through plasterboard. If undergraduates return to Cambridge in October, I assume there will still be small-group teaching, but with protocols for distancing, self-isolation and quarantine. A supervisor might sit in a teaching room with two or three students, all more than 2m apart and maybe wearing masks, and the window open. The bluetooth app will flag up not just the others in the room but people in the next room too.

How is this to be dealt with? I expect the app developers will have to fit a user interface saying “You’re within range of device 38a5f01e20. Within infection range (y/n)?” But what happens when people get an avalanche of false alarms? They learn to click them away. A better design might be to invite people to add a nickname and a photo so that contacts could see who they are. “You are near to Ross [photo] and have been for five minutes. Are you maintaining physical distance?”

When I discussed this with a family member, the immediate reaction was that she’d refuse to run an anonymous app that might suddenly say “someone you’ve been near in the past four days has reported symptoms, so you must now self-isolate for 14 days.” A call from a public health officer is one thing, but not knowing who it was would just creep her out. It’s important to get the reactions of real people, not just geeks and wonks! And the experience of South Korea and Taiwan suggests that transparency is the key to public acceptance.

Seventh, on the systems front, decentralised systems are all very nice in theory but are a complete pain in practice as they’re too hard to update. We’re still using Internet infrastructure from 30 years ago (BGP, DNS, SMTP…) because it’s just too hard to change. Watch Moxie Marlinspike’s talk at 36C3 if you don’t get this. Relying on cryptography tends to make things even more complex, fragile and hard to change. In the pandemic, the public health folks may have to tweak all sorts of parameters weekly or even daily. You can’t do that with apps on 169 different types of phone and with peer-to-peer communications.

Personally I feel conflicted. I recognise the overwhelming force of the public-health arguments for a centralised system, but I also have 25 years’ experience of the NHS being incompetent at developing systems and repeatedly breaking their privacy promises when they do manage to collect some data of value to somebody else. The Google Deepmind scandal was just the latest of many and by no means the worst. This is why I’m really uneasy about collecting lots of lightly-anonymised data in a system that becomes integrated into a whole-of-government response to the pandemic. We might never get rid of it.

But the real killer is likely to be the interaction between privacy and economics. If the app’s voluntary, nobody has an incentive to use it, except tinkerers and people who religiously comply with whatever the government asks. If uptake remains at 10-15%, as in Singapore, it won’t be much use and we’ll need to hire more contact tracers instead. Apps that involve compulsion, such as those for quarantine geofencing, will face a more adversarial threat model; and the same will be true in spades for any electronic immunity certificate. There the incentive to cheat will be extreme, and we might be better off with paper serology test certificates, like the yellow fever vaccination certificates you needed for the tropics, back in the good old days when you could actually go there.

All that said, I suspect the tracing apps are really just do-something-itis. Most countries now seem past the point where contact tracing is a high priority; even Singapore has had to go into lockdown. If it becomes a priority during the second wave, we will need a lot more contact tracers: last week, 999 calls in Cambridge had a 40-minute wait and it took ambulances six hours to arrive. We cannot field an app that will cause more worried well people to phone 999.

The real trade-off between surveillance and public health is this. For years, a pandemic has been at the top of Britain’s risk register, yet far less was spent preparing for one than on anti-terrorist measures, many of which were ostentatious rather than effective. Worse, the rhetoric of terror puffed up the security agencies at the expense of public health, predisposing the US and UK governments to disregard the lesson of SARS in 2003 and MERS in 2015 — unlike the governments of China, Singapore, Taiwan and South Korea, who paid at least some attention. What we need is a radical redistribution of resources from the surveillance-industrial complex to public health.

Our effort should go into expanding testing, making ventilators, retraining everyone with a clinical background from vet nurses to physiotherapists to use them, and building field hospitals. We must call out bullshit when we see it, and must not give policymakers the false hope that techno-magic might let them avoid the hard decisions. Otherwise we can serve best by keeping out of the way. The response should not be driven by cryptographers but by epidemiologists, and we should learn what we can from the countries that have managed best so far, such as South Korea and Taiwan.

Three Paper Thursday: The role of intermediaries, platforms, and infrastructures in governing crime and abuse

The platforms, providers, and infrastructures which together make up the contemporary Internet play an increasingly central role in the business of governing human societies. Although the software engineers, administrators, business professionals, and other staff working at these organisations may not have the institutional powers of state organisations such as law enforcement or the civil service, they are now in a powerful position of responsibility for the harms and illegal activities which their platforms facilitate. For this Three Paper Thursday, I’ve chosen to highlight papers which address these issues, and which explore the complex networks of different infrastructural actors and perspectives which play a role in the reporting, handling, and defining of abuse and crime online.

Continue reading Three Paper Thursday: The role of intermediaries, platforms, and infrastructures in governing crime and abuse

Three Paper Thursday: Sanitisers and Mitigators

In this reboot of the Three Paper Thursdays, back after a hiatus of almost eight years, I consider the many different ways in which programs can be sanitised to detect, or mitigated to prevent the use of, the many programmer errors that can introduce security vulerabilities in low-level languages such as C and C++. We first look at a new binary translation technique, before covering the many compiler techniques in the literature, and finally finishing off with my own hardware analysis architecture.

Continue reading Three Paper Thursday: Sanitisers and Mitigators

Identifying Unintended Harms of Cybersecurity Countermeasures

In this paper (winner of the eCrime 2019 Best Paper award), we consider the types of things that can go wrong when you intend to make things better and more secure. Consider this scenario. You are browsing through Internet and see a news headline on one of the presidential candidates. You are unsure if the headline is true. What you can do is to navigate to a fact-checking website and type in the headline of interest. Some platforms also have fact-checking bots that would update periodically on false information. You do some research through three fact-checking websites and the results consistently show that the news contains false information. You share the results as a comment on the news article. Within two hours, you receive hundreds of notifications with comments countering your resources with other fact-checking websites. 

Such a scenario is increasingly common as we rely on the Internet and social media platforms for information and news. Although they are meant to increase security, these cybersecurity countermeasures can result in confusion and frustration among users due to the incorporation of additional actions as part of users’ daily online routines. As seen, fact-checking can easily be used as a mechanism for attacks and demonstration of in-group/out-group distinction which can contribute further to group polarisation and fragmentation. We identify these negative effects as unintended consequences and define it as shifts in expected burden and/or effort to a group. 

To understand unintended harms, we begin with five scenarios of cyber aggression and deception. We identify common countermeasures for each scenario, and brainstorm potential unintended harms with each countermeasure. The unintended harms are inductively organized into seven categories: 1) displacement, 2) insecure norms, 3) additional costs, 4) misuse, 5) misclassification, 6) amplification and 7) disruption. Applying this framework to the above scenario, insecure norms, miuse, and amplification are both unintended consequences of fact-checking. Fact-checking can foster a sense of complacency where checked news are automatically seen as true. In addition, fact-checking can be used as tools for attacking groups of different political views. Such misuse facilitates amplification as fact-checking is being used to strengthen in-group status and therefore further exacerbate the issue of group polarisation and fragmentation. 

To allow for a systematic application to existing or new cybersecurity measures by practitioners and stakeholders, we expand the categories into a functional framework by developing prompts for each harm category. During this process, we identify the underlying need to consider vulnerable groups. In other words, practitioners and stakeholders need to take into consideration the impacts of countermeasures on at-risk groups as well as the possible creation of new vulnerable groups as a result of deploying a countermeasure. Vulnerable groups refer to user groups who may suffer while others are unaffected or prosper from the countermeasure. One example is older adult users where their non-familiarity and less frequent interactions with technologies means that they are forgotten or hidden when assessing risks and/or countermeasures within a system. 

It is important to note the framework does not propose measurements for the severity or the likelihood of unintended harm occurring. Rather, the emphasis of the framework is in raising stakeholders’ and practitioners’ awareness of possible unintended consequences. We envision this framework as a common-ground tool for stakeholders, particularly for coordinating approaches in complex, multi-party services and/or technology ecosystems.  We would like to extend a special thank you to Schloss Dagstuhl and the organisers of Seminar #19302 (Cybersafety Threats – from Deception to Aggression). It brought all of the authors together and laid out the core ideas in this paper. A complimentary blog post by co-author Dr Simon Parkin can be found at UCL’s Benthams Gaze blog. The accepted manuscript for this paper is available here.

From Playing Games to Committing Crimes: A Multi-Technique Approach to Predicting Key Actors on an Online Gaming Forum

I recently travelled to Pittsburgh, USA, to present the paper “From Playing Games to Committing Crimes: A Multi-Technique Approach to Predicting Key Actors on an Online Gaming Forum” at eCrime 2019, co-authored with Ben Collier and Alice Hutchings. The accepted version of the paper can be accessed here.

The structure and content of various underground forums have been studied in the literature, from threat detection to the classification of marketplace advertisements. These platforms can provide a mechanism for knowledge sharing and a marketplace between cybercriminals and other members.

However, gaming-related activity on underground hacking forums have been largely unexplored. Meanwhile, UK law enforcement believe there is a potential link between playing online games and committing cybercrime—a possible cybercrime pathway. A small-scale study by the NCA found that users looking for gaming cheats on these types of forums can lead to interactions with users involved in cybercrime, leading to a possible first offences, followed by escalating levels of offending. Also, there has been interest from UK law enforcement in exploring intervention activity which aim to deter gamers from becoming involved in cybercrime activity.

We begin to explore this by presenting a data processing pipeline framework, used to identify potential key actors on a gaming-specific forum, using predictive and clustering methods on an initial set of key actors. We adapt open-source tools created for use in analysis of an underground hacking forum and apply them to this forum. In addition, we add NLP features, machine learning models, and use group-based trajectory modelling.

From this, we can begin to characterise key actors, both by looking at the distributions of predictions, and from inspecting each of the models used. Social network analysis, built using author-replier relationships, shows key actors and predicted key actors are well connected, and group-based trajectory modelling highlights a much higher proportion of key actors are contained in both a high-frequency super-engager trajectory in the gaming category, and in a high-frequency super-engager posting activity in the general category.

This work provides an initial look into a perceived link between playing online games and committing cybercrime by analysing an underground forum focused on cheats for games.

Honware: A Virtual Honeypot Framework for Capturing CPE and IoT Zero Days

Existing defenses are slow to detect zero day exploits and capture attack traffic targeting inadequately secured Customer Premise Equipment (CPE) and Internet of Things (IoT) devices. This means that attackers have considerable periods of time to find and compromise vulnerable devices before the attack vectors are well understood and mitigation is in place.

About a month ago we presented honware at eCrime 2019, a new honeypot framework that enables the rapid construction of honeypots for a wide range of CPE and IoT devices. The framework automatically processes a standard firmware image (as is commonly provided for updates) and runs the system with a special pre-built Linux kernel without needing custom hardware. It then logs attacker traffic and records which of their actions led to a compromise.

We provide an extensive evaluation and show that our framework is scalable and significantly better than existing emulation strategies in emulating the devices’ firmware applications. We were able to successfully process close to 2000 firmware images across a dozen brands (TP-Link, Netgear, D-Link…) and run them as honeypots. Also, as we use the original firmware images, the honeypots are not susceptible to fingerprinting attacks based on protocol deviations or self-revealing properties.

By simplifying the process of deploying realistic honeypots at Internet scale, honware supports the detection of malware types that often go unnoticed by users and manufactures. We hope that honware will be used at Internet scale by manufacturers setting up honeypots for all of their products and firmware versions or by researchers looking for new types of malware.

The paper is available here.

Security Engineering, and Sustainability

Yesterday I got the audience at the 36th Chaos Computer Congress in Leipzig to vote on the cover art for the third edition of my textbook on Security Engineering: you can see the result here.

It was a privilege to give a talk at 36C3; as the theme was sustainability, I spoke on The Sustainability of Safety, Security and Privacy. This is a topic on which I’ve written and spoken several times in recent years, but we now have some progress to report. The EU has changed the rules to require that if you sell goods with digital components (whether embedded software, associated cloud services or smartphone apps) then these have to be maintained for as long as the customer might reasonably expect.

2020 Caspar Bowden Award

You are invited to submit nominations for the 2020 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies. The Caspar Bowden PET award is presented annually to researchers who have made an outstanding contribution to the theory, design, implementation, or deployment of privacy enhancing technology. It is awarded at the annual Privacy Enhancing Technologies Symposium (PETS), and carries a cash prize as well as a physical award monument.

Any paper by any author written in the area of privacy enhancing technologies is eligible for nomination. However, the paper must have appeared in a refereed journal, conference, or workshop with proceedings published in the period from April 1, 2018 until March 31, 2020.

Note that we do not accept nominations for publications in conference proceedings when the dates of the conference fall outside of the nomination window. For example, a IEEE Symposium on Security and Privacy (“Oakland”) paper made available on IEEE Xplore prior to the March 31 deadline would not be eligible, as the conference happens in May. Please note that PETS is associated with a journal publication, PoPETs, so any PoPETs paper published in an issue appearing before the March 31 deadline is eligible (which typically means only Issue 1 of the current year).

Anyone can nominate a paper by sending an email message to award-chairs20@petsymposium.org containing the following:
. Paper title
. Author(s)
. Author(s) contact information
. Publication venue and full reference
. Link to an available online version of the paper
. A nomination statement of no more than 500 words.

All nominations must be submitted by April 5, 2020. The award committee will select one or two winners among the nominations received. Winners must be present at the PET Symposium in order to receive the Award. This requirement can be waived only at the discretion of the PET advisory board. The complete Award rules including eligibility requirements can be found here.

Caspar Bowden PET Award Chairs (award-chairs20@petsymposium.org)

Simone Fischer-Hübner, Karlstad University
Ross Anderson, University of Cambridge

Caspar Bowden PET Award Committee

Erman Ayday, Bilkent University
Nataliia Bielova, Inria
Sonja Buchegger, KTH
Ian Goldberg, University of Waterloo
Rachel Greenstadt, NYU
Marit Hansen, Unabhängiges Datenschutzzentrum Schleswig Holstein -ULD
Dali Kaafar, CSIRO
Eran Toch, Tel Aviv University
Carmela Troncoso, EPFL
Matthew Wright, Rochester Institute of Technology

More information about the Caspar Bowden PET award (including past winners) is available here.

APWG eCrime 2019

Last week the APWG Symposium on Electronic Crime Research was held at Carnegie Mellon University in Pittsburgh. The Cambridge Cybercrime Centre was very well-represented at the symposium. Of the 12 accepted research papers, five were authored or co-authored by scholars from the Centre. The topics of the research papers addressed a wide range of cybercrime issues, ranging from honeypots to gaming as pathways to cybercrime. One of the papers with a Cambridge author, “Identifying Unintended Harms of Cybersecurity Countermeasures”, received the Best Paper award. The Honorable Mention award went to “Mapping the Underground: Supervised Discovery of Cybercrime Supply Chains”, which was a collaboration between NYU, ICSI and the Centre.

In this post, we will provide a brief description for each paper in this post. The final versions aren’t yet available, we will blog them in more detail as they appear.

Best Paper

Identifying Unintended Harms of Cybersecurity Countermeasures

Yi Ting Chua, Simon Parkin, Matthew Edwards, Daniela Oliveira, Stefan Schiffner, Gareth Tyson, and Alice Hutchings

In this paper, the authors consider that well-intentioned cybersecurity risk management activities can create not only unintended consequences, but also unintended harms to user behaviours, system users, or the infrastructure itself. Through reviewing countermeasures and associated unintended harms for five cyber deception and aggression scenarios (including tech-abuse, disinformation campaigns, and dating fraud), the authors identified categorizations of unintended harms. These categories were further developed into a framework of questions to prompt risk managers to consider harms in a structured manner, and introduce the discussion of vulnerable groups across all harms. The authors envision that this framework can act as a common-ground and a tool bringing together stakeholders towards a coordinated approach to cybersecurity risk management in a complex, multi-party service and/or technology ecosystem.

Honorable Mention

Mapping the Underground: Supervised Discovery of Cybercrime Supply Chains

Rasika Bhalerao, Maxwell Aliapoulios, Ilia Shumailov, Sadia Afroz, and Damon McCoy

Cybercrime forums enable modern criminal entrepreneurs to collaborate with other criminals into increasingly efficient and sophisticated criminal endeavors.
Understanding the connections between different products and services is currently very expensive and requires a lot of time-consuming manual effort. In this paper, we propose a language-agnostic method to automatically extract supply chains from cybercrime forum posts and replies. Our analysis of generated supply chains highlights unique differences in the lifecycle of products and services on offer in Russian and English cybercrime forums.

Honware: A Virtual Honeypot Framework for Capturing CPE and IoT Zero Day

Alexander Vetterl and Richard Clayton

We presented honware, a new honeypot framework which can rapidly emulate a wide range of CPE and IoT devices without any access to the manufacturers’ hardware.

The framework processes a standard firmware image and will help to detect real attacks and associated vulnerabilities that might otherwise be exploited for considerable periods of time without anyone noticing.

From Playing Games to Committing Crimes: A Multi-Technique Approach to Predicting Key Actors on an Online Gaming Forum

Jack Hughes , Ben Collier, and Alice Hutchings

This paper proposes a systematic framework for analysing forum datasets, which contain minimal structure and are non-trivial to analyse at scale. The paper takes a multi-technique approach drawing on a combination of features relating to content and metadata, to predict potential key actors. From these predictions and trained models, the paper begins to look at characteristics of the group of potential key actors, which may benefit more from targeted intervention activities.

Fighting the “Blackheart Airports”: Internal Policing in the Chinese Censorship Circumvention Ecosystem

Yi Ting Chua and Ben Collier

In this paper, the authors provide an overview of the self-policing mechanisms present in the ecosystem of services used in China to circumvent online censorship. We conducted an in-depth netnographic study of four Telegram channels which were used to co-ordinate various kinds of attacks on groups and individuals offering fake or scam services. More specifically, these actors utilized cybercrime tools such as denial of service attack and doxxing to punish scammers. The motivations behind this self-policing appear to be genuinely altruistic, with individuals largely concerned with maintaining a stable ecosystem of services to allow Chinese citizens to bypass the Great Firewall. Although this is an emerging phenomenon, it appears to be developing into an important and novel kind of trust mechanism within this market