Category Archives: Privacy technology

Anonymous communication, data protection

Bugs in our pockets?

In August, Apple announced a system to check all our iPhones for illegal images, then delayed its launch after widespread pushback. Yet some governments continue to press for just such a surveillance system, and the EU is due to announce a new child protection law at the start of December.

Now, in Bugs in our Pockets: The Risks of Client-Side Scanning, colleagues and I take a long hard look at the options for mass surveillance via software embedded in people’s devices, as opposed to the current practice of monitoring our communications. Client-side scanning, as the agencies’ new wet dream is called, has a range of possible missions. While Apple and the FBI talked about finding still images of sex abuse, the EU was talking last year about videos and text too, and of targeting terrorism once the argument had been won on child protection. It can also use a number of possible technologies; in addition to the perceptual hash functions in the Apple proposal, there’s talk of machine-learning models. And, as a leaked EU internal report made clear, the preferred outcome for governments may be a mix of client-side and server-side scanning.

In our report, we provide a detailed analysis of scanning capabilities at both the client and the server, the trade-offs between false positives and false negatives, and the side effects – such as the ways in which adding scanning systems to citizens’ devices will open them up to new types of attack.

We did not set out to praise Apple’s proposal, but we ended up concluding that it was probably about the best that could be done. Even so, it did not come close to providing a system that a rational person might consider trustworthy.

Even if the engineering on the phone were perfect, a scanner brings within the user’s trust perimeter all those involved in targeting it – in deciding which photos go on the naughty list, or how to train any machine-learning models that riffle through your texts or watch your videos. Even if it starts out trained on images of child abuse that all agree are illegal, it’s easy for both insiders and outsiders to manipulate images to create both false negatives and false positives. The more we look at the detail, the less attractive such a system becomes. The measures required to limit the obvious abuses so constrain the design space that you end up with something that could not be very effective as a policing tool; and if the European institutions were to mandate its use – and there have already been some legislative skirmishes – they would open up their citizens to quite a range of avoidable harms. And that’s before you stop to remember that the European Court of Justice struck down the Data Retention Directive on the grounds that such bulk surveillance, without warrant or suspicion, was a grossly disproportionate infringement on privacy, even in the fight against terrorism. A client-side scanning mandate would invite the same fate.

But ‘if you build it, they will come’. If device vendors are compelled to install remote surveillance, the demands will start to roll in. Who could possibly be so cold-hearted as to argue against the system being extended to search for missing children? Then President Xi will want to know who has photos of the Dalai Lama, or of men standing in front of tanks; and copyright lawyers will get court orders blocking whatever they claim infringes their clients’ rights. Our phones, which have grown into extensions of our intimate private space, will be ours no more; they will be private no more; and we will all be less secure.

Is Apple’s NeuralMatch searching for abuse, or for people?

Apple stunned the tech industry on Thursday by announcing that the next version of iOS and macOS will contain a neural network to scan photos for sex abuse. Each photo will get an encrypted ‘safety voucher’ saying whether or not it’s suspect, and if more than about ten suspect photos are backed up to iCloud, then a clever cryptographic scheme will unlock the keys used to encrypt them. Apple staff or contractors can then look at the suspect photos and report them.

We’re told that the neural network was trained on 200,000 images of child sex abuse provided by the US National Center for Missing and Exploited Children. Neural networks are good at spotting images “similar” to those in their training set, and people unfamiliar with machine learning may assume that Apple’s network will recognise criminal acts. The police might even be happy if it recognises a sofa on which a number of acts took place. (You might be less happy, if you own a similar sofa.) Then again, it might learn to recognise naked children, and flag up a snap of your three-year-old child on the beach. So what the new software in your iPhone actually recognises is really important.

Now the neural network described in Apple’s documentation appears very similar to the networks used in face recognition (hat tip to Nicko van Someren for spotting this). So it seems a fair bet that the new software will recognise people whose faces appear in the abuse dataset on which it was trained.

So what will happen when someone’s iPhone flags ten pictures as suspect, and the Apple contractor who looks at them sees an adult with their clothes on? There’s a real chance that they’re either a criminal or a witness, so they’ll have to be reported to the police. In the case of a survivor who was victimised ten or twenty years ago, and whose pictures still circulate in the underground, this could mean traumatic secondary victimisation. It might even be their twin sibling, or a genuine false positive in the form of someone who just looks very much like them. What processes will Apple use to manage this? Not all US police forces are known for their sensitivity, particularly towards minority suspects.

But that’s just the beginning. Apple’s algorithm, NeuralMatch, stores a fingerprint of each image in its training set as a short string called a NeuralHash, so new pictures can easily be added to the list. Once the tech is built into your iPhone, your MacBook and your Apple Watch, and can scan billions of photos a day, there will be pressure to use it for other purposes. The other part of NCMEC’s mission is missing children. Can Apple resist demands to help find runaways? Could Tim Cook possibly be so cold-hearted as to refuse at add Madeleine McCann to the watch list?

After that, your guess is as good as mine. Depending on where you are, you might find your photos scanned for dissidents, religious leaders or the FBI’s most wanted. It also reminds me of the Rasterfahndung in 1970s Germany – the dragnet search of all digital data in the country for clues to the Baader-Meinhof gang. Only now it can be done at scale, and not just for the most serious crimes either.

Finally, there’s adversarial machine learning. Neural networks are fairly easy to fool in that an adversary can tweak images so they’re misclassified. Expect to see pictures of cats (and of Tim Cook) that get flagged as abuse, and gangs finding ways to get real abuse past the system. Apple’s new tech may end up being a distributed person-search machine, rather than a sex-abuse prevention machine.

Such a technology requires public scrutiny, and as the possession of child sex abuse images is a strict-liability offence, academics cannot work with them. While the crooks will dig out NeuralMatch from their devices and play with it, we cannot. It is possible in theory for Apple to get NeuralMatch to ignore faces; for example, it could blur all the faces in the training data, as Google does for photos in Street View. But they haven’t claimed they did that, and if they did, how could we check? Apple should therefore publish full details of NeuralMatch plus a set of NeuralHash values trained on a public dataset with which we can legally work. It also needs to explain how the system it deploys was tuned and tested; and how dragnet searches of people’s photo libraries will be restricted to those conducted by court order so that they are proportionate, necessary and in accordance with the law. If that cannot be done, the technology must be abandoned.

Patient confidentiality in remote consultations

During the lockdown last year, I was asked by the International Psychoanalytic Association (IPA) to help them update their guidance on remote consultations. I spoke to a range of GPs, surgeons, psychologists and psychoanalysts about what they’d learned during the first lockdown about working over the phone, or over Skype or Zoom. The IPA has now published my report, on a web page that also has their guidance to members both before and after the exercise.

Before the pandemic, remote consultation did happen, but not all therapists offered it; and confidentiality concerns tended to focus on technical security measures such as whether the call was encrypted end-to-end. After everyone was forced online in March and April 2020, clinicians learned rapidly to focus on the endpoints. Patients often have problems finding a private space to talk; there may be a family member in earshot, whether by accident, or because they’re cooped up in a tiny apartment, or because they have a controlling partner or parent. A clinician may return a patient’s call and catch them in a supermarket queue. And the clinic too can be interrupted, if the clinician is practicing from home.

Technical endpoint compromise is occasionally an issue; a controlling family member could inspect a patient’s device and discover a therapeutic relationship that had not been disclosed. By far the worst endpoint compromise that happened during the study period was when the Vastaamo chain of clinics in Finland was hit by ransomware; 45,000 patients’ records were stolen, and some were put online by extortionists demanding bitcoin payments. (And now we face an even larger-scale issue in the UK as the government plans to hoover up all our GP records for sale to drug companies unless we opt out by June 25; see here for how to do that.)

Such horrors aside, the core problem is to establish a therapeutic space where both patient and clinician can interact effectively, which means being able to concentrate and also to relax. There’s more to this than just being comfortable trusting the endpoint environments, the devices, the communications medium and any record-keeping mechanism. Interaction matters too. Many clinician communities discovered independently that the plain old telephone system often works better than new-fangled stuff such as skype and zoom. Video calls add maybe half a second of latency for buffering, which destroys conversational turn-taking. A further advantage of the phone is that you’re not staring at someone’s face at an unnatural distance. You can walk around the room, or even walk around the park.

Since doing this work I’ve started to avoid zoom and teams in favour of phone calls when I can, and use end-to-end encrypted voice calls on WhatsApp or Signal where call costs or client confidentiality make it sensible.

Pushing the limits: acoustic side channels

How far can we go with acoustic snooping on data?

Seven years ago we showed that you could use a phone camera to measure the phone’s motion while typing and use that to recover PINs. Four years ago we showed that you could use interrupt timing to recover text entered using gesture typing. Last year we showed how a gaming app can steal your banking PIN by listening to the vibration of the screen as your finger taps it. In that attack we used the on-phone microphones, as they are conveniently located next to the screen and can hear the reverberations of the screen glass.

This year we wondered whether voice assistants can hear the same taps on a nearby phone as the on-phone microphones could. We knew that voice assistants could do acoustic snooping on nearby physical keyboards, but everyone had assumed that virtual keyboards were so quiet as to be invulnerable.

Almos Zarandy, Ilia Shumailov and I discovered that attacks are indeed possible. In Hey Alexa what did I just type? we show that when sitting up to half a meter away, a voice assistant can still hear the taps you make on your phone, even in presence of noise. Modern voice assistants have two to seven microphones, so they can do directional localisation, just as human ears do, but with greater sensitivity. We assess the risk and show that a lot more work is needed to understand the privacy implications of the always-on microphones that are increasingly infesting our work spaces and our homes.

SHB Seminar

The SHB seminar on November 5th was kicked off by Tom Holt, who’s discovered a robust underground market in identity documents that are counterfeit or fraudulently obtained. He’s been scraping both websites and darkweb sites for data and analysing how people go about finding, procuring and using such credentials. Most vendors were single-person operators although many operate within affiliate programs; many transactions involved cryptocurrency; many involve generating pdfs that people can print at home and that are good enough for young people to drink alcohol. Curiously, open web products seem to cost twice as much as dark web products.

Next was Jack Hughes, who has been studying the contract system introduced by hackforums in 2018 and made mandatory the following year. This enabled him to analyse crime forum behaviour before and during the covid-19 era. How do new users become active, and build up trust? How does it evolve? He collected 200,000 transactions and analysed them. The contract mandate stifled growth quickly, leading to a first peak; covid caused a second. The market was already centralised, and became more so with the pandemic. However contracts are getting done faster, and the main activity is currency exchange: it seems to be working as a cash-out market.

Anita Lavorgna has been studying the discourse of groups who oppose public mask mandates. Like the antivaxx movement, this can draw in fringe groups and become a public-health issue. She collected 23654 tweets from February to June 2020. There’s a diverse range of voices from different places on the political spectrum but with a transversal theme of freedom from government interference. Groups seek strength in numbers and seek to ally into movements, leading to the mask becoming a symbol of political identity construction. Anita found very little interaction between the different groups: only 144 messages in total.

Simon Parkin has been working on how we can push back on bad behaviours online while they are linked with good behaviours that we wish to promote. Precision is hard as many of the desirable behaviours are not explicitly recognised as such, and as many behaviours arise as a combination of personal incentives and context. The best way forward is around usability engineering – making the desired behaviours easier.

Bruce Schneier was the final initial speaker, and his topic was covid apps. The initial rush of apps that arrived in March through June have known issues around false positives and false negatives. We’ve also used all sorts of other tools, such as analysis of Google maps to measure lockdown compliance. The third thing is the idea of an immunity passport, saying you’ve had the disease, or a vaccine. That will have the same issues as the fake IDs that Tom talked about. Finally, there’s compliance tracking, where your phone monitors you. The usual countermeasures apply: consent, minimisation, infosec, etc., though the trade-offs might be different for a while. A further bunch of issues concern home working and the larger attack surface that many firms have as a result of unfamiliar tools, less resistance to being tols to do things etc.

The discussion started on fake ID; Tom hasn’t yet done test purchases, and might look at fraudulently obtained documents in the future, as opposed to completely counterfeit ones. Is hackforums helping drug gangs turn paper into coin? This is not clear; more is around cashing out cybercrime rather than street crime. There followed discussion by Anita of how to analyse corpora of tweets, and the implications for policy in real life. Things are made more difficult by the fact that discussions drift off into other platforms we don’t monitor. Another topic was the interaction of fashion: where some people wear masks or not as a political statement, many more buy masks that get across a more targeted statement. Fashion is really powerful, and tends to be overlooked by people in our field. Usability research perhaps focuses too much on the utilitarian economics, and is a bit of a blunt instrument. Another example related to covid is the growing push for monitoring software on employees’ home computers. Unfortunately Uber and Lyft bought a referendum result that enables them to not treat their staff in California as employees, so the regulation of working hours at home will probably fall to the EU. Can we perhaps make some input into what that should look like? Another issue with the pandemic is the effect on information security markets: why should people buy corporate firewalls when their staff are all over the place? And to what extent will some of these changes be permanent, if people work from home more? Another thread of discussion was how the privacy properties of covid apps make it hard for people to make risk-management decisions. The apps appear ineffective because they were designed to do privacy rather than to do public health, in various subtle ways; giving people low-grade warnings which do not require any action appear to be an attempt to raise public awareness, like mask mandates, rather than an effective attempt to get exposed individuals to isolate. Apps that check people into venues have their own issues and appear to be largely security theatre. Security theatre comes into its own where the perceived risk is much greater than the actual risk; covid is the opposite. What can be done in this case? Targeted warnings? Humour? What might happen when fatigue sets in? People will compromise compliance to make their lives bearable. That can be managed to some extent in institutions like universities, but in society it will be harder. We ended up with the suggestion that the next SHB seminar should be in February, which should be the low point; after that we can look forward to things getting better, and hopefully to a meeting in person in Cambridge on June 3-4 2021.

Security and Human Behaviour 2020

I’ll be liveblogging the workshop on security and human behaviour, which is online this year. My liveblogs will appear as followups to this post. This year my program co-chair is Alice Hutchings and we have invited a number of eminent criminologists to join us. Edited to add: here are the videos of the sessions.

Three Paper Thursday – GDPR anniversary edition

This is a guest contribution from Daniel Woods.

This coming Monday will mark two years since the General Data Protection Regulation (GDPR) came into effect. It prompted an initial wave of cookie banners that drowned users in assertions like “We value your privacy”. Website owners hoped that collecting user consent would ensure compliance and ward off the lofty fines.

Article 6 of the GDPR describes how organisations can establish a legal basis for processing personal data. Putting aside a selection of `necessary’ reasons for doing so, data processing can only be justified by collecting the user’s consent to “the processing of his or her personal data for one or more specific purposes”. Consequently, obtaining user consent could be the difference between suffering a dizzying fine or not.

The law changed the face of the web and this post considers one aspect of the transition. Consent Management Providers (CMPs) emerged offering solutions for websites to embed. Many of these use a technical standard described in the Transparency and Consent Framework. The standard was developed by the Industry Advertising Body, who proudly claim it is is “the only GDPR consent solution built by the industry for the industry”.

All of the following studies either directly measure websites implementing this standard or explore the theoretical implications of standardising consent. The first paper looks at how the design of consent dialogues shape the consent signal sent by users. The second paper identifies disparities between the privacy preferences communicated via cookie banners and the consent signals stored by the website. The third paper uses coalitional game theory to explore which firms extract the value from consent coalitions in which websites share consent signals.

Continue reading Three Paper Thursday – GDPR anniversary edition

Contact Tracing in the Real World

There have recently been several proposals for pseudonymous contact tracing, including from Apple and Google. To both cryptographers and privacy advocates, this might seem the obvious way to protect public health and privacy at the same time. Meanwhile other cryptographers have been pointing out some of the flaws.

There are also real systems being built by governments. Singapore has already deployed and open-sourced one that uses contact tracing based on bluetooth beacons. Most of the academic and tech industry proposals follow this strategy, as the “obvious” way to tell who’s been within a few metres of you and for how long. The UK’s National Health Service is working on one too, and I’m one of a group of people being consulted on the privacy and security.

But contact tracing in the real world is not quite as many of the academic and industry proposals assume.

First, it isn’t anonymous. Covid-19 is a notifiable disease so a doctor who diagnoses you must inform the public health authorities, and if they have the bandwidth they call you and ask who you’ve been in contact with. They then call your contacts in turn. It’s not about consent or anonymity, so much as being persuasive and having a good bedside manner.

I’m relaxed about doing all this under emergency public-health powers, since this will make it harder for intrusive systems to persist after the pandemic than if they have some privacy theater that can be used to argue that the whizzy new medi-panopticon is legal enough to be kept running.

Second, contact tracers have access to all sorts of other data such as public transport ticketing and credit-card records. This is how a contact tracer in Singapore is able to phone you and tell you that the taxi driver who took you yesterday from Orchard Road to Raffles has reported sick, so please put on a mask right now and go straight home. This must be controlled; Taiwan lets public-health staff access such material in emergencies only.

Third, you can’t wait for diagnoses. In the UK, you only get a test if you’re a VIP or if you get admitted to hospital. Even so the results take 1–3 days to come back. While the VIPs share their status on twitter or facebook, the other diagnosed patients are often too sick to operate their phones.

Fourth, the public health authorities need geographical data for purposes other than contact tracing – such as to tell the army where to build more field hospitals, and to plan shipments of scarce personal protective equipment. There are already apps that do symptom tracking but more would be better. So the UK app will ask for the first three characters of your postcode, which is about enough to locate which hospital you’d end up in.

Fifth, although the cryptographers – and now Google and Apple – are discussing more anonymous variants of the Singapore app, that’s not the problem. Anyone who’s worked on abuse will instantly realise that a voluntary app operated by anonymous actors is wide open to trolling. The performance art people will tie a phone to a dog and let it run around the park; the Russians will use the app to run service-denial attacks and spread panic; and little Johnny will self-report symptoms to get the whole school sent home.

Sixth, there’s the human aspect. On Friday, when I was coming back from walking the dogs, I stopped to chat for ten minutes to a neighbour. She stood halfway between her gate and her front door, so we were about 3 metres apart, and the wind was blowing from the side. The risk that either of us would infect the other was negligible. If we’d been carrying bluetooth apps, we’d have been flagged as mutual contacts. It would be quite intolerable for the government to prohibit such social interactions, or to deploy technology that would punish them via false alarms. And how will things work with an orderly supermarket queue, where law-abiding people stand patiently six feet apart?

Bluetooth also goes through plasterboard. If undergraduates return to Cambridge in October, I assume there will still be small-group teaching, but with protocols for distancing, self-isolation and quarantine. A supervisor might sit in a teaching room with two or three students, all more than 2m apart and maybe wearing masks, and the window open. The bluetooth app will flag up not just the others in the room but people in the next room too.

How is this to be dealt with? I expect the app developers will have to fit a user interface saying “You’re within range of device 38a5f01e20. Within infection range (y/n)?” But what happens when people get an avalanche of false alarms? They learn to click them away. A better design might be to invite people to add a nickname and a photo so that contacts could see who they are. “You are near to Ross [photo] and have been for five minutes. Are you maintaining physical distance?”

When I discussed this with a family member, the immediate reaction was that she’d refuse to run an anonymous app that might suddenly say “someone you’ve been near in the past four days has reported symptoms, so you must now self-isolate for 14 days.” A call from a public health officer is one thing, but not knowing who it was would just creep her out. It’s important to get the reactions of real people, not just geeks and wonks! And the experience of South Korea and Taiwan suggests that transparency is the key to public acceptance.

Seventh, on the systems front, decentralised systems are all very nice in theory but are a complete pain in practice as they’re too hard to update. We’re still using Internet infrastructure from 30 years ago (BGP, DNS, SMTP…) because it’s just too hard to change. Watch Moxie Marlinspike’s talk at 36C3 if you don’t get this. Relying on cryptography tends to make things even more complex, fragile and hard to change. In the pandemic, the public health folks may have to tweak all sorts of parameters weekly or even daily. You can’t do that with apps on 169 different types of phone and with peer-to-peer communications.

Personally I feel conflicted. I recognise the overwhelming force of the public-health arguments for a centralised system, but I also have 25 years’ experience of the NHS being incompetent at developing systems and repeatedly breaking their privacy promises when they do manage to collect some data of value to somebody else. The Google Deepmind scandal was just the latest of many and by no means the worst. This is why I’m really uneasy about collecting lots of lightly-anonymised data in a system that becomes integrated into a whole-of-government response to the pandemic. We might never get rid of it.

But the real killer is likely to be the interaction between privacy and economics. If the app’s voluntary, nobody has an incentive to use it, except tinkerers and people who religiously comply with whatever the government asks. If uptake remains at 10-15%, as in Singapore, it won’t be much use and we’ll need to hire more contact tracers instead. Apps that involve compulsion, such as those for quarantine geofencing, will face a more adversarial threat model; and the same will be true in spades for any electronic immunity certificate. There the incentive to cheat will be extreme, and we might be better off with paper serology test certificates, like the yellow fever vaccination certificates you needed for the tropics, back in the good old days when you could actually go there.

All that said, I suspect the tracing apps are really just do-something-itis. Most countries now seem past the point where contact tracing is a high priority; even Singapore has had to go into lockdown. If it becomes a priority during the second wave, we will need a lot more contact tracers: last week, 999 calls in Cambridge had a 40-minute wait and it took ambulances six hours to arrive. We cannot field an app that will cause more worried well people to phone 999.

The real trade-off between surveillance and public health is this. For years, a pandemic has been at the top of Britain’s risk register, yet far less was spent preparing for one than on anti-terrorist measures, many of which were ostentatious rather than effective. Worse, the rhetoric of terror puffed up the security agencies at the expense of public health, predisposing the US and UK governments to disregard the lesson of SARS in 2003 and MERS in 2015 — unlike the governments of China, Singapore, Taiwan and South Korea, who paid at least some attention. What we need is a radical redistribution of resources from the surveillance-industrial complex to public health.

Our effort should go into expanding testing, making ventilators, retraining everyone with a clinical background from vet nurses to physiotherapists to use them, and building field hospitals. We must call out bullshit when we see it, and must not give policymakers the false hope that techno-magic might let them avoid the hard decisions. Otherwise we can serve best by keeping out of the way. The response should not be driven by cryptographers but by epidemiologists, and we should learn what we can from the countries that have managed best so far, such as South Korea and Taiwan.

2020 Caspar Bowden Award

You are invited to submit nominations for the 2020 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies. The Caspar Bowden PET award is presented annually to researchers who have made an outstanding contribution to the theory, design, implementation, or deployment of privacy enhancing technology. It is awarded at the annual Privacy Enhancing Technologies Symposium (PETS), and carries a cash prize as well as a physical award monument.

Any paper by any author written in the area of privacy enhancing technologies is eligible for nomination. However, the paper must have appeared in a refereed journal, conference, or workshop with proceedings published in the period from April 1, 2018 until March 31, 2020.

Note that we do not accept nominations for publications in conference proceedings when the dates of the conference fall outside of the nomination window. For example, a IEEE Symposium on Security and Privacy (“Oakland”) paper made available on IEEE Xplore prior to the March 31 deadline would not be eligible, as the conference happens in May. Please note that PETS is associated with a journal publication, PoPETs, so any PoPETs paper published in an issue appearing before the March 31 deadline is eligible (which typically means only Issue 1 of the current year).

Anyone can nominate a paper by sending an email message to award-chairs20@petsymposium.org containing the following:
. Paper title
. Author(s)
. Author(s) contact information
. Publication venue and full reference
. Link to an available online version of the paper
. A nomination statement of no more than 500 words.

All nominations must be submitted by April 5, 2020. The award committee will select one or two winners among the nominations received. Winners must be present at the PET Symposium in order to receive the Award. This requirement can be waived only at the discretion of the PET advisory board. The complete Award rules including eligibility requirements can be found here.

Caspar Bowden PET Award Chairs (award-chairs20@petsymposium.org)

Simone Fischer-Hübner, Karlstad University
Ross Anderson, University of Cambridge

Caspar Bowden PET Award Committee

Erman Ayday, Bilkent University
Nataliia Bielova, Inria
Sonja Buchegger, KTH
Ian Goldberg, University of Waterloo
Rachel Greenstadt, NYU
Marit Hansen, Unabhängiges Datenschutzzentrum Schleswig Holstein -ULD
Dali Kaafar, CSIRO
Eran Toch, Tel Aviv University
Carmela Troncoso, EPFL
Matthew Wright, Rochester Institute of Technology

More information about the Caspar Bowden PET award (including past winners) is available here.