Towards greater ecological validity in security usability

When you are a medical doctor, friends and family invariably ask you about their aches and pains. When you are a computer specialist, they ask you to fix their computer. About ten years ago, most of the questions I was getting from friends and family as a security techie had to do with frustration over passwords. I observed that what techies had done to the rest of humanity was not just wrong but fundamentally unethical: asking people to do something impossible and then, if they got hacked, blaming them for not doing it.



So in 2011, years before the Fido Alliance was formed (2013) and Apple announced its smartwatch (2014), I published my detailed design for a clean-slate password replacement I called Pico, an alternative system intended to be easier to use and more secure than passwords. The European Research Council was generous enough to fund my vision with a grant that allowed me to recruit and lead a team of brilliant researchers over a period of five years. We built a number of prototypes, wrote a bunch of papers, offered projects to a number of students and even launched a start-up and thereby learnt a few first-hand lessons about business, venture capital, markets, sales and the difficult process of transitioning from academic research to a profitable commercial product. During all those years we changed our minds a few times about what ought to be done and we came to understand a lot better both the problem space and the mindset of the users.

Members of the public were invariably thrilled when hearing about the topic of our research: “At last!” was their feeling, immediately followed by personal tales of password-induced frustration. And yet, when we offered our prototypes for testing, not merely at no charge but at negative cost (with a small monetary reward for the participants’ time, as is common in user trials), the uptake was minimal. It turns out that, while people hate passwords with a passion, they hate changing their habits even more. If you develop a solution that works on a certain browser, or on a certain brand of smartphone, nobody except the keenest early adopters will even bother trying it, regardless of how good it might be, if they’re not already using that browser or that brand of phone. If you are an academic researcher trying to come up with a novel solution, it is reasonable to develop just for one platform (usually the most open one, the one that gives you greatest control) and leave the porting to other more commercial platforms as an intellectually unrewarding “exercise for the reader”. If however you care about your solution being used in practice (for example because you are an entrepreneur trying to make money from your invention, or because you want it to solve a problem rather than just make an intellectual point) then you must support the platforms that real people actually use, even if it’s a lot more work.

In a 2012 paper called “The Quest to Replace Passwords”, which grew out of the “related work” section of my 2011 Pico paper and then became one of the most cited reference works in the field with over 800 citations on Google Scholar, we examined and compared in detail several dozen web authentication schemes. The web is largely responsible for the proliferation of passwords that we experienced over the past quarter century and most of the current password usability literature still focuses on web passwords. However, the problem with web passwords is sufficiently annoying and sufficiently pervasive that most users already have some coping strategy that allows them to survive, such as having the same basic password for many sites, sometimes with minor variations—or, for the more sophisticated users, a password manager, perhaps the one already embedded in the browser. Users still grumble, but they get along, and they prefer continuing to use their admittedly imperfect workaround rather than changing their habits. Commercial websites, conversely, as originally noted by Florêncio and Herley in 2010, have also evolved their own coping strategies, after realising that annoying their users too frequently will cause those users to take their custom elsewhere; hence the long-lived login cookies, thanks to which users no longer have to type their passwords every time. As a consequence of these two trends (ad-hoc coping strategies from both users and websites) we witness the security usability paradox that, while users continue to be very vocal about their hatred of passwords, they are generally reluctant to try new solutions to get rid of them.

In a new paper recently accepted by the Journal of Cybersecurity, “Deploying authentication in the wild: towards greater ecological validity in security usability studies” by Seb Aebischer, Claudio Dettoni, Graeme Jenkinson, Kat Krol, David Llewellyn-Jones, Toshiyuki Masui and Frank Stajano, we report on a user study on web authentication we performed in collaboration with Gyazo, an Alexa Top 1000 website, and how it led our Pico team to pivot from the web to the desktop in our quest to alleviate password problems. We tell the edifying story of our ill-fated trial with a friendly government organisation, Innovate UK. We distill a number of instructive lessons about usability of authentication and about the path travelled by the Pico team over the years, some of which might now seem obvious in hindsight even though we had to learn them the hard way. We also offer our open source code (as is—no maintenance or support) for others who might wish to improve on it.

At a higher level, we note how the publishing incentives in academia are stacked against validating security usability research with realistic experiments. Early-career researchers get to publish more papers, and with more respectable-looking statistical results, if they validate their hypotheses with simulated experiments on Mechanical Turk that reach hundreds of experimental subjects, as opposed to actually building working prototypes, finding a few people who will accept to use them as part of their daily tasks, deploying the imperfect prototypes to them, fixing the inevitable problems as they come up, and observing and debriefing those users after extensive practice in their normal environment. For a given investment of time and effort, the latter strategy will lead to substantially greater engineering effort, fewer data points, less convincing statistical evidence and many fewer papers overall. It is easy to see how researchers seeking academic promotion would be deterred from following this approach. But our thesis is that, while there is scope for mTurk surveys during the initial exploratory stages, the true validation of a security usability design is only offered by the more laborious iterated methodology of design – build – deploy – measure – fix – repeat that is successfully adopted in other academic areas such as Systems research. Security is almost never the user’s goal, but rather something that gets in the way of honest users who were trying to achieve their goal; this is why simulated experiments (which focus on using the security mechanism itself), as opposed to live deployments (which focus on getting on with one’s daily routine despite the security mechanism), only tell a small part of the story. We hope the security usability community will recognise the value of failure and of realistic deployments and will therefore change its incentives to reward this higher standard of ecological validity.

@article{2020-AebischerETAL-ecological,
author = {Seb Aebischer and Claudio Dettoni and Graeme Jenkinson and Kat Krol and David Llewellyn-Jones and Toshiyuki Masui and Frank Stajano},
title = {Deploying authentication in the wild: Towards greater ecological validity in security usability studies},
year = {2020},
journal = {Journal of Cybersecurity},
doi = {https://doi.org/10.1093/cybsec/tyaa010},
publisher = {Oxford University Press},
url = {http://www.cl.cam.ac.uk/~fms27/papers/2020-AebischerETAL-ecological.pdf},
}

Three paper Thursday: Ethics in security research

Good security and cybercrime research often creates an impact and we want to ensure that impact is positive. This week I will discuss three papers on ethics in computer security research in the run up to next week’s Security and Human Behaviour workshop (SHB). Ethical issues in research using datasets of illicit origin (Thomas, Pastrana, Hutchings, Clayton, Beresford) from IMC 2017, Measuring eWhoring (Pastrana, Hutchings, Thomas, Tapiador) from IMC 2019, and An Ethics Framework for Research into Heterogeneous Systems (Happa, Nurse, Goldsmith, Creese, Williams).

Ethical issues in research using datasets of illicit origin (blog post) came about because in prior work we had noticed that there were ethical complexities to take care of when using data that had “fallen off the back of a lorry” such as the backend databases of hacked booter services that we had used. We took a broad look at existing published guidance to synthesise those issues which particularly apply to using data of illicit origin and we expected to see discussed by researchers:

Continue reading Three paper Thursday: Ethics in security research

How to jam neural networks

Deep neural networks (DNNs) have been a very active field of research for eight years now, and for the last five we’ve seen a steady stream of adversarial examples – inputs that will bamboozle a DNN so that it thinks a 30mph speed limit sign is a 60 instead, and even magic spectacles to make a DNN get the wearer’s gender wrong.

So far, these attacks have targeted the integrity or confidentiality of machine-learning systems. Can we do anything about availability?

Sponge Examples: Energy-Latency Attacks on Neural Networks shows how to find adversarial examples that cause a DNN to burn more energy, take more time, or both. They affect a wide range of DNN applications, from image recognition to natural language processing (NLP). Adversaries might use these examples for all sorts of mischief – from draining mobile phone batteries, though degrading the machine-vision systems on which self-driving cars rely, to jamming cognitive radar.

So far, our most spectacular results are against NLP systems. By feeding them confusing inputs we can slow them down over 100 times. There are already examples in the real world where people pause or stumble when asked hard questions but we now have a dependable method for generating such examples automatically and at scale. We can also neutralize the performance improvements of accelerators for computer vision tasks, and make them operate on their worst case performance.

One implication is that engineers designing real-time systems that use machine learning will have to pay more attention to worst-case behaviour; another is that when custom chips used to accelerate neural network computations use optimisations that increase the gap between worst-case and average-case outcomes, you’d better pay even more attention.

Three Paper Thursday – Analysing social networks within underground forums

One would be hard pressed to find an aspect of life where networks are not present. Interconnections are at the core of complex systems – such as society, or the world economy – allowing us to study and understand their dynamics. Some of the most transformative technologies are based on networks, be they hypertext documents making up the World Wide Web, interconnected networking devices forming the Internet, or the various neural network architectures used in deep learning. Social networks that are formed based on our interactions play a central role in our every day lives; they determine how ideas and knowledge spread and they affect behaviour. This is also true for cybercriminal networks present on underground forums, and social network analysis provides valuable insights to how these communities operate either on the dark web or the surface web.

For today’s post in the series `Three Paper Thursday’, I’ve selected three papers that highlight the valuable information we can learn from studying underground forums if we model them as networks. Network topology and large scale structure provide insights to information flow and interaction patterns. These properties along with discovering central nodes and the roles they play in a given community are useful not only for understanding the dynamics of these networks but for various purposes, such as devising disruption strategies.

Continue reading Three Paper Thursday – Analysing social networks within underground forums

Hiring for the Cambridge Cybercrime Centre

We have just advertised some short-term “post-doc” positions in the Cambridge Cybercrime Centre: https://www.cambridgecybercrime.uk.

We are specifically interested in extending our data collection to better record how cybercrime has changed in response the COVID-19 pandemic and we wish to mine our datasets in order to understand whether cybercrime has increased, decreased or displaced during 2020.

There are a lot of theories being proposed as to what may or may not have changed, often based on handfuls of anecdotes — we are looking for researchers who will help us provide data driven descriptions of what is (now) going on — which will feed into policy debates as to the future importance of cybercrime and how best to respond to it.

We are not necessarily looking for existing experience in researching cybercrime, although this would be a bonus. However, we are looking for strong programming skills — and experience with scripting languages and databases would be much preferred. Good knowledge of English and communication skills are important.

Since these posts are only guaranteed to be funded until the end of September, we will be shortlisting candidates for (online) interview as soon as possible (NOTE the application deadline is less than ONE WEEK AWAY) and will be giving preference to people who can take up a post without undue delay. The rapid timescale of the hiring process means that we will only be able to offer positions to candidates who already have permission to work in the UK (which, as a rough guide, means UK or EU citizens or those with existing appropriate visas).

We do not realistically expect to be permitted to return to our desks in the Computer Laboratory before the end of September, so it will be necessary for successful candidates to be able to successfully “work from home” … not necessarily within the UK.

Please follow this link to the advert to read the formal advertisement for the details about exactly who and what we’re looking for and how to apply.

Cybercrime is (often) boring

Much has been made in the cybersecurity literature of the transition of cybercrime to a service-based economy, with specialised services providing Denial of Service attacks, cash-out services, escrow, forum administration, botnet management, or ransomware configuration to less-skilled users. Despite this acknowledgement of the ‘industrialisation’ of much for the cybercrime economy, the picture of cybercrime painted by law enforcement and media reports is often one of ’sophisticated’ attacks, highly-skilled offenders, and massive payouts. In fact, as we argue in a recent paper accepted to the Workshop on the Economics of Information Security this year (and covered in KrebsOnSecurity last week), cybercrime-as-a-service relies on a great deal of tedious, low-income, and low-skilled manual administrative work.

Continue reading Cybercrime is (often) boring

Is science being set up to take the blame?

Yesterday’s publication of the minutes of the government’s Scientific Advisory Group for Emergencies (SAGE) raises some interesting questions. An initial summary in yesterday’s Guardian has a timeline suggesting that it was the distinguished medics on SAGE rather than the Prime Minister who went from complacency in January and February to panic in March, and who ignored the risk to care homes until it was too late.

Is this a Machiavellian conspiracy by Dominic Cummings to blame the scientists, or is it business as usual? Having spent a dozen years on the university’s governing body and various of its subcommittees, I can absolutely get how this happened. Once a committee gets going, it can become very reluctant to change its opinion on anything. Committees can become sociopathic, worrying about their status, ducking liability, and finding reasons why problems are either somebody else’s or not practically soluble.

So I spent a couple of hours yesterday reading the minutes, and indeed we see the group worried about its power: on February 13th it wants the messaging to emphasise that official advice is both efficaceous and sufficient, to “reduce the likelihood of the public adopting unnecessary or contradictory behaviours”. Turf is defended: Public Health England (PHE) ruled on February 18th that it can cope with 5 new cases a week (meaning tracing 800 contacts) and hoped this might be increased to 50; they’d already decided the previous week that it wasn’t possible to accelerate diagnostic capacity. So far, so much as one might expect.

The big question, though, is why nobody thought of protecting people in care homes. The answer seems to be that SAGE dismissed the problem early on as “too hard” or “not our problem”. On March 5th they note that social distancing for over-65s could save a lot of lives and would be most effective for those living independently: but it would be “a challenge to implement this measure in communal settings such as care homes”. They appear more concerned that “Many of the proposed measures will be easier to implement for those on higher incomes” and the focus is on getting PHE to draft guidance. (This is the meeting at which Dominic Cummings makes his first appearance, so he cannot dump all the blame on the scientists.)

Continue reading Is science being set up to take the blame?

Three Paper Thursday: Vulnerabilities! We’ve got vulnerabilities here! … See? Nobody cares.

Jurassic Park is often (mistakenly) left out of the hacker movie canon. It clearly demonstrated the risk of an insider attack on control systems (Velociraptor rampage, amongst other tragedies…) nearly a decade ahead of the Maroochy sewage incident, it’s the first film I know of with a digital troll (“ah, ah, ah, you didn’t say the magic word!”), and Samuel L. Jackson correctly assesses the possible consequence of a hard reset (namely, everyone dying), resulting in his legendary “Hold on to your butts”. The quotable mayhem is seeded early in the film, when biotech spy Lewis Dodgson gives a sack of money to InGen’s Dennis Nedry to steal some dino DNA. Dodgson’s caricatured OPSEC (complete with trilby and dark glasses) is mocked by Nedry shouting, “Dodgson! Dodgson! We’ve got Dodgson here! See, nobody cares…” Three decades later, this quote still comes to mind* whenever conventional wisdom doesn’t seem to square with observed reality, and today we’re going to apply it to the oft-maligned world of Industrial Control System (ICS) security.

There is plenty of literature on ICS security pre-2010, but people really sat up and started paying attention when we learned about Stuxnet. Possibly the most upsetting thing about Stuxnet (for security-complacent control system designers like me) was the apparent ease with which the “air gap” was bridged over and over again. Any remaining faith in the air gap was killed by Éireann Leverett’s demonstration (thesis and S4 presentation) that thousands of industrial systems were directly connected to the Internet — no air gap jumping required. Since then, we’ve observed a steady growth in Internet-connected ICS devices, due both to improved search techniques and increasingly-connectable ICS devices. On any given day you can find about 100,000 unique devices speaking industrial protocols on Censys and Shodan. These protocols are largely unauthenticated and unencrypted, allowing an attacker that can speak the protocol to remotely read state, issue commands, and even modify programmable logic without using an actual exploit.

This sounds (and is) bad, and people have (correctly) highlighted its badness on many occasions. The attacks, however, appear to be missing: we are not aware of a single instance of industrial damage initiated via an Internet-connected ICS device. In this Three Paper Thursday we’ll look at papers showing how easy it is to find and contextualise Internet-connected ICS devices, some evidence for lack of malicious interest, and some leading indicators that this happy conclusion (for which we don’t really deserve any credit) may be changing.

*Perhaps because guys of a certain age still laugh and say “Dodson! We’ve got Dodson here!” when they learn my surname. I try to explain that it’s spelt differently, but…
Continue reading Three Paper Thursday: Vulnerabilities! We’ve got vulnerabilities here! … See? Nobody cares.

Three Paper Thursday – GDPR anniversary edition

This is a guest contribution from Daniel Woods.

This coming Monday will mark two years since the General Data Protection Regulation (GDPR) came into effect. It prompted an initial wave of cookie banners that drowned users in assertions like “We value your privacy”. Website owners hoped that collecting user consent would ensure compliance and ward off the lofty fines.

Article 6 of the GDPR describes how organisations can establish a legal basis for processing personal data. Putting aside a selection of `necessary’ reasons for doing so, data processing can only be justified by collecting the user’s consent to “the processing of his or her personal data for one or more specific purposes”. Consequently, obtaining user consent could be the difference between suffering a dizzying fine or not.

The law changed the face of the web and this post considers one aspect of the transition. Consent Management Providers (CMPs) emerged offering solutions for websites to embed. Many of these use a technical standard described in the Transparency and Consent Framework. The standard was developed by the Industry Advertising Body, who proudly claim it is is “the only GDPR consent solution built by the industry for the industry”.

All of the following studies either directly measure websites implementing this standard or explore the theoretical implications of standardising consent. The first paper looks at how the design of consent dialogues shape the consent signal sent by users. The second paper identifies disparities between the privacy preferences communicated via cookie banners and the consent signals stored by the website. The third paper uses coalitional game theory to explore which firms extract the value from consent coalitions in which websites share consent signals.

Continue reading Three Paper Thursday – GDPR anniversary edition