The current coronavirus pandemic has significantly disrupted all our societies and, we believe, it has also significantly disrupted cybercrime.
In the Cambridge Cybercrime Centre we collect crime-related datasets of many types and we expect, in due course, to be able to identify, measure and document this disruption. We will not be alone in doing so — a key aim of our centre is to make datasets available to other academic researchers so that they too can identify, measure and document. What’s more, we make this data available in a timely manner — sometimes before we have even looked at it ourselves!
When we have looked at the data and identified what might be changing (or where the criminals are exploiting new opportunities) then we shall of course be taking the traditional academic path of preparing papers, getting them peer reviewed, and then presenting them at conferences or publishing them in academic journals. However, that process is extremely slow — so we have decided to provide a faster route for getting out the message about what we find to be going on.
Our new series of “COVID Briefing Papers” are an ongoing series of short-form, open access reports aimed at academics, policymakers, and practitioners, which aim to provide an accessible summary of our ongoing research into the effects which the coronavirus pandemic (and government responses) are having on cybercrime. We’re hoping, at least for a while, to produce a new briefing paper each week … and you can now read the very first, where Ben Collier explains what has happened to illegal online drug markets… just click here!
We all know that learning a new craft is hard. We spend a large part of our lives learning how to operate in everyday physics. A large part of this learning comes from observing others, and when others can’t help we learn through trial and error.
In machine learning the process of learning how to deal with the environment is called Reinforcement Learning (RL). By continuous interaction with its environment, an agent learns a policy that enables it to perform better. Observational learning in RL is referred to as Imitation Learning. Both trial and error and imitation learning are hard: environments are not trivial, often you can’t tell the ramifications of an action until far in the future, environments are full of non-determinism and there are no such thing as a correct policy.
So, unlike in supervised and unsupervised learning, it is hard to tell if your decisions are correct. Episodes usually constitute thousands of decisions, and you will only know if you perform well after exploring other options. But experiment is also a hard decision: do you exploit the skill you already have, or try something new and explore the unknown?
Despite all these complexities, RL has managed to achieve incredible performance in a wide variety of tasks from robotics through recommender systems to trading. More impressively, RL agents have achieved superhuman performance in Go and other games, tasks previously believed to be impossible for computers.
When you are a medical doctor, friends and family invariably ask you about their aches and pains. When you are a computer specialist, they ask you to fix their computer. About ten years ago, most of the questions I was getting from friends and family as a security techie had to do with frustration over passwords. I observed that what techies had done to the rest of humanity was not just wrong but fundamentally unethical: asking people to do something impossible and then, if they got hacked, blaming them for not doing it.
So in 2011, years before the Fido Alliance was formed (2013) and Apple announced its smartwatch (2014), I published my detailed design for a clean-slate password replacement I calledPico, an alternative system intended to be easier to use and more secure than passwords. The European Research Council was generous enough to fund my vision with a grant that allowed me to recruit and lead a team of brilliant researchers over a period of five years. We built a number of prototypes, wrote a bunch of papers, offered projects to a number of students and even launched a start-up and thereby learnt a few first-hand lessons about business, venture capital, markets, sales and the difficult process of transitioning from academic research to a profitable commercial product. During all those years we changed our minds a few times about what ought to be done and we came to understand a lot better both the problem space and the mindset of the users.
Ethical issues in research using datasets of illicit origin (blog post) came about because in prior work we had noticed that there were ethical complexities to take care of when using data that had “fallen off the back of a lorry” such as the backend databases of hacked booter services that we had used. We took a broad look at existing published guidance to synthesise those issues which particularly apply to using data of illicit origin and we expected to see discussed by researchers:
Deep neural networks (DNNs) have been a very active field of research for eight years now, and for the last five we’ve seen a steady stream of adversarial examples – inputs that will bamboozle a DNN so that it thinks a 30mph speed limit sign is a 60 instead, and even magic spectacles to make a DNN get the wearer’s gender wrong.
So far, these attacks have targeted the integrity or confidentiality of machine-learning systems. Can we do anything about availability?
Sponge Examples: Energy-Latency Attacks on Neural Networks shows how to find adversarial examples that cause a DNN to burn more energy, take more time, or both. They affect a wide range of DNN applications, from image recognition to natural language processing (NLP). Adversaries might use these examples for all sorts of mischief – from draining mobile phone batteries, though degrading the machine-vision systems on which self-driving cars rely, to jamming cognitive radar.
So far, our most spectacular results are against NLP systems. By feeding them confusing inputs we can slow them down over 100 times. There are already examples in the real world where people pause or stumble when asked hard questions but we now have a dependable method for generating such examples automatically and at scale. We can also neutralize the performance improvements of accelerators for computer vision tasks, and make them operate on their worst case performance.
One implication is that engineers designing real-time systems that use machine learning will have to pay more attention to worst-case behaviour; another is that when custom chips used to accelerate neural network computations use optimisations that increase the gap between worst-case and average-case outcomes, you’d better pay even more attention.
One would be hard pressed to find an aspect of life where networks are not present. Interconnections are at the core of complex systems – such as society, or the world economy – allowing us to study and understand their dynamics. Some of the most transformative technologies are based on networks, be they hypertext documents making up the World Wide Web, interconnected networking devices forming the Internet, or the various neural network architectures used in deep learning. Social networks that are formed based on our interactions play a central role in our every day lives; they determine how ideas and knowledge spread and they affect behaviour. This is also true for cybercriminal networks present on underground forums, and social network analysis provides valuable insights to how these communities operate either on the dark web or the surface web.
For today’s post in the series `Three Paper Thursday’, I’ve selected three papers that highlight the valuable information we can learn from studying underground forums if we model them as networks. Network topology and large scale structure provide insights to information flow and interaction patterns. These properties along with discovering central nodes and the roles they play in a given community are useful not only for understanding the dynamics of these networks but for various purposes, such as devising disruption strategies.
We are specifically interested in extending our data collection to better record how cybercrime has changed in response the COVID-19 pandemic and we wish to mine our datasets in order to understand whether cybercrime has increased, decreased or displaced during 2020.
There are a lot of theories being proposed as to what may or may not have changed, often based on handfuls of anecdotes — we are looking for researchers who will help us provide data driven descriptions of what is (now) going on — which will feed into policy debates as to the future importance of cybercrime and how best to respond to it.
We are not necessarily looking for existing experience in researching cybercrime, although this would be a bonus. However, we are looking for strong programming skills — and experience with scripting languages and databases would be much preferred. Good knowledge of English and communication skills are important.
Since these posts are only guaranteed to be funded until the end of September, we will be shortlisting candidates for (online) interview as soon as possible (NOTE the application deadline is less than ONE WEEK AWAY) and will be giving preference to people who can take up a post without undue delay. The rapid timescale of the hiring process means that we will only be able to offer positions to candidates who already have permission to work in the UK (which, as a rough guide, means UK or EU citizens or those with existing appropriate visas).
We do not realistically expect to be permitted to return to our desks in the Computer Laboratory before the end of September, so it will be necessary for successful candidates to be able to successfully “work from home” … not necessarily within the UK.
Please follow this link to the advert to read the formal advertisement for the details about exactly who and what we’re looking for and how to apply.
Much has been made in the cybersecurity literature of the transition of cybercrime to a service-based economy, with specialised services providing Denial of Service attacks, cash-out services, escrow, forum administration, botnet management, or ransomware configuration to less-skilled users. Despite this acknowledgement of the ‘industrialisation’ of much for the cybercrime economy, the picture of cybercrime painted by law enforcement and media reports is often one of ’sophisticated’ attacks, highly-skilled offenders, and massive payouts. In fact, as we argue in a recent paper accepted to the Workshop on the Economics of Information Security this year (and covered in KrebsOnSecurity last week), cybercrime-as-a-service relies on a great deal of tedious, low-income, and low-skilled manual administrative work.
Yesterday’s publication of the minutes of the government’s Scientific Advisory Group for Emergencies (SAGE) raises some interesting questions. An initial summary in yesterday’s Guardian has a timeline suggesting that it was the distinguished medics on SAGE rather than the Prime Minister who went from complacency in January and February to panic in March, and who ignored the risk to care homes until it was too late.
Is this a Machiavellian conspiracy by Dominic Cummings to blame the scientists, or is it business as usual? Having spent a dozen years on the university’s governing body and various of its subcommittees, I can absolutely get how this happened. Once a committee gets going, it can become very reluctant to change its opinion on anything. Committees can become sociopathic, worrying about their status, ducking liability, and finding reasons why problems are either somebody else’s or not practically soluble.
So I spent a couple of hours yesterday reading the minutes, and indeed we see the group worried about its power: on February 13th it wants the messaging to emphasise that official advice is both efficaceous and sufficient, to “reduce the likelihood of the public adopting unnecessary or contradictory behaviours”. Turf is defended: Public Health England (PHE) ruled on February 18th that it can cope with 5 new cases a week (meaning tracing 800 contacts) and hoped this might be increased to 50; they’d already decided the previous week that it wasn’t possible to accelerate diagnostic capacity. So far, so much as one might expect.
The big question, though, is why nobody thought of protecting people in care homes. The answer seems to be that SAGE dismissed the problem early on as “too hard” or “not our problem”. On March 5th they note that social distancing for over-65s could save a lot of lives and would be most effective for those living independently: but it would be “a challenge to implement this measure in communal settings such as care homes”. They appear more concerned that “Many of the proposed measures will be easier to implement for those on higher incomes” and the focus is on getting PHE to draft guidance. (This is the meeting at which Dominic Cummings makes his first appearance, so he cannot dump all the blame on the scientists.)