Monthly Archives: July 2020

Cambridge Cybercrime Centre: COVID briefing papers

The current coronavirus pandemic has significantly disrupted all our societies and, we believe, it has also significantly disrupted cybercrime.

In the Cambridge Cybercrime Centre we collect crime-related datasets of many types and we expect, in due course, to be able to identify, measure and document this disruption. We will not be alone in doing so — a key aim of our centre is to make datasets available to other academic researchers so that they too can identify, measure and document. What’s more, we make this data available in a timely manner — sometimes before we have even looked at it ourselves!

When we have looked at the data and identified what might be changing (or where the criminals are exploiting new opportunities) then we shall of course be taking the traditional academic path of preparing papers, getting them peer reviewed, and then presenting them at conferences or publishing them in academic journals. However, that process is extremely slow — so we have decided to provide a faster route for getting out the message about what we find to be going on.

Our new series of “COVID Briefing Papers” are an ongoing series of short-form, open access reports aimed at academics, policymakers, and practitioners, which aim to provide an accessible summary of our ongoing research into the effects which the coronavirus pandemic (and government responses) are having on cybercrime. We’re hoping, at least for a while, to produce a new briefing paper each week … and you can now read the very first, where Ben Collier explains what has happened to illegal online drug markets… just click here!

Reinforcement Learning and Adversarial thinking

We all know that learning a new craft is hard. We spend a large part of our lives learning how to operate in everyday physics.  A large part of this learning comes from observing others, and when others can’t help we learn through trial and error. 

In machine learning the process of learning how to deal with the environment is called Reinforcement Learning (RL). By continuous interaction with its environment, an agent learns a policy that enables it to perform better. Observational learning in RL is referred to as Imitation Learning. Both trial and error and imitation learning are hard: environments are not trivial, often you can’t tell the ramifications of an action until far in the future, environments are full of non-determinism and there are no such thing as a correct policy. 

So, unlike in supervised and unsupervised learning, it is hard to tell if your decisions are correct. Episodes usually constitute thousands of decisions, and you will only know if you perform well after exploring other options. But experiment is also a hard decision: do you exploit the skill you already have, or try something new and explore the unknown?

Despite all these complexities, RL has managed to achieve incredible performance in a wide variety of tasks from robotics through recommender systems to trading. More impressively, RL agents have achieved superhuman performance in Go and other games, tasks previously believed to be impossible for computers. 

Continue reading Reinforcement Learning and Adversarial thinking