LDAP based UDP reflection attacks increase throughout 2017

There have been reports that UDP reflection DDoS attacks based on LDAP (aka CLDAP) have been increasing in recent months. Our network of UDP honeypots (described previously) confirms that this is the case. We estimate there are around 6000 attacks per day using this method. Our estimated number of attacks has risen fairly linearly from almost none at the beginning of 2017 to 5000-7000 per day at the beginning of 2018.
Number of attacks rises linearly from 0 at the beginning of 2017 to 5000-7000 per day at the beginning of 2018

Over the period where Netlab observed 304,146 attacks (365 days up to 2017-11-01) we observed 596,534 attacks. This may be due to detecting smaller attacks or overcounting due to attacks on IP prefixes.

The data behind this analysis is part of the Cambridge Cybercrime Centre’s catalogue of data available to academic researchers.

What Goes Around Comes Around

What Goes Around Comes Around is a chapter I wrote for a book by EPIC. What are America’s long-term national policy interests (and ours for that matter) in surveillance and privacy? The election of a president with a very short-term view makes this ever more important.

While Britain was top dog in the 19th century, we gave the world both technology (steamships, railways, telegraphs) and values (the abolition of slavery and child labour, not to mention universal education). America has given us the motor car, the Internet, and a rules-based international trading system – and may have perhaps one generation left in which to make a difference.

Lessig taught us that code is law. Similarly, architecture is policy. The architecture of the Internet, and the moral norms embedded in it, will be a huge part of America’s legacy, and the network effects that dominate the information industries could give that architecture great longevity.

So if America re-engineers the Internet so that US firms can microtarget foreign customers cheaply, so that US telcos can extract rents from foreign firms via service quality, and so that the NSA can more easily spy on people in places like Pakistan and Yemen, then in 50 years’ time the Chinese will use it to manipulate, tax and snoop on Americans. In 100 years’ time it might be India in pole position, and in 200 years the United States of Africa.

My book chapter explores this topic. What do the architecture of the Internet, and the network effects of the information industries, mean for politics in the longer term, and for human rights? Although the chapter appeared in 2015, I forgot to put it online at the time. So here it is now.

End of privacy rights in the UK public sector?

There has already been serious controversy about the “Henry VIII” powers in the Brexit Bill, which will enable ministers to rewrite laws at their discretion as we leave the EU. Now Theresa May’s government has sneaked a new “Framework for data processing in government” into the Lords committee stage of the new Data Protection Bill (see pages 99-101, which are pp 111–3 of the pdf). It will enable ministers to promulgate a Henry VIII privacy regulation with quite extraordinary properties.

It will cover all data held by any public body including the NHS (175(1)), be outside of the ICO’s jurisdiction (178(5)) and that of any tribunal (178(2)) including Judicial Review (175(4), 176(7)), wider human-rights law (178(2,3,4)), and international jurisdictions – although ministers are supposed to change it if they notice that it breaks any international treaty obligation (177(4)).

In fact it will be changeable on a whim by Ministers (175(4)), have no effective Parliamentary oversight (175(6)), and apply retroactively (178(3)). It will also provide an automatic statutory defence for any data processing in any Government decision taken to any tribunal/court 178(4)).

Ministers have had frequent fights in the past over personal data in the public sector, most frequently over medical records which they have sold, again and again, to drug companies and others in defiance not just of UK law, EU law and human-rights law, but of the express wishes of patients, articulated by opting out of data “sharing”. In fact, we have to thank MedConfidential for being the first to notice the latest data grab. Their briefing gives more details are sets out the amendments we need to press for in Parliament. This is not the only awful thing about the bill by any means; its section 164 will be terrible news for journalists. This is one of those times when you need to write to your MP. Please do it now!

Ethical issues in research using datasets of illicit origin

On Friday at IMC I presented our paper “Ethical issues in research using datasets of illicit origin” by Daniel R. Thomas, Sergio Pastrana, Alice Hutchings, Richard Clayton, and Alastair R. Beresford. We conducted this research after thinking about some of these issues in the context of our previous work on UDP reflection DDoS attacks.

Data of illicit origin is data obtained by illicit means such as exploiting a vulnerability or unauthorized disclosure, in our previous work this was leaked databases from booter services. We analysed existing guidance on ethics and papers that used data of illicit origin to see what issues researchers are encouraged to discuss and what issues they did discuss. We find wide variation in current practice. We encourage researchers using data of illicit origin to include an ethics section in their paper: to explain why the work was ethical so that the research community can learn from the work. At present in many cases positive benefits as well as potential harms of research, remain entirely unidentified. Few papers record explicit Research Ethics Board (REB) (aka IRB/Ethics Commitee) approval for the activity that is described and the justifications given for exemption from REB approval suggest deficiencies in the REB process. It is also important to focus on the “human participants” of research rather than the narrower “human subjects” definition as not all the humans that might be harmed by research are its direct subjects.

The paper and the slides are available.

Is this research ethical?

The Economist features face recognition on its front page, reporting that deep neural networks can now tell whether you’re straight or gay better than humans can just by looking at your face. The research they cite is a preprint, available here.

Its authors Kosinski and Wang downloaded thousands of photos from a dating site, ran them through a standard feature-extraction program, then classified gay vs straight using a standard statistical classifier, which they found could tell the men seeking men from the men seeking women. My students pretty well instantly called this out as selection bias; if gay men consider boyish faces to be cuter, then they will upload their most boyish photo. The paper authors suggest their finding may support a theory that sexuality is influenced by fetal testosterone levels, but when you don’t control for such biases your results may say more about social norms than about phenotypes.

Quite apart from the scientific value of the research, which is perhaps best assessed by specialists, I’m concerned with the ethics and privacy aspects. I am surprised that the paper doesn’t report having been through ethical review; the authors consider that photos on a dating website are public information and appear to assume that privacy issues simply do not arise.

Yet UK courts decided, in Campbell v Mirror, that privacy could be violated even by photos taken on the public street, and European courts have come to similar conclusions in I v Finland and elsewhere. For example, a Catholic woman is entitled to object to the use of her medical record in research on abortifacients and contraceptives even if the proposed use is fully anonymised and presents no privacy risk whatsoever. The dating site users would be similarly entitled to object to their photos being used in research to which they might have an ethical objection, even if they could not be identified from their photos. There are surely going to be people who object to research in any nature vs nurture debate, especially on a charged topic such as sexuality. And the whole point of the Economist’s coverage is that face-recognition technology is now good enough to work at population scale.

What do LBT readers think?

Is the City force corrupt, or just clueless?

This week brought an announcement from a banking association that “identity fraud” is soaring to new levels, with 89,000 cases reported in the first six months of 2017 and 56% of all fraud reported by its members now classed as “identity fraud”.

So what is “identity fraud”? The announcement helpfully clarifies the concept:

“The vast majority of identity fraud happens when a fraudster pretends to be an innocent individual to buy a product or take out a loan in their name. Often victims do not even realise that they have been targeted until a bill arrives for something they did not buy or they experience problems with their credit rating. To carry out this kind of fraud successfully, fraudsters need access to their victim’s personal information such as name, date of birth, address, their bank and who they hold accounts with. Fraudsters get hold of this in a variety of ways, from stealing mail through to hacking; obtaining data on the ‘dark web’; exploiting personal information on social media, or though ‘social engineering’ where innocent parties are persuaded to give up personal information to someone pretending to be from their bank, the police or a trusted retailer.”

Now back when I worked in banking, if someone went to Barclays, pretended to be me, borrowed £10,000 and legged it, that was “impersonation”, and it was the bank’s money that had been stolen, not my identity. How did things change?

The members of this association are banks and credit card issuers. In their narrative, those impersonated are treated as targets, when the targets are actually those banks on whom the impersonation is practised. This is a precursor to refusing bank customers a “remedy” for “their loss” because “they failed to protect themselves.”
Now “dishonestly making a false representation” is an offence under s2 Fraud Act 2006. Yet what is the police response?

The Head of the City of London Police’s Economic Crime Directorate does not see the banks’ narrative as dishonest. Instead he goes along with it: “It has become normal for people to publish personal details about themselves on social media and on other online platforms which makes it easier than ever for a fraudster to steal someone’s identity.” He continues: “Be careful who you give your information to, always consider whether it is necessary to part with those details.” This is reinforced with a link to a police website with supposedly scary statistics: 55% of people use open public wifi and 40% of people don’t have antivirus software (like many security researchers, I’m guilty on both counts). This police website has a quote from the Head’s own boss, a Commander who is the National Police Coordinator for Economic Crime.

How are we to rate their conduct? Given that the costs of the City force’s Dedicated Card and Payment Crime Unit are borne by the banks, perhaps they feel obliged to sing from the banks’ hymn sheet. Just as the MacPherson report criticised the Met for being institutionally racist, we might perhaps describe the City force as institutionally corrupt. There is a wide literature on regulatory capture, and many other examples of regulators keen to do the banks’ bidding. And it’s not just the City force. There are disgraceful examples of the Metropolitan Police Commissioner and GCHQ endorsing the banks’ false narrative. However people are starting to notice, including the National Audit Office.

Or perhaps the police are just clueless?

History of the Crypto Wars in Britain

Back in March I gave an invited talk to the Cambridge University Ethics in Mathematics Society on the Crypto Wars. They have just put the video online here.

We spent much of the 1990s pushing back against attempts by the intelligence agencies to seize control of cryptography. From the Clipper Chip through the regulation of trusted third parties to export control, the agencies tried one trick after another to make us all less secure online, claiming that thanks to cryptography the world of intelligence was “going dark”. Quite the opposite was true; with communications moving online, with people starting to carry mobile phones everywhere, and with our communications and traffic data mostly handled by big firms who respond to warrants, law enforcement has never had it so good. Twenty years ago it cost over a thousand pounds a day to follow a suspect around, and weeks of work to map his contacts; Ed Snowden told us how nowadays an officer can get your location history with one click and your address book with another. In fact, searches through the contact patterns of whole populations are now routine.

The checks and balances that we thought had been built in to the RIP Act in 2000 after all our lobbying during the 1990s turned out to be ineffective. GCHQ simply broke the law and, after Snowden exposed them, Parliament passed the IP Act to declare that what they did was all right now. The Act allows the Home Secretary to give secret orders to tech companies to do anything they physically can to facilitate surveillance, thereby delighting our foreign competitors. And Brexit means the government thinks it can ignore the European Court of Justice, which has already ruled against some of the Act’s provisions. (Or perhaps Theresa May chose a hard Brexit because she doesn’t want the pesky court in the way.)

Yet we now see the Home Secretary repeating the old nonsense about decent people not needing privacy along with law enforcement officials on both sides of the Atlantic. Why doesn’t she just sign the technical capability notices she deems necessary and serve them?

In these fraught times it might be useful to recall how we got here. My talk to the Ethics in Mathematics Society was a personal memoir; there are many links on my web page to relevant documents.

Compartmentation is hard, but the Big Data playbook makes it harder still

A new study of Palantir’s systems and business methods makes sobering reading for people interested in what big data means for privacy.

Privacy scales badly. It’s OK for the twenty staff at a medical practice to have access to the records of the ten thousand patients registered there, but when you build a centralised system that lets every doctor and nurse in the country see every patient’s record, things go wrong. There are even sharper concerns in the world of intelligence, which agencies try to manage using compartmentation: really sensitive information is often put in a compartment that’s restricted to a handful of staff. But such systems are hard to build and maintain. Readers of my book chapter on the subject will recall that while US Naval Intelligence struggled to manage millions of compartments, the CIA let more of their staff see more stuff – whereupon Aldrich Ames betrayed their agents to the Russians.

After 9/11, the intelligence community moved towards the CIA model, in the hope that with fewer compartments they’d be better able to prevent future attacks. We predicted trouble, and Snowden duly came along. As for civilian agencies such as Britain’s NHS and police, no serious effort was made to protect personal privacy by compartmentation, with multiple consequences.

Palantir’s systems were developed to help the intelligence community link, fuse and visualise data from multiple sources, and are now sold to police forces too. It should surprise no-one to learn that they do not compartment information properly, whether within a single force or even between forces. The organised crime squad’s secret informants can thus become visible to traffic cops, and even to cops in other forces, with tragically predictable consequences. Fixing this is hard, as Palantir’s market advantage comes from network effects and the resulting scale. The more police forces they sign up the more data they have, and the larger they grow the more third-party databases they integrate, leaving private-sector competitors even further behind.

This much we could have predicted from first principles but the details of how Palantir operates, and what police forces dislike about it, are worth studying.

What might be the appropriate public-policy response? Well, the best analysis of competition policy in the presence of network effects is probably Lina Khan’s, and her analysis would suggest in this case that police intelligence should be a regulated utility. We should develop those capabilities that are actually needed, and the right place for them is the Police National Database. The public sector is better placed to commit the engineering effort to do compartmentation properly, both there and in other applications where it’s needed, such as the NHS. Good engineering is expensive – but as the Los Angeles Police Department found, engaging Palantir can be more expensive still.