Is Apple’s NeuralMatch searching for abuse, or for people?

Apple stunned the tech industry on Thursday by announcing that the next version of iOS and macOS will contain a neural network to scan photos for sex abuse. Each photo will get an encrypted ‘safety voucher’ saying whether or not it’s suspect, and if more than about ten suspect photos are backed up to iCloud, then a clever cryptographic scheme will unlock the keys used to encrypt them. Apple staff or contractors can then look at the suspect photos and report them.

We’re told that the neural network was trained on 200,000 images of child sex abuse provided by the US National Center for Missing and Exploited Children. Neural networks are good at spotting images “similar” to those in their training set, and people unfamiliar with machine learning may assume that Apple’s network will recognise criminal acts. The police might even be happy if it recognises a sofa on which a number of acts took place. (You might be less happy, if you own a similar sofa.) Then again, it might learn to recognise naked children, and flag up a snap of your three-year-old child on the beach. So what the new software in your iPhone actually recognises is really important.

Now the neural network described in Apple’s documentation appears very similar to the networks used in face recognition (hat tip to Nicko van Someren for spotting this). So it seems a fair bet that the new software will recognise people whose faces appear in the abuse dataset on which it was trained.

So what will happen when someone’s iPhone flags ten pictures as suspect, and the Apple contractor who looks at them sees an adult with their clothes on? There’s a real chance that they’re either a criminal or a witness, so they’ll have to be reported to the police. In the case of a survivor who was victimised ten or twenty years ago, and whose pictures still circulate in the underground, this could mean traumatic secondary victimisation. It might even be their twin sibling, or a genuine false positive in the form of someone who just looks very much like them. What processes will Apple use to manage this? Not all US police forces are known for their sensitivity, particularly towards minority suspects.

But that’s just the beginning. Apple’s algorithm, NeuralMatch, stores a fingerprint of each image in its training set as a short string called a NeuralHash, so new pictures can easily be added to the list. Once the tech is built into your iPhone, your MacBook and your Apple Watch, and can scan billions of photos a day, there will be pressure to use it for other purposes. The other part of NCMEC’s mission is missing children. Can Apple resist demands to help find runaways? Could Tim Cook possibly be so cold-hearted as to refuse at add Madeleine McCann to the watch list?

After that, your guess is as good as mine. Depending on where you are, you might find your photos scanned for dissidents, religious leaders or the FBI’s most wanted. It also reminds me of the Rasterfahndung in 1970s Germany – the dragnet search of all digital data in the country for clues to the Baader-Meinhof gang. Only now it can be done at scale, and not just for the most serious crimes either.

Finally, there’s adversarial machine learning. Neural networks are fairly easy to fool in that an adversary can tweak images so they’re misclassified. Expect to see pictures of cats (and of Tim Cook) that get flagged as abuse, and gangs finding ways to get real abuse past the system. Apple’s new tech may end up being a distributed person-search machine, rather than a sex-abuse prevention machine.

Such a technology requires public scrutiny, and as the possession of child sex abuse images is a strict-liability offence, academics cannot work with them. While the crooks will dig out NeuralMatch from their devices and play with it, we cannot. It is possible in theory for Apple to get NeuralMatch to ignore faces; for example, it could blur all the faces in the training data, as Google does for photos in Street View. But they haven’t claimed they did that, and if they did, how could we check? Apple should therefore publish full details of NeuralMatch plus a set of NeuralHash values trained on a public dataset with which we can legally work. It also needs to explain how the system it deploys was tuned and tested; and how dragnet searches of people’s photo libraries will be restricted to those conducted by court order so that they are proportionate, necessary and in accordance with the law. If that cannot be done, the technology must be abandoned.

12 thoughts on “Is Apple’s NeuralMatch searching for abuse, or for people?

  1. “ Apple stunned the tech industry on Thursday by announcing that the next version of iOS and macOS will contain a neural network to scan photos for sex abuse.”

    I believe this is a misinterpretation, but the whole of the post relies on this, and if this sentence is wrong then the whole post is.

    Apple is introducing two very different things. One is a NN system which will be applied to iMessage messages (sent and received) *if* the phone’s owner is designated a minor (under 13) in a “family” system by a parent, *and* *if* the parent chooses to turn it on. If the minor sends or receives content that the NN determines to be sexual and inappropriate, it will alert the minor and also their “parent/s”.

    The other thing, entirely separate, that Apple is introducing is a system for scanning photos that are about to be uploaded to iCloud Photo Library. (If they aren’t going to be uploaded, they won’t be scanned.) The fingerprint of the photo is compared against the fingerprints of *known* CSAM: “ Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child safety organizations. Apple further transforms this database into an unreadable set of hashes that is securely stored on users’ devices.
    “Before an image is stored in iCloud Photos, an on-device matching process is performed for that image against the known CSAM hashes. This matching process is powered by a cryptographic technology called private set intersection, which determines if there is a match without revealing the result.”
    There is *no* reference to machine learning in this. It is about finding *known* CSAM, and we don’t know what the trigger threshold is: the figure of ten does not appear anywhere in Apple’s literature or briefings.
    I would welcome a reexamination of this based on a careful reading of the Apple page (at apple dot com / child-safety ). And I’d hope that Madeleine McCann’s picture would be in the NCMEC database – she is, or was, a missing child.

    1. Convolutional neural networks are a form of deep learning. On page 6 of https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf it is stated that “Before the [PSI] protocol begins, Apple and the user’s device have distinct sets of image hashes that each system computed using the NeuralHash algorithm” – and the description given on page 5 is clear that the first step in generating a NeuralHash is to pass a (user) image into a “convolutional neural network” that has been “trained through a self-supervised training scheme”.

      The diagram and explanation on page 7 confirm that the user’s device computes the image NeuralHash before any matching takes place.

      I read this as several explicit references to (a form of) machine learning being used in the CSAM matching process, but of course I stand to be corrected.

      1. The details of the neural-network part remain unclear from the information released so far, however “self-supervised training scheme” does sound like this is not a domain-specific network, i.e. nobody has taught the network to look for anything in particular using any classified training set, such as typical elements of CASM images. It sounds more like the network was merely trained to produce outputs that try to remain invariant under minor image transforms, such as compression artifacts, rescaling or minor cropping.

  2. From Apple’s description of how the output of the CNN-hash is used, namely by directly using just the most-significant bits of a normalised output-layer vector (points on a hypersphere), it seems pretty clear that all they are doing is applying a general-purpose perceptual compression step (i.e., no neural network is trained on a CASM library!) followed by a hash-based key-derivation function, to essentially just almost exactly match normalised kind-of thumbnails of images against a library of given images. There isn’t any distance threshold or Hamming-distance search involved, the key derived from the neural network must match bit-for-bit exactly the quantised hypersphere coordinates, or the image will not result in a secret-share of the voucher key being revealed. This system does specifically not involve any kind of AI-based image classification or categorisation or identification of persons or activities. The phone has no way of finding out if it found anything, because all it does is compress the image and then offer it in a privacy-preserving exact bit-string comparison.

    Having now read the actual description of the system, I am actually deeply impressed by how much thought and effort has gone into what appears to be an excellent piece of privacy-preserving technology, namely the ftPSI-ad protocol used. The designers went to great length to reveal absolutely no other information to any party other than to tell the iCloud Photo servers the presence on these servers of a minimum number of images found in a known-size library of CASM material. I can’t find anything objectionable to that. It appears to be a rather well-designed system and this is probably going to be the most sophisticated system of privacy-preserving technology to be commercially deployed so far.

  3. Use of adversarial machine learning will be hindered by the fact that the client only holds a blinded list of the hash values of the CASM-library images. In other words, reverse engineering the content of the phone will not tell you anything about what characteristics an input image will need to have to lead to a match. The phone does not know if it found anything, because it doesn’t perform any search, scan or matching. It just hashes and participates in a private-set-intersection protocol for an exact string comparison. So you have no target function to optimise, unless you source CASM images from elsewhere and speculate about which of them may or may not be currently in Apple’s library of CASM hashes.

  4. Some other problems:
    – it makes it easy to disable a target’s Apple account (by weaponizing such content).
    – like it or not, there is a growing behavior of sexting among teenagers that is unrelated to abuse, and now this private content would probably be exposed to authorities (and worse, to Apple staff or contractors).

  5. The best piece I’ve read so far, of the many on Apple’s new data initiative, is here. Its author runs a photo forensics service, regularly reports CSAM to NCMEC, and understands Microsoft’s PhotoDNA. Contrary to Microsoft’s claims, PhotoDNA is reversible; anyone who understands it can take a “hash” and reverse it to a thumbnail. It follows that anyone with access to the hashes can use them to generate low-resolution copies of the original sex-abuse images. What’s more, about 20% of the NCMEC hashes are false positives to begin with.

    1. While it is quite likely that one could train a Generative Adversarial Network that could reconstruct an image from one of Apple’s neural network hashes, crucially, end users DO NOT have access to the list of hashes. Apple have implemented a threshold Private Set Intersection algorithm that allows Apple to check if the set of hashes match a threshold number of images in their database, in such a way that neither side knows anything about the other side’s image hashes until that threshold is met. Users don’t ever see Apple’s hash database, so even if they had a hash reversal GAN it wouldn’t do them any good.

      As for the false positive rate, the operator of FotoForensics notes that one of the five hashes that matched the PhotoDNA hash list was a picture of a man holding a money, not CSAM. Firstly, while a one-out-of-five score lets us estimate the error rate to be 20%, that estimate comes with a much wider error bar than a 100 out of 500 score, or even than 2 out of 10. Also note that this is an error rate on the categorisation of the matching hashes, not on the categorisation of images being checked. FotoForensics handled nearly 1 million images last year and just one image was a false positive match against the hash list. Even if the 20% estimate of the hash list categorisation accuracy is close to correct, we have a 0.0001% false positive rate when scanning images. We don’t know the error rate on Apple’s NN image hashes but we do know that they are then applying a threshold scheme so that it takes several matches to trigger closer inspection. Even if Apple’s scheme doesn’t make any improvements over what has gone before, someone would need to upload many millions of images to iCloud before they could expect to be falsely flagged.

  6. I am concerned about unintended harms.

    The major risk in suppression of content have traditionally included
    – keeping information about mental health, staying safe away from girls and LGBT
    – keeping health care information and safety away from women and girls
    – preventing transmission of information about pregnancy, childbirth, and breastfeeding.

    This would be avoided if the training data included a corpus of
    – body positive images about transitioning
    – educational information including information on sexual health for LGBT
    – images of childbirth and post-natal care for birthing mothers so these are not flagged
    – breastfeeding, childcare, and other positive parenting images

    But not only if, because reviews on such content after the system was built could also provide some protection.

    I know of no dataset that would be used to minimize harm against these populations.

    I know of no company that engages in ML design with intent to minimize harm against minoritized populations, but it may be happening.

    If it is happening only Apple would know, and they would only know if they bother to track the false positive with an eye to detecting these situations and improving them. But there is no transparency in this process, because that might result in quite terrible privacy violations. We know Facebook has a tradition of censoring all of these populations.

    This seems like a high risk of censorship to other populations about whom we are also reasonably concerned.

  7. Here is an op-ed I wrote for the Guardian.

    After the paper accepted it on Friday – after several days of to-and-fro with lawyers, who may have been talking to Apple lawyers and who insisted that I cut comments about Apple’s proposed system being cheaper than the scanning systems already in use by other tech majors – Apple released more information hinting that the threshold would be 30 suspect images rather than 10, and sort-of-denying that the NeuralMatch algorithm would look for faces. In their newly released threat model they claim an image-level false-positive rate of 3 in 100m but don’t expect it to perform that well in the field. One might ask: if the algorithm is tuned to alarm only on very close matches to file photos, then what’s to stop the gangs circumventing it by minor image edits? Twenty-five years ago, when the policy issue was copyright, we released a test suite called Stirmark which does just that. How well does NeuralHash stand up to Stirmark? And if the algorithm has precision greater than 95%, why a threshold of 30 rather than 10?

Leave a Reply

Your email address will not be published. Required fields are marked *