There are many different ways to represent the same text in Unicode. We’ve previously exploited this encoding-visualization gap to craft imperceptible adversarial examples against text-based machine learning systems and invisible vulnerabilities in source code.
In our latest paper, we demonstrate another attack that exploits the same technique to target Google Search, Bing’s GPT-4-powered chatbot, and other text-based information retrieval systems.
Consider a snake-oil salesman trying to promote a bogus drug on social media. Sensible users would do a search on the alleged remedy before ordering it, and sites containing false information would normally be drowned out by genuine medical sources in modern search engine rankings.
But what if our huckster uses a rare Unicode encoding to replace one character in the drug’s name on social media? If a user pastes this string into a search engine, it will throw up web pages with the same encoding. What’s more, these pages are very unlikely to appear in innocent queries.
The upshot is that an adversary who can manipulate a user into copying and pasting a string into a search engine can control the results seen by that user. They can hide such poisoned pages from regulators and others who are unaware of the magic encoding. These techniques can empower propagandists to convince victims that search engines validate their disinformation.