Today we are releasing Trojan Source: Invisible Vulnerabilities, a paper describing cool new tricks for crafting targeted vulnerabilities that are invisible to human code reviewers.
Until now, an adversary wanting to smuggle a vulnerability into software could try inserting an unobtrusive bug in an obscure piece of code. Critical open-source projects such as operating systems depend on human review of all new code to detect malicious contributions by volunteers. So how might wicked code evade human eyes?
This potentially devastating attack is tracked as CVE-2021-42574, while a related attack that uses homoglyphs – visually similar characters – is tracked as CVE-2021-42694. This work has been under embargo for a 99-day period, giving time for a major coordinated disclosure effort in which many compilers, interpreters, code editors, and repositories have implemented defenses.
This attack was inspired by our recent work on Imperceptible Perturbations, where we use directionality overrides, homoglyphs, and other Unicode features to break the text-based machine learning systems used for toxic content filtering, machine translation, and many other NLP tasks.