Researchers have discovered a new method for spreading malware through generative AI (GenAI) apps like ChatGPT by utilizing clever prompt engineering and injection techniques. This worm, named “Morris II,” is capable of tricking AI models into replicating malicious prompts as output, allowing the malware to propagate to further AI agents.
In a controlled experiment, a team of Israeli researchers demonstrated how attackers could design “adversarial self-replicating prompts” to deceive generative AI models. These prompts can be used for various malicious activities such as stealing information, spreading spam, and poisoning models. The researchers created an email system that operates using generative AI and sent a prompt-laden email to showcase how the malware could contaminate the system’s database and force it to exfiltrate sensitive data.
Additionally, the researchers showed how an adversarial prompt could be encoded in an image to coerce the email assistant into forwarding the poisoned image to other hosts. This method enables attackers to automatically propagate spam, propaganda, malware payloads, and other malicious instructions through a chain of AI-integrated systems.
The emergence of malware targeting AI models reflects the continuation of traditional security threats in a new technological landscape. Andrew Bolster, senior R&D manager for data science at Synopsys, compares these attacks to well-known injection methods like SQL injection, emphasizing the importance of safeguarding AI systems against malicious inputs.
The researchers drew parallels between the Morris worm, a self-propagating malware from 1988, and modern AI malware. Just as the Morris worm exploited vulnerabilities to influence computer functions, hackers today leverage GenAI prompts to manipulate AI systems. Bolster suggests that developers may need to break up AI models into smaller components to separate data and control aspects, mitigating the risk of malicious inputs affecting system behavior.
Moving forward, the shift towards a distributed multiple agent approach in AI development could enhance security by implementing runtime content gateways and constraints on AI functionalities. By restructuring AI systems and incorporating strict protocols to distinguish between user input and machine output, developers aim to fortify AI models against exploitation and propagation of malware. This strategic approach mirrors the evolution from monolithic software architectures to microservices, enabling better control and monitoring of AI functionalities.