HomeCII/OTPackage hallucination: LLMs may deliver malicious code to negligent developers

Package hallucination: LLMs may deliver malicious code to negligent developers

Published on

spot_img

LLMs, or large language models, are known for generating code packages that do not actually exist, leading to a potential new type of supply chain attack referred to as “slopsquatting,” according to Seth Larson, the Security Developer-in-Residence at the Python Software Foundation.

Developers who rely on LLMs for programming assistance have encountered instances where these models produce false information presented as factual. This phenomenon extends to coding, where LLMs have been observed to suggest non-existent software libraries and packages. Researchers have raised concerns that attackers could exploit this behavior by creating malicious packages with the same names as the hallucinated ones and making them available for download on popular code repositories like PyPI and npm.

To explore the extent of this issue, a team of researchers from the University of Texas at San Antonio, University of Oklahoma, and Virginia Tech tested 16 different code generation AI models with two unique prompt datasets. The LLMs generated 576,000 Python and JavaScript code samples, with nearly 20% of the recommended packages turning out to be non-existent. Further analysis revealed that some hallucinated packages were consistent across multiple queries, indicating a pattern rather than random error.

Dr. Murtuza Jadliwala, an Associate Professor in the Department of Computer Science at the University of Texas at San Antonio, highlighted the potential risks associated with these hallucinated packages. If an unsuspecting user executes code that references a non-existent package recommended by an LLM, a malicious actor could exploit this vulnerability by creating a package with the same name and injecting harmful code into it.

The researchers investigated the origins of these hallucinated packages and found that deleted packages were not a significant source of the issue. They also observed cross-language hallucinations, where a package in one programming language had the same name as an existing package in another language. While the majority of hallucinated package names were distinct from existing ones, they were often plausible and contextually relevant.

To address this issue, the researchers proposed recommendations for LLM creators to minimize package hallucinations during code generation. They also advised individual coders using LLMs to verify the recommended packages before incorporating them into their code.

Overall, the prevalence of package hallucinations among LLMs highlights the potential risks to the software supply chain. By understanding and addressing this issue, developers can better protect their systems from potential supply chain attacks.

Source link

Latest articles

Anubis Ransomware Now Hitting Android and Windows Devices

 A sophisticated new ransomware threat has emerged from the cybercriminal underground, presenting a...

Real Enough to Fool You: The Evolution of Deepfakes

Not long ago, deepfakes were digital curiosities – convincing to some, glitchy to...

What Happened and Why It Matters

In June 2025, Albania once again found itself under a digital siege—this time,...

Why IT Leaders Must Rethink Backup in the Age of Ransomware

 With IT outages and disruptions escalating, IT teams are shifting their focus beyond...

More like this

Anubis Ransomware Now Hitting Android and Windows Devices

 A sophisticated new ransomware threat has emerged from the cybercriminal underground, presenting a...

Real Enough to Fool You: The Evolution of Deepfakes

Not long ago, deepfakes were digital curiosities – convincing to some, glitchy to...

What Happened and Why It Matters

In June 2025, Albania once again found itself under a digital siege—this time,...