Commercial AI Models Demonstrate Swift Advancements in Vulnerability Research

In the evolving landscape of cybersecurity, the performance of artificial intelligence (AI) in identifying security vulnerabilities has become a focal point of research and discussion. Recent findings from Forescout’s Verde Labs illustrate that while non-public frontier AI models, such as Anthropic’s Claude Mythos, have displayed remarkable capabilities in uncovering thousands of zero-day vulnerabilities across major operating systems, commercial AI models are also making strides in the proactive identification of software bugs.

Notably, a year ago, Forescout reported that a staggering 55% of AI models tested failed to execute basic vulnerability research adequately, with a staggering 93% failing in exploit development tasks. However, a significant turnaround has occurred within the past year. As of 2026, Forescout claims that all tested models have successfully completed their vulnerability research tasks, with half of them capable of autonomously generating functional exploits.

In their comprehensive study, Verde Labs evaluated 50 different AI models, incorporating a wide range of both commercial and open-source models as well as those from the underground sphere. Among the contingent of models assessed, Forescout identified Claude Opus 4.6 and Kimi K2.5 as the most capable. These advanced AI platforms have reached a level of sophistication where they can locate and exploit vulnerabilities without requiring complex prompts. This newfound ability drastically lowers the entry barrier for novice attackers, who might not possess specialized skills in cybersecurity.

Rik Ferguson, the VP of Security Intelligence at Forescout, underscored the implications of these advancements by noting, “These are widely available AI models exceeding human capability.” However, he tempered this optimism by acknowledging that, while these models demonstrate impressive capabilities, they may not yet match the scale, speed, and quality of performance seen in Anthropic’s Claude Mythos.

During the testing phase, Forescout employed various methodologies, including single prompts and the RAPTOR agentic framework, to identify vulnerabilities. Remarkably, they discovered four new zero-day vulnerabilities within OpenNDS, a software solution broadly deployed in various environments. The RAPTOR framework, an open-source agentic AI tool designed for both offensive and defensive cybersecurity operations, played a crucial role in this discovery, demonstrating AI’s potential in uncovering vulnerabilities that had previously eluded human analysts. Ferguson highlighted that one particularly notable vulnerability was found in code that had already undergone manual analysis by Verde Labs, illustrating AI’s capacity to reveal hidden flaws in software systems that might otherwise remain undetected.

AI Lowers the Barrier to Discovering Unknown Vulnerabilities

As the testing revealed, commercial AI models exhibited the best performance in vulnerability detection. Yet, Forescout acknowledged that their high costs could be a barrier to widespread adoption—Claude Opus 4.6, for instance, is priced as high as $25 per million output tokens. In contrast, there are open-source alternatives available, such as DeepSeek 3.2, which can manage basic tasks at a minimal cost, with all test tasks averaging less than $0.70. This cost disparity raises critical questions about operational feasibility for organizations looking to bolster their cybersecurity defenses.

When comparing these commercial models to the costs associated with Claude Mythos, which is priced at $25 to $125 per million input/output tokens, it’s evident that organizations must navigate a complex landscape in selecting appropriate tools. Consequently, strategic decision-makers in both cybersecurity defense and offensive maneuvers are increasingly adopting a model-based approach that considers the complexity of tasks alongside cost-efficiency.

Forescout’s findings suggest that the combination of their research into new vulnerabilities with the capabilities of popular models like Project Glasswing—designed to unearth thousands of zero-days in critical software—means that organizations cannot overlook the likelihood of having hidden vulnerabilities within their environments. The underpinning message is clear: whether deployed by defenders or malicious actors, AI will continue to reveal unknown vulnerabilities that organizations must proactively address to secure their systems.

In conclusion, this ongoing evolution in the capabilities of AI models underscores an urgent need for businesses and security professionals to adapt their strategies. As these technologies advance, they promise to both enable attackers and empower defenders, forcing organizations to reconsider their comprehensive approaches to cybersecurity.

Source link

Select a plan

Monthly plan

Yearly plan

All plans include

Search for an article

Commercial AI Models Demonstrate Swift Advancements in Vulnerability Research

Latest articles

Cydome and Rakuten Maritime Collaborate on Vessel Cybersecurity

Decoding Claude: Understanding Signal and Speculation

APT Group Updates termsrv.dll to Allow Multiple RDP Sessions

Automated Megalodon Campaign Distributes Backdoors in GitHub Repositories

More like this

Cydome and Rakuten Maritime Collaborate on Vessel Cybersecurity

Decoding Claude: Understanding Signal and Speculation

APT Group Updates termsrv.dll to Allow Multiple RDP Sessions