WormGPT Reborn: Cybercriminals Hijack Mainstream AI to Power New Variants

WormGPT first emerged in 2023 as a blackhat tool built on GPT-J, offering threat actors a way to generate malicious content through an uncensored AI interface. It gained quick traction in underground forums, where it was used for everything from phishing emails to malware scripts. Yet the original version was short-lived. Following exposure and increased scrutiny, WormGPT was shut down, leaving a gap in the illicit market for AI-driven tools.

That gap has since been filled, not with a single product, but with a growing category of rebranded and retooled models still operating under the WormGPT name. What began as one project has evolved into a recognizable cybercriminal label. Today, WormGPT is less of a single tool and more a marketing asset used to attract buyers seeking AI systems customized for offensive use.

“Hundreds of uncensored LLMs exist in Dark Web communities,” said Dave Tyson, Chief Intelligence Officer at Apollo Information Systems. “Many of them are labeled WormGPT as a means of convenience, just like Americans say Kleenex for a facial tissue, even though the first is actually a brand versus the true item. Some of them have distinctive names, like EvilGPT, but most criminal AI are glommed under the word “WormGPT” as a catch-all term. Given the source code for WormGPT leaked and spread widely, it’s not especially surprising to see its dominance as a term or as backend code.”

New Variants, Familiar Tactics

The new generation of WormGPT tools includes variants like xzin0vich-WormGPT and keanu-WormGPT, both actively promoted on BreachForums. These models are pitched as “uncensored” and free from the ethical constraints that govern commercial platforms. Their appeal lies in what they lack: built-in safeguards, refusal mechanisms, or content filters that might block malicious prompts.

Descriptions of these tools now mirror the marketing language used for legitimate large language models (LLMs). Posts tout fast response times, adaptability, and ease of use. But instead of boosting productivity or creativity, the goal is to enable scams, social engineering, and exploit development. The positioning is deliberate and is intended to show that these tools are not watered-down replicas but responsive, capable systems for threat actors.

Mainstream AI Models Hijacked

According to new research from Cato CTRL, some of the latest WormGPT variants are powered not by obscure or custom-built models, but by high-profile systems developed by major players in the AI space. Among them are versions based on xAI’s Grok and Mixtral, a leading release from Mistral AI. These platforms were not created for malicious use, but cybercriminals are finding ways to bend them to their purposes.

Instead of rewriting the models from scratch, attackers are using new jailbreaking techniques to bypass built-in protections. Some go further by fine-tuning the models with illicit training data, while others remove guardrails through prompt manipulation alone. In most cases, the core architecture remains intact.

Implications for AI Governance and Security

The rise of WormGPT variants exposes a central dilemma in generative AI: dual-use capability. The same systems that power innovation in business, science, and communication can also be manipulated for fraud, deception, and exploitation. This isn’t a flaw in the technology itself. It’s a reflection of how open or semi-open ecosystems can be turned against their creators.

Threat actors are taking advantage of models designed to support transparency and collaboration. Open-weight systems like Mixtral allow researchers to build on shared work, but they also allow malicious users to strip away safety features or inject harmful training data. Even models that are not fully open can be jailbroken through prompt engineering, revealing just how thin the line is between accessibility and risk.

These developments raise difficult questions for AI providers:

How resilient are their safeguards?
How easily can guardrails be removed?
And when those controls fail, who bears responsibility?

The WormGPT evolution suggests that preventing misuse is not just a technical challenge – it is a governance problem that demands urgent attention.

Identifying the Attackers

WormGPT may be the most recognizable name in malicious AI today, but it is unlikely to remain the only one. As threat actors become more sophisticated, branded offensive tools are poised to become a permanent fixture in the cybercrime landscape. The evolution of WormGPT shows how easily high-performance models can be co-opted, re-branded, and deployed at scale. This trend points to the emerging need for new layers of defense that extend beyond traditional network and endpoint protections.

Security researchers are already calling for stronger safeguards at the model level. Techniques like watermarking, fingerprinting, and usage monitoring can help identify when models are being abused or redistributed. These tools are still in early development, but they represent a critical step toward accountability in AI use.

“The proliferation of WormGPT variants underscores a growing reality in the AI threat landscape – the fact that LLM guardrails are not perfect,” said Margaret Cunningham, Director, Security & AI Strategy at Darktrace. “As we've seen with WormGPT and similar findings like HiddenLayer’s universal jailbreak technique, threat actors will continue to find creative ways to skirt safeguards, expose system prompts, and remove censorship.”

At the same time, AI threat intelligence is becoming a strategic necessity. Organizations will need to track how generative models are being exploited and build defenses that account for both technical and behavioral risk. In this arms race, understanding the tools is as important as defending against them.

WormGPT’s evolution from a single tool to a growing family of rebranded variants reflects a shift in how malicious AI is developed, marketed, and deployed. As generative models become more accessible, so do the risks. Combating this threat will require more than better models. It will demand stronger oversight, smarter defenses, and a deeper understanding of how these technologies are being weaponized in the wild.