Claude Mythos: The AI Model That Slipped Through the Internet Before Anyone Could Stop It
When no one is sure what to say about a tech story, there is a certain silence that descends upon it. There is a certain silence surrounding the Claude Mythos story. On a Tuesday in early April, Anthropic unveiled the model, described it as a turning point for cybersecurity, and then, almost simultaneously, announced that it would not be releasing it. About a week later, there was the awkward follow-up: it seemed to have been obtained by a small group of users who were hiding in some private online forum that no one was willing to identify.
The scene is fairly easy to visualize. Engineers are updating dashboards somewhere in San Francisco. Somewhere else, in a room shaped like Discord with a few dozen users, a model that can silently point out the gaps in decades-old code. It’s the kind of scene that seems lifted from a thriller, but the antagonist in a thriller is typically more obvious.
| Model name | Claude Mythos (Preview) |
| Developer | Anthropic, San Francisco |
| Announced | 7 April 2026 |
| Status | Restricted release; not publicly available |
| Access program | Project Glasswing — roughly 40 partner firms |
| Notable partners | Apple, Google, Goldman Sachs, JP Morgan, AWS, Nvidia, Broadcom |
| Headline capability | Identifies and chains zero-day vulnerabilities across major OSes and browsers |
| Independent assessment | UK AI Security Institute — “step up” on prior models |
| Reported incident | Unauthorised access by a “handful” of users via a private forum, April 2026 |
| Estimated company valuation | ~$800bn |
The fact that Mythos writes code is not what sets it apart; these days, all models write code. The reason is that it appears to comprehend code in a way that makes the testers uneasy. Anthropic claims to have discovered thousands of zero-day vulnerabilities, one of which appears to have been present in a system for 27 years. Twenty-seven years. Consider that. That line was written in the late 1990s by someone who most likely went home and forgot about it. In 2026, a machine approached it and said, “This is broken.”
The model was tested by the UK’s AI Security Institute, which was impressed but cautious. The institute reported that it successfully completed a first-ever 32-step simulated cyberattack. The AISI’s article uses cautious, almost reluctant language, which is how British institutions sound when they’re attempting to prevent a panic. Reading it gives the impression that those closest to the technology are selecting their words based on weight.

Not everyone believes the threat and the alarm are the same. The same vulnerability claims were run through less expensive, less well-known models by a company called Aisle, which specializes in AI cybersecurity. They discovered that these models were able to identify many of the same flaws. Mythos is still impressive despite this. It might make Anthropic’s narrative more difficult to understand. In its own article, Bain & Company stated unequivocally that mythos is a signal rather than the actual threat. This is already done by other cutting-edge models, such as Google’s Big Sleep and OpenAI’s cyber-tuned GPT. The time had come. The part of the era that is loud enough to make headlines is mythos.
It tells you something that Banks was the first to notice. The heads of Goldman and Citi were summoned to Washington by US Treasury Secretary Scott Bessent. Mythos was added to the Cross Market Operational Resilience Group’s agenda by UK regulators. The group’s name alone suggests that its members are not easily alarmed. Before Mythos even existed, government modeling outlined the worst-case scenario of a bank hack, including unsuccessful direct debits, frozen cash machines, buses refusing to accept payments, and a gradual trend toward the kind of run that ends careers. It’s a different experience to read that document now, with Mythos in the background, than it was a year ago.
It’s difficult to ignore how rapidly the topic shifted from capability to containment. Through Project Glasswing, Anthropic provided early access to about forty companies, including Apple, Google, JP Morgan, and Crowdstrike of all companies, and then asked them to share their findings. They haven’t said much in public. It’s unclear if that’s discomfort or discipline.
The leak itself is still not fully explained. a few users. a discussion board. Anthropic verification, research, and minimal commentary. The company’s own alignment update, which was released in early April, alluded to the more general concern that AI models that can perform intricate technical tasks without significant human involvement could alter the threat landscape in ways that have not yet been thoroughly mapped. As I watch this happen, the silence surrounding the leak seems to reveal more than the announcement itself. Press kits come with big claims. The truly awkward stuff comes in bits and pieces.
A smaller question, the kind that doesn’t make headlines, is what remains. What happens when the next model is created by someone who isn’t trying as hard if a model this capable can slip, even slightly, even momentarily, through the gaps of a company that built its brand on caution?