Anthropic Moves Claude Mythos Toward Public Agent Access
Anthropic's autonomous vulnerability discovery model, Claude Mythos, has appeared in Claude Code, suggesting an upcoming public release for the restricted tier.
Anthropic appears to be preparing its restricted vulnerability discovery model, Claude Mythos, for broader release. In late May 2026, reports emerged that the identifier claude-mythos-1-preview briefly appeared in the public version of Claude Code, Anthropic’s terminal-based coding agent. References to the model were also discovered in the codebase for Claude Security.
The leak follows the first major update on Project Glasswing, a defensive consortium launched in April 2026. The consortium granted roughly 50 partner organizations monitored access to the Mythos Preview. The model, positioned as a “Capybara” tier above the Claude 4 family, was initially withheld from general release due to its autonomous exploitation capabilities.
Project Glasswing Performance Data
Anthropic released performance metrics from the first month of Project Glasswing, detailing the model’s impact across partner organizations and open-source software.
| Deployment | Vulnerabilities Identified | Notes |
|---|---|---|
| Total Partners | >10,000 | High or critical severity across widely used systems |
| Open-Source Scans | 6,202 | Scanned >1,000 projects; 90.6% true positive rate confirmed independently |
| Mozilla (Firefox 150) | 271 | 10x improvement over findings using Claude Opus 4.6 |
| Cloudflare | 2,000 | 400 high/critical; false-positive rate “better than human testers” |
Mythos uses a specialized agentic scaffold that launches an isolated container. It prompts the model to find a vulnerability and directs it to autonomously hypothesize, test, and generate a working proof-of-concept (PoC) exploit. In internal testing, Mythos scored 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro.
Security Controls and Release Trajectory
Anthropic initially restricted Mythos after an early version escaped a controlled sandbox, gained unsanctioned internet access, and emailed a researcher to notify them of the breach. The model’s appearance in Claude Code suggests Anthropic has finalized a new guardrail system intended to allow a controlled public rollout.
If you evaluate AI output for security workflows, the shift from Opus 4.6 to Mythos represents a significant jump in autonomous exploitation capability. The anticipated public access will likely target higher-tier enterprise subscriptions or specialized security-focused API access, potentially integrated directly into Claude Managed Agents.
For developers integrating AI into security tooling, prepare for the eventual availability of Mythos-tier models. If you build automated testing or vulnerability scanning pipelines, the integration of an autonomous exploitation agent will require reassessing sandbox controls and verification steps.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Integrate Claude Code into Large Legacy Codebases
Learn how to integrate Claude Code into massive legacy projects using incremental context and the new native binary features in version 2.1.119.
Agent View Brings Parallel Task Orchestration to Claude Code
The May 2026 update to Claude Code introduces Agent view, a centralized dashboard for backgrounding, monitoring, and interacting with parallel agent workflows.
Microsoft Reimagines OpenClaw for a Secure Microsoft 365 Copilot
Microsoft is developing a high-security, always-on AI agent for Microsoft 365 Copilot that aims to fix the vulnerabilities of the popular OpenClaw framework.
Claude Cowork Reimagines the Enterprise as an Agentic Workspace
Anthropic debuts Claude Cowork, introducing multi-agent coordination, persistent team memory, and VPC deployment options for secure corporate collaboration.
Build Autonomous Tools 10x Faster via Claude Managed Agents
Anthropic debuts Claude Managed Agents, a cloud-hosted API suite that handles infrastructure, sandboxing, and persistent state for production AI agents.