On April 7, Anthropic announced that its Claude Mythos Preview model autonomously discovered thousands of zero-day vulnerabilities across major operating systems and browsers. This milestone closed the margin of safety that existed when AI could only exploit known vulnerabilities. While GPT-4 in 2024 required a CVE description to exploit 87% of a dataset, Mythos now discovers flaws without any prior knowledge.
The safety margin has vanished
Mythos scored 83.1% on the CyberGym vulnerability reproduction benchmark. In a single campaign targeting OpenBSD across 1,000 scaffold runs, the total compute cost was under 20,000 dollars. Exploitation timelines are collapsing. CVE-2026-33017 (Langflow, CVSS 9.8) was exploited 20 hours after disclosure with no public proof-of-concept. CVE-2026-39987 (Marimo, CVSS 9.3) was hit in under 10 hours. Google's M-Trends 2026 report confirms that exploitation now occurs before a patch is even released. The median time from CVE publication to CISA KEV listing is five days, which is no longer safe.
Why traditional prioritization fails
Most vulnerability management programs rely solely on CVSS scores, ignoring whether a vulnerability is actively exploited. A three-layer prioritization filter offers a concrete replacement: check CISA KEV status, use EPSS (Exploit Prediction Scoring System) scores from FIRST.org, and then apply CVSS. Validated against 28,377 real-world vulnerabilities, this filter delivers an 18x efficiency gain, 85.6% coverage of exploited vulnerabilities, and a 95% reduction in urgent remediation workload.
Immediate actions for security teams
Because exploit windows now shrink to hours, organizations must adopt event-driven patching for Tier 0 services. The goal is to deploy a patch to canary within four hours of a CVE being declared critical. If a patch cannot be applied, compensating controls such as removing internet exposure, rotating credentials, and disabling affected functionality must be enforced immediately. Additionally, security teams must test authorization boundaries for AI agents, especially for oversized requests (over 1 MB) and burst rates. Mapping the credential blast radius for every AI builder host is essential. For a deeper look at offensive security frameworks, see our article on Ethical Hacking in Italy: Operational Methodology and Legal Framework for Penetration Testing.
The bottom line: adversaries now operate in under 20 hours. Calendar-based patch cycles are obsolete. For more details, read the original analysis on VentureBeat: Claude Mythos exposed a hard truth.
Sponsored Protocol