Anthropic's Most Dangerous Model Was Accessed Without Authorization on Day One

On April 25, researchers at Palo Alto-based security firm Calif found a pair of bugs in macOS. By May 1 — six days later — they had a working kernel exploit bypassing Apple's Memory Integrity Enforcement, the hardware-backed protection that the company spent five years and billions of dollars building for its M5 chip. The tool that helped them get there was Claude Mythos Preview, Anthropic's frontier AI model. The team disclosed the finding to Apple in person at Cupertino and published a blog post on May 14 that landed like a depth charge through the security community.

Apple has not yet shipped a fix.

This is the news peg that matters. Not the claim, circulating on social media and across parts of the AI press, that Mythos is about to be unleashed on the general public. Anthropic has stated plainly — in its system card, on its Project Glasswing page, and in repeated public statements — that Claude Mythos Preview will not be made generally available. Prediction markets assign roughly 7% probability to a public release by June 30. The "Preview label has been removed, public launch is imminent" narrative is not supported by any credible reporting and is directly contradicted by the company that built the model.

The accurate version of this story is, if anything, more alarming than the embellished one.

What Mythos Can Actually Do: The ExploitBench Numbers Are Real

Claude Mythos Preview is a real, unreleased frontier model. Anthropic announced it on April 7 alongside Project Glasswing, an initiative that grants access to twelve major launch partners — including Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, and Microsoft — and over 40 additional vetted organizations, all for defensive cybersecurity work only.

The benchmark driving most of the coverage is also real and peer-reviewed. ExploitBench — arXiv paper 2605.14153, authored by security researcher Seunghyun Lee and computer scientist David Brumley — tests nine frontier AI models against 41 real, previously-exploited vulnerabilities in V8, the JavaScript engine inside Chrome, Edge, Node.js, and Cloudflare Workers. The scores confirm the numbers that have circulated widely: Mythos Preview, with occasional researcher guidance, averaged 9.90 out of a possible 16 and reached the top tier — full arbitrary code execution — on 21 of the 41 tested vulnerabilities. OpenAI's GPT-5.5, the second-best performer, reached that same tier on just two. In fully autonomous mode, the gap widened: Mythos scored 9.55, GPT-5.5 scored 4.30. The benchmark paper and its authors, published this week, also confirmed that running a Mythos session costs roughly twelve times what an equivalent GPT-5.5 run costs — a material asymmetry that rarely appears in louder retellings.

What makes these numbers significant is ExploitBench's design. Most prior cyber benchmarks collapse the exploitation process into a binary: did the model crash the target? ExploitBench instead maps 16 measurable flags across five tiers — from merely reaching vulnerable code up to taking full arbitrary control of a running system. Each flag is checked by a deterministic oracle using randomized challenge-response checks, removing the circularity that plagues LLM-as-judge evaluations.

The key finding is not that one model beats another. It is that publicly deployed frontier models routinely crash targets but almost none can reach reliable code execution against a hardened system. Mythos can. The near-identical scores in human-assisted versus fully autonomous mode are the genuinely alarming data point: on this class of task, the model barely needs a human to reach the same result.

ExploitBench co-author Seunghyun Lee, who has personally reported over 20 browser vulnerabilities, reviewed Mythos transcripts individually. His assessment: the model works like "a fairly competent browser / JS engine security researcher." In one case, Mythos developed an exploit technique that Lee and a colleague had previously dismissed as too complex to be viable. Two caveats the headline numbers do not capture: the tested vulnerabilities are publicly known, meaning training-data contamination cannot be ruled out, and ExploitBench explicitly does not measure the ability to discover new vulnerabilities or to fully weaponize an exploit for a live attack.

Unauthorized Users Accessed Mythos Through a Contractor Credential on Launch Day

Three weeks before the Calif macOS disclosure, Anthropic's access controls were breached on their first day. On the same day the company announced Project Glasswing — April 7 — a private Discord group gained unauthorized access to Mythos Preview through a third-party vendor environment.

According to Bloomberg, the group combined credentials obtained through a third-party contractor who evaluates Anthropic models with information from a data breach at Mercor, an AI-recruitment startup. They used their knowledge of Anthropic's model-identifier URL format to guess the Mythos endpoint. Anthropic confirmed it was investigating: "We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments." The company said it had found no evidence of activity beyond the vendor environment.

The group — not publicly identified — told Bloomberg it was interested in exploring the model rather than causing harm. The reported use was not for cybersecurity attacks. But the incident exposes the central tension in Glasswing's premise: the defensive window the program is designed to create depends on access staying limited to vetted partners. On launch day, that limit was circumvented at the supply chain's weakest point — a contractor credential and a URL guess — not through any action by the model itself. Acalvio Technologies CEO Ram Varadarajan told SiliconAngle that the incident "didn't require a sophisticated attack — it just required a contractor, a URL pattern and a day-one guess."

Where the "About to Be Released" Narrative Falls Apart

The claim generating the most noise — that Anthropic is quietly preparing to release Mythos to the general public — rests on a specific reading of changes in the Google Cloud Vertex AI console that supposedly mirror the rollout pattern that preceded Claude Opus 4.7.

This reading is directly contradicted by Anthropic's own language. The Project Glasswing page states: "We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale." The system card for the model states explicitly that its "large increase in capabilities has led us to decide not to make it generally available." Standard API accounts cannot see or access the model identifier. Approved Glasswing partners reach it through the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry — but only after vetting.

The 7% Polymarket probability of a public release by June 30 reflects traders who have put real money on this question. The Google Cloud console observation conflates infrastructure labeling with a commercial release decision. No credible reporting from Bloomberg, the Wall Street Journal, TechCrunch, or CNBC supports the imminent-release reading.

Heidy Khlaaf, False Positives, and the Human-in-the-Loop Problem

Not every researcher accepts Anthropic's claims at face value. Heidy Khlaaf, chief AI scientist at the AI Now Institute — who has led safety audits across industries ranging from UAVs to nuclear power plants and helped build vulnerability evaluation methodology at OpenAI — has warned against treating the company's disclosures as verified fact. As Gary Marcus documented, Khlaaf flagged missing comparison benchmarks, opaque testing conditions, and ambiguous human involvement in Anthropic's initial cybersecurity disclosures.

Her practical objection cuts to the defensive case for Glasswing: Palo Alto Networks, a Glasswing launch partner, found a false positive rate of roughly 30% when deploying both Mythos and GPT-5.5 across its security products — meaning roughly one in three flagged vulnerabilities was not actually exploitable. That rate dropped as the company tuned the model to its specific environment, but it underscores Khlaaf's core concern. A model that surfaces hundreds of findings requiring days of expert triage can overwhelm the defenders it is meant to help, erasing the throughput advantage before it arrives.

XBOW, an AI-powered penetration testing startup involved in Glasswing testing, found Mythos to be "extremely powerful for source code audits" but "less powerful at validating exploits" and "too literal and conservative" in some cases, overstating the practical severity of findings. Yann LeCun, Meta's chief AI scientist, dismissed the Mythos claims as "BS from self-delusion." The UK AI Security Institute found Mythos to be the most capable model it tested for cybersecurity tasks but noted its evaluations were conducted in controlled environments without active defenders or defensive tooling — a significant limitation on how directly those scores translate to real-world defense.

On the other side, companies running the model are already reporting concrete returns. Palo Alto Networks found 75 vulnerabilities using Mythos and GPT-5.5 in testing periods where it would normally identify five to ten per month. CrowdStrike's 2026 Global Threat Report documented an 89% year-over-year increase in AI-driven attacks — a figure its CTO Elia Zaitsev used to argue that the industry has no choice but to embrace AI-native defense: "If you want to deploy AI, you need security."

The Government Battlefield: A Federal Ban, a Lawsuit, and Congress Moving In

Project Glasswing launched into the most legally complicated period in Anthropic's history.

On February 27, 2026, President Trump directed all federal agencies to cease using Anthropic's technology. Defense Secretary Pete Hegseth simultaneously designated the company a "supply chain risk" — the first time that designation had been applied to an American company. The dispute: Anthropic had refused to remove contractual restrictions barring the Pentagon from using Claude for fully autonomous weapons and mass domestic surveillance of Americans. Anthropic filed two federal lawsuits on March 9, 2026, arguing the designation was unlawful and violated its rights.

On March 26, Federal Judge Rita Lin of the Northern District of California granted a preliminary injunction temporarily blocking the ban, writing that the supply chain risk designation was "likely both contrary to law and arbitrary and capricious" and that nothing in the statute supported "the Orwellian notion that an American company may be branded a potential adversary and saboteur of the U.S. for exposing a disagreement with the government."

The White House subsequently met with CEO Dario Amodei and began developing guidance that could allow agencies to bypass the designation and access Mythos specifically — a signal that the model's capabilities have altered the political calculus even as the legal fight continues.

Congress is moving on a parallel track. The House Homeland Security Committee held a classified briefing on Mythos cyber risks and is preparing a public hearing. The briefing, reported by CyberScoop, came the day after OpenAI announced its own cybersecurity initiative, with the committee focused on Chinese state threats exploiting AI capabilities. DOW Assistant Secretary Katherine Sutton told a conference this week that Mythos presents "huge opportunity" for building secure code — a government framing that stands in direct tension with the administration's legal campaign against the company that built it.

The Real Risk: Diffusion, Not a Launch Date

The structural problem that no single disclosure event solves is the one Anthropic itself names in its Glasswing announcement: "The work of defending the world's cyber infrastructure might take years; frontier AI capabilities are likely to advance substantially over just the next few months."

Over 45% of discovered vulnerabilities in large organizations remain unpatched after 12 months, according to a 2025 security industry report. Many critical infrastructure operators still run end-of-life software that no vendor supports. This is the population most exposed if offensive AI capabilities diffuse beyond Glasswing's fifty-plus vetted organizations — whether through a competing lab releasing without Anthropic's restrictions, through open-weight models that are already being uncensored within days of release, or through the kind of contractor-credential breach that put Mythos in unauthorized hands on its first day.

Anthropic's own argument, explicit in the Glasswing announcement, is that comparable capability will not stay scarce. The defensive window Project Glasswing is creating depends on that window being used. The reader who needs to act on this story is not one waiting for a public launch date. It is the security officer whose patch backlog already contains vulnerabilities that a Mythos-class model can turn into a working exploit — and who may be operating with less lead time than this week's headlines suggest.

Tags:Anthropic Llm Security

Join the Discussion

Anthropic’s Most Dangerous Model Was Accessed Without Authorization on Day One — and It’s Still Not Going Public

A five-day Apple M5 exploit and a third-party credential breach expose the real stakes of Claude Mythos: the threat is not a public launch, it’s that access controls are harder to hold than they look

What Mythos Can Actually Do: The ExploitBench Numbers Are Real

Unauthorized Users Accessed Mythos Through a Contractor Credential on Launch Day

Where the "About to Be Released" Narrative Falls Apart

Heidy Khlaaf, False Positives, and the Human-in-the-Loop Problem

The Government Battlefield: A Federal Ban, a Lawsuit, and Congress Moving In

The Real Risk: Diffusion, Not a Launch Date

FromSoftware Pirate Soulslike Rumors Spark Excitement Around 'Cerulean Onslaught'

ASUS, Xreal R1 Gaming Display Glasses Now Open For Preorders—Is it Worth it?

WhatsApp Rolls Out Liquid Glass Adoption on iOS to More Users—What to Expect?

Mayo Clinic DNA Molecules Pinpoint Aging "Zombie Cells", Unlocking Precision Senolytic Therapy

Party Animals Studio Recreate Games Loses Steam "Very Positive" Rating in Under 24 Hours Over $75,000 AI Video Contest