Anthropic’s Claude Mythos Preview threat shifts AI security race from skills to budget #AI


As Claude Mythos emerges with capabilities ranging from vulnerability detection to generating attack code, some say response strategies need to be redefined. [Photo: Shutterstock]

A claim has emerged that the artificial intelligence security race is shifting from vulnerability detection capabilities to how much money is poured into AI.

On April 20, online outlet Gigazine reported that evaluation results of Anthropic’s security-focused AI “Claude Mythos Preview,” unveiled earlier this month, are spreading the view that systems can be protected only by spending a bigger token budget than attackers, or hackers.

The debate stems from Anthropic saying on April 7 it is providing Claude Mythos Preview to some key software developers. Drew Breunig (드루 브루닉), an IT industry worker, initially took a cautious stance on the claim. He viewed security as a field where it is relatively possible to verify after the fact whether AI actually found vulnerabilities, and where impressive success stories are easy to create by pouring in large volumes of tokens. But the UK government’s AI security research body, the AI Security Institute (AISI), later assessed that “the model has gone a step beyond existing models in cyber performance as well.”

What particularly drew Breunig’s attention was “The Last Ones” test, which recreated the process of attacking a corporate network. The test consisted of 32 tasks ranging from information gathering to taking control of an entire network, and compared multiple models including Claude Mythos Preview and GPT-5.4.

The conditions were not light. Each test was designed to process up to 100 million tokens. Within that range, the only model to finish all 32 tasks was Claude Mythos Preview. The model completed all tasks in 3 of 10 tests. AISI said there were cases in which performance gains did not slow even when Claude Mythos Preview reached 100 million tokens.

The cost burden also surfaced. Anthropic offers the model at $25 (about 37,000 won) per 1 million input tokens and $125 (about 184,000 won) per 1 million output tokens. Under the conditions, running 10 tests that use up to 100 million tokens would cost about $12,500 (about 18.4 million won). Based on this, Breunig argued that “defenders need to spend more tokens to find vulnerabilities first, before hackers exploit them.”

The claim also points to changes in how software is developed. Breunig said it is likely that development work will split into three stages of “development,” “review” and “hardening” in an environment where AI agents write code. He said the phase of building functions quickly is constrained by human judgement and user responses, but the phase of finding and eliminating vulnerabilities becomes constrained primarily by budget. He added that while writing code itself will keep getting cheaper, making that code secure will have a separate cost structure.

Some also argue companies should reduce external dependencies and directly implement needed functions with large language models. Breunig said that did not mean dependencies should be abandoned immediately. He judged that if companies spend enough tokens to audit open-source libraries, that could be safer than implementing each one in-house. He added that widely used open source is also valuable to hackers, leaving open the possibility that hackers’ investment could rise as well.

Market and developer community reactions were mixed. On Hacker News, some cautioned it is too early to generalise from AISI’s assessment that defence requires more tokens. They said other defensive tools such as formal methods also need to be considered.

Others argued that defenders can operate more efficiently than hackers because they can periodically bundle and check full source code and changes, which could instead raise software security.

Limits of the evaluation are also clear. AISI said the test included a condition that there was “no penalty even if an AI model engages in actions that trigger security alerts.” It said it therefore cannot be conclusive whether the model can attack sufficiently defended systems where proactive security tools and security staff are actually operating. Despite these constraints, the evaluation has left the central issue that AI-driven security competition is shifting from technical performance alone to how much tokens and costs can be borne.



Click Here For The Original Source.

——————————————————–

..........

.

.

National Cyber Security

FREE
VIEW