Whether or not security is breached is increasingly becoming a competition of 'which side will spend more money on AI.' #AI

Apr 20, 2026 06:00:00

On April 7, 2026, Anthropic announced that it was providing ‘

Claude Mythos Preview ,’ an AI with extremely high cyberattack capabilities, to select critical software developers. In response to these developments, Drew Breunig, who writes technical articles, discusses security in the age of AI on his website.

Cybersecurity Looks Like Proof of Work Now
https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html

Regarding Anthropic’s emphasis on the high capabilities of Claude Mythos Preview, Breunig initially viewed Anthropic’s explanation with some caution, as in the security field, it’s relatively easy to verify afterward whether AI has found vulnerabilities, and it’s also easy to create impressive success stories by using a large number of tokens.

However, the UK government’s AI Security Institute (AISI) later evaluated that ‘Claude Mythos Preview surpasses previous models in terms of performance in the cyber domain.’

Tests by a UK government agency revealed that ‘Claude Mythos Preview’ can autonomously execute a complete network takeover attack – GIGAZINE

Following this evaluation, Breunig said he could no longer dismiss Anthropic’s claims. Breunig particularly emphasized ‘The Last Ones,’ a test that simulates attacks on corporate networks, and it included a total of 32 challenges ranging from gathering information on the target to taking over the entire network.

The AI models tested included a wide range of models such as Claude Mythos Preview and GPT-5.4, and processing continued until 100 million tokens were spent for each test. As a result of the tests, only Claude Mythos Preview completed all 32 tasks within 100 million tokens, and it cleared all tasks a total of 3 times out of 10 tests.

Anthropic is offering Claude Mythos Preview at a rate of $25 (approximately 3,980 yen) per 1 million tokens input and $125 (approximately 19,900 yen) per 1 million tokens output. AISI’s evaluation uses up to 100 million tokens per test, so under these conditions, 10 tests would cost approximately $12,500 (approximately 1,990,000 yen). Furthermore, while AI models tend to plateau in performance as tasks drag on, AISI has confirmed that Claude Mythos Preview did not lose momentum even after reaching 100 million tokens.

Based on this assessment, Breunig argues that ‘those protecting the system must use more tokens on the AI to find vulnerabilities before attackers can exploit them,’ and he draws two further conclusions from this.

The first point is that the importance of open-source software will actually increase. With

LiteLLM and Axios , problems arose where malicious code could infiltrate through the distribution and update channels of the libraries. As a result, Breunig says that there is a growing idea that we should reconsider configurations that rely on external libraries and instead have necessary functions written in-house by having large-scale language models write the code.

However, Breunig doesn’t simply suggest that ‘dependencies should be abandoned.’ He says that if companies invest enough tokens in auditing open-source libraries, it could be safer than implementing them individually in-house. On the other hand, he reserves that widely used open-source software is also valuable to attackers, which might mean that attackers will invest more money in it.

The second perspective is that in the era of writing code with AI agents, software development work will be divided into three parts: ‘development,’ ‘review,’ and ‘security enhancement.’

Breunig categorized these three tasks as follows: First, in ‘development,’ humans decide what to build and quickly create features while observing user feedback. Next, in ‘review,’ documentation is refined and refactoring is carried out. And finally, in ‘security enhancement,’ vulnerabilities are searched for and fixed until the budget runs out.

Breunig states that development and safety enhancement are separate processes because human judgment and user reactions become constraints in the development phase, while funding becomes a constraint in the safety enhancement phase.

Breunig states that ‘writing code itself will continue to be cheap.’ However, he believes that securing the code is a different story, and if the performance of AI models continues to improve, defenders will need to buy more tokens than attackers and continue to use them to find vulnerabilities.

In response to Breunig’s claims, the social news site HackerNews has posted cautious opinions such as, ‘It’s too early to generalize that ‘more tokens are needed for protection’ based solely on AISI evaluations , and there are other defense measures like formal methods,’ and ‘Defenders can periodically scan the entire source code and change diffs , so they can often act more efficiently than attackers, and in fact, this may even increase the security of the software.’

AISI also stated that ‘the test content had constraints such as ‘there are no penalties for AI models taking actions that trigger security alerts,’ so we cannot definitively say whether it is possible to attack well-protected systems with active security tools and security personnel.’

Click Here For The Original Source.

——————————————————–

..........
.
.

Related

Whether or not security is breached is increasingly becoming a competition of ‘which side will spend more money on AI.’ #AI

Related

Our Products

Company

Other Links