AI Outperforms 99% Of Human Hackers In Global Cyber Games | #hacking | #cybersecurity | #infosec | #comptia | #pentest | #hacker


An Israeli startup let its AI loose in advanced cyber games. It did better than 125,000 humans.

The Tenzai cofounders have created an AI hacking agent using OpenAI and Anthropic tools. They say AI has become so adept at hacking it might need regulatory controls, urgently.

Elad Malka

Every year, more than 100,000 seasoned cybersecurity pros compete in global hacking competitions, designed to show off their abilities at picking apart security systems to pilfer data. The games task hackers with challenges that escalate in difficulty, from bypassing logins to more complex cyberattacks requiring exploitation of hidden software weaknesses. Ultimately, they aim to break through all the security layers protecting a digital “flag,” just like real life capture the flag.

Now, Israeli startup Tenzai says that earlier this month its AI hacker performed better than 99% of the 125,000 human competitors who faced off in a series of six so-called capture the flag (CTF) competitions, which regularly update with new sets of tricky challenges.

Tenzai tailors models from both OpenAI and Anthropic for use in offensive cybersecurity. The firm proved itself in both old school competitions, where participants had to hack a web application, and newer ones, where the aim was to break into AI apps with prompts that manipulated the underlying large language models. Tenzai cofounder and CEO Pavel Gurvich tells Forbes the AI was surprisingly adept at combining exploits for software vulnerabilities, something which had previously been difficult to automate.

“The proliferation of such capabilities to pretty much everybody is already there, and growing.”

Gadi Evron, cofounder and CEO at Knostic

AI-driven offensive security is no longer theoretical, Gurvich says, but works at scale. That’s cause for both concern and optimism. If artificial intelligence programs are able to exploit complex IT systems at speed, it lowers the barrier to entry for almost anyone wanting to launch potentially devastating cyberattacks. At the same time, AI could also be tasked with finding and fixing a significant number of security weaknesses before they’re exploited. It will come down to which AI finds the problem first.

It’s also significantly cheaper. It cost just $5,000 to run Tenzai’s AI models across all the competitions. That’s chump change for government agencies, cybercrime gangs or surveillance companies wanting to use artificial intelligence for snooping. It’s downright affordable for anyone with some expendable income who wants to do damage. “This is rapidly getting out of the realm of nations and military intelligence organizations and into the hands of college kids who may have very different incentives,” Gurvich says. He believes regulation may be required that would limit AI companies from widely selling any models that could put highly capable hacking agents into the hands of the average person, limiting them to select customers.

Tenzai was founded in 2025 by a group of seasoned Israeli cyber execs who worked together in their homeland’s intelligence agencies. Within six months it had a $75 million seed round and a $330 million valuation, attracting investors with AI agents that had “elite, nation-grade offensive capabilities.”

Gurvich isn’t the first to have AI compete openly with human hackers. Last year, startup Xbow made it to the top of the HackerOne leaderboard, which ranks participants by the number of real-world vulnerabilities they’ve helped find and that have been fixed. Last year, Anthropic deployed Claude in some student hacking competitions, which are less challenging than those Tenzai participated in. It still fared well, ranking in the top 3% in a Carnegie Mellon CTF for school and college students. Last month, Anthropic announced Claude was able to find more than 500 high-severity vulnerabilities in open source software.

Gadi Evron, founder and CEO of AI security company Knostic, says that hackers have already had their “singularity moment.” It used to take days or weeks to go from discovering a software vulnerability to exploiting it. With the help of AI, it now takes hours. “Tenzai now showing how their agents win at 99% of six CTFs shows a maturity of the capability in the market, even though the proliferation of such capabilities to pretty much everybody is already there, and growing.”

Not that AI always outclasses humans. While it came in the top 100 in most of the competitions it entered, Tenzai’s AI never made it to the number one spot. “There’s still room at the top for humans,” Gurvich says.


This story was originally published on forbes.com and all figures are in USD.

Want to see more Forbes articles on your feed? Tap here to make Forbes Australia a preferred source on Google.

Look back on the week that was with hand-picked articles from Australia and around the world. Sign up to the Forbes Australia newsletter here or become a member here.

——————————————————–


Click Here For The Original Story From This Source.

.........................

National Cyber Security

FREE
VIEW