Intruvent EDGE: The First AI-Generated Zero-Day Just Dropped. Here’s What We Know.

A zero-day discovered by AI, written by AI, and deployed in the wild. Welcome to the new timeline.

May 14, 2026

Welcome to Intruvent Edge, our bi-weekly technical deep dive into a current cyber threat. If you found us through Prevent This, our weekly community newsletter covering cybersecurity for everyone, you’re in the right place. Both live on the same Substack. Feel free to share either one. We’re glad you’re here.

The Short Version

For the first time, a criminal hacking group used artificial intelligence to find a security flaw that nobody knew existed, write the code to exploit it, and deploy that code against real targets. The AI tool handled the entire exploit lifecycle by itself. Google’s security team caught it, worked with the affected software vendor to fix the flaw, and shut the operation down.

This matters because it changes the math on how quickly attackers can move. Finding and exploiting unknown security flaws used to require deep expertise and significant time. AI compresses both. If you run a business, manage IT infrastructure, or make decisions about technology risk, this article explains what happened, what it means, and what you should be paying attention to going forward.

Some Quick Definitions

Before we go further, a few terms that will come up throughout:

Zero-day: A security flaw in software that the software maker does not yet know about. The name comes from the idea that developers have had “zero days” to fix it. These are the most dangerous type of vulnerability because there is no patch available when attackers start using them.
Exploit: A piece of code designed to take advantage of a specific flaw. Think of the flaw as an unlocked window. The exploit is the burglar who knows exactly which window and how to climb through it.
Two-factor authentication (2FA): The second verification step when you log in, typically a code sent to your phone or generated by an app. The flaw in this case allowed attackers to skip that second step entirely.
CVE: A standardized ID number assigned to known security flaws, like a case number. When the security community refers to “CVE-2026-31431,” everyone knows exactly which flaw is being discussed.
APT (Advanced Persistent Threat): A government-sponsored hacking team. These are professional, well-funded groups that conduct cyber operations on behalf of nation-states. They are named and tracked by the security industry the way intelligence agencies track foreign operatives.

What Happened

On May 11, 2026, Google’s Threat Intelligence Group (their security research division, often shortened to GTIG) published a report tracking how threat actors are using AI. Among dozens of findings, one stood out: a criminal hacking group used an AI model to discover a previously unknown security flaw and write a working exploit for it.

The exploit is a Python script that bypasses two-factor authentication on a popular, widely used system administration tool. The underlying flaw is a hard-coded trust assumption in the software’s login process. In plain terms: the software was programmed to trust certain login attempts automatically, skipping the second verification step. The AI found that blind spot and wrote the code to walk through it.

Google identified the exploit in active use, coordinated with the software vendor to get the flaw patched, and disrupted the operation before it could spread further.

How Did Google Know AI Was Involved?

The exploit code contained telltale signs of AI authorship, the digital equivalent of a forged painting that uses pigments that did not exist when the original was supposedly created. Specifically:

Excessive documentation: The code included detailed explanatory notes throughout, the kind a teacher would write for a student. Real attackers do not document their exploits. They want their code to be hard to understand, not easy.
A fabricated severity score: The code included a CVSS score (a standardized severity rating, like a hurricane category for software flaws) that the AI made up. The score did not correspond to any real entry in the vulnerability database.
Textbook code structure: The code was organized with a precision and readability that prioritized clarity over stealth. Human exploit developers optimize for evasion. AI optimizes for correctness.
Built-in help menus: The exploit included usage instructions. No human attacker builds a help menu into their attack tool.

Google stated there was no evidence that its own Gemini AI was used, and assessed with high confidence that an AI model was involved. The specific model and the specific group remain undisclosed.

Three Findings, Three Actors: Getting the Story Right

Several publications reported this story as “North Korea used AI to build a zero-day.” That is not what Google said. The GTIG report contains three distinct findings involving three different groups doing three different things. Conflating them creates a misleading picture. Here is what actually happened:

Finding 1: The AI-generated zero-day. An unnamed group of cybercrime actors used an AI model to discover a zero-day and write a working exploit. Google caught it in active use and shut it down. The group, the model, and the target product remain undisclosed. This is the headline finding.

Finding 2: North Korea’s APT45 using AI to research vulnerabilities at scale. Separately, Google found that a North Korean government hacking team called APT45 (also known as Andariel) sent “thousands of repetitive prompts” to AI models, asking them to analyze known security flaws and validate whether existing proof-of-concept exploits actually work. Think of it as using AI to do the research grunt work: reading thousands of vulnerability reports and testing whether the published attack code is functional. This is a serious capability, but it is vulnerability research, not zero-day creation. APT45 did not produce the AI-generated zero-day in Finding 1.

Finding 3: China’s APT27 using AI to build operational tools. A Chinese government hacking team called APT27 (also known as Threat Group-3390) used Google’s Gemini to write a fleet management application for their proxy network. A proxy network is a series of relay points that attackers route their traffic through to hide their real location, like forwarding mail through multiple PO boxes so the return address cannot be traced. APT27 used AI to build the software that manages those relay points. This is software engineering, not exploit development.

The distinction matters. One unnamed criminal group created an AI-generated zero-day. A North Korean government team is using AI to accelerate vulnerability research. A Chinese government team is using AI to build infrastructure tools. All three are significant. None of them should be described as the same thing.

Why This Matters

The security industry has debated whether AI would be used for exploit development since ChatGPT launched in late 2022. That debate is settled. The question now is how fast this scales.

Three dynamics make this consequential for anyone who manages technology risk:

1. The Clock Is Faster Now

When a security flaw is discovered and publicly disclosed, a race begins. Defenders race to install the patch. Attackers race to build an exploit before the patch is applied. Historically, building a working exploit took days to weeks of skilled manual work. AI compresses that timeline.

In April, a security research team demonstrated this directly: they used AI-assisted analysis to turn a newly disclosed Linux kernel vulnerability (the “Copy Fail” flaw we covered in our April 30 newsletter) into a complete attack chain in approximately one hour. Google’s finding confirms that criminal groups are achieving similar speeds.

For organizations that patch on a monthly cycle, this is a problem. When the time from disclosure to exploit drops from weeks to hours, a monthly patch schedule means spending most of the month exposed.

2. The Expertise Barrier Is Lower

Building exploits used to require rare, specialized skills. Google’s report describes APT45’s approach as “thousands of repetitive prompts” rather than sophisticated engineering. That is not an elite technique. It is a volume play, like running a thousand internet searches instead of crafting one perfect query.

The AI artifacts in the zero-day exploit (the help menus, the documentation, the fabricated severity score) suggest the developer leaned heavily on the AI’s raw output rather than refining it. That implies someone with moderate technical skills, not a world-class exploit developer, produced a working zero-day with AI assistance.

The analogy in other fields: it is the difference between needing a board-certified specialist for a procedure versus a general practitioner with the right diagnostic tool. The tool does not replace expertise entirely, but it lowers the bar for who can produce a competent result.

3. The Trajectory Is Clear and Accelerating

Google has now published three AI Threat Tracker reports. Each one documents capabilities that were theoretical in the previous edition:

Early 2024: Attackers used AI mostly for writing phishing emails and generating basic code. Think of this as using AI as a research assistant.
November 2025: Google discovered experimental malware called PROMPTFLUX that queried an AI model to rewrite its own code hourly, changing its appearance to avoid detection. The malware was still in testing, but it proved the concept of using AI as a live mutation engine.
April 7, 2026: Anthropic (the company behind the Claude AI) disclosed that its Mythos model discovered over 2,000 previously unknown security flaws in seven weeks during internal testing, including bugs that had gone undetected for 17 and 27 years. Anthropic restricted Mythos’s release because of its offensive potential.
May 11, 2026: Google confirmed the first AI-assisted zero-day exploit used in a real-world criminal operation.

Each milestone arrived faster than the previous one. The gap between Anthropic’s controlled testing disclosure and a real-world AI-generated exploit in the wild was approximately one month.

What Else Was in the Report

The zero-day was the headline, but Google’s full report documents AI being integrated across every phase of cyberattack operations:

Autonomous phone malware: Android malware called PROMPTSPY uses an AI model to navigate a phone’s screen and replay biometric data without human guidance. This is malware that can operate your phone by itself.
AI-generated decoy documents: Russian-linked actors are using AI to produce convincing fake documents for phishing campaigns targeting Ukraine. AI makes the lure material faster and more believable at scale.
AI voice cloning for impersonation: A pro-Russia influence operation used AI-generated voice clones to impersonate journalists.
Software supply chain attacks: A group called TeamPCP compromised popular security scanning tools (Trivy, Checkmarx, LiteLLM), affecting over 1,000 business software environments.
Autonomous reconnaissance: Chinese actors deployed AI-powered tools called Hexstrike and Strix that can scan and map target networks without human direction.

The pattern is consistent. AI is not being used for one thing. It is being used for everything: research, reconnaissance, exploit development, malware creation, phishing, impersonation, and infrastructure management.

The Actors Behind This

APT45 / Andariel (North Korea)

APT45 is a hacking team that operates under North Korea’s military intelligence agency, the Reconnaissance General Bureau. The security industry also tracks them as Andariel, Silent Chollima, and Onyx Sleet. They are a sub-unit of the Lazarus Group, the umbrella organization behind North Korea’s most prominent cyber operations (Intruvent Codex; MITRE ATT&CK G0138).

APT45 has been active since at least 2015. Their primary mission is generating revenue for North Korea’s weapons programs through ransomware, cryptocurrency theft, and extortion. They target defense contractors, financial institutions, and government agencies. They have also targeted hospitals with ransomware (the Maui campaign) and conducted espionage against nuclear research programs.

Google’s report reveals that APT45 is now using AI at an industrial scale to analyze security flaws and test whether published exploits actually work. Their approach (sending thousands of repetitive prompts, using automated testing tools in practice environments) suggests they have built AI into their standard research workflow. This is not an experiment. It is how they operate now.

APT27 / Threat Group-3390 (China)

APT27 is a Chinese government espionage group that has been active since at least 2010. The security industry tracks them under a long list of names: Threat Group-3390, Emissary Panda, BRONZE UNION, Iron Tiger, and LuckyMouse (Intruvent Codex; MITRE ATT&CK G0027). The US Department of Justice indicted members of the group in 2020.

APT27 is one of the more technically sophisticated state-sponsored groups. The Intruvent Codex maps them to 57 distinct attack techniques and 24 malware families. They target defense, government, energy, manufacturing, and technology organizations.

Google’s finding that APT27 used Gemini to build a proxy network management tool is significant because it shows AI being used for operational plumbing, not just flashy capabilities. They needed a piece of software to manage their network of relay servers (which disguise the origin of their attacks by routing traffic through multiple hops, including consumer-grade 4G/5G connections). Instead of writing it from scratch, they had an AI build it. It is the cyber equivalent of hiring a contractor through an app instead of building the addition yourself.

What Should Organizations Do?

Patch Faster

If your organization patches software on a monthly cycle, this report is a signal to reassess. When AI can analyze a published security flaw and produce an exploit in hours, a 30-day patch window means spending most of the month with a known, exploitable weakness. For systems that face the internet (web servers, VPNs, email gateways, login portals), the target should be patching critical flaws within 24 to 72 hours of disclosure, not 30 days.

Know What AI-Written Exploits Look Like

The AI fingerprints Google identified (excessive documentation, fabricated severity scores, textbook code structure, help menus) are a temporary detection opportunity. If your security team investigates an incident and finds exploit code on a compromised system, these patterns can help determine whether AI was involved. Think of it like identifying a counterfeiter: the first generation of AI-generated exploits has tells that experienced analysts can spot. Those tells will fade as attackers learn to clean up after their AI, but right now, they are useful.

Audit Your Two-Factor Authentication

The specific zero-day exploited a flaw in how a product implemented two-factor authentication. The software had a built-in exception that trusted certain login attempts without requiring the second step. These kinds of shortcuts are common in software development (they make testing easier, or they accommodate legacy systems), and they are exactly the kind of subtle flaw that AI is good at finding.

Ask your IT team or security vendor: are there any conditions under which our two-factor authentication can be bypassed? Is there a fallback path that skips the second factor? Are there API endpoints (programmatic access points) that do not enforce it?

Prepare for More Exploits, Faster

APT45’s “thousands of repetitive prompts” approach is a preview. When vulnerability research becomes a volume game played by AI, the number of working exploits for known flaws will increase and the time between a flaw being disclosed and an exploit being available will decrease. Your security team should be planning for a world where every major vulnerability has a working exploit within days of disclosure, not weeks or months.

The Bigger Picture

The security industry has spent three years debating whether AI would be used to build cyberweapons. Google’s report ends that debate. It is happening. The first AI-assisted zero-day was caught in active use. The next one may not be caught at all.

The trajectory from Anthropic’s controlled disclosure of 2,000+ AI-discovered flaws in April to a real-world AI-generated exploit in May is a one-month gap. The trajectory from APT45’s brute-force approach to something more refined is measured in AI model generations, not years. The models themselves are improving faster than defensive tooling is adapting.

There is a silver lining. The exploit Google caught was identifiable precisely because AI leaves fingerprints that human developers do not. The help menus, the documentation, the fabricated scores: these are artifacts of an AI trying to be helpful in a context where helpfulness is a tell. That detection window is real, but it is closing. As models improve and attackers learn to strip the artifacts, the fingerprints will fade.

Thanks for reading Intruvent Edge! This post is public so feel free to share it.

For decision-makers, the takeaway is straightforward: the speed and scale at which attackers can find and exploit security flaws just changed. The organizations that adapt their patching speed, their detection capabilities, and their risk models to reflect this new reality will be in a stronger position than those that treat it as a future problem. The future arrived last Sunday.

Sources

The Hacker News: Hackers Used AI to Develop First Known Zero-Day 2FA Bypass for Mass Exploitation (May 11, 2026)
Google Threat Intelligence Group, AI Threat Tracker, Third Edition (May 11, 2026)
The Hacker News: Google Uncovers PROMPTFLUX Malware That Uses Gemini AI to Rewrite Its Code Hourly (November 5, 2025)
The Hacker News: Anthropic’s Claude Mythos Finds Thousands of Zero-Day Flaws(April 7, 2026)
Intruvent Technologies, Golden CTI Database (Codex): Andariel (G0138), Threat Group-3390 (G0027), Lazarus Group (G0032), queried May 14, 2026
MITRE ATT&CK: Andariel (G0138), Threat Group-3390 (G0027)
Xint Code: Copy Fail, 732 Bytes to Root (April 29, 2026)

Discussion about this post

Ready for more?