Intel Node

Defense at AI speed: Microsoft’s new multi-model agentic security system tops leading industry benchmark

critical•vulnerability•2026-05-12T22:00:00+00:00

vulnerabilitycvewindows

Today Microsoft is announcing a major step forward in AI-powered cyber defense: a new multi-model agentic scanning harness (codenamed MDASH). The post Defense at AI speed: Microsoft’s new multi-model agentic security system tops leading industry benchmark appeared first on Microsoft Security Blog .

In this article AI-powered vulnerability discovery at hyper-scale Codename: MDASH—Microsoft Security’s new multi-model agentic scanning harness Using codename MDASH for security research The 5. 12. 2026 Patch Tuesday cohort Two deep dives CVE-2026-33827—Remote unauthenticated UAF in tcpip. sys via SSRR CVE-2026-33824: Unauthenticated IKEv2 SA_INIT + fragmentation → double-free → LocalSystem RCE How capable is codename MDASH?

What this all means Conclusion Today Microsoft announced a major step forward in AI-powered cyber defense: our new agentic security system helped researchers find 16 new vulnerabilities across the Windows networking and authentication stack—including four Critical remote code execution flaws in components such as the Windows kernel TCP/IP stack and the IKEv2 service. They used the new Microsoft Security m ulti-mo d el a gentic s canning h arness (codename MDASH) which was built by Microsoft’s Autonomous Code Security team.

Unlike single-model approaches, the harness orchestrates more than 100 specialized AI agents across an ensemble of frontier and distilled models to discover, debate, and prove exploitable bugs end-to-end. Learn more and sign up to join the preview The results speak for themselves: 21 of 21 planted vulnerabilities found with zero false positives on a private test driver; 96% recall against five years of confirmed Microsoft Security Response Center (MSRC) cases in clfs. sys and 100% in tcpip. sys; and an industry-leading 88.

45% score on the public CyberGym benchmark of 1,507 real-world vulnerabilities—the top score on the leaderboard, roughly five points ahead of the next entry. The strategic implication is clear: AI vulnerability discovery has crossed from research curiosity into production-grade defense at enterprise scale, and the durable advantage lies in the agentic system around the model rather than any single model itself. Codename MDASH is being used by Microsoft security engineering teams and tested by a small set of customers as part of a limited private preview.

This post explains how codename MDASH works, what we shipped today, what we learned along the way, and how you can sign up for the private preview.    AI-powered vulnerability discovery at hyper-scale The Microsoft Autonomous Code Security (ACS) team was assembled to take AI-powered vulnerability research from a research curiosity to production engineering at enterprise scale.

Several members of this team came to Microsoft from Team Atlanta, the team that won the $20 million DARPA AI Cyber Challenge by building an autonomous cyber-reasoning system that found and patched real bugs in complex open-source projects. The lessons from that work, especially the level of engineering required to make the frontier language models perform professional-level security auditing, are what our new multi-model agentic scanning harness (codename MDASH) is built around.

View Source