# 阿里研究展示AI新威胁：多智能体协作可自动生成软件漏洞利用代码

- 来源：Rohan Paul (@rohanpaul_ai)
- 发布时间：2026-05-17 17:29
- AIHOT 分数：61
- AIHOT 链接：https://aihot.virxact.com/items/cmp9l4a9v0rq0slnzxfngd0y1
- 原文链接：https://x.com/rohanpaul_ai/status/2055943658080514243

## AI 摘要

阿里巴巴的研究论文表明，AI正从发现漏洞转向实际生成可利用的攻击代码。其提出的VulnSage框架采用多智能体协作工作流，将过程分解为数据流提取、自然语言约束重写、候选攻击生成及沙箱验证与反思等步骤。该系统的关键突破在于将代码理解转化为对代码使用方式的推理，从而能在更复杂、现实的软件上成功生成漏洞利用。评估显示，其在SecBench.js上的成功率比传统工具高34.64%，并在真实软件包中发现146个零日漏洞，印证了谷歌CEO关于前沿模型可能颠覆软件安全的警告。

## 正文

Alibaba's published a paper giving a strong example of what Sundar Pichai is warning about.

Shows AI is moving beyond bug finding and into actually proving software is exploitable.

This paper asks a simple question with hard consequences： can LLMs confirm software vulnerabilities by actually building working exploits？

The authors' answer is yes， but only when the model stops acting like a single genius and starts acting like a team.

That sounds minor until you look at the mechanism.

Automated exploit generation usually fails for familiar reasons. Fuzzers miss deep paths. Symbolic execution chokes on messy real code， especially when the right input is not just a value but a carefully assembled object， class instance， or string with the right structure.

A plain LLM is not enough either. It can imitate code， but it loses the thread， hallucinates details， and struggles to repair its own mistakes once execution fails.

VulnSage's real move is to turn exploit generation into a workflow.
- One agent extracts the vulnerable dataflow.
- Another rewrites that path as natural-language constraints.
- Another generates candidate exploits.
- Then a validation agent runs them in a sandbox， and reflection agents use the resulting traces and errors to refine the next attempt or conclude the alert was probably a false positive.

Here's the part most people miss.

The point is that the hard part is often not "solve these equations，" but "figure out how this code expects to be used." Their system writes the problem in ordinary language so the model can reason about code structure， like which object to build and which method path keeps the malicious input alive.

The concerning part is that this makes exploit generation work on messier， more realistic software where older methods often fail. In other words， the paper's claim is not just "we solved constraints differently，" but "we can now turn code understanding itself into a path to real exploits."
In the paper's evaluation， the authors report 34.64% more successful exploits than prior tools on SecBench.js， and 146 zero-days in real packages.

The win is not that LLMs magically solve exploitation. It is that they become useful once they are forced to read， act， fail， and learn like a security researcher.

----

Paper Link - arxiv. org/abs/2604.05130

Paper Title： "A Multi-Agent Framework for Automated Exploit Generation with Constraint-Guided Comprehension and Reflection"

### 引用推文

> Rohan Paul：Google CEO Sundar Pichai on current frontier model's ability to break the security of almost all current software. "These models are definitely, like really gon...
