# HarnessX：一种可组合、自适应、可演化的智能体运行框架铸造厂

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-12 08:00
- AIHOT 分数：42
- AIHOT 链接：https://aihot.virxact.com/items/cmqemsqua03pfslun6xfd4wns
- 原文链接：https://arxiv.org/abs/2606.14249

## AI 摘要

HarnessX 是一个智能体运行框架（harness）铸造厂，通过类型化原语和替代代数组装可组合的框架，并利用 AEGIS 这一基于轨迹的多智能体进化引擎实现自适应演化，将执行轨迹反馈用于框架更新与模型训练。在 ALFWorld、GAIA、WebShop、tau³-Bench 和 SWE-bench Verified 五个基准上，HarnessX 平均提升 +14.5%，最高达 +44.0%，基线越低提升越明显。完整代码将在未来开源。

## 正文

AI agent performance depends critically on the runtime harness, comprising the prompts, tools, memory, and control flow that mediate how a model observes, reasons, and acts. Yet today's harnesses remain largely hand-crafted and static: each new model or task still demands bespoke scaffolding, and the rich traces produced during execution are rarely distilled back into systematic improvement. We introduce HarnessX, a foundry for composable, adaptive, and evolvable agent harnesses. HarnessX assembles typed harness primitives via a substitution algebra, adapts them through AEGIS, a trace-driven multi-agent evolution engine grounded in an operational mirror between symbolic adaptation and reinforcement learning, and closes the harness-model loop by turning trajectories into both harness updates and model training signal. Across five benchmarks (ALFWorld, GAIA, WebShop, tau^3-Bench, and SWE-bench Verified), HarnessX yields an average gain of +14.5% (up to +44.0%), with gains largest where baselines are lowest. These results suggest that agent progress need not come from model scaling alone: composing and evolving runtime interfaces from execution feedback is an actionable and complementary lever. The complete codebase will be open-sourced in a future release.
