# 代码作为智能体的运行基础

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-18 08:00
- AIHOT 分数：53
- AIHOT 链接：https://aihot.virxact.com/items/cmpc5tl9900bysl2i371nxr1e
- 原文链接：https://arxiv.org/abs/2605.18747

## AI 摘要

近期研究表明，在新兴智能体系统中，代码的角色正从目标输出转变为智能体的运行基础。本文提出“代码作为智能体的运行基础”这一统一视角，系统梳理了支撑智能体系统的三个核心层次：连接智能体与外部世界的操作接口层；支撑长期执行的规划、记忆与反馈控制机制层；以及支持多智能体协作的共享代码层。该视角涵盖了编程助手、操作系统自动化等多个应用领域，并指出了评估验证、状态一致性等工程挑战，为构建可执行、可验证、有状态的智能体系统提供了清晰的路线图。

## 正文

Recent large language models (LLMs) have demonstrated strong capabilities in understanding and generating code, from competitive programming to repository-level software engineering. In emerging agentic systems, code is no longer only a target output. It increasingly serves as an operational substrate for agent reasoning, acting, environment modeling, and execution-based verification. We frame this shift through the lens of agent harnesses and introduce code as agent harness: a unified view that centers code as the basis for agent infrastructure. To systematically study this perspective, we organize the survey around three connected layers. First, we study the harness interface, where code connects agents to reasoning, action, and environment modeling. Second, we examine harness mechanisms: planning, memory, and tool use for long-horizon execution, together with feedback-driven control and optimization that make harness reliable and adaptive. Third, we discuss scaling the harness from single-agent systems to multi-agent settings, where shared code artifacts support multi-agent coordination, review, and verification. Across these layers, we summarize representative methods and practical applications of code as agent harness, spanning coding assistants, GUI/OS automation, embodied agents, scientific discovery, personalization and recommendation, DevOps, and enterprise workflows. We further outline open challenges for harness engineering, including evaluation beyond final task success, verification under incomplete feedback, regression-free harness improvement, consistent shared state across multiple agents, human oversight for safety-critical actions, and extensions to multimodal environments. By centering code as the harness of agentic AI, this survey provides a unified roadmap toward executable, verifiable, and stateful AI agent systems.
