AgentSPEX:一种智能体规范与执行语言
阅读原文· arxiv.orgAgentSPEX 是一种 LLM 智能体规范与执行语言,通过显式控制流和模块化结构解决现有框架与 Python 紧耦合、难以维护的问题。系统支持类型化步骤、分支循环、并行执行和子模块复用,配备可视化编辑器及可定制执行环境(含沙盒、检查点与日志功能)。经 7 项基准测试验证,用户研究证实其工作流编写范式比主流框架更具可解释性和易用性,同时提供深度研究与科学研究即用型智能体。
Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.