宝玉@dotey

2026-06-08 09:50·25天前

AI 摘要

宝玉指出，Agent 能否自我验证是长时间运行的关键，否则可能浪费 Token。@bcherny 的基准测试显示 Claude Opus 最适合长时间运行，并给出 5 条自主运行技巧：1. 使用自动权限模式；2. 部署动态工作流让 Claude 协调数百/数千个 Agent；3. 用 /goal 或 /loop 指令持续推进；4. 在云端运行 Claude Code 以便关闭笔记本；5. 确保端到端自我验证——通过 Chrome 浏览器扩展验证网页、iOS/Android 模拟器 MCP 验证移动端、启动完整 Web 服务验证后端。

长时间运行 Agent，Agent 能自行验证才是关键，否则可能只是浪费 Token

Boris ChernySeeing a number of benchmarks showing Opus is the best model for long-running work. Five tips for running Opus autonomously for hours/days: 1. Use auto mode for...

智能体 Anthropic MCP/工具大佬观点

在 X 查看原推导出 Markdown

宝玉@dotey · X

44导出 Markdown

2026-06-08 09:50·25天前

在 X 看原推· x.com

AI 摘要

长时间运行 Agent，Agent 能自行验证才是关键，否则可能只是浪费 Token

Boris ChernySeeing a number of benchmarks showing Opus is the best model for long-running work. Five tips for running Opus autonomously for hours/days: 1. Use auto mode for...

智能体 Anthropic MCP/工具大佬观点