Eric Zakariasson 分享其AI智能体编程工作流:先设定可验证的完成标准(如模型评估分、测试全绿、p95阈值等),再将任务包装成循环——智能体反复修改、测量、保留或回退,直到达标、多轮无改进、思路用尽或遇阻。通过MCP和/notify向Slack发送通知,需要决策时主动联系人类。循环在云端运行,可同时启动多个长循环,并穿插PR、一次性调查等短任务。提示词模板用/loop驱动迭代、/notify保持更新。
http://x.com/i/article/2070417295810166784
Human in the /loop
What I like most about coding with agents right now is the room to leave a few runs going and still get on with other work. When something finishes or needs a call, I show up.
This post is a short explainer of the setup I use, a definition of done the agent can score, a loop that keeps going until it should stop, pings so I know when to lean in.
Find something the agent can verify
Before kicking off a longer running task, I lock a definition of done. Examples I actually use:
- Model or eval work. Target is a score. Change the approach, run the eval, keep the change only if the number moved the right way. Closest to Karpathy's autoresearch for ML training loops.