AI 摘要
动态工作流(即时生成测试工具)是测试时计算的一种新形式。 但大语言模型并不擅长构建它们。我经常需要引导AI智能体来生成复杂模式。 好奇Mythos/GPT-5.6在动态生成复杂工作流方面的效果如何。
Dynamic workflows (generating harnesses on the fly) are a new form of test-time compute.
But LLMs aren't great at building them. I often have to steer agents to generate complex patterns.
Curious how effective Mythos/GPT-5.6 is at dynamically generating complex workflows.