54
AI 摘要
智谱 GLM-5.2 在内部 35 项挑战性移动开发任务(共 70 次试验)中完成率达 48/70,较 GLM-5.1 的 21/70 提升超两倍;同期 Claude Fable 5 为 56/70。主推文指出长程能力应落地真实场景,更多场景即将推出。
Long-horizon is more than a concept. It should live in real-world scenarios, empowering AI builders to solve the problems that matter.
And more scenarios are on the way.
GLM-5.2 delivers a substantial leap in app development capabilities, which also represent demanding long-horizon tasks. Results: - GLM-5.1: 21/70 - GLM-5.2: 48/...