AI 摘要
这里有一个非常重要的教训,但你们中的一些人还没准备好进行这场对话。
there's a really important lesson here, but some of yall aren't ready for that conversation yet
Today we're releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepS...