AI 摘要
Claude通常在软件工程方面优于前沿竞争对手,数学方面则稍逊。 根据我们汇总基准测试创建的领域特定ECI指标,Claude家族的软件工程ECI平均比通用ECI高2.7分,数学ECI则低1.8分。
Claude is typically better at software engineering and worse at math than frontier competitors.
Aggregating benchmarks to create our domain-specific ECI, we find the Claude family has an average SWE-ECI 2.7 points higher than their general ECI, and a Math-ECI 1.8 points lower.