# Anthropic 发布 Claude Opus 4.8，成为 GDPval-AA 基准新领导者

- 来源：Artificial Analysis (@ArtificialAnlys)
- 发布时间：2026-05-29 00:57
- AIHOT 分数：80
- AIHOT 链接：https://aihot.virxact.com/items/cmpprb12h00g0slm60nkg9meh
- 原文链接：https://x.com/ArtificialAnlys/status/2060042848268083411

## AI 摘要

Anthropic 正式发布了 Claude Opus 4.8 模型。该模型在人工智能分析公司的 GDPval-AA 基准（专注于智能体的现实工作任务）上，以“max”努力设置获得了 1890 分。这一成绩比前代 Opus 4.7 高出 137 分，并以 121 分的优势领先于次优模型 GPT-5.5 xhigh。在直接对比中，这意味着 Opus 4.8 对 GPT-5.5 xhigh 拥有约 67% 的胜率。Anthropic 在模型公开发布前，为人工智能分析公司提供了早期访问权限以进行评测。

## 正文

Anthropic just launched Claude Opus 4.8， and it is the new leader on our GDPval-AA benchmark for agentic real-world work tasks

Opus 4.8 scored 1890 on GDPval-AA at launch with its 'max' effort setting， +137 points from Opus 4.7 and +121 points ahead of the next-best model， GPT-5.5 xhigh.

Compared head-to-head on the GDPval task set， this implies a ~67% win rate against GPT-5.5 xhigh.

@AnthropicAI shared access with us ahead of the public release to benchmark this model and we're glad to see our benchmarks referenced in today's launch.

The rest of the Artificial Analysis Intelligence Index is in progress - we'll share final results soon！
