Chubby♨️@kimmonismus

2026-05-06 22:59·57天前

AI 摘要

一篇关于Claude Mythos和GPT-5.5的分析文章指出，两者在网络安全能力上基本持平，GPT-5.5可能更具成本效益。Mythos在部分通用基准和SWE-bench Pro上略微领先，但并未形成显著的能力突破。分析认为Mythos的性能符合既往趋势，并非偏离趋势的巨大飞跃。与此同时，OpenAI近期发布了多项出色产品，这反衬出Claude Mythos为何仍保持高度保密状态。

A very worthwhile substack （written by @natalia__coelho ） article that focuses particularly on Claude Mythos and GPT-5.5 cyber.

tl；dr according to the analysis， GPT-5.5 is basically tied with Claude Mythos Preview on cyber capabilities， and may even be more cost-efficient； Mythos looks slightly ahead on some general benchmarks and SWE-bench Pro， but not like a major capability leap.

OpenAI has recently enabled some truly outstanding releases. Against this backdrop， the question arises as to why Claude Mythos remains so secretive.

Matthew BarnettNew post from @natalia__coelho on Mythos. She analyzes its capabilities using publicly reported benchmark results to determine whether the model represents a la...

Anthropic OpenAI 推理编码

在 X 查看原推导出 Markdown

Chubby♨️@kimmonismus · X

49导出 Markdown

2026-05-06 22:59·57天前

在 X 看原推· x.com

AI 摘要

A very worthwhile substack （written by @natalia__coelho ） article that focuses particularly on Claude Mythos and GPT-5.5 cyber.

OpenAI has recently enabled some truly outstanding releases. Against this backdrop， the question arises as to why Claude Mythos remains so secretive.