一篇关于Claude Mythos和GPT-5.5的分析文章指出,两者在网络安全能力上基本持平,GPT-5.5可能更具成本效益。Mythos在部分通用基准和SWE-bench Pro上略微领先,但并未形成显著的能力突破。分析认为Mythos的性能符合既往趋势,并非偏离趋势的巨大飞跃。与此同时,OpenAI近期发布了多项出色产品,这反衬出Claude Mythos为何仍保持高度保密状态。
A very worthwhile substack (written by @natalia__coelho ) article that focuses particularly on Claude Mythos and GPT-5.5 cyber.
tl;dr according to the analysis, GPT-5.5 is basically tied with Claude Mythos Preview on cyber capabilities, and may even be more cost-efficient; Mythos looks slightly ahead on some general benchmarks and SWE-bench Pro, but not like a major capability leap.
OpenAI has recently enabled some truly outstanding releases. Against this backdrop, the question arises as to why Claude Mythos remains so secretive.