Artificial Analysis@ArtificialAnlys

2026-06-05 02:12·28天前

AI 摘要

NVIDIA 今日发布 Nemotron 3 Ultra，重点优化低延迟智能体性能。在 Terminal-Bench v2.1 上，该模型与竞品在 4 个递增轮次限制下对比测试。Nemotron 3 Ultra 凭借高推理速度（基于 token 用量与 blackboxai 预部署测得的端点输出速度，以及工具执行实际耗时），在每个轮次限制下完成任务的速度均快于竞品，同时保持了有竞争力的基准分数，处于该评测性能-时间帕累托前沿的领先位置。

Nemotron 3 Ultra was launched today， including a focus on low latency agentic performance. We tested it against peers under restricted turn-usage limits on Terminal-Bench v2.1 - @NVIDIA Nemotron 3 Ultra completes tasks at a much faster pace than peers due to its high inference speed while scoring competitively on the benchmark.

In this analysis each model is given a 'turn limit' within which it can complete tasks， inside a customized version of the Terminus 2 harness which advises it of this limit. We apply 4 increasing turn limits and trace each result's tradeoff of task latency and performance. Time per task， on the X axis， is calculated as decode time based on token usage and measured endpoint output speeds （for Nemotron 3 Ultra， speeds were measured on a pre-release deployment on @blackboxai）， plus the actual time spent executing tools to complete the benchmark.

Nemotron 3 Ultra is the fastest across all turn limits and sits on the Pareto frontier for performance versus time per task for this evaluation.

智能体推理评测/基准

在 X 查看原推

Artificial Analysis@ArtificialAnlys · X

65导出 Markdown

2026-06-05 02:12·28天前

在 X 看原推· x.com

AI 摘要