# NVIDIA 发布 Nemotron 3 Ultra，专注低延迟智能体性能

- 来源：Artificial Analysis (@ArtificialAnlys)
- 发布时间：2026-06-05 02:12
- AIHOT 分数：65
- AIHOT 链接：https://aihot.virxact.com/items/cmpztszeu00afsll3pahwepvf
- 原文链接：https://x.com/ArtificialAnlys/status/2062598349757567359

## AI 摘要

NVIDIA 今日发布 Nemotron 3 Ultra，重点优化低延迟智能体性能。在 Terminal-Bench v2.1 上，该模型与竞品在 4 个递增轮次限制下对比测试。Nemotron 3 Ultra 凭借高推理速度（基于 token 用量与 blackboxai 预部署测得的端点输出速度，以及工具执行实际耗时），在每个轮次限制下完成任务的速度均快于竞品，同时保持了有竞争力的基准分数，处于该评测性能-时间帕累托前沿的领先位置。

## 正文

Nemotron 3 Ultra was launched today， including a focus on low latency agentic performance. We tested it against peers under restricted turn-usage limits on Terminal-Bench v2.1 - @NVIDIA Nemotron 3 Ultra completes tasks at a much faster pace than peers due to its high inference speed while scoring competitively on the benchmark.

In this analysis each model is given a 'turn limit' within which it can complete tasks， inside a customized version of the Terminus 2 harness which advises it of this limit. We apply 4 increasing turn limits and trace each result's tradeoff of task latency and performance. Time per task， on the X axis， is calculated as decode time based on token usage and measured endpoint output speeds （for Nemotron 3 Ultra， speeds were measured on a pre-release deployment on @blackboxai）， plus the actual time spent executing tools to complete the benchmark.

Nemotron 3 Ultra is the fastest across all turn limits and sits on the Pareto frontier for performance versus time per task for this evaluation.