# DeepReinforce 发布开源智能体编码大模型家族 Ornith-1.0（MIT 许可）

- 来源：Rohan Paul (@rohanpaul_ai)
- 发布时间：2026-06-25 23:47
- AIHOT 分数：72
- AIHOT 链接：https://aihot.virxact.com/items/cmqtolq8005g7sl0ei0z2vwkk
- 原文链接：https://x.com/rohanpaul_ai/status/2070171975386112372

## AI 摘要

DeepReinforce 发布 Ornith-1.0，一个 MIT 许可的开源智能体编码大语言模型家族，涵盖 9B Dense、31B Dense、35B MoE 及旗舰 397B MoE（17B 活跃参数）。旗舰模型在 SWE-Bench Verified 上取得 82.4，Terminal-Bench 2.1 上取得 77.5，均超越 Claude Opus 4.7；并在 SWE-Bench Pro（62.2）、Multilingual（78.9）等基准上达到开源同尺寸最佳。模型基于 Gemma 4 和 Qwen 3.5 后训练，采用新型自我改进策略：强化学习不仅生成解决方案，还联合优化任务特定的 scaffold（包含计划、记忆模式、工具节奏、错误处理等）。最小的 9B 模型也在 SWE-Bench Verified 上达到 69.4。全部模型以 MIT 许可证发布，支持商用与研究。

## 正文

Another fantastic open source release.

DeepReinforce just dropped Ornith-1.0， an MIT-licensed open-source family of agentic coding LLMs.

The flagship Ornith-1.0-397B MoE （17B-active） is the most powerful model in the release， reporting 82.4 on SWE-Bench Verified and 77.5 on Terminal-Bench 2.1 - surpassing Claude Opus 4.7 on both benchmarks.

Built on top of pretrained Gemma 4 and Qwen 3.5

Employs a novel self-improving training strategy. With this Ornith changes the training target by asking the model to improve both the answer and the task scaffold， meaning the plan， memory pattern， tool rhythm， error handling， and search process that shape the answer.

During RL， the model proposes a better scaffold first， then uses it to produce solution rollouts， and the reward updates both stages together.

That makes the model less like a coder following one rigid checklist and more like a coder learning which checklist works for each type of bug， repo， or terminal task.

The most interesting result is the 9B model reaching 69.4 on SWE-Bench Verified

### 引用推文

> Ornith：Aloha! 🌺 Meet Ornith-1.0, a family of open-source LLMs specialized for agentic coding. Ornith-1.0 spans the full parameter sizes including 9B Dense, 31B Dense,...
