Mistral AI 开源模型 Leanstral 1.5 专为 Lean 4 形式化验证设计
阅读原文· the-decoder.comMistral AI 发布 Leanstral 1.5(Apache 2.0 许可证),专为 Lean 4 编程语言的形式化验证设计。该模型在 miniF2F 基准上准确率达 100%,在 PutnamBench 的 672 道题中解出 587 题,在 FATE-H 和 FATE-X 上分别取得 87% 和 34% 的最高分。除数学外,该模型在代码验证中扫描 57 个开源仓库,发现 5 个未知漏洞,包括 Rust 库 varinteger 的一个溢出 bug。模型通过 Hugging Face 和免费 API 提供,训练涉及 mid-training、监督微调和强化学习。
Mistral's open-source Leanstral 1.5 aces formal math benchmarks and catches real bugs in code
Mistral AI released Leanstral 1.5, a free open-source model (Apache 2.0 license) built for formal verification in the Lean 4 programming language. Lean 4 is designed to formally verify mathematical proofs and software correctness.
Mistral says the model hits 100 percent on miniF2F, a formal math benchmark covering problems from high school level up to math olympiad difficulty. On PutnamBench, which includes 672 problems from the Putnam math competition, it solves 587. On the algebra benchmarks FATE-H and FATE-X, which test master's and doctoral-level tasks in areas like group theory and ring theory, it scores top results of 87 and 34 percent.

The model was trained mainly for math, but Mistral says it also performs well at code verification. In a hands-on test, it scanned 57 open-source repositories and caught five previously unknown bugs, including an overflow bug in the Rust library varinteger. The model is available through Hugging Face and a free API. Training involved mid-training, supervised fine-tuning, and reinforcement learning.