Mistral AI与NVIDIA联合推出开源模型Mistral NeMo

2024-07-18 00:00·715天前

AI 摘要

Mistral AI团队与NVIDIA合作发布了Mistral NeMo，这是一个12B参数的大语言模型。它提供高达128k tokens的上下文窗口，并在推理、世界知识和编码能力上达到了其规模的前沿水平。该模型基于标准架构，是Mistral 7B的即插即用替代品，并支持FP8推理。Mistral NeMo以Apache 2.0许可开源，包含预训练和指令微调版本，权重已发布在HuggingFace并可通过其API平台调用。新引入的Tekken分词器在超过100种语言上训练，在压缩多种语言文本时效率显著高于前代。

原文 · 未翻译

Today, we are excited to release Mistral NeMo, a 12B model built in collaboration with NVIDIA. Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category. As it relies on standard architecture, Mistral NeMo is easy to use and a drop-in replacement in any system using Mistral 7B.

We have released pre-trained base and instruction-tuned checkpoints checkpoints under the Apache 2.0 license to promote adoption for researchers and enterprises. Mistral NeMo was trained with quantisation awareness, enabling FP8 inference without any performance loss.

The following table compares the accuracy of the Mistral NeMo base model with two recent open-source pre-trained models, Gemma 2 9B, and Llama 3 8B.

Table 1: Mistral NeMo base model performance compared to Gemma 2 9B and Llama 3 8B.

Multilingual Model for the Masses

The model is designed for global, multilingual applications. It is trained on function calling, has a large context window, and is particularly strong in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. This is a new step toward bringing frontier AI models to everyone’s hands in all languages that form human culture.

Figure 1: Mistral NeMo performance on multilingual benchmarks.

Tekken, a more efficient tokenizer

Mistral NeMo uses a new tokenizer, Tekken, based on Tiktoken, that was trained on over more than 100 languages, and compresses natural language text and source code more efficiently than the SentencePiece tokenizer used in previous Mistral models. In particular, it is ~30% more efficient at compressing source code, Chinese, Italian, French, German, Spanish, and Russian. It is also 2x and 3x more efficient at compressing Korean and Arabic, respectively. Compared to the Llama 3 tokenizer, Tekken proved to be more proficient in compressing text for approximately 85% of all languages.

Mistral AI：News（网页）

62导出 Markdown

Mistral AI与NVIDIA联合推出开源模型Mistral NeMo

2024-07-18 00:00·715天前

阅读原文· mistral.ai

AI 摘要

原文 · 保持原样，未翻译

The following table compares the accuracy of the Mistral NeMo base model with two recent open-source pre-trained models, Gemma 2 9B, and Llama 3 8B.