Mistral内容审核API（2024年11月7日，Mistral AI团队）

2024-11-07 00:00·603天前

AI 摘要

Mistral AI发布了新的内容审核API，与驱动Le Chat审核服务的系统相同。该API基于一个大语言模型（LLM）分类器，能够将文本输入划分为9个预定义类别。它提供两个端点，分别用于处理原始文本和对话内容，模型专为评估对话上下文中的最后一条消息而训练。该分类器原生支持多语言，包括阿拉伯语、中文、英语等11种语言，旨在为用户的应用提供可扩展、轻量且可定制的安全防护。

原文 · 未翻译

Safety plays a key role in making AI useful. At Mistral AI, we believe that system level guardrails are critical to protecting downstream deployments.That's why we are releasing a new content moderation API. It is the same API that powers the moderation service in Le Chat. We are launching it to empower our users to utilize and tailor this tool to their specific applications and safety standards.

Over the past few months, we've seen growing enthusiasm across the industry and research community for new LLM based moderation systems, which can help make moderation more scalable and robust across applications. Our model is an LLM classifier trained to classify text inputs into 9 categories defined below. We are releasing two end-points: one for raw text and one for conversational content. Undesirable content is very specific to a given context, therefore we've trained our model to classify the last message of conversation within a conversational context. Check out our technical documentation for more information. The model is natively multilingual and in particular trained on Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish.

The Content Moderation classifier leverages the most relevant policy categories for effective guardrails and introduces a pragmatic approach to LLM safety by addressing model-generated harms such as unqualified advice and PII. The full set of policy definitions and details on how to get started are available in our technical documentation .

We are sharing AUC PR across policies on our internal testset below.

We're working with our customers to build and share scalable, lightweight and customizable moderation tooling, and will continue to engage with the research community to contribute safety advancements to the broader field.

Mistral AI：News（网页）

43导出 Markdown

Mistral内容审核API（2024年11月7日，Mistral AI团队）

2024-11-07 00:00·603天前

阅读原文· mistral.ai

AI 摘要

原文 · 保持原样，未翻译