Mistral AI 发布 OCR 4 模型。在独立标注员对 600+ 现实文档(12+ 语言)的盲测中,OCR 4 被偏好,平均胜率 72%;OlmOCRBench 得分 85.20。OCR 4 还返回边界框、类型化块分类和行内置信度分数,作为 Search Toolkit 的组件,支持 170 种语言,且足够紧凑可单容器运行。
Mistral AI launched OCR 4 👀
Win rates averaging 72%, alongside the top overall score on OlmOCRBench (85.20). Alongside the extracted text, OCR 4 returns bounding boxes, typed-block classification, and inline confidence scores. OCR 4 is an ingestion component of Search Toolkit, Mistral's open-source, composable search framework. Support for 170 languages across 10 language groups. OCR 4 is compact enough to run in a single container.