Baidu Inc.@Baidu_Inc

2026-06-23 17:56·9天前

AI 摘要

百度开源Unlimited OCR，专为一次性读取长文档设计。模型总参数量3B，仅激活500M，在OmniDocBench v1.5和v1.6上取得端到端SOTA。核心创新为参考滑动窗口注意力（R-SWA），模拟人类抄书过程，保持源、近期上下文和后续焦点，同时软遗忘无关信息。凭借恒定KV缓存大小和更低注意力成本，可在单次前向传播中转录40+页，不丢失上下文也不减速。模型已开源至GitHub和Hugging Face。

3B total parameters &amp； 500M activated， yet powerful enough to transcribe 40+ pages in one pass while keeping context intact. Meet Unlimited OCR！

Baidu AIWe're open-sourcing Unlimited OCR - built to read long documents in one pass. With 3B total parameters and only 500M activated, Unlimited OCR sets new end-to-en...

Hugging Face 多模态模型发布

在 X 查看原推导出 Markdown

Baidu Inc.@Baidu_Inc · X

71导出 Markdown

2026-06-23 17:56·9天前

在 X 看原推· x.com

AI 摘要

3B total parameters &amp； 500M activated， yet powerful enough to transcribe 40+ pages in one pass while keeping context intact. Meet Unlimited OCR！

Baidu AIWe're open-sourcing Unlimited OCR - built to read long documents in one pass. With 3B total parameters and only 500M activated, Unlimited OCR sets new end-to-en...

Hugging Face 多模态模型发布