ReMMD：面向多模态虚假信息检测的现实多语言多图像智能体验证框架

2026-06-23 08:00·10天前

AI 摘要

提出ReMMD框架，包含基准ReMMDBench（500样本、2756张图片、5种单语及2种跨语言设置、多图像帖子、5类真实性标签与8类失真标签）及持久记忆验证器ReMMD-Agent。该Agent将帖子分解为原子点，构建可重用证据集，输出结构化L1/L2/L3预测。在闭源系统、开源LVLMs、MMD-Agent和T2-Agent对比中，ReMMD-Agent搭配GPT-5.2取得最佳五类真实性性能，准确率41.80%，macro-F1 39.12%，成本较MMD-Agent降低17.5%，较T2-Agent降低79.9%。项目已在HuggingFace开源。

原文 · 未翻译

Multimodal misinformation detection is increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image framing errors. Existing benchmarks and methods remain poorly matched to this setting: they usually isolate short captions, single images, binary labels, or one manipulation source, while agentic verification remains costly under realistic evidence search. We present ReMMD, a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection. ReMMD includes ReMMDBench, a real-world multimodal misinformation detection benchmark with 500 samples, 2,756 images, five monolingual languages, two cross-lingual settings, three text-length tiers, multi-image posts, five-way veracity labels, eight distortion labels, evidence provenance, and rationales. It also includes ReMMD-Agent, a persistent-memory verifier that decomposes posts into atomic points, builds a reusable evidence set, and predicts structured L1/L2/L3 outputs. Across proprietary systems, open LVLMs, MMD-Agent, and T2-Agent, ReMMD-Agent obtains the best five-way veracity performance, with 41.80% accuracy and 39.12% macro-F1 using GPT-5.2, while reducing cost by 17.5% relative to MMD-Agent and 79.9% relative to T2-Agent. The project is available at https://dang-ai.github.io/ReMMD.

HuggingFace Daily Papers（社区热门论文）

57导出 Markdown

ReMMD：面向多模态虚假信息检测的现实多语言多图像智能体验证框架

2026-06-23 08:00·10天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译