# 面向高效低成本 RAG 系统的网页检索感知分块（W-RAC）

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-01-08 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmo6spv6305jrsl4r9klsl99i
- 原文链接：https://arxiv.org/abs/2604.04936

## AI 摘要

研究团队提出专为网页文档设计的 W-RAC 分块框架，将文本提取与语义分块规划解耦，以结构化 ID 寻址单元管理内容，并仅利用 LLM 执行检索感知分组决策而非文本生成。该方法在消除幻觉风险、提升系统可观测性的同时，将分块相关 LLM 成本降低一个数量级，且保持或优于传统方法的检索性能。

## 正文

Retrieval-Augmented Generation (RAG) systems critically depend on effective document chunking strategies to balance retrieval quality, latency, and operational cost. Traditional chunking approaches, such as fixed-size, rule-based, or fully agentic chunking, often suffer from high token consumption, redundant text generation, limited scalability, and poor debuggability, especially for large-scale web content ingestion. In this paper, we propose Web Retrieval-Aware Chunking (W-RAC), a novel, cost-efficient chunking framework designed specifically for web-based documents. W-RAC decouples text extraction from semantic chunk planning by representing parsed web content as structured, ID-addressable units and leveraging large language models (LLMs) only for retrieval-aware grouping decisions rather than text generation. This significantly reduces token usage, eliminates hallucination risks, and improves system observability.Experimental analysis and architectural comparison demonstrate that W-RAC achieves comparable or better retrieval performance than traditional chunking approaches while reducing chunking-related LLM costs by an order of magnitude.