# NeuraDock Agent：低通道脑电图智能体的边界感知上下文接地架构

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-25 08:00
- AIHOT 分数：45
- AIHOT 链接：https://aihot.virxact.com/items/cmqyruylk00xsslmcw5bo8542
- 原文链接：https://arxiv.org/abs/2606.26519

## AI 摘要

NeuraDock Agent是一个开源架构，将确定性本地EEG引擎与硬件感知语言层分离。它解析七通道脑电图，执行质量控制与审核后的频谱工作流，生成机器可读结果。大语言模型仅接收经过允许列表筛选的摘要和版本化上下文包，包含硬件描述、工作流、结果字段、实施边界、科学限制及参考案例，原始EEG和密集数组数据保留在本地。评估分三个层面：12份记录在十次数值重复中结果一致；请求捕获与故障注入实验验证了数据边界和本地工件保留；边界意识基准测试对36个普通和对抗性问题在4种上下文消融设置和2个LLM下产生288个输出，证实了硬件与实现感知接地机制的可行性，但未验证临床有效性。

## 正文

Large language models (LLMs) can make scientific software easier to use. However, a general model does not automatically know which measurements a particular sensor can support, which algorithms are implemented in the current software, or which conclusions are justified by a computed result. These distinctions are especially important for low-channel electroencephalography (EEG), where sparse spatial coverage and variable signal quality make plausible but unsupported interpretations easy to produce. We present NeuraDock Agent, an open-source architecture that separates a deterministic local EEG engine from a hardware-aware language layer. The numerical engine parses recordings, performs quality control, executes reviewed spectral workflows, and writes machine-readable artifacts. The LLM receives only a compact, allowlisted summary and a versioned context pack. The context describes the seven-channel hardware, reviewed workflows, result fields, implementation boundaries, scientific limits, and reference cases. Raw EEG and dense per-sample arrays remain local We evaluate the system at three levels. First, 12 recordings produced identical structured results over ten numerical repetitions, and a complete Rest/Task run produced identical result, report, and figure hashes over three repetitions. Second, request-capture and failure-injection experiments confirmed the tested data boundary and preservation of local artifacts under HTTP, malformed-output, and connection failures. Third, a boundary-awareness benchmark tested 36 ordinary and adversarial questions under four context ablations and two LLMs, yielding 288 outputs.These results support hardware- and implementation-aware grounding as a practical mechanism for calibrating what an EEG agent accepts, qualifies, or refuses; they do not establish clinical validity or a validated absolute cognitive-load index.
