# 更多上下文、更大模型还是道德知识？政治文本中Schwartz价值观检测的系统研究

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-21 08:00
- AIHOT 分数：49
- AIHOT 链接：https://aihot.virxact.com/items/cmpgl46l60ge5sljwy2sr9lor
- 原文链接：https://arxiv.org/abs/2605.22641

## AI 摘要

本研究探讨在句子级价值观检测中，上下文与显式道德知识的作用。通过对比句子、窗口和全文输入，以及有无检索增强（基于道德知识库）的设置，实验了监督式DeBERTa编码器与零样本大语言模型。结果发现：全文上下文能显著提升DeBERTa性能，但对零样本大模型并无稳定助益；而检索到的道德知识则能一致性地提升各类模型性能。模型规模的扩大并不保证性能增益。分析表明，上下文与检索对易混淆的价值观类别帮助最大。因此，价值观敏感的NLP应综合评估上下文、知识与模型，而非简单依赖更长输入或更大模型。

## 正文

Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained distinctions between neighboring values. We study when context and explicit moral knowledge help sentence-level value detection. Using the ValuesML/Touch{é} ValueEval format, we compare sentence, window, and full-document inputs; no-RAG and retrieval-augmented settings with a curated moral knowledge base; supervised DeBERTa-v3-base/large encoders; and zero-shot LLMs from 12B to 123B parameters. The results show that more context is not uniformly better: full-document context improves supervised DeBERTa encoders by 3.8--4.8 macro-F1 points over sentence-only input, but does not consistently help zero-shot LLMs. Retrieved moral knowledge is more consistently useful in matched comparisons, improving each tested model family and context condition under early fusion. However, scaling from DeBERTa-v3-base to large and from 12B to larger LLMs does not guarantee gains, and simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders. Per-value analyses show that context and retrieval help most for socially situated or conceptually confusable values. These findings suggest that value-sensitive NLP should evaluate context, knowledge, and model family jointly rather than treating longer inputs or larger models as universal improvements.