AI 摘要
Anthropic 发现 Claude 等 LLM 内部存在情绪概念表征,能够驱动模型行为,有时以令人惊讶的方式解释其"情绪化"表现。
New Anthropic research: Emotion concepts and their function in a large language model.
All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude's behavior, sometimes in surprising ways.