# ZeroUnlearn：大语言模型中的少样本知识遗忘

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-20 08:00
- AIHOT 分数：45
- AIHOT 链接：https://aihot.virxact.com/items/cmpnwyk2c0105slv4oqtv7cy8
- 原文链接：https://arxiv.org/abs/2605.18879

## AI 摘要

ZeroUnlearn 提出一种将机器遗忘重新定义为通过模型编辑进行精确知识重映射的方法。该框架以少样本方式运行，通过乘法参数更新与闭合解强制表示正交性，将敏感输入覆盖并映射到中立目标状态，从而高效定向地移除其原始表示。此方法还扩展为基于梯度的多样本遗忘变体。实验表明，ZeroUnlearn 在保持模型通用效用的同时，性能优于现有基线。

## 正文

Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on massive web corpora, raising concerns for privacy and safety. Existing machine unlearning methods primarily rely on retraining or aggressive fine-tuning, which are either computationally expensive or prone to degrading related knowledge and overall model utility. In this work, we reformulate machine unlearning as a precise knowledge re-mapping problem via model editing. We propose ZeroUnlearn, a few-shot unlearning framework. It overwrites sensitive inputs by mapping them to a neutral target state and removing their original representations. ZeroUnlearn enforces representational orthogonality through a multiplicative parameter update with a closed-form solution, enabling efficient and targeted unlearning. We further extend ZeroUnlearn to a gradient-based variant for multi-sample unlearning. Experiments demonstrate that our approach outperforms existing baselines while preserving general model utility. Our code is available at the github: https://github.com/XMUDeepLIT/ZeroUnlearn.
