Rohan Paul@rohanpaul_ai

2026-04-19 15:03·74天前

AI 摘要

LLM可通过分析公开写作实现大规模去匿名化。研究让模型执行提取身份线索、搜索匹配池、比较验证候选者三项任务，在Hacker News与LinkedIn、Reddit跨社区及跨时间段等场景测试中，达到90%精确度与68%召回率，远胜旧方法。关键突破在于推理步骤能处理大规模候选池，证明零散公开文本已足以关联账户并识别个人，传统匿名保护机制失效。

Anonymous usernames are no longer much protection when LLMs can piece together a person's public trail.

LLMs can identify supposedly anonymous people online by turning messy posts into personal clues.

The best setup finds 68% of true matches at 90% precision， meaning 9 out of 10 guesses are right， while older methods stay near 0%.

The problem is that pseudonyms often seemed safe only because linking a person across sites used to take lots of careful manual work.

This paper cuts that work by making an LLM do 3 jobs： pull identity hints from raw text， search a huge pool of possible matches， and compare the best candidates to reject weak fits.

The authors tested this on 3 cases： matching Hacker News users to LinkedIn profiles， matching Reddit movie users across communities， and matching the same Reddit users across different time periods.

The main result is that the reasoning step beats simple matching by a wide margin and stays useful even as the candidate pool grows， which matters because it shows that public writing alone can now be enough to join accounts or name a person at scale.

----

Paper Link - arxiv. org/abs/2602.16800

Paper Title： "Large-scale online deanonymization with LLMs"

arXiv安全/对齐推理论文/研究

在 X 查看原推导出 Markdown

Rohan Paul@rohanpaul_ai · X

导出 Markdown