# 美国作协测试：部分AI检测器完美识别人类写作，另一些全部误判

- 来源：The Decoder：AI News（RSS）
- 作者：Matthias Bastian
- 发布时间：2026-06-25 19:21
- AIHOT 分数：53
- AIHOT 链接：https://aihot.virxact.com/items/cmqtfhj4s034gsl0einglm2fd
- 原文链接：https://the-decoder.com/authors-guild-test-finds-some-ai-detectors-perfectly-identify-human-writing-while-others-fail-on-every-single-text

## AI 摘要

美国作家协会用10篇2020–2022年发表的文章测试多款AI检测器。Pangram和Grammarly正确识别每篇人类文本（0%误报），Originality.ai同样精准。而Sidekicker全部误判为AI生成（两篇评分100%），ZeroGPT也不可靠，对每篇人类文本报告较高AI百分比。协会警告这些工具不应作为唯一决策依据，误判可能使作者失去合同和声誉。该测试主要反映检测器在避免假阳性上的表现，并不保证同等准确地识别真正由AI生成的文本。

## 正文

Authors Guild test finds some AI detectors perfectly identify human writing while others fail on every single text

Matthias Bastian View the LinkedIn Profile of Matthias Bastian

Jun 25, 2026

Nano Banana Pro prompted by THE DECODER

In a test by the Authors Guild, AI detectors from Pangram and Grammarly correctly identified every human-written text as human.

Originality.ai also performed well. The test used ten Guild articles published between 2020 and 2022, before generative AI went mainstream. Sidekicker delivered the worst results. Every single article was flagged as mostly AI-generated, with two scoring 100 percent. ZeroGPT was also unreliable, reporting sometimes high AI percentages for all the human-written texts.

Articles ZeroGPT Originality.ai Sidekicker.ai Grammarly Pangram

Obscenity Petitions Dismissed 14.3% 0.0% 85.0% 0.0% 0.0%

Antitrust Litigation & Publications 5.3% 0.0% 100.0% 0.0% 0.0%

Warhol Fair Use Letter 40.7% 0.0% 79.0% 0.0% 0.0%

Copyright Claims Board 28.1% 0.0% 96.0% 0.0% 0.0%

Banned Books Club 64.5% 1.0% 71.0% 0.0% 0.0%

Kiss Library Piracy Lawsuit 26.5% 1.0% 71.0% 7.0% 0.0%

Obituary: Joan Didion 66.0% 0.0% 82.0% 9.0% 0.0%

Erdrich Pulitzer Prize 76.3% 0.0% 100.0% 0.0% 0.0%

Support Authors & Literary Arts 50.6% 0.0% 92.0% 0.0% 0.0%

The Roundup 12/2020 18.1% 0.0% 96.0% 0.0% 0.0%

False positives can cost authors their contracts

Still, the oldest and largest professional organization for writers warns that even the best-performing tools should never be the sole basis for any decision. These tools change constantly, and their accuracy can't be taken for granted.

Pangram CEO Max Spero recently explained that his detector is essentially a black box, with no way to explain in detail why a text gets flagged as AI-generated. Language models do give themselves away through uniformity, though, especially in how they build arguments. Humans write with far more variety, Spero said.

Professionally written texts share many of the same statistical patterns as AI output, according to the Authors Guild, simply because language models were trained on exactly that kind of writing. False results can cost authors their contracts and their reputations, so publishers should disclose their methods and always give authors a chance to defend themselves.

This creates a troubling paradox. A writer who has spent decades honing clarity, economy, and precision is, by definition, writing in a way that overlaps with what AI has learned to produce. Detection tools cannot distinguish between a human writer who has mastered the craft and a machine that has learned to imitate it, because at the level these tools operate, there may be little difference to find.

Author’s Guild

That said, the fact that Pangram and Originality reliably identify human-written texts as human doesn't necessarily mean they're equally good at catching AI-generated ones. The results mainly show that these tools are tuned to minimize false positives, avoiding cases where human text gets wrongly flagged as AI. Plenty of texts written by or with AI could still slip through undetected. The reliability shown in this test applies first and foremost to correctly recognizing human writing.

The cultural debate behind detection

Errors will keep happening, and that's why the usefulness of these detectors keeps getting questioned. This is especially true since AI can be a genuinely useful writing tool, and the broader debate often conflates using AI to write with using AI to think.

Detector advocates like Pangram CEO Max Spero justify their business model by pointing to a social contract between writer and reader. The writer invests time and effort to shape an idea; the reader invests time to engage with it. If AI drops the cost of writing to zero, bad incentives follow, and people flood the internet with worthless content that takes readers more time to consume than it took the author to produce, Spero said.

Whether a piece of writing gets its value from the typing, though, or from the topic selection, the idea, the perspective, the story, the research, the argument, and the judgment behind it, that's a different question entirely. So is whether AI text detection can actually do anything about the flood of worthless content.