Ethan Mollick@emollick

2026-06-12 22:43·20天前

AI 摘要

一项发表在Nature Medicine的研究显示，通用前沿大语言模型（Google、OpenAI、Anthropic）在医学信息评估中全面优于专门的临床AI工具（OpenEvidence和UpToDate）。12名美国临床医生进行随机盲测，Frontier LLMs在三项评估中均胜出。临床AI工具的表现与自动启用的Google Search AI Overview在RCQ测试中相当。

There has been a push to use OpenEvidence AI for doctors. But this paper suggests general models are much better： "Frontier LLMs outperformed clinical AI tools in all three evaluations. Clinical AI tools performed comparably to auto-enabled Google Search AI Overview on the RCQ."

Eric TopolFor medical information, general AI frontier models (Google, OpenAI, Anthropic) outperformed specialized @EvidenceOpen and @UpToDate as assessed by 12 US clinic...

Anthropic Google OpenAI 论文/研究

在 X 查看原推导出 Markdown

Ethan Mollick@emollick · X

72导出 Markdown

2026-06-12 22:43·20天前

在 X 看原推· x.com

AI 摘要

Eric TopolFor medical information, general AI frontier models (Google, OpenAI, Anthropic) outperformed specialized @EvidenceOpen and @UpToDate as assessed by 12 US clinic...

Anthropic Google OpenAI