Claude Opus 4.8 在 Claude Code 中基于匿名研究数据自主撰写学术论文,经由 GPT-5.5 Pro 担任审稿人并指出错误后,Claude 对论文质量进行了量化自评:在1-10的识别度量表上,其稳健性检验后的评分从之前的3.5分提升至4.5分,但认为仍未达到准实验水平(约7分)。因此,Claude 将论文定性为“条件关联一致”的谨慎表述,而非声称因果识别。
Claude really can roleplay an economist. I love this little comment Claude made after some robustness checks on the paper it wrote: "On a 1-10 identification scale, I'd now put the paper at about 4.5 - better than the 3.5 I'd have given before these tests, but well short of quasi-experimental (~7). The framing "conditional association consistent with…" is still the right calibration. I shouldn't claim causal identification."