卧槽，惊了兄弟们！这波流量我必须得接住啊！ A社CEO Dario EX私信推荐的项目必须推荐啊！既然“女正主”都主动私信推荐自己做的网站项目，我不得给大家推荐一下啊！今天不仅仅是吃瓜，这个项目做的还挺有意思的~ Spaces Left Blank，一个个人诗歌文学网站，作者是 Jade Q Wang，风格非常独特。网站名 "Spaces Left Blank"（留白）本身就是诗歌的核心意象 —— 故意留下的空白、未被言说的部分，比文字本身更有力量。一、用 AI 做诗歌交互这个网站最有意思的部分是它设计了两种 AI 交互阅读模式： 1. Adaptive Footnotes（自适应脚注）把 PDF 上传给 AI（推荐 Opus 4.6+ 或 GPT 5.5）让 AI 根据你的背景和兴趣，为诗歌中的典故、引用生成个性化注释二、每个读者看到的注释都不一样 2. Cinematic Universe Exploration（电影宇宙探索模式）把 PDF 上传到 Claude Project 让 AI 把诗歌当作一个"宇宙"来探索 —— 角色、时间线、意象关联推荐 Opus 4.6，NotebookLM 也能用但"没那么有趣" 诗歌风格从 Preview 页面的节选来看：主题：移民创伤、ICE 突击搜查、童年记忆、种族身份、政治恐惧形式：自由诗，碎片化叙事意象：轨道日落、果树、鲁布·戈德堡装置、Von Trapp 船长引用：《安多》(Andor)、Shane Carruth 的《Primer》情绪：克制但紧迫，"我们活下来了"的劫后余生感一句话总结：一个用诗歌探讨移民身份与创伤的文学网站，最大亮点是把 AI 当作诗歌阅读的交互工具。让每个读者获得个性化的注释和探索体验。挺有想法的，比大多数诗集网站高级。 PS：看来还是夹带私货了啊，体验推荐的模型是Claude😄 @DarioAmodei 人不念旧情是不可能的~ 关键好友诗情画意啊！

译Berry Xia发推称，Anthropic老板Dario Amodei私信推荐个人诗歌文学网站“Spaces Left Blank”（留白），作者为Jade Q Wang。该站有AI交互阅读模式：Adaptive Footnotes（上传PDF由Opus 4.6+或GPT 5.5根据读者背景生成个性化注释）；Cinematic Universe Exploration（将诗歌视为宇宙探索角色与时间线关联，推荐Opus 4.6，NotebookLM也可用但“没那么有趣”）。诗歌主题涉及移民创伤、ICE突击、种族身份等。推文调侃Dario虽早年在中国有经历且对华态度负面，仍推荐此项目。

Rohan Paul@rohanpaul_ai · 6月15日87

A new Semafor report says the White House partly decided to place export restrictions on Anthropic’s Mythos over concerns that a China-linked group had accessed it. The China concern adds a second risk: if a foreign group accessed Mythos, it might use distillation, where one model is queried repeatedly so another model can imitate its answers and capabilities. --- semafor .com/article/06/13/2026/white-house-move-to-limit-anthropic-linked-to-concerns-about-chinese-access-to-mythos

译Semafor报道称，美国白宫因担忧中国关联团体访问Anthropic的Mythos模型，决定对其施加出口限制。另一风险是外部团体可能通过知识蒸馏窃取模型能力。此前美国商务部指令Anthropic禁用Fable 5和Mythos 5，因发现越狱可让模型透露网络安全帮助。Anthropic反驳称越狱并非普遍性，其他公开模型也能提供类似能力。限制将持续至美国政府加强国家安全系统，预计未来几周内。Anthropic承认当前任何模型供应商都无法实现完美防越狱。

凡人小北@frxiaobei · 6月15日5

我知道我用不了 Claude Fable5 ，不用一直提醒了。 @claudeai

AYi@AYi_AInotes · 6月15日60

讲真，AI赛道的追赶窗口，可能真的已经关上了。 2023年Anthropic内部就做过预判，2023到2026这三年是关键期，谁先跑出最强模型，谁就会拉开断层级差距。现在这个预判正在变成现实。 xAI只用了26个月，就摸到了第一梯队的门槛，而很多国家尤其是欧洲，把大把时间耗在了监管设限上，等回过神想入场冲刺，先机已经没了，因为领先优势一旦形成就会自我强化，资源、人才、迭代速度只会持续向头部集中。不是说后面完全没机会，只是追赶的成本和难度，会比窗口期内高出好几个量级，大家怎么看呢，会不会觉得这个判断是不是太绝对了🤔

译Anthropic在2023年内部预判，2023到2026年是AI赛道的关键期，谁先跑出最强模型就会拉开断层级差距。如今该预判正成为现实：xAI仅用26个月便摸到第一梯队门槛，而许多国家（尤其欧洲）将大量时间耗费在监管设限上，错失入场冲刺的先机。领先优势一旦形成会自我强化，资源、人才、迭代速度持续向头部集中。后续追赶的成本和难度将比窗口期内高出数倍。

Chubby♨️@kimmonismus · 6月15日65

One of the most important questions: even if Fable 5 is re-released today or this week, will our subscription plan access still only last until June 22nd? Or will they extend access?

译用户关键疑问：若Fable 5本周重新发布，订阅计划访问权限仅到6月22日还是会延长？据Axios最新报道，此事核心并非模型越狱，而是Anthropic与政府沟通受阻。Anthropic聘请网络安全专家审查Amazon调查结果并反驳政府说法，该专家被政府视为“激进民主党”。知情人士称公司不知如何与本届政府沟通。今日Anthropic员工将与商务部、CIA及白宫科学顾问会面，商讨网络行政令合规事宜，技术问题已成次要。

Chubby♨️@kimmonismus · 6月15日69

New update on Fable 5: and it's less about jailbreaks than anyone initially thought. Via Axios The Axios story that just dropped today reframes the whole thing: Anthropic hired a cybersecurity expert to review Amazon's findings and push back on the government's narrative. The administration viewed her as a "radical Democrat." She was then publicly celebrated by Chris Krebs, the official Trump just fired. That didn't help. Behind the scenes, officials describe a company that simply doesn't know how to talk to this administration. "It's like they just speak different languages," one source said. "Everybody said Anthropic was a bad actor. Some of us said it was time to give them a chance. Now those people are questioning that. They screwed us." Today: Anthropic staffers meet with Commerce, the CIA, and White House science advisor Michael Kratsios to work through compliance with the cyber executive order. The technical question - can Fable 5 be jailbroken - is almost secondary now. This is a story about a company that keeps losing the room. Ill keep you updated.

译据Axios报道，Anthropic因出口管制被迫将顶级模型Mythos和Fable下线后，派遣高级技术人员赴华盛顿修复与白宫关系。核心并非Fable 5能否越狱的技术问题——Anthropic聘请网络安全专家审查亚马逊的发现并反驳政府叙事。该专家被政府官员视为“激进民主党”，且被特朗普解职的Chris Krebs公开赞扬她，加剧了矛盾。内部人士称Anthropic根本不知如何与本届政府沟通。今日Anthropic员工将与商务部、CIA及白宫科学顾问会面，讨论遵守网络安全行政令。

小互@xiaohu · 6月15日47

Anthropic 更新了其隐私条款 Claude 免费版、Pro 版和 Max 版用户在某些特定情况下可能会被要求进行年龄或身份验证具体的验证方式没有说，感觉应该就是之前说的上传护照、身份证证件和摄像头认证。我感觉应该不是全部需要认证，工作量太大，可能是你在进行一些特定任务时候会被弹出认证比如进行尝试越狱、尝试诱导AI回答铭感问题、尝试进行黑客、生化、恐怖活动等铭感任务时候，或者是某些涉及政治问题、任务的时候...

译Anthropic 更新隐私条款，Claude 免费版、Pro 版和 Max 版用户在某些特定情况下可能被要求进行年龄或身份验证。具体验证方式未公布，推测可能包括上传护照、身份证及摄像头认证。该要求并非面向所有用户，而是在用户尝试越狱、诱导敏感回答、黑客、生化、恐怖活动等敏感任务，或涉及政治问题时可能弹出。

swyx@swyx · 6月15日41

havent seen many people outside anthropic ultracode yet. this thing is scarily good at burning tokens but you need to set up your repo to parallelize properly to make use of the fanout that i think subagents are best at. basically the idea is "subroutines but intelligent". when you undersatnd just how much knowledge work is just yakshaves after yakshaves that require some judgment and intelligence, you start to appreciate that dynamic workflows are not just for coding tasks...

译swyx 指出，Anthropic 的 Ultracode 工具在消耗模型 token 方面表现惊人，但需要正确设置仓库的并行化以利用子智能体（subagents）的扇出（fanout）能力。该工具的核心思想是“智能子程序”——当理解大量知识工作不过是需要判断和智能的琐碎任务（yak shaves）时，动态工作流不仅适用于编码任务。

数字生命卡兹克@Khazix0918 · 6月15日24

Codex现在对我最大的作用，就是在手机上启动家里电脑上的Claude code，然后开启远程控制，方便我在手机上继续coding...🤣🤣🤣 说实话， Claude自己客户端的Dispatch实在是太难用了。。。其实不止 Dispatch，整个客户端做的都挺垃圾的。。。

译推文分享Codex的实际用法：在手机上远程启动家里电脑的Claude Code，实现移动端远程编码。作者认为Claude客户端的Dispatch功能极为难用，并进一步批评整个客户端体验都很糟糕。

ginobefun@hongming731 · 6月15日42

BestBlogs 早报 · 06-15 # Fable 5 出口管制禁令 / SpaceX 上市估值 7800 亿美元 / Yann LeCun JEPA 世界模型 / Pliny 越狱攻击 / 华为昇腾 950DT [1] ★ 精讲｜SpaceX 崛起史：一切，为了去火星｜实地探访星舰基地与总部在 SpaceX 上市进入冲刺阶段（EP84 已报道其路演启动，隐含 15 年 41.5% 年增长率的估值预期）的关键节点，本文通过实地探访星舰基地与总部、对话前高管，完整复盘了 SpaceX 从猎鹰 1 号三次失败、NASA 救命合同、火箭回收突破到星链商业闭环——EP80 报道的 Google 每月 9.2 亿美元云服务协议正是星链变现的延伸——再到星舰技术创新的 24 年崛起史。读懂这段历史，才能理解市场为何愿意为这家公司支付「时间价值」溢价。来源：硅谷 101 https://www.bestblogs.dev/article/17c1ee9c [2] ★ 精讲｜从发布到被消失的 72 小时，Fable 5 暴露了最强 AI 模型的安全困境继 EP84 Simon Willison 对 Fable 5 的惊艳初体验、EP85 开发者实测「1770% 性能提升」的兴奋之后，这款最强模型在发布 72 小时内经历了从轰动到被美国政府出口管制禁令强制下线的完整生命周期。文章还原 Pliny 团队如何用 Unicode 同形字替换和「分解-重组」攻击突破 Fable 5 的分类器降级安全架构，并指出 Amazon 在禁令背后兼具投资人与安全预警源的复杂角色——当 Constitutional AI 的发明者也守不住自己的宪法，整个行业的安全承诺都面临拷问。来源：腾讯科技 https://www.bestblogs.dev/article/18f89448 [3] ★ 精讲｜图灵奖得主，要用十亿美金赌 AI 的下一个十年（上集） EP80 早报曾以「世界模型」作为 AI 下一阶段的全面分析框架，本文带来这条路线最重磅的下注者：图灵奖得主 Yann LeCun 系统批判了主流 LLM 路线的局限——只是统计预测，缺乏对物理世界的因果建模能力——并详细拆解了他押注约 10 亿美元、以 JEPA（联合嵌入预测架构）为核心的非生成式世界模型替代方案，认为这才是通向真正智能的路径。来源：十字路口 Crossing https://www.bestblogs.dev/article/572cef4c [4] 软件架构指南 Martin Fowler 的这篇文章阐述了他对软件架构的看法，将其定义为系统设计中“重要的东西”，并作为其网站上海量应用架构与企业架构资源的精选指南。来源：Hacker News https://www.bestblogs.dev/article/6ce856e6 [5] 成功产品背后的隐藏模式：先证明、再改进、最后测试新意 [视频] Mark Pincus 解释了为什么成功产品通常不是凭空原创，而是从已经被验证的用户行为出发，加入一个明显更好的改进，并在希望变成执行浪费之前快速测试新想法。来源：Lenny's Podcast https://www.bestblogs.dev/video/4540937 [6] 艾伦·J·佩利斯的《编程格言》艾伦·J·佩利斯的 120 条格言集，提炼了关于编程、软件工程和计算本质的深刻、往往自相矛盾的真理。来源：Hacker News https://www.bestblogs.dev/article/d99a4600 [7] 形式化方法与编程的未来 Jane Street 从对形式化方法的怀疑转向兴奋，驱动力是智能体编程降低了成本并增加了收益，目前正在组建专门团队。来源：Hacker News https://www.bestblogs.dev/article/c15f7953 [8] Kubernetes 上并发 LLM 智能体的 GPU 时间切片本文通过实验证明，Kubernetes GPU 时间切片会隐藏对延迟敏感的智能体严重的尾延迟恶化问题，其中一个小型工作节点的 p99 延迟飙升 66%，而中位数和吞吐量几乎不变。来源：Towards Data Science https://www.bestblogs.dev/article/07cfce6d [9] 为啥 Codex 还不推出类似 Codex Design 的产品？本文从模型与 Harness 两层架构的区分出发，解释 Codex 不推出类似 Claude Design 产品的原因是 GPT-5.5 模型能力不足，无法同时胜任 UI/UX 设计与系统架构设计。来源：宝玉的分享 https://www.bestblogs.dev/article/c3e760eb [10] 全网首份指令级拆解：看华为昇腾 950DT 芯片如何撬动 DeepSeek 75%降价与字节锁单本文基于 SemiAnalysis 的 Trace 级拆解报告，深度解析华为昇腾 950DT 芯片的架构设计、CANN 软件栈优化，以及其如何与 DeepSeek V4 协同设计，实现低成本、高并发推理，并推动国产芯片生态的关键转变。来源：InfoQ 中文 https://www.bestblogs.dev/article/8da23f49 --- http://BestBlogs.dev · 发现真正适合你的高质量内容 BestBlogs 是 AI 驱动的私人阅读助手，帮助你建立稳定、可信、个性化的高质量信息输入。关注你感兴趣的来源和主题，每天生成一份更适合自己的「我的早报」。在线阅读：https://www.bestblogs.dev/explore/brief/2026-06-15

译本早报涵盖多项AI与技术动态。Fable 5发布72小时内被美国政府出口管制禁令强制下线，Pliny团队利用Unicode同形字替换和“分解-重组”攻击突破其分类器降级安全架构。SpaceX上市估值7800亿美元，复盘24年历程，隐含15年41.5%年增长率，Google曾签每月9.2亿美元云服务协议。图灵奖得主Yann LeCun系统批判LLM缺乏因果建模，押注约10亿美元开发JEPA世界模型。华为昇腾950DT芯片与DeepSeek V4协同实现低成本高并发推理，推动推理降价75%，字节已锁单。

Berryxia.AI@berryxia · 6月15日79

世界真的就是“草台班子”… 一个电话就给你Fable 5 下架！亚马逊CEO一通电话，直接把Anthropic的Fable模型给搞下架了，白宫24小时内就祭出出口管制。上周四Jassy向特朗普政府反映Fable存在jailbreak风险，周五上午白宫一群人开会，下午就疯狂给Dario Amodei打电话。 Dario还在健康疗养（Anthropic后来否认），但不管怎样，他跟Bessent、Lutnick他们聊了三通电话，试图解释guardrails和universal jailbreak的区别。结果人家完全不吃这套，直接要求把模型下架。 Dario要时间、要更多信息，人家一句“你这决定很糟糕”。当天晚上出口管制就下来了。白宫官员说：“我们求了几个小时让他们配合，最后没办法才出此下策。” 这事最离谱的地方在于，亚马逊作为Anthropic的大股东和合作伙伴，居然先跑去告状，而不是直接跟他们沟通。政府介入的速度也快得离谱，基本就是“发现问题→要求下架→不听就直接封”。以前大家觉得AI公司是自己玩自己的，现在突然发现，当模型足够强、漏洞足够敏感的时候，大公司+政府联合出手的速度，比任何技术迭代都快。这波操作把AI监管的真实权力结构给暴露得清清楚楚。都特么是草台班子… 你觉得这算是政府在保护国家安全，还是大公司借政府之手打压竞争？详情见评论👇

译上周四，亚马逊CEO Andy Jassy向特朗普政府反映Anthropic的Fable模型存在jailbreak风险。周五上午白宫开会后密集联系Anthropic CEO Dario Amodei，当时他正在疗养。下午Amodei与Bessent等人进行三通紧张电话，试图区分guardrails与universal jailbreak，但政府不为所动，要求立即下架。Amodei请求更多时间被拒，Bessent直言“决定很糟糕”。当晚特朗普政府即实施出口管制。白宫官员称“求了几个小时配合无果”。亚马逊作为大股东先告状而非直接沟通，暴露了AI监管的真实权力结构。

Chubby♨️@kimmonismus · 6月15日75

A deal has been struck with Iran, and the Strait of Hormuz is open again. This should trigger a very positive reaction from the stock market. Nevertheless, the question remains how the NASDAQ, in particular, will react to the US authorities' intervention in Anthropic. Therefore, Monday will be doubly exciting: a huge rally or panic over Anthropic?

译美国当局干预Anthropic，出口管制导致其顶级模型Mythos和Fable下线。Anthropic紧急派高级技术人员赴华盛顿，试图说服官员模型可被安全控制，成为AI地缘政治实时测试案例。市场关注周一反应：伊朗协议推高股市，但Anthropic事件可能引发大幅上涨或恐慌。

Chubby♨️@kimmonismus · 6月15日64

Keeps getting worse: It seems that the Chinese government ("China-linked group") had access to Claude Mythos, which is what triggered the whole situation surrounding Fable 5 on Friday. Via The Verge But the China angle is still unconfirmed: David Sacks publicly focused on jailbreak risks, not China, and Anthropic says the White House did not raise Chinese access in its discussions with the company.

译因出口管制，Anthropic的顶级模型Mythos和Fable被强制下线。公司已派遣高级技术人员飞往华盛顿，试图修复与白宫的关系并说服官员模型可以被安全控制。有未经证实的指控称中国相关组织曾访问Mythos，但David Sacks公开关注的是破解风险而非中国，Anthropic也表示白宫未提及中国访问。此事触发了周五围绕Fable 5的事件。

Chubby♨️@kimmonismus · 6月15日76

Just now: Anthropic is flying senior technical staff to Washington to repair its fight with the White House after export controls forced its top models, Mythos and Fable, offline. The company is now trying to convince officials that the models can be safely controlled, turning this into a real-time test case for AI geopolitics. Via Axios Monday is getting more interesting by the minute.

译由于出口管制导致其顶级模型 Mythos 和 Fable 被强制下线，Anthropic 紧急派遣高级技术人员前往华盛顿，修复与白宫的冲突。公司正努力说服官员这些模型可被安全控制，此事成为 AI 地缘政治的实时测试案例。据 Axios 独家报道。

Nathan Lambert@natolambert · 6月15日42

Threading the needle in this post of anthropic has done some bad things for AI governance & the discourse but the actions of this administration are way worse so we need to get a handle on it before stronger models, open or closed, come along soon. https://www.interconnects.ai/p/welcome-to-the-agi-era-of-ai-governance

译串联本文的要点：Anthropic在AI治理和公共讨论方面做过一些坏事，但本届政府的行动糟糕得多，因此我们必须在更强大的模型（无论是开源还是闭源）很快出现之前控制住局面。 https://www.interconnects.ai/p/welcome-to-the-agi-era-of-ai-governance

AI Notkilleveryoneism Memes ⏸️@AISafetyMemes · 6月15日50

TLDR: They lied. Again. Dario was NOT at a wellness retreat. "I was at Anthropic's HQ on Friday reporting when this all unfolded. Dario is not at a wellness retreat."

译Anthropic CEO Dario Amodei 被白宫散布谣言称其在健康静修（wellness retreat）无法联系。作者（AI Safety Memes）周五在 Anthropic 总部亲眼看到 Dario 并未离开，直指白宫再次撒谎。引述推文指出，Anthropic 是过去两年推动 AI 进步最猛的美国 AI 明珠，Dario 因不愿完全听从联邦摆布而遭苏联式宣传打压。批评此举自毁长城，削弱美国对抗中国的 AI 竞争力。

Chubby♨️@kimmonismus · 6月14日24

Tomorrow will be an exciting day. -Will Fable-5 be released again in a modified form? -How will the market react to the US regulation? -What is the situation regarding Anthropic's valuation? I don't think I've often been as excited as I am for tomorrow. History is written and 99% of people dont even understand.

译明天将是激动人心的一天。 -Fable-5会以修改形式再次发布吗？ -市场会如何应对美国监管？ -Anthropic的估值情况如何？我觉得我很少像对明天这样兴奋。历史正在被书写，而99%的人根本不理解。

meng shao@shao__meng · 6月14日60

Anthropic 内幕：近万亿美元 AI 巨头的「安全优先」与权力博弈 | The Circuit Dario Amodei 仍坚持：“AI 可能在 1–5 年内消除约 50% 初级白领岗位”和“支持对华芯片出口管制”，Anthropic 试图在指数级技术、地缘政治、商业竞争与公众焦虑之间走钢丝。 Bloomberg 对 Anthropic 的深度纪录片，采访了联合创始人 Dario & Daniela Amodei 兄妹，以及 Claude Code 负责人 Boris Cherny，采访者是 @emilychangtv，视频发布于 6.10（Claude Fable 5 被美国政府下线前两天），这个时间点很微妙，在 Fable 5 被禁后再回头看，更有趣。 https://www.youtube.com/watch?v=v1wZwxY3CMg&t=1s # 公司定位：从 OpenAI 出走到行业领跑者起源 · 2021 年，7 位 OpenAI 核心成员（含 Amodei 兄妹）因信任与价值观分歧离开，在旧金山 Precita Park 草皮上讨论创业方向。 · Dario 在 OpenAI 提出 Scaling Laws（算力+数据→模型变强），为 ChatGPT 铺路；Daniela 负责运营，把 Dario 的「宇宙级想法」落地。现状 · 估值约 9650 亿美元，2026 年 Q1 年化增长约 80 倍，API 调用量同比 17 倍。 · 首次盈利，主要靠 Claude Code / Cowork 等企业工具，而非消费级应用。 · Dario 用「平滑指数曲线」形容：长期看似无变化，然后突然爆发。战略选择刻意避开广告驱动的消费 AI（类比社交媒体的成瘾与「slop」），押注企业场景：制药、能源、科研等，认为商业模式与价值观更一致。 # Claude 的产品哲学 Constitution（宪法）：用 UN 人权宣言等跨文化价值训练模型行为。 Professional Warmth：专业但不冷漠，不是「最好的朋友」，也不是冷冰冰的计算器。安全三轴：不撒谎（含幻觉与蓄意欺骗）、无害、价值观对齐。早期 Claude 曾过于「保姆式」（问天气也过度担心），后通过精细调参修正。 # 技术冲击：代码革命与就业焦虑 Claude Code 的变革 · Boris Cherny：团队 6 个月 100% 代码由 Claude 编写，可同时运行数百至数千个 Claude 实例。 · 工程师角色从「手写代码」转向「规划、与用户沟通、定义方向」。市场震荡 · Cowork 发布引发「SaaSpocalypse」，单日约 2850 亿美元软件股市值蒸发。 · Dario 判断：软件行业整体会变大，但不适配者会被淘汰。就业预测（视频中最具争议的部分） · Dario 维持此前判断：AI 可能在 1–5 年内消除约 50% 初级白领岗位。 · 可能出现 GDP 高增长 + 高失业/低薪 + 高不平等的组合。 · 自动化路径：先替代 90% 任务→人效 10 倍→最终接近 100% 替代。 · 对策方向：UBI、对 AI 公司累进税、向物理制造、人际服务（如医疗中的 bedside manner）转移。 · Dario 反驳 Jensen Huang「混淆任务与岗位」的批评，称完整论述见其文章 The Adolescence of Technology。 # 五角大楼冲突：红线与代价背景 · 2025 年，Anthropic 与 OpenAI、xAI、Google 共同获得 2 亿美元国防部合同。 · Claude reportedly 用于委内瑞拉抓马杜罗等行动；Bloomberg 称其在伊朗战争中通过 Palantir Maven 做 AI 辅助目标识别。红线 Anthropic 拒绝： · 大规模监控 · 完全自主致命武器后果 · 国防部要求「无护栏全面使用」，遭拒后被列入黑名单；Trump、国防部长 Hegseth 公开批评 Dario 为「意识形态疯子」。 · Dario 回应：这是关于政府如何正确使用 AI 的辩论，而非单纯对抗；希望建立先例。战争伦理的尖锐追问 · 美国官员称 LLM 帮助军方目标识别从 1000/天 → 5000/天。 · 2026 年 2 月，伊朗一所女子学校遭导弹袭击，150+ 儿童死亡；Dario 称不清楚 Claude 是否参与，但强调「人类做最终决策」是其红线之一。 · 他承认军事决策仍会出错，但认为整体 net positive；若无限制，AI 战争更可能引发而非阻止大国冲突（引用《奇爱博士》的自动反击风险）。地缘政治立场 · 支持对华芯片出口管制（类比不向朝鲜出售核武器）。 · 从 Caltech 反战立场，转向支持国防：俄乌、台海风险使「威权集团 resurgence」需应对。 · 否认与 ICE、CBP、加沙相关合作；与 Palantir 合作但声称严格限定范围。 # Mythos： withheld 的网络「超级武器」模型能力 · Claude Mythos：在主流操作系统中发现数千个高危漏洞（含 27 年 OpenBSD、16 年 FFmpeg、Linux 内核提权链等）。 · 早期测试方称其为「超级武器」，要求 Anthropic 不要发布。 Project Glasswing · 仅向 AWS、Google、Microsoft、CrowdStrike 等可信防御方开放，用于修补而非攻击。 · 即使 NSA 等联邦机构也争相接入——尽管 Anthropic 已被 Pentagon 拉黑。核心困境 · Dario：未来是攻防猫鼠游戏，好人需先有工具；坏人迟早也会拥有类似能力。 · Emily Chang 追问：谁有权决定谁能获得这种力量？ Daniela 承认决策复杂、可能不完美，但强调出于网络安全特定担忧，而非泛化的权力分配。 · Dario 称 withheld Mythos 商业上损失惨重，反驳「安全营销」说法。 # 治理与信任：能否当「好人」？监管主张 · AI 是首个私营部门主导、政府滞后的颠覆性技术（对比核武、互联网、GPS）。 · 呼吁发布前强制第三方测试（网络安全、生物武器、失控风险等），类比 FAA 对客机的认证。 · 批评硅谷在「极端反监管」与「国有化 AI」之间摇摆，主张适度、持续的监管。信任危机 · 公众：更担忧而非兴奋，认为风险大于收益；Anthropic 办公室外有抗议。 · Dario：从不信任出发是理性的；Silicon Valley 需重新赢得信任，「不同」须靠行动证明。 · 自比 Leo Szilard（核链式反应构想者），视 Oppenheimer 为失败案例——需 checks and balances，而非个人英雄主义。 · 给出 10–25% 文明崩溃概率；Anthropic 约一半工作用于降风险，但无法保证零风险（类比更安全的航空公司仍无法承诺永不坠机）。社会媒体教训 · Daniela：AI 行业是 social media 之后的第二次机会，应 proactively 思考儿童福利、心理健康、选举 integrity，而非事后辩解。 · 若出现重大事故，AI 可能被禁——「也许理应如此」。

译Bloomberg纪录片揭秘Anthropic：坚持“安全优先”，拒绝国防部无护栏要求被拉黑；Claude Code团队6个月100%代码由AI编写，Cowork发布致单日2850亿美元软件股市值蒸发。Dario维持预判：AI 1–5年内消除约50%初级白领岗位，并给出10–25%文明崩溃概率。被限制模型Mythos发现数千高危漏洞。Anthropic支持对华芯片出口管制，呼吁发布前强制第三方测试。

Chubby♨️@kimmonismus · 6月14日57

Holy, Dario really hasn't made any friends lately.

译三个月前，美国国防部将Anthropic永久赶出大楼，并称此举正确。 Kim 感叹：Dario最近真是没交到什么朋友。

Chubby♨️@kimmonismus · 6月14日75

New Politico reporting fills in the 24 hours behind the Fable 5 / Mythos 5 shutdown, and it's messier than the press releases. And two sides contradict each other. - The first alarm to the White House came from Amazon CEO Andy Jassy (The Information confirmed that yesterday), who flagged that Fable's guardrails could be bypassed. - He was answering a government request for feedback, and per Politico he wasn't the only one. By Friday it had reached Bessent, Cyber Director Cairncross and Commerce Sec Lutnick, who pulled Amodei into three calls. - From there the two sides don't agree on anything. The White House says export controls were a last resort after hours of trying to get Anthropic to cooperate. -Anthropic's camp says it got a 90-minute deadline to kill the models, no threat detail, no offer to work it out. Officials were reportedly stunned that Amodei, who has compared his own tech to a nuclear bomb, wouldn't pull it over a known hole. I assume that Anthropic will release more information on Monday and further strengthen their position.

译Politico披露，Amazon CEO Andy Jassy周四向白宫报告Anthropic的Fable模型guardrails可被绕过。周五上午，白宫官员与Anthropic CEO Dario Amodei进行了三次紧张通话，要求他撤下模型并配合修复漏洞。Amodei要求更多时间与信息，未承诺撤下。当晚特朗普政府直接实施出口管制。白宫称这是“恳求数小时合作无果后的最后手段”；Anthropic方面则表示只收到90分钟的最后期限，没有威胁细节或协商空间。

Chubby♨️@kimmonismus · 6月14日82

Calling it now: if this turns out to be true, he won’t remain Anthropic CEO for much longer. However, Anthropic denies it.

译Politico新报道披露Anthropic关闭Fable 5/Mythos 5模型的幕后细节，双方说法矛盾。亚马逊CEO Andy Jassy首先向白宫报警，称模型护栏可被绕过。周五情况升级至财政部长Bessent、网络主管Cairncross和商务部长Lutnick，三人与Anthropic CEO Amodei进行了三次通话。白宫称出口管制是最后手段，而Anthropic声称仅获90分钟截止期限，未被告知威胁细节，也无协商机会。官员们对Amodei曾将自家技术比作核弹、却因已知漏洞不主动撤回模型感到震惊。Anthropic否认了关于CEO将离任的预测。

Yuchen Jin@Yuchenj_UW · 6月14日48

One hypothesis: If non-citizens at Anthropic can’t work on Mythos/Fable, and LLM jailbreaks remain unsolved, US frontier labs will be forced to slow down training and model releases. Could Chinese open-source AI surpass US closed models for the first time in ~6 months?

译一个假设：如果Anthropic的非公民不能参与Mythos/Fable项目，且LLM越狱问题仍未解决，美国前沿实验室将被迫放缓训练和模型发布。中国开源AI是否会在约6个月内首次超越美国闭源模型？

Peter Steinberger 🦞@steipete · 6月14日45

Got a PayPal verification text and thought I been hacked, but it was just codex signing up for a web service it needed.

译收到一条PayPal验证短信，以为被黑客攻击了，结果只是codex在注册它需要的网络服务。

小互@xiaohu · 6月14日75

Anthropic 上市前夕彭博社采访了Anthropic 公司俩兄妹，在这次采访中（Fable 5 还没有被封杀）Dario Amodei极度的渲染了Mythos的威力和AI的威胁当然这也是他一贯的主张，呼吁政府对AI监管，当然他呼吁的是对所有公司监管... 下面是一些采访片段剪辑（完全由Claude Code 翻译并剪辑） • 一个强到自己都不敢发布的模型 Mythos：上千个漏洞,能黑银行、撬国家机密,连 NSA 都抢着要用 • Dario 预言:AI 可能一到五年内，砍掉一半入门级白领工作 • Claude 被美军用进了对伊朗的战争，一所女校 150 人死亡的拷问 • 他头一次说清为什么离开 OpenAI：不是安全分歧,是信任崩了 • 当面回怼黄仁勋的"末日营销":把这说成廉价营销,本身才是廉价营销 • 文明崩溃概率 10% 到 25%，他拿"飞机会不会坠毁"给你算账

译Anthropic CEO Dario Amodei透露内部模型Mythos有上千漏洞，能黑银行、窃取国家机密；预言AI一到五年内砍掉一半入门级白领工作；称Claude已被美军用于对伊朗战争，涉及女校150人死亡拷问；解释离开OpenAI因信任崩塌；回怼黄仁勋末日营销指控；给出文明崩溃概率10%-25%。

宝玉@dotey · 6月14日46

模型是根本，Harness层相对好补齐，但Harness这层不需要太多垂直领域的，Claude Design 很快就会合并到 Claude Desktop，Codex 在下一代或者几代模型能力够了后，会在 Codex App 直接以 Plugin 集成 Codex Design

译模型能力是根本，Harness层相对容易补齐且无需过多垂直领域。Claude Design将很快合并至Claude Desktop。未来模型能力足够时，Codex会在Codex App以Plugin集成Codex Design。针对开源Open Design方案，若使用Claude Code的模型能否达到类似工程能力？这是该讨论中提出的问题。

宝玉@dotey · 6月14日74

举一个具体的用 Claude Design 更新设计和代码的例子我有一个视频字幕编辑器工具，是 Claude Design 做的设计，之前标题文字和下面的信息是放在一行，标题一长就放不下，于是我就让它变成两行。图1 是我在设计稿上做的修改，修改好了后导出下载 zip 文件，放到项目中，用 git diff 很容易看到做了哪些变更（图2）然后一句简单的提示给 Claude Code： > 参考设计稿 design 目录下的相关变更，对 UI 进行变更 Claude 自己通过 git diff 去分析变更，然后找出所有设计稿修改了的位置，自己帮我修改了相应的 Swift 代码，任务完成！（图4是修改后的效果）全程我主要是在 Claude Design 上修改，然后需要手工去同步一下。

译宝玉分享了 Claude Design 与 Claude Code 联动的实际案例：在 Claude Design 上修改字幕编辑器 UI 设计稿后，导出 zip 并用 git diff 查看变更，然后通过一句提示让 Claude Code 参考设计目录变更自动修改 Swift 代码，全程只需手动同步设计文件。他解释为何 Codex 没有类似产品：Claude Design 依赖 Claude Opus 4.8 模型同时具备 UI/UX 设计和系统架构设计能力，能一次性交付可交互原型（含数据结构、状态管理、交互逻辑）；而 GPT-5.5 还做不到。Harness 层可复制，模型层才是关键门槛。

Rohan Paul@rohanpaul_ai · 6月14日78

Reuters: Amazon’s Andy Jassy was among the people who warned senior Trump officials this week about security concerns around Anthropic’s newest Fable 5. Amazon researchers pushed Fable 5 with a string of prompts and got it to spill cyberattack-helping information it was not supposed to share. --- reuters .com/business/retail-consumer/amazon-voiced-concerns-about-anthropic-ai-models-before-us-governments-crackdown-2026-06-13/

译路透社报道，亚马逊CEO Andy Jassy本周向特朗普政府官员警告Anthropic新模型Fable 5的安全隐患。亚马逊研究人员用一系列提示词成功让该模型泄露了本应拒绝提供的网络攻击帮助信息。此前美国商务部已指令Anthropic关闭Fable 5和Mythos 5，因测试者发现越狱方法。Anthropic回应称该越狱技术狭窄，仅发现少量已知漏洞，其他公共模型也能提供类似能力，并指出当前任何模型提供商都难以实现完美越狱抵抗。

Rohan Paul@rohanpaul_ai · 6月14日75

So Anthropic says now even some of its own employees, who built Anthropic’s most powerful new AI models, Fable 5 and Mythos 5, will not have access to it. The reason is a U.S. government export control directive that treats giving these advanced models to any foreign national (even those working inside the United States) as an illegal “deemed export” on national security grounds. Because Anthropic cannot easily verify every user’s nationality in real time, the company had no choice but to disable the models entirely for everyone, including its own international team members.

译美国政府上周五向Anthropic发出出口管制指令，要求其关闭最强模型Fable 5和Mythos 5。起因是有人发现越狱方式，能让模型提供本应拒绝的网络安全帮助。商务部长Howard Lutnick称，该模型将对美国境外及境内外国公民实施出口限制，直至国家安全系统加强（可能数周内）。Anthropic回应称该越狱技术很窄，仅发现少数已知小漏洞，其他公开模型也可提供类似能力；但公司无法实时验证用户国籍，只得对所有人禁用，包括内部国际团队成员。Anthropic还表示当前行业无法实现完美越狱抵抗，所有防护对非通用越狱均脆弱。

Rohan Paul@rohanpaul_ai · 6月14日75

👀 Hope Fable 5 and Mythos 5 comes back soon.

译Anthropic本周发布Mythos类模型，商业名Fable（带安全护栏）。高度可信的合作方发现越狱漏洞，美国政府要求CEO Dario Amodei修复或下架模型。Anthropic拒绝，认为漏洞不严重，政府因此实施出口管制。David Sacks透露，行政当局希望Anthropic尽快修复以解除管制、恢复公开，并对Anthropic此前以安全为先、如今却拒绝配合表示困惑。主推文作者希望Fable和Mythos早日回归。

Chubby♨️@kimmonismus · 6月14日70

There are only two possibilities: Either a solution is quickly found next week that somehow explains to the market how enterprises can continue to access Anthropic's best models in the future, in agreement with the US government, or: We foresee a rapid decline in the valuation of Anthropic and Dario Amodei, who has seriously miscalculated his dealings with the US government and, at the same time, the rapid success of OpenAI compared to Anthropic. The upcoming Anthropic IPO will be particularly important in this context. Everything will be decided next week.

译亚马逊CEO Andy Jassy向特朗普政府高级官员报告Anthropic最新Claude模型的安全风险，帮助触发对Mythos 5和Fable 5的深夜出口限制。分析师Kim指出两种可能：下周要么找到方案让企业继续访问Anthropic最佳模型并与美国政府达成一致；要么Anthropic估值快速下滑，Dario Amodei严重失算，OpenAI迅速崛起。关键节点在下周。

elvis@omarsar0 · 6月14日71

http://x.com/i/article/2065876120965111808 # Autonomous Long-Running Coding Agents Autonomous coding is moving from better prompting to better control systems. The important shift is that engineers are learning how to wrap agents in goals, evaluators, loops, and artifacts that let them keep working after the human stops typing. This matters because most serious engineering work spans long horizons: ambiguous requirements, hidden constraints, partial failures, changing context, and repeated verification. The new frontier is designing the system around the agent so it can plan, execute, check its work, recover from mistakes, and keep making progress without constant human steering. This piece is based on a DAIR.AI Academy session on autonomous long-running coding agents, where I walked through Claude Code's /goal mode, the newer /loop command, verifiers, artifacts, and orchestration patterns in practice. Written in collaboration with Codex and Claude Code. ## From Prompting to Goal Design The core idea behind features like Claude Code's /goal is simple. A coding agent remains the executor, but the human no longer interacts with it turn by turn. Instead, the human specifies the desired end state, the evidence required to prove success, the constraints that must not be violated, and, where possible, the number of turns and budget. That goal works more like a contract than a longer prompt. A weak goal gives the model room to stop early, take shortcuts, or redefine success in a way that looks plausible in the transcript but fails in the real system. A strong goal gives the agent a target it can repeatedly measure itself against. Engineering judgment still matters here. The best goals encode domain knowledge that the model would otherwise guess. For a research experiment, that might mean a target benchmark score, a held-out evaluation, a required loss curve, and a rule that the result must beat an initial baseline. For a UI task, it might mean a screenshot reference, concrete layout constraints, and a browser verification step. The model can execute, but the human still defines what "done" actually means. ## The Evaluator Becomes a First-Class Component Long-running agents need a second role besides the goal. That evaluator can be another coding agent, an LLM-as-judge, a script, a test suite, a benchmark harness, or a mix of all of them. The key design choice is matching the evaluator to the task. When success is crisp, deterministic checks are better. Type checks, unit tests, lint rules, integration tests, and benchmark scripts should be used whenever they can express the condition clearly. When success is fuzzy, an agent evaluator becomes useful. A script can tell you whether tests pass, but it cannot easily decide whether a generated research report is coherent, whether an implementation faithfully follows a paper, or whether a UI matches a design intent. This is where the evaluator benefits from language, judgment, and sometimes vision. The practical pattern uses deterministic checks as the floor and agent evaluation as the higher-level review. That combination reduces hallucinated success while still allowing autonomy on tasks that do not fit cleanly into a test assertion. ## Verifiers Define the Boundary of Trust The deeper point is that autonomy only works when the system has a reliable verifier. A coding agent can generate a plan, implement a feature, and explain why it believes the work is complete, but that explanation should not be treated as evidence. Evidence comes from an external check that the agent cannot easily talk its way around. For code, the verifier might be a test suite, type checker, benchmark, browser run, screenshot comparison, or reproducible script. For research work, it might be a held-out evaluation, a reproduced table, a loss curve, or a benchmark score that improves over the baseline. For design work, it might be a reference screenshot plus a visual review step. The verifier is what turns a long-running agent from a confident text generator into a system that can be trusted with more time. Most shortcuts appear at this boundary. If the verifier is vague, the model will often satisfy the easiest interpretation of the task. If the verifier is too narrow, the model may overfit to it and miss the broader intent. A good autonomous workflow, therefore, needs layered verification, with cheap deterministic checks catching basic failures and higher-level review catching judgment-heavy failures. A few of the frontier models can already achieve some level of verification, but based on my research, there is still an evident OOD problem, where if the verification task you assign to the agent falls outside the training distribution, models struggle significantly. Verifiers are still an open area of research, but I anticipate more companies will start to make huge investments in this area. The concept of fine-tuned verifiers is also in high demand in the enterprise. ## Loops Make Autonomy Durable A goal gives the agent direction, but a loop keeps the work alive. This distinction is important because models often stop before the real task is finished. They may hit a turn limit, lose confidence, exhaust context, or decide that a partial solution is enough. The loop is the outer control system. It wakes up, inspects progress, runs checks, compares the result against the goal, and sends the agent back in with the next instruction when the goal has not been met. In its simplest form, this is the Ralph loop pattern with a coding agent and a deterministic condition. In a more flexible form, the loop includes an evaluator agent that can reason about progress and decide what should happen next. Long-running autonomy works as repeated effort under supervision from a control layer, not as one continuous act of intelligence. The agent can still fail, but the loop gives the system a way to notice the failure and continue instead of silently declaring victory. ## Planning Is Where Expertise Enters One of the strongest themes from the session was that planning remains critical. You can ask a frontier model to generate a plan, but you still need to inspect it, challenge assumptions, and make the success criteria sharper before handing the task to an autonomous loop. This leads to a useful division of labor. A stronger planning model can help define the goal, identify missing constraints, and structure the evaluation. A different execution model can then run the implementation once the plan is clear. In practice, this means engineers should stop thinking of "the model" as a single choice. Model choice becomes an architecture decision. Some models are better planners. Some are better executors. Some are cheaper evaluators. Some are better at vision-based review. A good orchestrator lets you swap these roles instead of waiting for one vendor to provide the perfect coding agent interface. ## Visual Artifacts Become Control Surfaces Terminal transcripts do not scale when many agents are running. Once you have several sessions working in parallel, raw text becomes a poor interface for understanding progress. Live artifacts matter because a dashboard with loss curves, benchmark scores, task states, screenshots, cost estimates, and recent decisions gives the human a much better way to supervise autonomy. The artifact becomes the control surface for deciding when to intervene, rather than a report generated after the fact. The most useful pattern is to separate storage from presentation. Markdown or a vault can store durable evidence, logs, notes, plans, and results. HTML artifacts can render that state into something visual and interactive. The agent can search the Markdown, while the human can monitor the artifact. For UI and product work, visual cues are especially powerful. A screenshot reference can communicate design intent more precisely than prose, and a vision-capable evaluator can compare the implementation against that reference. This reduces the common failure mode where the agent technically implements the requested component but misses spacing, hierarchy, alignment, or product feel. ## Session Mining Turns Usage Into Memory Another important insight is that past agent sessions are a rich source of workflow data. If an agent repeatedly fails in the same way, forgets to run the same check, uses the wrong path, or retries the same broken command, that pattern should not stay buried in logs. Session mining turns those transcripts into operating rules. An agent can scan the last thirty days of work, find recurring failure modes, and propose updates to project instructions, vault learnings, or agent rules. This is how a team can gradually improve its harness without manually remembering every mistake. The goal is to make the local environment smarter without training a model from scratch. A small rule in an agent instruction file can prevent repeated failures across future sessions, especially when the rule is specific to the project. ## A Practical Operating Model For AI engineers, the emerging workflow looks like this. - Start with a small, cheap subset before launching the full autonomous run. - Write a goal with measurable success criteria, explicit constraints, and a turn or time budget (where possible). - Separate the executor from the evaluator so implementation and judgment are not collapsed into one role. - Define external verifiers before the long-running loop starts. - Use deterministic checks wherever possible, then add agent review for fuzzy criteria. - Require proof artifacts such as logs, screenshots, benchmark curves, or changed files. - Mine past sessions and promote repeated lessons into project instructions. That is the difference between using a coding agent and engineering an autonomous coding system. One gives you a conversation. The other gives you a harness. ## What Still Breaks None of this removes the hard problems. Agents still take shortcuts. They still stop early. They still overestimate completion. They still produce confident but weak plans, especially on recent papers, unfamiliar benchmarks, or systems outside their training distribution. Trusting them more will not solve this. Better control systems will. Goals, loops, evaluators, deterministic checks, visual artifacts, and session memory are all ways of making autonomy observable and correctable. The direction is clear. The future of coding agents depends on better orchestration around more capable models, where engineers design the conditions under which agents can safely run for hours or days and still produce work that can be verified.

译长期运行编码智能体核心从提示转向控制系统。Elvis Saravia在DAIR.AI Academy session中详解Claude Code的/goal模式：人类指定最终状态、成功证据、约束与预算，目标作为“合同”而非长提示。评估器成为第一类组件——明确任务用确定性检查（测试、lint、基准），模糊任务用智能体评估器（判断报告、UI设计），两者结合降低幻觉。验证器定义信任边界：外部检查（测试套件、类型检查、浏览器运行、截图对比）提供不可绕过的证据。

宝玉@dotey · 6月14日51

为啥 Codex 还不推出类似 Codex Design 的产品？ Anthropic 最近推出了 Claude Design，是我除了编程之外用得最多的 Agent，也推荐过很多次。效果真的好：你用一句话描述想要的 App，它直接给你生成一个可交互的原型，点哪哪都有反应，不仔细看还以为在操作真实的 App。有网友问：为啥 Codex 还不推出类似 Codex Design 的产品？简单来说，GPT-5.5 的模型能力还做不好这件事。但要解释清楚为什么，得先理解一个关键区分。【1】Agent 的两层：模型和 Harness 很多人把 Codex、Claude Design 和 GPT-5.5、Claude Opus 4.8 混在一起说，其实它们是完全不同的两层。 Claude Design 和 Codex 是"产品层"，业界叫 Harness，包括提示词、工具链、UI 交互流程这些工程层面的东西。Claude Opus 4.8 和 GPT-5.5 是"模型层"，是真正干活的大脑。打个比方：Harness 是厨房，里面有锅碗瓢盆（工具）和菜谱（Skills），模型是厨师。同一套厨房，换个厨师，做出来的菜完全不一样。理解了这个区分，后面的事情就好说了。【2】Harness 不是门槛 Claude Design 的 Harness 层技术上不复杂。花点心思逆向一下，提示词、工具代码几乎都可以拿到。我已经做过了，成果在 baoyu-design（https://github.com/JimLiu/baoyu-design），可以借助 Skill 把 Claude Design 在其他模型上运行。工程上没秘密。真正拉开差距的是背后的模型。【3】高精度可交互原型，难在模型 Claude Design 这个名字容易让人误解，以为交付的是 Figma、Photoshop 那样的静态设计图。实际上它交付的比 Figma 更进一步，是融合了设计稿和原型的高精度可交互原型：你不光能看到设计，还能直接上手操作。这对模型的要求很高。举个例子。我要做一个类似 X/微博的客户端。让模型画一个好看的静态界面，很多模型都做得到。但要让这个界面能交互就复杂了：切换不同 Timeline，展示不同类型的推文（文本、图片、视频），点赞要变红心，删推要从列表消失，从列表点进详情再返回，状态还要保持住。要做到这些，模型必须在动手画 UI 之前，先把整套数据结构和状态管理想清楚：tweet 长什么样、timeline 有哪几种、每个按钮当前是什么状态、状态之间怎么联动。这是系统架构设计的活，不是画 UI 的活。 Claude Design 对模型的要求，是同时具备优秀的 UI/UX 设计能力和系统架构设计能力，缺一个效果就大打折扣。这也是为什么我之前反对只产出纯 HTML 的设计稿，那只是静态的 UI 设计，没有融合 UX 交互。有条件的话可以自己测试感受一下。比如用这个提示词： Design a X Client for Mac, similar to Tweetbot for Mac from Tapbots 同样的提示词让 Codex 去做，也能出个东西，能看，也能简单交互。但对比一下就知道差距了：列表能滚动，sidebar 不能点；点赞按钮没反应。来回迭代好几轮，才能达到一个勉强凑合的水平。 Claude Design 做出来完全不一样。从 Timeline 切到通知页，从列表点进详情再返回，全程流畅，状态都保持住了。不仔细看真以为在操作一个完成度很高的 App，虽然数据都是模拟的。 Claude Opus 4.8 显然在设计和架构这类场景上做了大量训练和优化。【4】产出物就是代码去看 Claude Design 的产出物，注意里面的 data.jsx 文件。它把整个设计的数据结构定义得很清晰，基于这个结构模拟了一套完整数据，然后用 React 在这套数据上构建 UI。设计产物本身就是代码（React、CSS、JSON），不是 Figma 或 PSD，任何开发者拿到都能直接看出按钮的圆角、主色、间距，照着自己的技术栈实现就行。后续设计变更？git diff 一看就知道改了什么。设计和开发之间的沟通损耗降到了最低。说得不严谨，应该说设计 Agent 和开发 Agent 之间的沟通损耗很低了。现在都是人在指挥 Agent 去设计，人指挥 Agent 写代码了。【5】怎么用好 Claude Design 很多人不知道该怎么用好 Claude Design，其实有点像 Vibe Coding：有个基本的想法，先让它做一个版本出来，然后通过 Chat 去指挥 Agent 帮你改，调整几个版本你的思路就清晰了。整个调整的过程非常神奇，有一种"言出法随"的感觉，你想让它怎么改它总能给你实现出来。这也是为啥我现在很痴迷用 Claude Design，反馈来得太快太过瘾了。还有一个小技巧：不要说太具体的要求，而是说你的目标是想要什么，让它自由发挥。往往能得到更好的效果，毕竟它训练过几乎所有公共的 UI 设计。回到最初的问题。Codex 不推类似的设计产品，是因为 GPT-5.5 还扛不住这个活。画个好看的界面很多模型都行，难的是在动手之前把数据结构、状态管理、交互逻辑都想清楚，然后一次性交付一个完整的可交互原型。目前只有 Claude 的模型做到了。至于能领先多久，就看 OpenAI 或者其他家后面模型的进化速度了。

译Anthropic推出Claude Design，可用一句话生成高精度可交互原型。网友问为何OpenAI的Codex没有类似产品？关键在模型层差距。Agent分Harness（产品层）和模型层，Harness非门槛（已有开源baoyu-design可复现），真正壁垒是Claude Opus 4.8同时具备UI/UX设计和系统架构设计能力，先定义数据结构、状态管理和交互逻辑再交付完整原型。而GPT-5.5生成的交互效果差。产出物为React/CSS/JSON代码。

Yuchen Jin@Yuchenj_UW · 6月14日73

Anthropic called Mythos dangerous in its own safety statement. That statement is now the reason Fable 5 got banned by the US gov. Surprisingly, “Dario refused.”

译Anthropic本周以商用名Fable发布Mythos类模型（Mythos曾被Anthropic自称为网络武器并呼吁监管）。Fable是带护栏的Mythos。一名高度可信的测试合作伙伴发现了护栏越狱漏洞，美国政府要求CEO Dario修复或下架模型。Dario拒绝，Anthropic发布博客称越狱不严重。美国政府随后对Fable实施出口管制，并表示希望Anthropic修复安全问题后尽快解禁。Dario的不配合与其此前标榜的安全优先形象严重不符。

Chubby♨️@kimmonismus · 6月14日69

Interesting: According to David Sacks’ opinion, the fault lies with Anthropic (specifically CEO Dario Amodei). He argues that: • Anthropic released Fable (Mythos with guardrails) but refused the U.S. government’s reasonable request to fix a confirmed jailbreak that could expose advanced cyber capabilities. • They prioritized keeping the consumer model available over addressing the safety issue, which directly contradicts their long-standing public branding as the “AI safety company.” • The administration only issued the export control reluctantly after Anthropic declined to cooperate, and Sacks emphasizes that the ball is now in Anthropic’s court to remediate the problem. It’s getting more interesting minute by minute.

译据David Sacks爆料，Anthropic本周发布Mythos类模型商业版Fable（带护栏）。一位可信测试方发现越狱漏洞，美国政府要求CEO Dario Amodei修复或下架，Dario拒绝，称漏洞不严重。安全合作伙伴和政府认为该越狱可暴露先进网络能力（Anthropic曾自称Mythos为网络武器）。Anthropic优先保留消费者模型而非修复安全漏洞，与其“AI安全公司”品牌矛盾。美政府不情愿下发出口管制，希望Anthropic修复后解除。

AYi@AYi_AInotes · 6月14日72

有人把《Fable 5》放到了 Pirate Bay 上，3.4TB ，我好奇哪里下载的，这么牛逼？🤔

译亚马逊AI研究员向美国政府举报，声称可攻破Anthropic的Fable5和Mythos5安全护栏。美国商务部长随即下达出口管制指令，迫使Anthropic切断所有用户访问权限。Anthropic认为所谓越狱仅是非通用漏洞，其他公开模型也普遍存在，但规则解释权不在开发者手中。这是特朗普政府第二次施压，此前Anthropic曾拒绝暂缓发布新模型。另有消息称有人已将Fable5以3.4TB大小上传至Pirate Bay。前沿AI竞争已从代码战场转向行政手段。

宝玉@dotey · 6月14日26

小孩子才做选择，成年人全都要

译tinyfool 问：现在你选 Claude Code 还是 Codex？宝玉回应：小孩子才做选择，成年人全都要。

Chubby♨️@kimmonismus · 6月14日68

It was in fact Amazon (CEO Andy Jassy) who reportedly helped trigger the Claude shutdown. Via The Information Amazon CEO Andy Jassy reportedly warned senior Trump administration officials about security risks in Anthropic’s newest Claude models, helping trigger late-night export restrictions on Mythos 5 and Fable 5. "An Amazon spokesperson told The Information: “As a leading cloud provider that serves a large number of private and public sector customers, it’s not uncommon for governments to seek our counsel on potential security risks. When they occur, we don’t share the details of these discussions.”" In other words: Anthropic’s own mega-backer may have played a key role in pushing the government to freeze access to its most advanced models.

译据报道，亚马逊CEO Andy Jassy向特朗普政府高级官员警告Anthropic最新Claude模型的安全风险，触发了对Mythos 5和Fable 5的深夜出口限制。亚马逊回应称政府常就潜在安全风险征求其意见，但不透露细节。有评论指出，亚马逊作为Anthropic最大投资者之一，疑似先破解（jailbreak）Claude模型再向美国政府告密（snitch），导致最先进模型被冻结出口。

AYi@AYi_AInotes · 6月13日48

WTF，Andrej Karpathy 都不能用他们内部的顶级模型了？查了下，Karpathy确实不是美国公民，他是斯洛伐克出生、加拿大长大，后来拿了美国的 EB-1 杰出人才绿卡，也就是永久居民，没有明确依据表明他是美国公民身份

Chubby♨️@kimmonismus · 6月13日56

The next big beneficiary is, of course, OpenAI for two reasons. 1) IPO: OpenAI is concerned that Anthropic would preempt its IPO, resulting in a better valuation. And this was recently a likely scenario. This would have created the image of a second-rate competitor. Now, the question is, how will the ban affect Anthropic's valuation and its upcoming IPO? Who wants to invest in a company that has become persona non grata with US authorities (due to supply chain risk) and may not even be allowed to distribute its best models to enterprises, let alone globally? This will certainly put a significant downward pressure on the valuation. 2) OpenAI has the opportunity to learn from this, to proactively engage in discussions with US authorities to avoid such a disaster in advance, to determine how its model needs to be structured, to obtain the authorities' approval beforehand, and thus essentially use the time to develop a model and secure the necessary authorization to distribute it. OpenAI can learn from this situation and presumably has a better relationship with the US government than Anthropic. Therefore, it was a comparatively successful day for OpenAI. Its biggest competitor suffered a major setback.

译Anthropic最大投资者Amazon据称破解Claude并向美国政府告密，导致Anthropic被美国当局视为供应链风险，可能失去企业分发许可，其估值和IPO面临下行压力。OpenAI成为主要受益者：一方面消除了Anthropic抢先IPO的威胁，另一方面有机会主动与美国当局沟通，提前获得模型审批，从而在竞速中占据优势。

Rohan Paul@rohanpaul_ai · 6月13日51

Higgsfield just announced Higgsfield Games, a prompt-to-multiplayer product that can build and deploy 2D or 3D games with generated characters, props, and settings. Build and deploy multiplayer games from one prompt, in any genre, 2D or 3D. The hard part in any game project was turning an idea into code, assets, physics, multiplayer, and launch, and Higgsfield compresses that into one prompt. Claude Fable 5 reason through the game idea while Higgsfield MCP calls the tools that build characters, props, environments, and playable structure.

译Higgsfield 近日宣布推出 Higgsfield Games，这是一款可从一条提示词直接构建并部署任意类型 2D 或 3D 多人游戏的产品，自动生成角色、道具和场景。该产品由 Claude Fable 5 推理游戏创意，并通过 Higgsfield MCP 调用工具完成资产和物理逻辑构建，将创意转化为代码、资产、多人游戏和发布的全流程压缩为单次提示词操作。用户可通过 Claude 的 MCP 界面或 Higgsfield 超级计算机体验。