Time for Anthropic to rename a slightly worse checkpoint Claude Fable/Mythos 5.1 and say its been nerfed for safety, roll it back out.

译是时候让Anthropic把一个略差一点的检查点重命名为Claude Fable/Mythos 5.1，并声称它因安全原因被削弱了，然后重新推出。

Ethan Mollick@emollick · 6月13日50

I don’t think this is going to result in more open weights models, as I wrote before the Anthropic news, if Mythos-level models are considered risky, China will also not want them to be open. And you can’t build a Mythos-class model without a very regulatable compute footprint

译Ethan Mollick认为，随着Mythos级模型被视作高风险，中国也将监管其发布，且建造该类模型需要大量可监管的计算资源，因此持续开源Mythos级模型并不现实。开源权重（open weights）未来仍会存在，但仅限于非前沿模型。

Berryxia.AI@berryxia · 6月13日23

有幸从0.x开始使用YouMind，不吹不黑真的成长和变化还是肉眼可见。对于用户的需求和期望有认真听取和综合评估取舍。不管做什么产品其实产品驱动应该还是第一要素，产品太烂，各种营销套路上了之后的结果也不会多好。这一块，Y&M做的还真的挺好的。希望，可以在2.0、3.0 看到更多的可能。祝贺YouMind ，祝贺玉伯。🎉

译Berry Xia 在推文中回顾从 YouMind 0.x 版本开始的长期使用体验，认为产品迭代进步明显，团队认真听取用户需求并做出合理取舍。他强调产品驱动应优于营销套路，肯定 YouMind 在这方面表现良好，并期待 2.0、3.0 版本带来更多可能，最后向 YouMind 及创始人玉伯表示祝贺。

Ethan Mollick@emollick · 6月13日41

I wrote this a few months ago right after the Anthropic/DoW conflict & Citrini & Block: “But I think that single week is a good illustration of what the near future will feel like… as the stakes go up, it is likely things will feel even more unstable..” https://www.oneusefulthing.org/p/the-shape-of-the-thing

译我几个月前写的，就在Anthropic/DoW冲突和Citrini & Block事件之后：“但我认为那一周很好地说明了近未来的感觉……随着风险增加，事情可能会感觉更加不稳定..” https://www.oneusefulthing.org/p/the-shape-of-the-thing

Chubby♨️@kimmonismus · 6月13日16

Uff. This meme goes hard.

译更新了 Fable 评估分数。

meng shao@shao__meng · 6月13日23

虽然 Claude Fable 5 被禁，但也挡不住古法编程真的很快要退出历史舞台了，想想这十几年，还是用过一些非主流编程语言的：Cobol、Fortran、Flex、Silverlight...

Orange AI@oran_ge · 6月13日73

今天中午发生了一件让我很震惊的事情，Fable 5 突然在全世界下架了。之前我以为 Fable 5 是普通大众能用上的最好的模型了，没想到美国政府竟然以国家安全为理由要求 Anthropic 下架了这个模型。群里有朋友说，愿意花 1000 美金买一个账号来用这个模型。但很可惜的是，这是全面下架，想买都买不到。这是第一次以政府的力量要求下架一个模型。我知道一定会有这一天，只是没想到会这么快。这就是我最担心的，闭源模型最大的风险。智能，最后变成了一种限购商品。 Token，不是你有钱想买就能买到。开源模型才是世界的希望。国内几个头部的开源模型，DeepSeek、Kimi、GLM，都要加油啊。昨天 Kimi 默默发布了最新的 coding 模型 K2.7 Code。这是当下开源的最好的 coding 模型之一。 Cursor 最近大火的 Composer 2.5 就是基于 Kimi 来训练。 K 2.7 Code 相比上一代的 K2.6 ，他们没有选择去刷分，主要使用内部实际在用的评估指标，模型的 coding 能力提升有 20%，模型容易过度思考的问题也终于得到了优化，思考token 立省 30%。价格方面：API 输入 6.5，输出 27，命中缓存 1.3，比 K2.6 略涨了一点。但考虑到省了 30% 的思考 token，实际用起来用花的钱倒是差不太多。今天智谱也因为 Fable 5 关闭的事情，紧急宣布即将发布 GLM 5.2，官方公告里这句话深深共鸣：在一些前沿模型突然变得不可用的时刻，前沿智能不应只属于少数人，也不应被少数规则随时收回。它应该开放、可用、可构建，并服务于每一位开发者。

译Anthropic 的 Fable 5 被美国政府以国家安全为由要求全面下架，用户无法购买。博主指出闭源模型的智能可能成为限购商品，呼吁开源模型。昨日 Kimi 发布开源 coding 模型 K2.7 Code，coding 能力较上一代提升 20%，过度思考问题优化，思考 token 减少 30%；API 输入 6.5、输出 27、缓存 1.3。智谱因 Fable 5 事件紧急宣布即将发布 GLM 5.2，称前沿智能不应只属于少数人。

swyx@swyx · 6月13日13

how your email finds me (if youre waiting for a decision or reply pls dont take it personally im just in peak crunch mode for aie)

译你的邮件找到我时是怎样的（如果你在等待决定或回复，请不要介意，我只是处于AIE的高峰冲刺模式）

Peter Steinberger 🦞@steipete · 6月13日48

I can barely keep up with implementing/testing/landing all the Issues/PRs folks submit to https://github.com/openclaw/crabbox#providers Codex runs INSIDE crabbox while it is building crabbox. This is becoming essential infra for my work. Codex been looping nonstop for the last 4 days in multiple trees. Since all of it is e2e verifiable it basically builds itself. Codex even signs up for the services automatically via browser/computer use. My main job is adding credit card details and closing things that I don't see as a fit.

译Peter Steinberger 分享了 Codex 在其项目 crabbox 中的应用体验。Codex 在 crabbox 内部运行，同时构建 crabbox 自身。它已连续4天在多处代码树中非停止循环运行。所有构建均为端到端可验证，使得项目几乎能够自我构建。Codex 还能通过浏览器/电脑使用自动注册所需服务。作者的主要工作仅剩添加信用卡信息和关闭不合适的内容。

Rohan Paul@rohanpaul_ai · 6月13日44

Kai-Fu Lee (founder of Sinovation Ventures) explains how the future is all about multi-agent systems. 1 agent today is like a pre-internet PC, useful but isolated. Connect agents, and they share context, split tasks, and coordinate instantly.

译李开复（创新工场创始人）解释了未来全是关于多智能体系统。今天的一个智能体就像一台前互联网时代的PC，有用但孤立。连接智能体，它们就能共享上下文、拆分任务并即时协调。

Berryxia.AI@berryxia · 6月13日60

AI有些地方真的还是“啥也不是的层面”！空白和进步空间巨大！ AI现在连抓个杯子都抓不对，手还没真碰到，杯子自己就飞起来了。极客公园这期对谈里，Aether AI创始人黄碧薇教授举了这个例子：今天的视频生成模型学的是“手靠近杯子，杯子常常会动”这种相关性，而不是“为什么动、我这一抓到底会发生什么”这种因果。聊天里说错话改改就行，可一旦进入物理世界——机器人、自动驾驶、生物医药。一个变量算错，后果就是真的。幻觉在这里可没那么好玩。所以下一代AI的分野，不是把世界预测得更像，是真正理解世界为什么这样运行。这就是因果世界模型想干的事：让AI不只看表象，更看懂机制。黄教授团队的benchmark显示，因果结构能让机器人成功率提升25-50%，样本需求降5-10倍。同一堆数据，换个结构，经济性直接变了。以前大家觉得规模化利用相关性就能一路走到黑，现在物理世界把这套玩法直接打脸了。真正的智能，得从“知道是什么”进化到“知道为什么”。

译当前视频生成模型仅学到“手靠近→杯子动”的相关性，而非因果机制，导致抓杯子时杯子提前飞起。Aether AI 创始人黄碧薇教授提出因果世界模型（Causal World Model），旨在让 AI 理解物理运行机制而非仅预测表象。其 benchmark 显示，引入因果结构可使机器人成功率提升 25-50%，样本需求降低 5-10 倍。这标志着下一代 AI 需从“知道是什么”进化到“知道为什么”，尤其在机器人、自动驾驶等真实物理场景中。

Logan Kilpatrick@OfficialLoganK · 6月13日27

Ilya was right and predicted much of this

译Ilya 是对的，并且预测了其中很多。

elvis@omarsar0 · 6月13日23

Open source AI must win!

译开源AI必须赢！

Nathan Lambert@natolambert · 6月13日24

This is so sad. I'm doomscrolling and everyone agrees it's horrible. So many people just want to build strong AI and safely deploy it. The government should facilitate this not axe it. I'm going to get some rest and hopefully can resume this goal tomorrow. Thanks all.

译这太让人难过了。我一边刷屏一边看到所有人都觉得这很糟糕。那么多人只是想打造强大的AI并安全地部署它。政府应该为此提供便利，而不是砍掉它。我要去休息一下，希望明天能继续这个目标。谢谢大家。

Emad@EMostaque · 6月13日44

So @Anthropic about to learn the @SpaceX ITAR/EAR lessons Will be very hard for non-nationals to work there and @OpenAI on frontier models. Suppose AGI is the ultimate dual purpose technology

译所以 @Anthropic 即将学习 @SpaceX 的 ITAR/EAR 教训非国民将很难在那里以及 @OpenAI 的前沿模型岗位上工作。假设 AGI 是终极双重用途技术。

小互@xiaohu · 6月13日22

好消息 Claude 重置了所有人的用量快去看看坏消息我本来就是今天要重置的特么的

Nathan Lambert@natolambert · 6月13日16

Not even much to say, I think the government way overstepped but we’ll see if they can substantiate the evidence (in which case Anthropic would tell us). Anthropic’s messaging was pushing government action, but this is insane and a bad action by USG for the AI trajectory.

译没什么好说的，我觉得政府过度干预了，但要看他们能否拿出证据（那样的话 Anthropic 会告诉我们）。 Anthropic 的消息曾推动政府行动，但这次太疯狂了，对 AI 发展而言是美国政府的一次糟糕举动。

Nathan Lambert@natolambert · 6月13日45

A good time to remind people that in my time doing LLM research I feel like a minority of my colleagues are American citizens. It would be industry destroying to have to rebuild with segregation for frontier ai research to be legal.

译一个提醒人们的好时机：在我从事LLM研究期间，我感觉我的同事中只有少数是美国公民。如果前沿人工智能研究要合法地进行种族隔离，那将是毁灭行业的重建。

fofr@fofrAI · 6月13日18

Yeah I'm going to have fun with this.

译我正在尝试一个智能体流程，将 Hyperframes 与 Gemini 视频分析结合起来，制作有趣的注释视频。是啊，这会很有意思。

Peter Steinberger 🦞@steipete · 6月13日52

IMO sth that is a bit overlooked but will become far more important in the future. GPT is 10-20x more token+cost effective for ~similar outcome.

译Peter Steinberger 指出 GPT 在 token 消耗和成本上比 Fable 高效 10-20 倍，且能达到相似结果。@thorstenball 的对比测试印证：让 Fable 和 deep^2 完成相同的 CLI、Web 服务器等多端功能，deep^2 花费 $20（首次未通过但可修复），Fable 运行 1 小时 40 分、花费 $350（首次成功）。后续追问后 Fable 总花费达 $457，deep^2 预计最多 $40，差距约 17 倍。

Chubby♨️@kimmonismus · 6月13日49

I had already wondered how Apple manages to perform inference at Google while simultaneously protecting their privacy, essentially their unique selling point. The answer: the heaviest requests run on Blackwell B200s inside Google Cloud, with NVIDIA's Confidential Computing encrypting the data while it's processed, so neither Google nor Apple can see it. "NVIDIA Confidential Computing provides a hardware-based security layer for accelerated AI workloads. The technology protects data while it’s being processed by isolating workloads in trusted execution environments and enabling systems to cryptographically verify that the infrastructure has not been tampered with before any sensitive data is sent to the server."

译Kim解释Apple如何在Google Cloud上执行推理时保护隐私：最重的请求运行在Google Cloud的Blackwell B200s上，利用NVIDIA Confidential Computing提供基于硬件的安全层，将工作负载隔离在可信执行环境中加密处理数据，确保Google和Apple都无法看到数据。

elvis@omarsar0 · 6月13日69

How to effectively run autonomous long-running coding agents? This is one of the most exciting discussions on agents I've ever had. I recorded it and am making it freely available. (bookmark it) The idea of autonomous long-running agents is a real thing. We talk about lots of things like /goal, /loop, and dynamic workflows, and what comes next. One interesting discussion was around how to make the agent run for longer while ensuring it stays on track. Most models today will struggle to coordinate work effectively. They sometimes pause the work early. Lots of mistakes happen, and lots of weird shortcuts (reward hacking). What helps is to be extremely clear about the goals it needs to achieve. To clarify the dos and don'ts clearly. Eliminate any assumptions you think the model would make. Deep expertise matters so much in this. But you can get far through careful planning. My formula currently is to use Opus 4.8 for planning carefully and GPT-5.5 for all executions. For the evaluator (via /goal), I am often using something like Deepseek or the latest models from Qwen, Kimi, and MiniMax, etc. Another insight we discussed to enforce goals is to provide strong visual cues for the agent to compare with. I found that a multimodal goal is a much stronger goal than a plain text one. And use agents to help you set clear goals. Watch here: https://academy.dair.ai/events/cmplo7v3b000e04l1pxprat4d

译DAIR.AI创始人Elvis Saravia分享如何有效运行长期自主编码智能体。他指出当前多数模型难以协调工作，会过早暂停、犯错或走捷径（reward hacking）。关键在于明确目标、消除假设，避免模型自行推断。他的实践公式：用Opus 4.8进行细致规划，GPT-5.5执行所有步骤，评估器（通过/goal）则使用Deepseek及Qwen、Kimi、MiniMax等最新模型。另一关键洞察是提供多模态视觉线索作为目标，比纯文本目标更强，能更好地约束智能体。完整讨论已录制并免费开放。

Chubby♨️@kimmonismus · 6月13日65

Google DeepMind published a 60-page paper mapping the road from AGI to superintelligence, written by Hutter, Legg, and Genewein. No hype, just a sober analysis The paper uses three levels. AGI = roughly average human performance across most cognitive tasks. ASI = a system that beats large, well-coordinated groups of human experts across virtually everything (their bar: tens of thousands of experts working ten years on one problem). Universal AI / AIXI = the theoretical ceiling, uncomputable, only approachable from below. Then they explore the question of how this could be achieved: Scaling compute, models, and data, the continuation of the trend that drove the breakthrough so far. It is the only path with historical data available for extrapolation. The core question: Does quantity transform into quality? Even if individual models plateau, the sheer act of running millions of faster AGI instances could trigger the leap. (A quick aside: that is a fascinating philosophical idea. It always reminds me of Hegel’s dialectic, the notion that quantity transforms into quality. We ought to start drawing on philosophical theories to make sense of the future.) Algorithmic paradigm shifts: a genuine break from the transformer pretraining paradigm. New architectures, new learning methods. However, hard to predict by definition. Recursive self-improvement: AI accelerates AI research, which produces better AI, which accelerates research further. Multi-agent coordination: superintelligence emerges from large collectives of AGI agents working together, like automated corporations or AI economies. Collective intelligence potentially far exceeding any individual model. The authors naturally point to what I repeatedly describe as the biggest bottleneck: energy. I recently linked to a few graphs showing, on the one hand, the extent to which energy is already becoming a problem and, on the other, how China dominates the expansion of both nuclear and solar energy in the global race. But the authors also address a profound shift in the world of work in a post-AGI era. I would say this is a reality we must face. So, it is not just about scaling, but also about whether the underlying conditions - such as energy and hardware - can be effectively established. Six things that could slow or stop all of this: The data wall. Quality training data runs out, possibly before the end of this decade. Resource demand grows too fast. Energy, chips, rare earths, investment. The physical infrastructure can't scale arbitrarily. The neural paradigm hits a ceiling. Pretrained transformers plus fine-tuning may not be enough to reach AGI, let alone go beyond it. Research gets harder. Keeping Moore's law going already needs 18x more researchers than in the 1970s. Ideas are genuinely harder to find as fields mature. The abstraction barrier. Models trained on human concepts may never invent new ones from scratch. Saturating GPQA or SWE-bench shows mastery of what humans already worked out, not the ability to go beyond it. Train only on pre-Newtonian physics and you won't reason your way to relativity. Deliberate slowdown. Regulation, accidents, public backlash. Real, but likely countered by the competitive pressure between companies and nations. I think it’s great that Google is addressing questions such as which paths they believe lead to AGI, what the road to ASI might look like, what challenges will arise, and much more. Overall, however, it sounds to me like all of this could actually succeed, making it, in that sense, a call to discuss and reflect on the consequences.

译Google DeepMind发表60页论文，由Hutter、Legg、Genewein撰写，定义AGI（多数认知任务达平均人类水平）、ASI（超越大量专家协作）和不可计算的AIXI三个层级。实现路径包括规模扩展、算法突破、递归自我改进和多智能体协调，瓶颈在于能源与硬件。六种阻碍：高质量数据可能本十年内耗尽、资源需求过快、神经范式天花板、研究难度激增（维持摩尔定律需18倍于1970年代的研究者）、模型无法创造全新概念、人为放缓。作者认为这是对AGI后果的严肃反思呼吁。

Jeff Dean@JeffDean · 6月13日48

Quite interesting thread on capabilities of real biological neurons (spoiler: they're way more capable than classical artificial neurons in a perceptron) . Nice work @IdoAizenbud and collaborators!

译据 Jeff Dean 转发，Ido Aizenbud 与合作者的新研究发现，单个皮层神经元能够对猫狗进行分类、识别口语单词并解决 10 位奇偶校验——这些任务此前被认为需要整个网络才能完成。

Emad@EMostaque · 6月12日38

If you think AI valuations are crazy just wait until SpaceX, OpenAI and Anthropic all are liquid. Hopefully some crazy ideas and impactful ideas get funded, especially as many of the stockholders think AGI is coming so like use it or lose it

译如果你觉得AI估值疯狂，那就等到SpaceX、OpenAI和Anthropic都变得流通起来。希望一些疯狂但有影响力的想法能得到资助，尤其是很多股东认为AGI即将到来，所以要么利用它要么失去它。

AYi@AYi_AInotes · 6月12日56

我感觉Garry Tan今天这条帖子有点戳破了AI编程的一些泡沫和幻觉。很多人都以为AI编码工具会解放创始人，实际呢，规则，审批，流程，层级，同一座牢笼只是搭得更快了。以前加一层审批要耗两个工程师两周，成本本身就是免疫系统，不值得的东西自然活不下来，但现在AI一个下午就能搭完，在构建成本归零的那一刻，复杂度就开始无限制的繁殖了。因为构建的速度，就是僵化的速度。 AI其实会改变我们的心智模型，只会把我们已有的东西放大，控制型团队用它堆出更密的官僚体系，创造型团队用它跑出更多的新体验，这两种工具本身都没有立场，它只是一面带编译器的镜子。所以我们别忙着用AI把旧流程跑的更快，可以试着去用AI删掉整个旧流程，去重新创造以前从未发生过的体验，不然可能就是赢了效率，输了方向。

译Garry Tan指出AI编码工具并未解放创始人，反而让人更快搭建规则、审批、流程、层级——同一座牢笼装配更快。以前加一层审批需两周，成本本身是免疫系统；现在AI一个下午就能完成，复杂度无限繁殖，构建速度即僵化速度。AI放大已有心智模型：控制型团队用它堆官僚，创造型团队用它创造新体验。提醒不要用AI把旧流程跑得更快，而应删掉整个旧流程，创造前所未有的事，否则赢了效率输了方向。

Rohan Paul@rohanpaul_ai · 6月12日64

Anthropic's Dario Amodei's new interview: on U.S. military use of Claude. Says “terrible” mistakes may be made. Argues that Anthropic has tried to set limits/"red lines" around how its models can be used, even if doing so risks the company’s future.

译Anthropic 的 Dario Amodei 最新访谈：关于 Claude 在美国军事中的使用。他表示可能会犯下“可怕的”错误。并主张 Anthropic 一直试图为其模型的使用设定限制/“红线”，即使这样做会危及公司的未来。

meng shao@shao__meng · 6月12日31

最近阿里（通义、钉钉..）发生的各种高层变动，让我想到一个问题。如果再有人问你，你们创业做的这件事，如果阿里这种大厂也做，你们的竞争力是什么？我会回答：我们的竞争力？就是我们不会宫斗 😂 阿里宫斗，顾不上我们。。。

译邵猛发推文指出，近期阿里（通义、钉钉等）高层变动频繁，引发对创业公司面对大厂竞争时差异化优势的思考。他认为，创业公司的核心竞争力在于“不会宫斗”——大厂内部斗争消耗精力，反而让创业者有了被忽视的空间。这一观点基于阿里实际的组织动态，并非抽象讨论。

Chubby♨️@kimmonismus · 6月12日56

Regardless of any political assessment of the war, a highly significant trend is emerging here: wars are increasingly being fought autonomously. I recall my school days, when we debated ethical and moral questions,such as whether it is justifiable to sacrifice several people for the sake of one, or to sacrifice younger people in favor of older ones, and so forth. Everyone is likely familiar with the "Trolley Problem," too. Decisions regarding these questions are increasingly being made by machines. Far be it from me to be a "doomer", not at all. Yet, this is a crucial debate, particularly concerning AI-powered autonomous weapons. Anthropic has stated that it does not want its models used for such purposes. They will likely remain the exception, however. My point is that we are entering an era where the human role as a moral arbiter is shifting; instead, AI models are trained in advance based on moral codes and endowed with underlying value systems. Humans, however, act differently. Even in the military, orders are refused if they are objectionable or violate moral principles. The situation is different with machines. Consequently, we will witness entirely new types of warfare and entirely new ethical and moral debates. For one thing is clear: autonomous weapons will become the standard, not the exception.

译推文指出，无论战争的政治立场如何，一个显著趋势正在形成：战争日益由机器自主进行。作者回顾学生时代讨论的电车难题等伦理问题，认为这些决策正越来越多地由机器做出。Anthropic已声明不希望其模型用于自主武器，但可能只是例外。人类士兵在战场上会基于道德拒绝违心命令，而机器则不会。因此，基于预先训练的价值观体系运作的AI将取代人类成为道德仲裁者，带来全新战争形态与道德争议。自主武器将成为常态而非例外。

Rohan Paul@rohanpaul_ai · 6月12日35

So ex-Google exec @MGawdat correctly predicted last year. "We're going to start to see a trillionaire before 2030. I can guarantee you that someone will be a trillionaire. There will be a new Elon Musk or Larry Ellison that will become a trillionaire because of AI investments, right? And that trillionaire will have so much money to buy everything. There will be robots and AIs doing everything, and humans will have no jobs." --- Video from 'The Diary Of A CEO' YT Channel (link in comment)

译前谷歌高管Mo Gawdat去年预测：2030年前将因AI投资诞生首位万亿富翁，届时机器人和AI将包办一切，人类彻底失业。其引用推文指出，SpaceX上市募资750亿美元、估值1.77万亿美元，使Elon Musk成为世界首个万亿富翁，印证该趋势。

Ethan Mollick@emollick · 6月12日31

Not having access to native imagegen does hold Fable back somewhat. It is really good at making PNGs, etc, but there are lots of areas (including commercially valuable ones like presentations) where having the ability to have multimodal output would be helpful/token efficient.

译无法使用原生图像生成确实在一定程度上限制了Fable。它非常擅长制作PNG等，但在很多领域（包括具有商业价值的领域，如演示文稿）中，拥有多模态输出能力将是有帮助的/节省token的。

Ethan Mollick@emollick · 6月12日38

Are there toolkits (or skillsets) being created specifically for AIs to use for building games? They default to 3js, reinvent how to make sprites from scratch each time, test technical issues but not gameplay loops, etc. It would help to point AIs at some tools to focus them.

译是否有专门为AI创建的工具包（或技能集）用于构建游戏？它们默认使用3js，每次都从头重新制作精灵，测试技术问题但不测试游戏循环等。给AI指向一些工具让它们专注会有所帮助。

Berryxia.AI@berryxia · 6月12日25

Trae AI ，这么屌，这你受得了么？

译Trae AI，这么厉害，你受得了吗？

swyx@swyx · 6月12日66

## On Loopcraft One might argue the entire game of the next century is to be able to stack loops as effectively as possible. In the early days of each phase, it will be valuable to know when to go **DOWN** a loop when things go wrong (for reliability)… but it will probably be more valuable to know how to go **UP** a loop as models improve (for leverage). If you don’t figure out how to do this, don’t be salty when you lose to those that do.

译swyx 提出“Loopcraft”概念，认为下世纪核心在于高效堆叠循环。早期需掌握向下循环（出错时保障可靠性），模型改进后更需向上循环（放大杠杆）。引用 @latentspacepod 的“Salty Lesson”：智能体时代不应手动修复问题，而应构建随智能体数量扩展的系统（如目标和编排），这是 Richard Sutton“Bitter Lesson”在智能体领域的延伸。

Deedy@deedydas · 6月12日63

There’s a new phenomenon of small groups of people who are running these small little quant funds driven by AI models who are making fuck-you returns. I’ve personally seen many who are 2x’ing capital in months. Many unsubstantiated rumors also claim SSI is a quant shop too. Well known quant funds have all tested out LLMs for trading. Some claim it doesn’t work. Others, well.. what do we think Jane Street doing with their huge GPU cluster? On top of that, there’s a ton of people asking Claude / GPT what stocks to buy and/or “vibe code me a trading engine”. Applies to other financial instruments too: derivatives, futures, crypto, and on the less sophisticated side, prediction masks. It begs the question: how does this change how we think about markets? how much retail volume is driven purely by the ripple effects of AI models? does this completely destroy efficient market hypotheses in favor of “correlated model hypothesis.” Early theories say – Small studies including one by the Fed show destabilizing effects. – We see amplified concentrated trades into the 20 known names in the market. – Can leave trading vulnerable to GEO attacks like publishing specific articles to “poison” the models decision making. Eventually any alpha generated by the model at a point of time *should* decay over time. New anti-AI trading strategies with custom post trains too. Remember, you need be able to afford the tokens to participate in this alpha. What does this mean for the future of wealth accumulation? We live in a brave new world.

译Deedy Das观察到新兴现象：小型团队利用AI模型运营量化基金，数月内实现资本翻倍。传闻SSI也是量化对冲基金。知名对冲基金（如Jane Street）正用GPU集群测试LLM交易；同时大量散户向Claude/GPT咨询股票建议或“vibe code”交易引擎。这引发对市场影响的思考：有效市场假说可能被“相关性模型假说”取代；美联储小规模研究显示不稳定效应；交易集中化易受“投毒”攻击；模型alpha会随时间衰减，需开发抗AI交易策略。最终能否参与取决于token成本承受能力。

karminski-牙医@karminski3 · 6月12日64

我现在的体感是，模型能力到底强不强(仅讨论编程)，会极度体现在代码直觉上，而这部分是最难训练的。这是海量的开发经验堆出来的。比如我这个bug, 生成的路网是断裂的, GPT-5.5-pro-xhigh都修不好. 但其实问题很简单, 我跟他说路网断裂, 他就觉得, 你矩形地块就是4条边, 对应4个tile, 然后4个角再来4个tile, 完事了, 怎么会断裂呢? 而实际上每条边需要用2个tile才能填充完毕, 这种"每条边一个tile"的固有直觉, 直到你发现之前, 你让它修, 是怎么都修不好的, 多模态模型截图打他脸也没用(强烈怀疑向量空间映射到一起了). 只能靠你自己发现问题的根源, 并反推模型在哪里出现了问题. 我这个case断断续续修了4小时了, 直到我意识到了, 我得自己下场了, 于是让它给每个tile编上ID，然后直接问他, 你觉得这两个tile之间可以容纳几个tile. 立刻露馅了, 他就觉得填充一个tile就ok了... 修复过程立刻就变得极其弱智, 告诉它应用规则, 每个tile对应几个单位长度. 然后计算填充就完事了... 而现在有一个模型, 上来就不会犯这个错误. 然后又有一个模型, 虽然会犯错误, 但是迭代几次修好了, 最后就是怎么都修不好. 大家会觉得哪个模型能力强?

译作者认为模型编程能力取决于“代码直觉”，由海量开发经验堆出，极难训练。他以路网断裂bug为例：GPT-5.5-pro-xhigh错误认为矩形每条边只需1个tile，实际需2个tile，多模态截图也无法纠正。作者费4小时，让模型给tile编ID并质问“两个tile之间能容纳几个tile”才暴露缺陷，随后告知每个tile对应单位长度并应用规则，修复变简单。不同模型表现：有的开始不犯错，有的迭代修复，有的怎么都修不好。

向阳乔木@vista8 · 6月12日37

最近几次分享的PPT都是用Youmind做的。玉伯在身边朋友是异类，持续独立思考，总有不一样的视角。恭喜Youmind，竟然已经两年了，时间过得太快。不少人觉得玉伯线下和线上感觉反差很大。一个把真实做原则的人，坦诚到可怕，这种CEO太稀缺了

译Vista 分享近期多次用 Youmind 制作 PPT，祝贺 Youmind 已成立两年。他评价 Youmind 创始人玉伯是身边朋友中的“异类”，持续独立思考，线上线下反差大。玉伯以真实为原则，坦诚到让人感到“可怕”，这种 CEO 非常稀缺。

swyx@swyx · 6月12日46

the #1 thing that is driving me to build my own vibecoding platform rn is that none of them - and i lov vercel, cloudflare, netlify etc - none of them really close the loop for you in terms of setting you on the right path with errors and pinging you when shit fails (shit always fails) there's way too much "webmaster" infra to setup for every single project and i just want to do it once and for all, instead i'm being asked to npx posthog wizard here and npx arize skills there and it all just needs to be swallowed up into One Thing.

译开发者swyx抱怨Vercel、Cloudflare、Netlify等现有平台未能真正闭环：在你出错或项目失败时，它们不会主动引导你纠正或发送通知。此外，每个项目都需要重复设置大量“网站管理员”基础设施，比如执行npx posthog wizard、npx arize skills等。swyx表示厌倦了这种零散配置，希望将所有功能整合到一个平台中，一次搞定。

歸藏(guizang.ai)@op7418 · 6月12日68

万字长文：做了些爆款 Skills 以后，我对 Skills 的看法最近做了几个传播还不错的 Skills后，我对 Skills 的理解也有些变化。这篇文章算是我目前对 Skills 最系统的一次复盘。我写了为什么 Agent 不是聊天框，为什么 Agent 会放大人的能力差距，为什么 Skill 可能是普通用户真正用好 Agent 的关键中间层；也写了一个好 Skill 应该怎么设计、怎么维护、怎么分发，为什么 Skill 生态不能只做成仓库列表，以及内容、产品、案例、反馈之间如何形成一个持续迭代的飞轮。这不是一篇概念科普，也不是对别人观点的转述，更多是我自己做了一批真实案例之后沉淀下来的判断。如果你正在做 Agent、AI 工具、插件生态、内容产品，或者想把自己的专业经验变成可复用的能力，这篇文章应该会有一些参考价值。

译@op7418 万字长文复盘爆款 Skills 经验，核心观点：Agent 不是聊天框，会放大能力差距；Skill 是普通用户用好 Agent 的关键中间层。好 Skill 需设计、维护与分发；生态不能只做仓库列表，需要内容、产品、案例、反馈形成迭代飞轮。基于真实案例。