The head of WhatsApp and CRED founder Kunal Shah (@kunalb11) on how India’s BPO sector is standing at the edge of complete disruption because the work that once came to India for cost efficiency can now be done by AI agents “A lot of the jobs that were outsourced to India are actually significantly more likely to get impacted. the word outsource will get replaced to agents. Outsource will get replaced to AI. Outsource will get replaced to robots.” "Banks today, or financial services, form 30 to 40% of India’s market cap. In a bank, a lot of stability comes from lending, and from lending, it comes from IT-BPO jobs, which would form 30 to 40% of a bank’s book. Even if 10 to 20% of India's BPO jobs get impacted, the safest part of those Bank's book starts getting negatively impacted.” ---- From "Thrive by Groww" YouTube channel, (link in comment)

译WhatsApp负责人、CRED创始人Kunal Shah警告，印度BPO行业正被AI智能体全面颠覆——过去因成本外包到印度的岗位，如今AI智能体即可完成，“外包”将变成“AI智能体”。他举例金融业占印度市值30-40%，其中IT-BPO岗位占银行账簿30-40%；即使仅10-20% BPO岗位受冲击，银行最安全的资产部分也将受损。此前Vinod Khosla也预警，传统IT服务和BPO业务“将会消失”，但印度若能转向AI部署仍可获胜。

Emad@EMostaque · 6天前37

What if the Great Filter is government bureaucracy The Dark Forest is export license paperwork

译如果大过滤器是政府官僚主义黑暗森林是出口许可文件

Rohan Paul@rohanpaul_ai · 6天前44

Vinod Khosla’s warning for India's BPO in the age AI: The traditional IT services and BPO business “will be gone” But India can still win if it shifts to deploying AI. ---- From "SparX by Mukesh Bansal" YouTube channel, (link in comment)

译Vinod Khosla 对 AI 时代印度 BPO 的警告：传统的 IT 服务和 BPO 业务“将会消失” 但如果印度转向部署 AI，仍能取胜。

Berryxia.AI@berryxia · 6天前53

OpenAI 推出了Daybreak，一个专门给网络安全防御者的前沿AI系统。它把最强的模型、Codex和安全合作伙伴整合在一起，目标是让防御方能更快发现和修复漏洞、处理安全积压、自动化检测验证和响应。简单说，就是想让安全团队的行动速度跟上攻击者的节奏。这其实是OpenAI在cybersecurity领域的一次重要布局，把agentic能力直接应用到真实的高风险场景里。后面他们又在GPT-5.6 Sol上继续强化了这方面的能力。但有趣的是，现在回看这个项目，和最近GPT-5.6受政府管控有限预览的新闻放在一起看，感觉OpenAI在安全相关的前沿能力上，越来越倾向于先服务受控的合作伙伴和企业，而不是全面开放。 https://x.com/OpenAI/status/2053939702110269822/video/1

译OpenAI 发布 Daybreak，整合最强模型、Codex 和安全合作伙伴，帮助防御方更快发现修复漏洞、处理安全积压、自动化检测与响应。后续在 GPT-5.6 Sol 上强化。结合 GPT-5.6 受控预览，OpenAI 倾向先服务合作伙伴而非全面开放。

gabriel@gabriel1 · 6天前36

america banning ai models internationally making everyone else 40% less economically productive while the EU is still debating if DALLE-2 is ISO 335 compliant at this point im not surprised if USA would 10x gdp without EU noticing

译美国在国际上禁止AI模型，使其他所有人经济生产力降低40%，而欧盟还在争论DALLE-2是否符合ISO 335标准此时，即使美国在欧盟不知情的情况下GDP增长10倍，我也不会感到惊讶

Nathan Lambert@natolambert · 6天前43

I get feedback a lot that is like "your book should be the RL for LLMs book" or "the post-training book" and it's definitely true those would sell more copies. The reality is that this book was in many ways a side project, and by the time I realized I agreed with a bit of this I didn't have the time for *another* refactor. At the end of the day, I still dumped as much knowledge as I could from what I was doing into the book, and now the course and the code. In it's spirit the book is totally a post-training book. The process to change this would've delayed the book from anywhere from 3 to 15 months. It is simply an amount of time I didn't have with Interconnects, Olmo, and other life necessities. So this isn't to say that I'll never do it. Re-prints and new versions are a common thing. It's doable for me to refactor most of the chapters, re-write the introduction, and make it a post-training centric book. Still, RLHF as a topic deserves a dedicated text and is far from solved. It's a technology that skyrocketed language models to prominence and points to a lot of fundamental problems interfacing the user and the AI. Much of the content that got me to where I am today in my career is by diving into caring about this interface, so I'm happy for it to have the space to live, breath and thrive. So in reality, I probably could've hot-swapped the title to sell more copies, but it would have made me feel dishonest to do so. For anyone wanting to learn post-training, there's nothing in this book that doesn't apply to you -- post-training is just constantly evolving and growing in complexity. A final nitpick, is that RLHF actually matches my more conceptual, intuitive vibe a good amount. Post-training is far more practical, in a data and systems sense, where this is more of a math & intuition book. Anyways, the RLHF "post-training" Book is coming soon and thank you for trusting me with your attention. 🩵

译Nathan Lambert回应外界建议——他的《RLHF: Reinforcement Learning from Human Feedback》若改名“后训练”书籍会更畅销。Lambert承认内容本质正是后训练，但改名需重构3至15个月，因精力有限未做。他认为RLHF远未解决，值得独立成篇；该书侧重数学与直觉，后训练更偏数据与系统。他坚持原题以避免不诚实，并宣布“RLHF后训练书籍”即将出版。

Elon Musk@elonmusk · 6天前21

Grok is balanced

译Grok 是平衡的（引用推文意为：这就像太阳从东边升起一样不令人惊讶。）

Chubby♨️@kimmonismus · 6天前36

Honestly, I no longer believe that people outside the U.S. will still have access to frontier models, and even there, access will be limited. We are now witnessing the end of public access to frontier intelligence. It is a very sad and serious turn of events.

译老实说，我不再相信美国以外的人还能使用前沿模型，即使在那里，访问也将受限。我们正在目睹前沿智能公共访问的终结。这是一个非常可悲且严峻的事态转变。

Yuchen Jin@Yuchenj_UW · 6天前32

The biggest baller move Sam could make right now is to open source GPT-5.6 on Huggingface and declare that OpenAI’s original mission has been achieved.

译Sam 现在能做出的最大胆之举就是在 HuggingFace 上开源 GPT-5.6，并宣布 OpenAI 的原始使命已经完成。

gabriel@gabriel1 · 6天前22

it's just easier to describe the outcome you want than to do the work yourself all computer work will be AI next year. the only reason it's not here this year is that the interface doesn't exist and we need to culturally update give me another 2 months

译描述你想要的结果比亲自动手做要容易得多明年所有计算机工作都将由 AI 完成。它今年还没实现，唯一的原因是我们还没有这样的界面，并且需要文化上的更新再给我两个月

Orange AI@oran_ge · 6天前62

最近几个对模型的反直觉的观察 1. GLM 5.2 正在取代 Claude sonnet 和 Opus，成为付费用户最爱的模型 2. DeepSeek v4 Pro 依然是大众里最受欢迎的模型 3. GPT 5.5 虽然很强大，但几乎没人用观测的方式的是看 cola 的 token 消耗统计这也侧面说明 cola 和 codex 用户（GPT5.5）的画像是完全不同的

译推文分享了三个反直觉的模型观察：GLM 5.2 正在取代 Claude Sonnet 和 Opus 成为付费用户最爱；DeepSeek v4 Pro 仍是大众最受欢迎模型；GPT 5.5 虽然强大但几乎无人使用。数据来源为 cola 的 token 消耗统计，侧面说明 cola 和 codex（GPT 5.5 用户）画像完全不同。

jason@jxnlco · 6天前19

We gotta a guy named Ferrari on the inference team. We can’t lose.

译引用推文感叹 GPT-5.6 的 token 效率高得不可思议。主推文回应：我们推理团队有个叫“法拉利”的家伙，输不了。

elvis@omarsar0 · 6天前32

Dynamic workflows (generating harnesses on the fly) are a new form of test-time compute. But LLMs aren't great at building them. I often have to steer agents to generate complex patterns. Curious how effective Mythos/GPT-5.6 is at dynamically generating complex workflows.

译动态工作流（即时生成测试工具）是测试时计算的一种新形式。但大语言模型并不擅长构建它们。我经常需要引导AI智能体来生成复杂模式。好奇Mythos/GPT-5.6在动态生成复杂工作流方面的效果如何。

宝玉@dotey · 6天前71

OpenAI 今天（6月26日）发布了新一代模型 GPT-5.6，包含三个版本：旗舰级 Sol、日常级 Terra 和经济级 Luna。但这条新闻最值得关注的地方不在模型本身，而在发布方式：应美国政府要求，GPT-5.6 目前只向大约 20 家经过政府审批的合作伙伴开放，普通开发者和 ChatGPT 用户暂时用不上。 GPT-5.6 用了一套新的命名规则：数字代表代际，Sol、Terra、Luna 代表三个固定的能力档位，灵感来自太阳、地球、月亮。Sol 是最强的旗舰，Terra 性能接近上一代 GPT-5.5 但价格砍半，Luna 主打便宜快速。 Sol 新增了两个模式：max 模式让模型花更长时间深度推理，ultra 模式则调用多个子 agent 并行处理复杂任务，相当于一个 AI 自己拆分工作给一组 AI 干活。在 OpenAI 公布的 Terminal-Bench 2.1（测试命令行工作流的编程基准）上，Sol Ultra 得分 91.9%，Sol 为 88.8%，Claude Mythos 5 为 88%，Google Gemini 3.1 Pro Preview 为 70.7%。网络安全方面，Sol 在 ExploitBench 上用大约三分之一的 token 就达到了 Mythos Preview 的水平。 API 定价： Sol 每百万 token 输入 5 美元、输出 30 美元； Terra 分别是 2.5 和 15 美元； Luna 是 1 和 6 美元。 7 月还会上线 Cerebras 硬件加速版本，推理速度可达每秒 750 个 token。 OpenAI 这次花了大量篇幅讲安全。投入超过 70 万 A100 等效 GPU 小时做自动化红队测试，专门寻找能跨场景通用的越狱攻击。模型内置了拒绝机制，实时分类器会在生成过程中检测网络安全和生物领域的滥用行为，可疑输出会被暂停，交给一个更大的推理模型复审。按照 OpenAI 自己的准备框架评估，Sol 的网络安全能力被定级为“高”，但没有达到“关键”级别。它能找到浏览器漏洞和利用原语（exploit primitive，也就是构建攻击的基础组件），但在测试条件下无法自主完成完整的攻击链。 OpenAI 把这解读为一个积极信号：模型更擅长帮防守方找洞和修补，而不是帮攻击方搞破坏。但这个判断是否经得起现实世界的检验，预览期就是用来回答这个问题的。如果你是 API 用户，短期内最实际的变化是：Terra 的性价比。性能接近 GPT-5.5，价格只有一半，对跑大量推理任务的团队来说值得关注。Luna 则适合对成本极度敏感的高吞吐场景。 Sol 的 ultra 模式如果真能稳定运行，意味着复杂的多步骤任务可以甩给模型自己拆解、分配、汇总，开发者不用自己搭 agent 编排框架。这跟 Anthropic 在 Claude 上做的 agent 能力、Cursor 在 IDE 里做的 background agent，方向一致，都在抢占"AI 自己管理 AI"这个位置。但眼下，大多数人还用不上。OpenAI 说几周内会扩大开放，据 Axios 报道下周就会增加更多客户。ChatGPT 用户什么时候能用，还没有明确时间表。完整报告：https://openai.com/index/previewing-gpt-5-6-sol/

译6月26日，OpenAI发布GPT-5.6系列，包括旗舰Sol、日常Terra和经济Luna。Terra性能接近GPT-5.5但价格减半；Sol新增max深度推理和ultra多智能体并行模式。Terminal-Bench 2.1上Sol Ultra得分91.9%，超Claude Mythos 5（88%）和Gemini 3.1 Pro Preview（70.7%）。API定价：Sol输入$5/百万token、输出$30；Terra $2.5/$15；Luna $1/$6。7月将推Cerebras加速版。受美国政府要求，目前仅向约20家审批合作伙伴开放，普通开发者及ChatGPT用户暂无法使用。OpenAI称几周内将扩大开放。

Nathan Lambert@natolambert · 6天前42

There's a lot of sloppy thinking around open models. You can ban them and make it impossible for US companies to use them, but this won't stop A) global open model progress B) bad actors using them So what exactly is gained by banning open models, including those from China?

译关于开放模型，有很多草率的想法。你可以禁止它们，让美国公司无法使用，但这不会阻止 A) 全球开放模型的进展 B) 恶意行为者使用它们那么，禁止开放模型（包括来自中国的）到底能得到什么？

elvis@omarsar0 · 6天前56

Great to see the new GPT-5.6 models finally announced. Sad to see this new release strategy where only a select few get access initially. Not a win for our industry IMO. Open-source AI must win!

译很高兴看到新的GPT-5.6模型终于发布了。遗憾的是，这种新发布策略只让少数人先行体验。我认为这对行业并非好事。开源AI必须胜利！

Deedy@deedydas · 6天前60

We hosted an intimate event on Agentic Engineering in SF with speakers at the forefront of AI yesterday. Three big lessons I took away: – @steipete: I now force contributors to OpenClaw to use a skill that pushes their prompt history of the code change to find signal in noise, to avoid often bad PRs that are 10,000 lines from a prompt “fix this” – @trq212: I used Claude to be a video editor to create a launch video with visuals, while having it interactively teach me about color grading as it did the edits. I didn't even know it could do that! Getting the most out of a model is finding your unknown unknowns. – @georgepickett: I spend a lot more human energy on crafting a plan upfront and getting all my clairfications answered upfront before leaving Codex to spin for days, armed with Ousterhout’s coding principles as a skill, on a well-crafted /goal We had about ~30 odd people including some recognizable names like Theo (@theo), Gergely (@GergelyOrosz), Andy (@andykonwinski), Jerry (@MillionInt), Dave Morin (@davemorin), Patrick Hsu (@pdhsu), Eric (@ericho), Bucky (@buckymoore), Joff (@mejoff) with a surprise visit from cricketer Robin Uthappa (@robbieuthappa) We were graciously hosted by @timshi_ai at his house and cohosted with @GregKamradt. Videos will be up soon! If you're interesting in coming to these, give me a shout in comments or in DM. (also incredible to see how huge the ClawFather is in the flesh)

译昨天在旧金山举办了一场Agentic Engineering小型活动，三位演讲者分享关键经验：@steipete强制OpenClaw贡献者使用技能，将代码变更的提示历史推送以过滤噪声，避免低质量PR；@trq212利用Claude作为视频编辑器制作启动视频，同时学习调色；@georgepickett在让Codex运行前花大量精力制定详细计划，结合Ousterhout编码原则作为技能。活动约30人参加，包括Theo、Gergely等知名人士，视频将很快发布。

swyx 🔜 @aiDotEngineer@swyx · 6天前59

have been testing 5.6 for a while and VERY happy with it. DO NOT view this as just a “cyber” release, it is the new sota workhorse model, completely replacing opus for 80% of tasks for me > GPT‑5.6 Sol is competitive with Mythos Preview using only ~1/3 of the output tokens. this is a very key line. OAI posttraining team has shifted the reasoning pareto frontier by A LOT and they arent saying anything about how they did it because this is the single most important competitive advantage right now in agentic models for enterprise. team really locked in on this one, i honestly wish they just went ahead and called it GPT6 because this minor semver bump is far larger than even the 5.4->5.5 jump which itself was the single most successful openai launch since 4o/o1

译OpenAI 发布 GPT-5.6 Sol（前沿模型）、Terra（平衡日常模型）和 Luna（快速低价模型）的有限预览。swyx 测试 Sol 后给出极高评价，称这不仅是“cyber”版本，而是全新的 SOTA 工作模型，完全取代 Opus 处理他 80% 的任务。关键数据：Sol 与 Mythos Preview 竞争时仅使用约 1/3 的输出 token。swyx 指出 OAI 后训练团队大幅提升了推理帕累托前沿，且未公开方法，这已成为企业智能体模型最重要的竞争优势。他认为这次小版本升级远大于 5.4→5.5 的跳跃，甚至应直接命名为 GPT-6。

AYi@AYi_AInotes · 6天前49

说句很扎心的，大部分人口中的学LLM，本质上只是在学怎么用别人做好的工具，连发动机的盖子都没掀开过。斯坦福CS336这门课最狠的地方，就是直接把盖子掀了，让你从零手搓一整套完整的LLM流水线，从分词、Transformer架构、GPU优化，到数据清洗、scaling laws、对齐技术，五个作业打穿全链路，讲座只是辅助，动手造才是核心。调包能快速出Demo，手搓才能获得系统直觉，看一百篇论文讲FlashAttention为什么快，不如自己用Triton实现一次印象深。跑十次别人的训练脚本，不如亲手处理一遍脏数据懂scaling的本质。很多人觉得没必要这么累，觉得会用就行，却不知道所有的天花板，本质上都是底层理解的不足，你对每一层组件越清楚，上层能做的设计空间就越大。 Knowledge is never kind，真正有价值的知识，获取过程必然伴随着挫败和耗时，信息早就摆在所有人面前了，差的从来不是资源，是愿意沉下心手搓一遍的执行力。想啃的直接从Assignment1开始，每周留够十五小时，三个月后你对LLM的理解会换一个层级。

译斯坦福CS336课程要求学生从零实现完整LLM流水线，覆盖分词、Transformer架构、GPU优化、数据清洗、scaling laws、对齐技术等核心环节。五个作业打穿全链路，强调手搓比调包更能获得系统直觉，例如用Triton实现FlashAttention比看论文印象深。课程无需前期深度背景，每周投入约十五小时，三个月即可建立对LLM底层理解的系统性认知。知识获取伴随挫败，但执行力是拉开差距的关键。

Nathan Lambert@natolambert · 6天前38

A few issues right now 1. Figuring out state capacity for managing frontier capabilities. Dean's stuff is great for this 2. Figuring out how we manage the coming frontier open models 3. Disentangling the distillation accusations/nonsense from the above two

译Nathan Lambert指出当前AI领域的三个关键问题：如何确定管理前沿能力的「状态容量」（Dean Ball的相关研究出色）；如何应对即将到来的前沿开放模型；如何从上述两个问题中厘清关于知识蒸馏的指责与混乱。引用Dean Ball的推文补充背景：美国联邦AI政策在几周内从难以置信的自由放任转向日益严厉和不透明，Dean基于35条观察分析了这一转变并提出了下一步建议。

Yuchen Jin@Yuchenj_UW · 6天前30

best case: OSS surpasses Mythos, and the gov stops banning GPT-5.6/Fable. worst case: OSS surpasses Mythos, and then decides to stop being open source.

译最佳情况：开源超越Mythos，政府不再禁止GPT-5.6/Fable。最差情况：开源超越Mythos，然后决定不再开源。

François Chollet@fchollet · 6天前47

If your benchmark relies on a static dataset or sampling from a static distribution densely known at training time, then it is fundamentally measuring memorization/retrieval. Which might be fine if you're looking for a retrieval benchmark! But don't confuse it with intelligence.

译如果你的基准测试依赖于静态数据集或从训练时已知的静态分布中采样，那么它本质上衡量的是记忆/检索。如果你需要的是检索基准测试，那倒也无妨，但不要将其与智能混淆。

AYi@AYi_AInotes · 6天前54

Seedance 2.5 这效果，说是恐怖如斯真的不夸张，就这十五秒的雪豹镜头，根根立起的绒毛，沾在毛尖的细碎雪粒，瞳孔里的冷光都带着呼吸的起伏， 4K的画质，写实度已经把市面上绝大多数AI视频碾成了渣。现在回头看OpenAI停掉Sora，只能说决策清醒得可怕，不是说做不下去了，关键再往下挤牙膏已经挤不出代际差， Seedance这波是实打实的跨代领先，放眼全球，找不到第二个能打的对手。这波算是把AI视频的行业基准线，拔到了绝大多数团队摸不到的高度，属于看完一眼就知道，整个赛道的天花板今天换了

译Seedance 2.5 生成的15秒雪豹视频达到4K画质，绒毛、雪粒、瞳孔等细节高度写实，写实度远超现有AI视频模型。对比OpenAI停掉Sora，该版本实现了跨代领先，将行业基准线提升至多数团队难以企及的高度。

Ethan Mollick@emollick · 6天前41

It may be obvious, but a lot of the first reaction to the gains in AI capability is going to be "just muddling through" rather than executing on a rational plan. (this is what humans always do in rapidly changing & complex situations, and it certainly seems to be happening here)

译可能显而易见，但很多人对AI能力提升的第一反应将是“勉强应付”而非执行理性计划。（这就是人类在快速变化和复杂情况中一贯的做法，而且这里显然也在发生。）

elvis@omarsar0 · 6天前49

One of my best uses of agentic loops has been personal health. I don't talk about it often because it's very personal. But here it goes in the hope it helps someone who's struggling. (I am not doing this for attention. I am writing this as a personal entry to keep as a reminder for the future.) It started with a simple question last year: how can I best be positioned to leverage AGI/ASI in the future? Money wasn't it. Drowning in work/research (my biggest passion) also wasn't it. The obvious answer was prioritizing health. I took a hard look at the mirror, and my physical health was at an all-time low. It was hard to admit it initially because I love the work I do. But it was time to slow down a bit and prioritize my health. It was really tough in the first few months. The AI industry started to move exponentially at the beginning of the year, so that didn't help either. FOMO affects each one of us to some extent, whether we admit it or not. But I eventually convinced myself that it made perfect sense. If I am in top shape/health, I can probably more optimally use AI and contribute to it. It sounds like a contradiction initially, and your mind will remind you of that every day. But, in fact, it was the most optimal solution there was for me at the time. So at the beginning of this year, I started on the personal goal to get back on track with my physical health. I started to consult a physician and started a private ChatGPT session where I logged everything, from conversations to medications. I had to change my diet, significantly reduce the number of hours I worked, change many habits, and increase my sleeping hours. Initially, I mainly used ChatGPT for a second opinion, but it often reminded me to stay on track. That was so important. As the months passed, I became more confident in my health and the advice my physician and ChatGPT were giving me. So I opened up more and began sharing every little detail of how I felt physically. That made the difference. I believe that personal health is going to be one of the most profound applications of AI, besides personal tutors (which is what I am working on @dair_ai). After 6 months, I have lost 100 pounds and am feeling great. I sleep better, I eat healthier, do a lot of exercise, devote more time spiritually, spend a lot more time with friends/family, and I feel energetic throughout the day. But I am just getting started. I need to continue working on my personal health. It is now at the center of it all. Without getting into too many details, my physician and ChatGPT saved my life. It's not an exaggeration. This is why I wanted to share this personal experience. I am thankful for all the hard-working people who devote their lives to making this world a better place, and for those who tirelessly work on making human-centered AI. I feel like one of the first lucky beneficiaries of it. This is why I am very optimistic about human potential in the age of AI abundance. And I want to give back in any way I can. The best part is that I am now able to use AI more optimally for my work and help friends and family members to get back on track in terms of health. I know many colleagues who are also struggling with their health. You are not alone. Take the time you need. Get the help you need. Consult a health expert. Use AI to keep you on track. Focus on your health first, and you will be able to more optimally help others. I also want to thank the community we have built here (300K and counting). I feel privileged to be connected to some of the top minds around the world. I feel blessed to be able to share my ideas freely and continue learning from you all.

译DAIR.AI创始人Elvis Saravia分享，去年他思考如何最好地利用未来AGI/ASI，答案是将健康放在首位。今年初他开始咨询医生，并开设私人ChatGPT会话记录饮食、药物等细节，用ChatGPT作为第二意见和提醒。6个月后成功减重100磅，睡眠、饮食、锻炼、社交均有改善，精力充沛。他认为个人健康是AI最深刻的应用之一，感谢医生和ChatGPT拯救了他的生命。他鼓励同事优先健康，用AI辅助跟踪。

DogeDesigner@cb_doge · 6天前49

Chamath Palihapitiya was asked to choose between keeping free shares in OpenAI, Anthropic or SpaceX, on the Axios Show. He chose SpaceX. He said the world’s communications infrastructure is overdue for a major overhaul, Starlink is positioned to capture a huge share of that transition, and what may sound like sci-fi today, building the same kinds of businesses beyond Earth, gives SpaceX enormous long-term optionality.

译Chamath Palihapitiya 在 Axios Show 上被问到要在 OpenAI、Anthropic 或 SpaceX 的免费股份中保留哪一个。他选择了 SpaceX。他说世界通信基础设施早就需要一次重大升级，Starlink 定位于抓住这一转变的巨大份额，而今天听起来像科幻的事——在地球之外建立同样的业务——给了 SpaceX 巨大的长期选择空间。

François Chollet@fchollet · 6天前48

Autonomy isn't the ability to act without human supervision. It's the ability to *learn* without human bottlenecks in the process. A system that is fully dependent on human training data and RL environments is only an imprint of human knowledge.

译自主性不是在没有人类监督的情况下行动的能力。而是在过程中没有人类瓶颈的情况下*学习*的能力。一个完全依赖人类训练数据和RL环境的系统，只是人类知识的印记。

Chubby♨️@kimmonismus · 6天前55

All hope now rests on open source. I have never been more bullish on DeepSeek, GLM, Qwen, and so many others than after the drama around Fable 5 and now GPT-5.6 - because we will presumably only rarely get access to frontier models, and even then only with difficulty.

译主推文作者因 GPT-5.6 发布困境更看好 DeepSeek、GLM、Qwen 等开源模型。Axios 报道，OpenAI 在 Anthropic 的 Fable 5 冲突前已主动与特朗普政府沟通，白宫预览了模型能力，Altman 与商务部长 Lutnick 讨论，要求政府审查后再公开。Altman 称 GPT-5.6 “不是我们偏好的长期模型”，暗示前沿模型发布需经过安全审查和合作伙伴筛选。作者推测 GPT-5.6 原计划本周四发布，因政府干预延迟。

Rohan Paul@rohanpaul_ai · 6天前40

"If we could snap our fingers and get a pile of data... we would solve general robotics right now." - Figure CEO Brett Adcock. General robotics is closer than it looks, but data is holding it back.

译"如果我们能打个响指就获得一堆数据……我们现在就能解决通用机器人问题。" —— Figure CEO Brett Adcock。通用机器人比看起来更近，但数据阻碍了它。

Alibaba Cloud@alibaba_cloud · 6天前46

At Flink Forward Asia Shenzhen 2026, Feifei Li, CTO of Alibaba Cloud and President of International Business, shared his perspective on the future of AI: "As the agent era takes off, one concept will dominate—Data Gravity. AI must tackle complex work and, more importantly, create tangible value in real enterprise workflows." AI isn't just about smarter models—it's about solving complex enterprise challenges and delivering real business value. #AlibabaCloud #ApacheFlink #ApachePaimon #ApacheFluss #DataAI #AI #Agent #RealTimeData

译在2026年深圳Flink Forward Asia大会上，阿里云CTO兼国际业务总裁李飞飞分享了对AI未来的看法：随着智能体时代兴起，“数据引力”（Data Gravity）将成为主导概念。AI不仅要处理复杂工作，更需在企业实际工作流中创造切实价值，解决复杂企业挑战并交付真实业务成果。

Rohan Paul@rohanpaul_ai · 6天前38

Mark Andreessen on the anti-data centers sentiments in US. "This completely fake meme about water use, which is just factually not true, running wild through the public discussion that somehow these data centers are basically destroying all the water"

译Mark Andreessen 谈美国国内的反数据中心情绪。 “这个关于用水的假梗完全就是假的，事实并非如此，它在公众讨论中疯传，好像这些数据中心正在耗尽所有水资源似的。”

AYi@AYi_AInotes · 6天前56

我现在越看越觉得， 2026 年 AI 工具的成熟正在让跨领域迁移能力变得成本极低， GitHub开源的这本书表面上是在教量化，实际上它给我们提供了一套用AI 攻破任何一个你完全不懂的领域的模板，说白了就是先跑通，边跑边学，把卡住的地方变成 Spec，让 AI 帮你破局主仓库 🔗 http://github.com/xingwudao/xquant-beginner

译GitHub开源量化书《XQuant：人人都是量化交易员》核心是问题驱动而非知识驱动：每章提供写好的Spec，丢给Claude或Cursor生成代码，先跑通策略（哪怕亏钱）再补理论。全书用9个问题串起量化pipeline（最小闭环、ETF选股、仓位、买卖信号、回测、过拟合检测、实盘等），第1章即上手最小系统。正文与练习代码分开维护。作者认为2026年AI工具成熟使跨领域迁移成本极低，这套把模糊想法写成清晰Spec的能力可复用于任何复杂领域。

Rohan Paul@rohanpaul_ai · 6天前45

Brett Adcock, CEO of Figure AI: "we're working until midnight every night... we are here every weekend. By end of 2026, we'll be able to put a robot into home and be able to do fairly long horizon work."

译Figure AI 首席执行官 Brett Adcock：“我们每晚工作到午夜……我们每个周末都在这里。到 2026 年底，我们将能把机器人放进家中，并能够完成相当长时间跨度的工作。”

Berryxia.AI@berryxia · 6天前53

阿里最近通义实验室这个视频挺火的！其实也和之前发的黄教授因果模型那个事儿还有异曲同工之妙的地方！ Tongyi Lab抛出一个问题：为什么AI在虚拟世界里很强，但让机器人去拿个鸡蛋却容易卡住？他们的新视频在讲Embodied Intelligence的核心难点，机器人“想得清楚”和“做得稳”完全是两回事。在数字世界里，模型可以反复试错、快速迭代。但在物理世界，传感器噪声、执行延迟、环境变化、物理约束让每一步都充满不确定性。一个简单的抓取动作，可能因为光线、摩擦力、物体形状的微小差异就失败。这其实把当前AI的两大世界拉开了对比：语言和代码世界里，scaling law还在狂飙。所以，待解决的问题还是很多，路还挺长。 AI时代，才是寒武纪爆发之际。

译阿里通义实验室视频指出，具身智能核心难点在于AI在虚拟世界强，但物理世界抓取鸡蛋等任务因传感器噪声、环境变化易失败。引用指出Physical AI瓶颈不在模型规模，当前VLA/LLM路线只学统计相关性而非因果律（如桌子高2cm即失败）。UCSD黄碧薇教授在CVPR 2026提出Causal World Models框架，让AI从模仿进化到理解因果，并宣布Aether AI融资2000万美元，成为全球首个因果世界模型公司。世界模型赛道火热，但Aether AI不卷规模，卷因果结构。

AYi@AYi_AInotes · 6天前56

GitHub 上刚开源一本量化书，设计思路有点不一样，而且我觉得这本书真正在教的东西不只是量化，背后其实是一个被严重低估的元能力——把模糊想法写成清晰 Spec，然后让 AI 执行。这套能力放到任何复杂领域都管用，量化交易只是它第一个练手的战场。现在量化交易的学习路径，大部分人搞反了，传统路线：先啃数学 → 觉得自己没准备好 → 永远不动手 → 放弃。一本GitHub上开源的书把路翻过来：先写 Spec 让 AI 帮你跑通一个策略，亏钱也行，跑起来再补理论。书叫《XQuant：人人都是量化交易员》，核心设计就一条：问题驱动，不是知识驱动。 9 个问题串起整条量化 pipeline： 1. 量化怎么赚钱？（先跑通最小闭环） 2. 买什么？（3 只 ETF 开始） 3. 买多少？（3 种仓位分法实测） 4. 什么时候买卖？（信号、再平衡、止盈止损） 5. 怎么知道有效？（回测框架） 6. 如何避免自欺欺人？（过拟合检测）——这章位置极早，说明作者懂新手真正的死法 7-9：实盘执行、持续改进、因子研究日常几个反直觉的地方： • 第 1 章就让你跑策略，不是先讲 CAPM、Black-Scholes，是直接上手做一个能运行的最小系统，跑起来产生的反馈和多巴胺，比任何理论都更能驱动你学下去。 • 正文和练习代码分开维护，书稿仓库放干净的正文，学习仓库放 Specs + Jupyter Notebooks。阅读时不被打断，动手时有完整参考。 • 每章给你写好的 Spec，丢给 Claude 或 Cursor 生成代码。你训练的不是手写代码，是把模糊策略想法变成清晰任务描述的能力。

译一本名为《XQuant：人人都是量化交易员》的开源量化书采用“问题驱动”设计：先写Spec让AI生成代码跑通策略，再补理论。全书用9个问题串联量化pipeline：量化怎么赚钱、买什么（3只ETF）、买多少（3种仓位分法）、何时买卖、如何回测、过拟合检测（第6章极早讲述）、实盘、改进、因子研究。正文与练习代码分开维护，每章提供现成Spec给Claude/Cursor生成代码，训练将模糊想法转为清晰任务描述的能力。

Ethan Mollick@emollick · 6天前50

I feel like on X all you hear about is elaborate plans by firms to build their own AI stacks but in my experience companies are full of people who want access to Claude or ChatGPT and are pressuring their purchasing staff to get licenses so they can just use the tools they know.

译我感觉在X上听到的都是公司构建自有AI堆栈的复杂计划，但根据我的经验，公司里满是想要访问Claude或ChatGPT的人，他们正施压采购人员获取许可，以便直接使用那些他们已经熟悉的工具。

Rohan Paul@rohanpaul_ai · 7天前53

This paper pushes back on the habit of calling every capable AI system an “agent” and asks the cleaner question: what makes something an agent in the 1st place? Explains why today’s AI agents are mostly clever tools, not truly independent agents. The problem is that many systems called agents are really advanced workflows around LLMs, not independent actors. Complex behavior is not the same as self-directed behavior. A chess engine can crush a grandmaster without wanting anything, and a browser agent can complete a task without maintaining a durable sense of what it is, what it can do, or why this task matters beyond the current instruction. They can call tools, follow steps, and complete useful tasks, but their goals, roles, limits, and update cycles still mostly come from humans. The paper’s core idea is to separate "agentic AI" from "agentive AI", where agentic means it looks autonomous and agentive means its agency comes from inside the system. The authors propose the Goal-Identity-Configurator model, where an AI keeps long-term goals, updates its sense of itself, predicts possible outcomes, decides how much to think, and learns from real and simulated experience. They do not mainly test a finished system, but build an argument and architecture for what real machine agency would require. ---- Link – arxiv. org/abs/2606.23991 Title: "Critique of Agent Model"

译该论文质疑当前将所有能力强AI系统称为“agent”的做法，指出许多所谓的agent只是围绕LLM的高级工作流，而非独立智能体。复杂行为不等于自我导向行为。论文提出核心区分：“agentic AI”（看似自主）与“agentive AI”（能动性源于系统内部），并构建Goal-Identity-Configurator模型，要求AI保持长期目标、更新自我认知、预测结果并自主决定思考深度，从真实和模拟经验中学习。论文主要构建论点和架构，未测试完整系统。

Ethan Mollick@emollick · 7天前52

As this post points out, contrary to what many say, the US government could absolutely effectively ban open weights models. That doesn’t mean you won’t be able to download the weights & run them, but they can ensure that no US company would use or provide access or host them

译Ethan Mollick指出，美国政府完全有能力有效禁止开源权重模型。禁止并非阻止个人下载运行，而是通过法规确保美国企业不得使用、提供访问或托管未经批准的模型。具体措施包括：禁止企业使用未经政府批准的模型，对在美国境内故意使用未批准模型伤害美国人或财产的行为处以严厉刑事处罚，并要求所有超过特定能力阈值的模型必须获得美国政府批准。这一框架既能限制商业分发，又不完全封杀个人使用。

SemiAnalysis@SemiAnalysis_ · 7天前23

"So you're saying that your SRAM supply is infinite?" "Yes" "But the logic wafers on which the SRAM is fabbed is supply constrained?" "Yes Dave that's right"

译"所以你是说你的SRAM供应是无限的？" "是的" "但制造SRAM的逻辑晶圆供应受限？" "是的Dave，没错"

Nathan Lambert@natolambert · 7天前47

Is what happens when the world becomes AGI pilled then both the leading lab and the government tell you you need to bow down if you want access to their models. I feel it too. More of the last few weeks giving people the words to explain how they felt for months.

译Nathan Lambert评论称，当世界被AGI说服后，领先实验室和政府开始要求用户“低头”才能使用其模型。他注意到过去几周明显变化：大量大型企业寻求确保计算资源，并基于GLM-5.2在内部进行后训练。这一趋势显示开源模型正在赢得企业信任，人们开始理解开源如何取胜。