We've reached an agreement to acquire @ona_hq. Its secure cloud execution technology will help Codex take on longer-runn...
We've reached an agreement to acquire @ona_hq. Its secure cloud execution technology will help Codex take on longer-runn...
针对如何给Codex写Goal指令的问题,作者发布了一个Skill,可将一句话需求自动转化为目标,实现“睡前写指令、模型自动开发、第二天收菜”。安装命令:npx skills add joeseesun/qiaomu-goal-meta-skill。源码免费开源(见评论区),旨在简化4w字文档的阅读负担。
Anthropic告知投资者即将迎来首个盈利季度,收入翻倍至约109亿美元。OpenAI预计2026年亏损达数十亿美元,正考虑进一步降价以阻止企业客户转向Claude。SemiAnalysis分析显示,ChatGPT Pro的200美元订阅计划每月可消耗约14,000美元API等价token,而Claude Max同价计划上限约8,000美元。亏损最严重的公司被迫降价应对竞争,而接近盈利的公司正在设定行业定价标准。
Subscription plans are massively subsidized. And by massively, I mean absurdly: Claude Max 20x: $200/month, with usage r...
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until w...
推文称Codex的Goal指令功能强大,一个网站开发任务已连续运行10小时,AI自动完成开发、测试、部署和上线,且功能持续完善。作者预告的AI资讯订阅RSS站已开放体验,链接为 https://rss.qiaomu.ai/。
昨天Claude Fable 5发布以后, 压力给到了Open AI, OpenAI 正在考虑大幅降价, 以争取从劲敌 Anthropic 手中赢得更多用户, 感觉有点难追啊,全球大模型铁王座大概是Claude稳坐了
OpenAI is considering drastic price cuts as it seeks to win over customers from archrival Anthropic https://on.wsj.com/4...
你不能指望一个模型在什么地方都是最强的,要像渣男一样才能用好 AI:去爱很多模型,去发掘他们的优秀点,东食西宿,组合着用 Opus 4.8 在写作不太行,但是在 UI 设计,UI 实现比 GPT-5.5 要好很多,推荐你多用用 Claude...
Usage share of OpenAI grew vs Anthropic yesterday despite Mythos 5 / Fable 5 launch Multiple power users at SemiAnalysis...
We're making a small update to the model picker in ChatGPT! We know it's critical to a lot of people's work, and that we...
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until w...
用户发现Codex的Goal指令无需精确可衡量目标也能有效执行。设定“迭代优化网站使其更精致易用”的目标后,第一版由Claude Fable 5生成,后续迭代交由Codex负责,运行6小时即新增多项功能。预计下周开源一个在线AI资讯RSS订阅网站,支持内容自动更新、AI转写与双语对照阅读,用户可配置大模型进行AI对话和翻译,所有翻译及人工点评将沉淀为共享资产。当前网站已上线但需优化,开放内测邀请。
Career update: I've joined @OpenAI to lead Cyber with @michaelaiello. Why I joined, and what we'll be building: It's cle...
New: We got the memo OpenAI CEO Sam Altman and chief scientist Jakub Pachocki sent to staff earlier this week on IPO tim...
Imagine the alternate reality where we named GPT-5.4-Pro something like Fable.
Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowl...
Anthropic的Claude 5 Fable(代号Mythos)在几乎所有AI能力基准测试中达到SOTA,长复杂任务优势尤为显著。模型更节约token,可在数百万tokens长任务中保持专注。Stripe早期测试中,Fable 5将5000万行Ruby代码库的迁移压缩到一天完成,而人工团队需两个多月。Gemini 3.5 Pro与GPT-5.6临近发布(GPT-5.6最早下周推出),面临压力。此次发布提振了Anthropic即将进行的IPO,证明其在性能与效率上仍能大幅跃升。
Claude 5 Fable tl;dr - It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional perf...
Mythos正式上线FrontierCode基准测试,旨在衡量AI生成可维护代码的能力。该基准包含超1000小时维护者验证的任务,并引入3000+评分标准防奖励攻击。最高难度FC Diamond上,Opus 4.8得分仅13.8%,且Opus 4.8与GPT 5.5均未随effort扩展提升。Mythos/Fable后训练将test time compute用于数小时级长任务。基准已在Devin上线,ACU成本仅1.4倍。FC Extended中最易的1/3任务在2025年末被快速攻克——Opus从41%升至74%,标志着AI编码进入“维护可读代码”新时代。
It's finally out!!! @METR_Evals found that more than half of SWEBench results is unmergeable slop. FrontierCode represen...
http://x.com/i/article/2057694226981257216
http://x.com/i/article/2059815427484655622
BREAKING: WSJ reports OpenAI just made its first formal move toward IPO. it has confidentially filed draft paperwork for...
关联讨论 10 条The Verge:AI(RSS)X:歸藏 (@op7418)X:Kim (@kimmonismus)Hacker News 热门(buzzing.cc 中文翻译)The Decoder:AI News(RSS)TechCrunch:AI(RSS)IT之家(RSS)X:Rohan Paul (@rohanpaul_ai)X:Testing Catalog (@testingcatalog)OpenAI:官网动态(RSS · 排除企业/客户案例)