http://x.com/i/article/2061850535708483585 # A harness for every task: dynamic workflows in Claude Code Last week, we released dynamic workflows in Claude Code. Claude can now write its own harness on the fly, custom-built for the task at hand. While the default Claude Code harness is built for coding, it is also useful for many other types of tasks because, as it turns out, many tasks resemble coding tasks. But there are certain classes of tasks where we have had to build custom harnesses on top of Claude Code to achieve peak performance such as Research, security analysis, agent teams, or Code Review. Workflows allow you to dynamically create harnesses that enable Claude to solve all of those problems and more natively inside of Claude Code. You can also share and re-use these workflows with others. In this article, I’ll cover my initial workflows experiences and learnings so you can take full advantage. That said, best practices are still developing! Dynamic workflows often use more tokens, so think carefully about when and how to use them. Note: this post is also available on the Claude Blog ## Example prompts Before diving into the technical details, I’d like to start with some example prompts to get you thinking about the possibilities with workflows: - "This test fails maybe 1 in 50 runs. Set up a workflow to reproduce it, form theories and adversarially test them in worktrees /goal don't stop until one theory works." - "Using a workflow, go through my last 50 sessions and mine them for corrections I keep making and turn the recurring ones into CLAUDE.md rules" - “Use a workflow to dig through #incidents in Slack for the past six months and find recurring root causes where nobody has filed a ticket." - "Take my business plan and run a workflow where different agents tear it apart from an investor's, a customer's, and a competitor's perspective." - "Here's a folder of 80 resumes, use a workflow to rank them for the backend role and double-check the top ten. Interview me using the AskUserQuestion tool for a rubric." - "I need a name for this CLI tool. Use a workflow to brainstorm a bunch of options and run a tournament to pick the top 3." - "Use a workflow to rename our User model to Account everywhere." - “Go through my blog post draft and using a workflow verify every technical claim against the codebase, I don't want to ship anything wrong." ## How dynamic workflows work Dynamic workflows execute a javascript file with a few special functions that help spawn and coordinate subagents: Dynamic workflows also include standard JavaScript functions like JSON, Math, and Array, to help process data. It’s particularly useful to know that dynamic workflows can decide which models an agent uses and whether subagents are run in their own worktree, allowing Claude to choose the intelligence level and isolation needed. If a workflow is interrupted, for example by user action or quitting the terminal, resuming the session will allow the workflow to pick up where it left off. ## Why dynamic workflows When you ask the default Claude Code harness to do a task, it needs to both plan and execute in the same context window. For many coding tasks, this is highly effective, but it can sometimes break down over long-running, massively parallel and/or highly structured adversarial tasks. This is because the longer Claude works on a complex task in a single context window, the more it becomes susceptible to a few specific failure modes: - Agentic laziness refers to when Claude stops before finishing a particularly complex, multi-part task and declares the job done after partial progress, for example addressing 20 of the 50 items in a security review. - Self-preferential bias refers to Claude’s tendency to prefer its own results or findings, especially when asked to verify or judge them against a rubric. - Goal drift refers to the gradual loss of fidelity to the original objective across many turns, especially after compaction. Each summarization step is lossy, and details like edge-case requirements or "don't do X" constraints can get lost. Creating a workflow helps combat these by orchestrating separate Claudes with their own context windows and focused, isolated goals. ## Dynamic vs static workflows You may have previously created a static workflow using the Claude Agent SDK or claude -p to coordinate multiple instances of Claude Code together. But because static workflows need to work for all edge cases, they are usually more generic. With Claude Opus 4.8 and dynamic workflows, Claude is now intelligent enough to write a custom harness tailor-made for your use case. # Helpful patterns when using dynamic workflows You can start using dynamic workflows just by asking Claude to make one, or by using the trigger word “ultracode” to ensure that Claude Code creates a workflow. But building a mental model for how dynamic workflows work will help you understand when to use them and how you might nudge Claude via prompts. There are a few common patterns that Claude might use and compose together when building workflows: Classify-and-act Use a classifier agent to decide on the type of task, and then route to different agents or behavior based on the task. Or, use a classifier at the end to determine output. Fan-out-and-synthesize Split up a task into many smaller steps, run an agent on each step and then synthesize those results. This is particularly useful for when there are a large number of smaller steps, or when each step benefits from its own clean context window so they don't interfere or cross-contaminate. The synthesize step is a barrier—it waits for all the fan-out agents, then merges their structured outputs into one result. Adversarial verification For each spawned agent, run a separate spawned agent to adversarially verify its output against a rubric or criteria. Generate-and-filter Generate a number of ideas on a topic and then filter them by a rubric or by verification, dedupe duplicates and return only the highest quality, tested ideas. Tournament Instead of dividing the work, have agents compete on it. Spawn N agents that each attempt the same task using different approaches. Prompts or models then judge the results in a pairwise fashion using a judging agent until you have a winner. Loop until done For tasks with an unknown amount of work, loop spawning agents until a stop condition is met (no new findings, or no more errors in the logs) instead of a fixed number of passes. # Use cases Think creatively of when and how to ask Claude Code to make dynamic workflows. I’ve found that workflows are sometimes even more useful for non-technical work. ## Migrations and refactors Bun was rewritten from Zig to Rust using workflows. You can read more about how that was done in Jarred’s X thread. The key is to break down the task into a series of steps that need to be operated on for example callsites, failing tests, modules, etc. Spin off a subagent for every fix in a worktree to make the fix, then have another agent adversarially review, and merge them. Consider telling the agent not to use resource intensive commands so that you can maximally parallelize without running out of resources on your machine. ## Deep research We published a deep research skill (/deep-research) inside Claude Code that uses dynamic workflows. Specifically, it fans-out web searches, fetches sources, adversarially verifies their claims, and synthesizes a cited report. But you may do this sort of research for more than just web searches. For example, asking Claude to compile a status report from context in Slack or to research how a feature works by exploring a codebase in-depth. ## Deep verification On the other hand, if you have a report where you want to check and source every factual claim that it references you may want to generate a workflow which has one agent identify all of the factual claims and then spin off a subagent to check each one in-detail. You could also have a verification agent check the source subagent to make sure its source is high quality. ## Sorting You may have a list of items that you want to sort by some qualitative measurement that you believe that Claude Code is good at evaluating, for example: support tickets sorted by severity of the bug. But if you try to sort 1000+ rows in one prompt, quality degrades and it won't fit in context. Instead run a tournament, a pipeline of pairwise-comparison agents (comparative judgment is more reliable than absolute scoring), or bucket-rank in parallel then merge. Each comparison is its own agent, so the deterministic loop holds the bracket and only the running order stays in context. ## Memory and rule adherence If you have a particular set of rules that you find Claude misses or struggles with, even when put into the CLAUDE.mds, create a workflow with a list of rules that must be checked by verifier agents—one verifier per rule. Creating a skeptic persona subagent to review the rules to make sure they are in line will help avoid too many false positives. The reverse direction works too: mine your recent sessions and code review comments for corrections you keep making, cluster them with parallel agents, adversarially verify each candidate (would this rule have prevented a real mistake?), and then distill the survivors back into a CLAUDE.md. ## Root-cause investigation Debugging works best when you come up with several independent hypotheses and test them, but if you’re only using one context window, Claude can run into self-preferential bias. A workflow can structurally prevent this by spinning up agents to generate hypotheses from disjoint evidence. For example, separate agents for logs, files, and data. Each hypothesis can then face a panel of verifiers and refuters. This isn't just for code. Workflows can be used for sales (why did sales drop in March?), data engineering (why did this pipeline fail?), or any post-mortem exercise. ## Triaging at scale Every team has a support queue, bug reports, or some other backlog that cannot be fully processed by humans. A triage workflow classifies each item, dedupes against what's already tracked, and takes action. This could mean attempting the fix or escalating to a human user. A useful pattern for triage workflows is quarantine. This involves barring the agents that read untrusted public content from taking high-privilege actions, which are instead done by the agents in charge of acting on the information. Pair triage workflows with /loop to have Claude do this continuously. ## Exploration and taste Workflows can be useful when exploring different approaches to a solution, especially when it is taste based, like design or naming, and would benefit from a rubric. Try asking Claude to explore a bunch of solutions, and give a review agent a rubric for what a good solution looks like. The task is complete when the review agent feels like it has met the criteria. Solutions can also be ordered or selected via a tournament based on the rubric. ## Evals You can run lightweight evals for particular tasks by spinning off separate agents in a worktree and then spinning off comparison agents to compare and grade the specific outputs against a rubric. For example, evaluating and then refining a skill you’ve created against a particular criteria. ## Model and intelligence routing Create a classifier agent tuned to your tasks that decides which model to use. This can be helpful when your task will involve many tool calls and conducting research prior to execution can identify the best model for the job. For example, the best model for the task “explain how the auth module works” depends on how many files in the auth module there are and the shape of the codebase. A classifier agent can do this research and then route to Sonnet or Opus based on the expected complexity of the task. ## When not to use dynamic workflows Workflows are new. While there are many use cases where it will create outsized results, they are not needed for every task and may end up using significantly more tokens. It’s best to use workflows creatively to push Claude Code in ways that you haven’t previously. For regular coding tasks, try and ask yourself does it really need more compute? For example, most traditional coding tasks do not need a panel of 5 reviewers. # Tips for building dynamic workflows Prompting Detailed prompting, using the specific techniques we described above, for dynamic workflows creates the best results. Workflows are not just for large tasks. You can prompt the model to use a “quick workflow.” For example, you can create a quick adversarial review of an assumption. Combine with /goal and /loop When using workflows that can be repeated, for example triage, research, or verification, pair them with /loop to be run at regular intervals, and /goal to set a hard completion requirement. Token usage budgets You can set explicit token usage budgets for dynamic workflows to limit how many tokens a task uses. You can prompt it with a budget like: “use 10k tokens,” which will set the cap. Saving and sharing dynamic workflows You can save workflows by pressing “s” in the workflow menu. You can check these into ~/.claude/workflows or distribute them via a skill. To share them via a skill, put your JavaScript workflow files in the skill and folder and reference them in the SKILL.MD. To allow for more flexibility, you may want to prompt Claude to think of the workflows in the skill as a template instead of a script that needs to be run verbatim. ## A whole new world Workflows are a helpful new way to extend Claude Code. I encourage you to think of this as a starting point, there's still much to discover in how to use them best. Let us know what you find. Thariq Shihipar and Sid Bidasaria (@sidbid) are members of technical staff at Anthropic, working on Claude Code.

译Claude Code 新增动态工作流功能，使 Claude 能根据任务动态创建定制化的执行框架。该功能通过执行 JavaScript 文件来协调子智能体，并可指定模型与工作区隔离级别。它适用于研究、安全分析、代码审查等复杂任务，支持共享与复用。需要注意，动态工作流会消耗更多 token。

Thariq@trq212 · 6月3日69

Workflows are the biggest upgrade to Claude Code’s capabilities since skills and subagents. I dove deep into it with @sidbid to figure out best practices, examples and more. I’m particularly excited about the non-technical tasks it enables for Claude Code.

译工作流是 Claude Code 自技能和子智能体以来最大的能力升级。我和 @sidbid 深入探讨了最佳实践、示例等内容。我特别兴奋于它为 Claude Code 启用的非技术任务。

ClaudeDevs@ClaudeDevs · 6月3日73

How do you get Claude Code to check its own work before handing it back? Watch how you can encode your manual checks so Claude closes its own feedback loop:

译如何让 Claude Code 在交回工作前检查自己的成果？看看如何编码你的手动检查，让 Claude 自己关闭反馈循环：

ClaudeDevs@ClaudeDevs · 6月3日77

We’ve added a CLI for Claude Platform to make every API endpoint runnable from your terminal. Call the Messages API, stand up Claude Managed Agents, pipe results straight into your shell. The ant CLI is well understood by coding agents (Claude Code) using the claude-api skill.

译我们为 Claude Platform 添加了一个 CLI，使每个 API 端点都可以从你的终端运行。调用 Messages API，启动 Claude 托管智能体，并将结果直接管道传输到你的 shell。 ant CLI 被使用 claude-api 技能的编码智能体（Claude Code）很好地理解。

Claude@claudeai · 6月2日30

Interpreting law is one of the oldest jobs in the world. @MaxJunestrand, co-founder and CEO of @WeAreLegora, is bringing it into its next era with Claude. His bet: every new model release raises the tide, and Legora is building the boats for everyone else.

译解读法律是世界上最古老的职业之一。@WeAreLegora的联合创始人兼CEO @MaxJunestrand 正在用Claude将其带入下一个时代。他的信念是：每一次新模型的发布都抬高了水位，而Legora正在为其他所有人建造船只。

Emad@EMostaque · 6月2日35

I wonder how many founders will pass on investors who passed on them in prior rounds I wonder how many would have three dinners & give them an allocation only to slash it to zero at the last moment.

译推文提出疑问：多少创始人会拒绝那些曾在前一轮投资中拒绝过自己的投资人？并引述Anthropic上一轮融资的内幕：一位知名基金的合伙人与Dario共进了三次晚餐后，其份额被削减至零。同时，至少另外四家一线基金也在最后关头被撤。引用推文指出，这些投资人受罚的原因是错过了由Spark领投的Series B——那是Dario经历过最艰难的一轮募资。在风险投资中，信任（conviction）就是一切。

Rohan Paul@rohanpaul_ai · 6月2日59

Anthropic is expanding Claude Mythos Preview from about 50 Project Glasswing partners to about 200 vetted organizations. This model is much closer to a cyber weapon detector than a normal coding assistant, since it can find weak spots in software and sometimes prove the attack path by building a working test exploit. The select group is basically a defensive priority list. The list includes power, healthcare, water, communications, hardware, governments, nonprofits, and key software maintainers, with security checks before any group gets the model. With this priority access Anthropic is trying to create a patching head start before similar AI-assisted exploit discovery becomes common across the industry. Anthropic says partners have already found 10,000+ high- or critical-severity flaws. Anthropic is still not making Mythos fully public because its own testing says the model can find subtle old bugs, chain small issues into bigger exploits, and help non-experts reach outcomes that previously required elite security skill. The poin is that other top frontier models usually flag suspicious code, while Mythos can inspect a codebase, form a theory about a bug, test it in a sandbox, read the failure, adjust the plan, and repeat until it has proof.

译Anthropic 正在将其 Project Glasswing 计划扩展至约200个经过审查的组织，以提供 Claude Mythos Preview 模型。该模型更接近于一个网络武器检测器，而非普通编程助手，它能分析代码库、验证漏洞攻击路径并构建测试漏洞以证明其可行性。访问权限优先分配给能源、医疗、水务、通信等关键基础设施部门以及政府机构。Anthropic 的目标是在漏洞发现工具广泛普及前，为这些重要系统提供补丁的先机。据称，合作伙伴已利用该模型发现超过10,000个高危或严重漏洞。Anthropic 暂未将该模型公开，因为其测试表明，模型能发现隐蔽的老漏洞、串联小问题形成大攻击，并使非专家也能达到专业安全人员的水平。

Anthropic@AnthropicAI · 6月2日57

We’re expanding Project Glasswing. We’ve extended access to Claude Mythos Preview to approximately 150 additional organizations, based in more than fifteen countries. Read more about this expansion and our future plans for Project Glasswing: https://www.anthropic.com/news/expanding-project-glasswing

译我们正在扩展Project Glasswing。我们已将Claude Mythos Preview的访问权限扩展至约150个额外组织，这些组织分布在超过十五个国家。阅读更多关于此次扩展及Project Glasswing未来计划的信息：https://www.anthropic.com/news/expanding-project-glasswing

小互@xiaohu · 6月2日14

Claude Code 升级为4.8后老是出这个 The model's tool call could not be parsed (retry also failed). 你们遇到过吗？好烦

🚨 AI News | TestingCatalog@testingcatalog · 6月2日41

Claude for iOS will get a redesigned settings menu along with a support for the upcoming Memory Files feature. > A slightly redesigned UI is being prepared for both Claude web and mobile, primarily revamping settings and navigation bar. > Memory Files is the upcoming new knowledge based memory system for Claude.

译Claude for iOS 将获得重新设计的设置菜单，并支持即将推出的 Memory Files 功能。 > Claude 网页版和移动版正在准备一个略微重新设计的 UI，主要改进设置和导航栏。 > Memory Files 是 Claude 即将推出的基于知识的新记忆系统。

SemiAnalysis@SemiAnalysis_ · 6月2日53

ARTICLE UPDATE ALERT: The day after we published Finding Miscompiles for Fun, Not Profit, Anthropic released Opus 4.8 and ultracode mode in Claude Code. Our preliminary experiments indicate that together these are significantly better at filtering out low-severity bugs, and that the cost per medium-to-high severity bug found is maybe 1/5 (with VERY large error bars) that of the workflow described in this article. (1/2)🧵

译文章更新提醒：我们发布《Finding Miscompiles for Fun, Not Profit》的次日，Anthropic发布了Opus 4.8和Claude Code中的ultracode模式。我们的初步实验表明，两者结合在过滤低严重性漏洞方面显著更优，且发现中高严重性漏洞的成本可能仅为本文所述工作流的1/5（误差范围极大）。(1/2)🧵

Yuchen Jin@Yuchenj_UW · 6月2日12

Came home to a surprise gift box from Anthropic on my doorstep. What’s cooler than vibe-coding software? Vibe-coding hardware! I can probably vibe code this mini-computer into a remote control for my Claude Code session. Thanks @bcherny for sending it over!

译回家发现门口放着一个来自 Anthropic 的惊喜礼物盒。比 vibe-coding 软件更酷的是什么？Vibe-coding 硬件！我大概能把这台迷你电脑 vibe code 成 Claude Code 会话的遥控器。感谢 @bcherny 寄来！

歸藏(guizang.ai)@op7418 · 6月2日77

Anthropic 开始准备 IPO 了。 MiniMax 和智谱也同时提交了上 A 股和科创板的申请，同时开始进行上市指导。大家都有光明的未来，不知道 OpenAI 啥时候开始。

Berryxia.AI@berryxia · 6月2日63

这种不要说磨了30遍，主要是真的非常费人且费Token。黄总这个研究也算是把这套Claude Workflow 底层核心的设计研究的七七八八了，对于自己想做一些项目中可以引入和学习。反正我不会，AI会就行了。反正我不学，AI学就行了。 😁

译该推文拆解了Claude Code的工作流。它是一个能后台运行、可监控的任务系统，包含三个核心角色：Claude负责拆解任务与规划，Runtime负责调度管理状态，每个AI智能体（agent）仅处理一个子任务，并通过并发池与队列推进。系统关键设计是“状态外置”，即中间结果由执行系统保存，主上下文只读取摘要，从而使其能扩展至大量智能体。推文认为这种智能规划、Runtime执行、状态独立、模型按需调度的模式，代表了一种新的工程编排方式，并可将其工作流转换为自有系统的可执行格式。

meng shao@shao__meng · 6月2日78

Claude Code 核心开发者 @trq212 分享了一段高价值「人机结对编程中的 “理解验证” 工作流」通过这份工作流 Skill，让 Coding Agent 结束工作时，人类对问题、方案和影响都有可复述、可辩护的掌握，一起拆解看看。 https://gist.github.com/ThariqS/1389dcdff9eba4789887a2211370f06b 核心定位：AI 扮演「高效且睿智的教师」成功标准不只是「任务完成」，更要看人类是否真正理解整场会话，与常见 agent 模式的差异： · 每步增量教学，过关才进入下一阶段 · 先让用户复述，再补缺口 · 清单 + 测验 + 演示理解才算结束三条理解轴（清单应覆盖） 1. 问题域 · 是什么问题 · 为何会出现（根因、历史、分支路径） · 曾有哪些取舍路线 2. 方案域 · 做了什么、为何这样解 · 设计决策与 trade-off · 边界情况与失败模式 3. 语境域 · 改动在系统/业务里意味着什么 · 会影响谁、什么流程、什么风险反复追问 why → 更深层的 why，同时覆盖 what / how。强调：问题理解不到位，方案理解往往是假的。操作流程（可执行的节拍） 1. 做完一小步只推进一个可验收的小单元（例如：定位根因、选定方案、改一处逻辑），不要一口气跨多个阶段。 2. 先让用户复述在进入下一步之前，请用户用自己的话说明：这一步在解决什么、为什么这样做、还有什么不确定。这是诊断，不是考试前的泄题。 3. 按缺口补课根据复述找空洞：补动机、补业务逻辑、补边界与分支；可按需要切换抽象层级（例如 ELI5 / ELI14 /「像实习生那样讲」）。 4. 小范围验证用开放题或多选题检查是否真懂；若用选择题，打乱正确选项顺序，且在用户提交答案之前不公布对错。 5. 过关才前进同一阶段需在高层（为何要做）和低层（怎么做、边界在哪）都确认后，才进入下一阶段。 6. 同步更新清单在 running 的 Markdown 里勾选或补充：问题 / 方案 / 语境三个维度下，用户应掌握的具体条目。 7. 必要时绑到真实材料理解若依赖实现细节，贴相关代码片段，或一起用调试器走一遍，避免「听懂了但对着 diff 仍说不清」。 8. 收工条件会话结束前，清单上的每一项都需用户表现出已掌握（能复述、能答题、能解释 trade-off），而不是由 agent 单方面总结一句「你应该懂了」。设计意图（为啥在 Anthropic 内部被推崇） · 对抗「智能体黑箱」：长会话里人类容易变成审批按钮；增量确认把认知负荷摊到全程。 · 把 tacit knowledge 外显化：分支、否决方案、边缘 case 往往只存在于 agent 上下文里，清单强制沉淀。 · 可审计的学习：对团队负责人或后来的自己，「当时为什么这么改」有迹可循。 · 与产品风险对齐：懂 impact 才谈得上 responsible shipping，而不只是 merge。实操要点（落地时注意） · 清单是活文档：随会话演进增删项，不是一次性大纲。 · 测验要变式：避免背答案；多选题需轮换正确选项位置。 · 层级要交替：同一主题在动机 <-> 实现 <-> 边界之间切换，防止只会背概念或只会跟 diff。 · 会话可拉长：这是刻意的——深度理解优先于速度。

译Anthropic 核心开发者分享了一套用于 Claude Code 的「理解验证」工作流。该工作流将 AI 定位为“高效且睿智的教师”，成功标准不仅是完成任务，更是确保人类对问题、方案及影响有可复述、可辩护的掌握。它通过增量教学、用户复述、清单+测验等方式，围绕问题域、方案域和语境域三条轴线展开，具体包含8个可执行步骤，强调在进入下一阶段前需确认用户已真正理解。此工作流旨在对抗长会话中人类易沦为“审批按钮”的“智能体黑箱”问题，强制沉淀决策上下文，实现可审计的深度理解。

meng shao@shao__meng · 6月2日60

吴恩达老师谈「AI FDE」和「AI Engineer」 AI 在创造新岗位，但长期岗位规模上，企业内部的 AI Engineer 会远大于厂商派驻的 Forward Deployed Engineer (FDE)；眼下最有价值的是能搭应用、会用 AI 编程工具的通才型 AI 工程师。回顾一下 AI FDE：驻场 + 深度集成 + 强交付 · 约 20 年前由 Palantir 开创：工程师进驻客户现场（如政府、隔离网环境）做深度交付 · OpenAI、Anthropic 等组建 AI FDE 团队，把工程师嵌入客户组织 · 把通用 LLM 改造成贴合业务的定制化智能体工作流（搭建、调优、落地） · 技术 + 沟通 + 有时还需商业判断：挖需求、排优先级、讲清技术、合理 push back 和「AI Engineer」的数量关系：吴老师的判断吴老师明确反对把 FDE 当成 AI 时代的主航道职业： 1. 企业更愿意养自己的兵可能接受少量外部 FDE，但更希望大量自有员工做 AI 项目——他自己的组织也是「招 FDE，但招远更多 AI Engineer」。 2. 厂商绑定 vs 选择权（optionality） · FDE 往往深度集成某一厂商产品，客户担心供应商锁定 · 在「一年后哪家 AI 服务最好还说不清」的阶段，保持技术/vendor 可选性比快速深度绑定更值钱 · 让 FDE 把流程绑死在一家厂商上，会显著削弱未来换栈空间结论：FDE 是重要但相对小众的交付形态；AI Engineer 才是更大、更稳的就业池。当下真正抢手的是什么人？吴老师观察到需求集中在 AI Engineer，尤其是能： · 用 LLM 能力做软件应用（prompt、智能体框架、evals 等） · 高效使用 AI Coding Agent（Claude Code、Codex、Antigravity CLI、OpenCode 等）这是「用 AI 组件写产品」的工程师，不一定非要驻场，也不一定代表某一家模型公司。职业演化：会像传统 Software Engineer 一样分化他认为 AI Engineer 会像几十年前的「软件工程师」一样从通才裂成专才，可能包括（他也在猜测）： · AI FDE（厂商侧或咨询侧驻场型） · LLMOps Engineer · Evals Engineer · AI Data Engineer · Harness Engineer（智能体/评测 harness） · 以及尚未命名的角色现阶段：通才型、技能全面的 AI Engineer 仍能创造很大价值——专业化是十年量级的趋势，不是今天的入场门槛。对「AI 砸就业」叙事的态度他用 FDE 复兴举例：AI 在创造新工种（FDE、AI Engineer 及未来专才），因此「工作末日 / jobocalypse」叙事过于简单。更准确的说法是：岗位结构在变，总量与类型会重组，而不是单向消灭。

译吴恩达对比了AI Forward Deployed Engineer (FDE)和AI Engineer两种岗位。他指出，FDE由厂商派驻客户现场进行深度集成，但企业更倾向于培养大量自有AI工程师。他判断，AI Engineer岗位数量将远多于FDE，因为客户担心供应商锁定，在AI技术快速演进时更需要保持技术选型灵活性。当下最有价值的是能使用Claude Code、Codex等AI编程工具构建应用的通才型AI工程师。未来该角色可能像传统软件工程师一样，分化出LLMOps、Evals等专才，但现阶段综合型人才价值依然很大。

SemiAnalysis@SemiAnalysis_ · 6月2日61

AWS margins jumping 10 points while Azure and Cloud fall flat. The Tokenomics Team deep dives into selling tokens vs renting GPU's, Anthropics $65 Billion Raise in Series H, and stablized token margins. New Episode Out Now: https://youtu.be/3zGmZfZnChs

译AWS利润率跃升10个百分点，而Azure和Cloud表现平淡。Tokenomics团队深入探讨了出售token与出租GPU的对比，Anthropic的650亿美元H轮融资，以及稳定的token利润率。新一期节目现已上线：https://youtu.be/3zGmZfZnChs

ginobefun@hongming731 · 6月2日74

Anthropic 提交 S-1 草案，为 IPO 做准备

Yuchen Jin@Yuchenj_UW · 6月2日50

OpenAI slept on coding, so Anthropic stole the crown. Anthropic didn’t secure enough GPUs/TPUs to turn that lead into a monopoly. Now Codex has caught up. Gemini will catch up too. It’s only a matter of time. AI coding is becoming a three-body problem.

译OpenAI 在编程领域睡着了，于是 Anthropic 抢走了王冠。 Anthropic 没有获得足够的 GPU/TPU 来将这一领先优势转化为垄断。现在 Codex 已经追上来了。 Gemini 也会追上来。这只是时间问题。 AI 编程正在成为一个三体问题。

elvis@omarsar0 · 6月2日51

Go build!

译去构建吧！

宝玉@dotey · 6月2日50

活久见，Claude 也重置了额度！

ClaudeDevs@ClaudeDevs · 6月2日49

We've reset 5-hour and weekly rate limits for all users on Pro and Max plans. We fixed an issue that caused some Claude Code sessions to spawn excessive parallel subagents, burning through usage faster than expected.

译我们已为所有Pro和Max计划用户重置了5小时和每周速率限制。我们修复了一个导致部分Claude Code会话生成过多并行子智能体、从而比预期更快消耗用量的问题。

Berryxia.AI@berryxia · 6月2日73

Anthropic如果成功上市，会对整个AI行业的融资和治理模式带来什么改变？ Apple 能不能收购了得了啊！封号真特么烦人！😡 刚刚刷到Anthropic官方发文。他们已经秘密向SEC提交了S-1注册声明草稿。只要审查通过，他们就拿到了启动IPO的选择权。这家由Dario Amodei掌舵、一直把安全和可控放在第一位的实验室，选择在这个节点走向公开市场，信号其实挺明确。很多人之前觉得顶尖AI公司会长期保持私有状态，避免资本市场短期压力和额外监管。但Anthropic的动作说明，在scaling竞赛进入深水区时，获得更广阔的资本通道和接受公众监督，已经成了保持领先的现实路径。他们把过去几年积累的估值和信任，准备放到市场上去检验。

译Anthropic已秘密向美国证券交易委员会（SEC）提交S-1注册声明草案，审查通过后将获得启动首次公开募股（IPO）的选择权。这一举动打破了顶尖AI公司为避免资本市场压力而长期保持私有状态的预期，表明在AI规模化竞赛的关键阶段，获取更广阔的资本通道和接受公众监督已成为保持领先地位的现实路径。由Dario Amodei领导的、注重安全的Anthropic，选择将其积累的估值和信任交由市场检验。

Chubby♨️@kimmonismus · 6月2日73

Here we go: Anthropic confidentially submits draft S-1 to the SEC The most important question: who will go public first, OpenAI or Anthropic? OpenAI wanted to beat Anthropic to the punch, but now Anthropic seems to be in a hurry. Currently, Anthropic's valuation is higher. The race to the stock market is on.

译Anthropic已向美国证券交易委员会（SEC）秘密提交了S-1招股说明书草案，这为其后续进行首次公开募股（IPO）奠定了基础。此举标志着AI领域的上市竞赛加剧，特别是与主要竞争对手OpenAI的对比。当前Anthropic的估值高于OpenAI，双方谁将率先上市成为市场焦点。

Rohan Paul@rohanpaul_ai · 6月2日82

Anthropic just took the formal first step toward its IPO. They confidentially sent the draft S-1 to the SEC. A draft S-1 is the IPO document that will eventually disclose Anthropic’s business model, financials, risks, share structure, use of proceeds, and underwriters, so submitting it means the company has done enough legal, accounting, and banker work to begin the real pre-IPO process. The confidential part means the public still cannot read the filing yet, because the SEC allows companies to submit draft registration statements for nonpublic review before the public version appears. Anthropic can receive SEC comments before exposing the full filing, so the market gets a signal of intent before getting the financial x-ray. The next visible milestone will be the public S-1, because that is when everyone finally sees the numbers that matter: revenue, losses, gross margin, compute spending, customer concentration, Amazon or Google dependence, executive control, and legal risks.

译Anthropic已向美国证券交易委员会（SEC）秘密提交了草案S-1注册文件，这是首次公开募股（IPO）流程的正式第一步。草案S-1最终将披露公司的商业模式、财务状况和风险等信息，但目前处于非公开审查阶段。此举是Anthropic为潜在IPO做准备的明确信号。下一个关键节点将是公开的S-1版本，届时外界将首次看到其核心财务数据。

Anthropic@AnthropicAI · 6月2日86

Anthropic has confidentially submitted a draft S-1 registration statement to the Securities and Exchange Commission. Pending completion of SEC review, this gives us the option to pursue an initial public offering. Read more: https://www.anthropic.com/news/confidential-draft-s1-sec

译Anthropic已向美国证券交易委员会秘密提交了S-1注册草案。待SEC审查完成后，我们将可选择推进首次公开募股。阅读更多：https://www.anthropic.com/news/confidential-draft1-sec

🚨 AI News | TestingCatalog@testingcatalog · 6月2日76

BREAKING 🔥: ANTHROPIC SUBMITTED DRAFT DOCUMENTS FOR THE UPCOMING IPO! SOON 📊

译突发 🔥：Anthropic已提交即将到来的IPO草案文件！即将 📊

Emad@EMostaque · 6月1日50

Let’s say half of OpenAI and Anthropic goes to the American people, $1 trillion That works out at $2,800 per American. With a 5% dividend (optimistic) that would be $142 a year Which alas would barely cover the cost of an OpenAI or Anthropic subscription.

译美国参议员桑德斯提出《美国AI主权财富基金法案》，旨在让公众直接拥有AI公司股份。推文设想若OpenAI和Anthropic的一半股份归美国人民所有，总价值约1万亿美元，相当于每人2800美元。按5%股息率计算，每人每年可获142美元，但仅勉强够支付一家AI公司的订阅费用。该法案基于“AI建立在人类集体知识之上”的理念，旨在让AI产生的财富惠及全人类而非少数寡头。

宝玉@dotey · 6月1日70

你不能指望一个模型在什么地方都是最强的，要像渣男一样才能用好 AI：去爱很多模型，去发掘他们的优秀点，东食西宿，组合着用 Opus 4.8 在写作不太行，但是在 UI 设计，UI 实现比 GPT-5.5 要好很多，推荐你多用用 Claude Design，然后把 Claude Design 设计好的结果分别给 GPT-5.5 和 Opus 4.8 去实现一下看看差异。然后系统设计和计划方面，质量也是很高的，一个复杂一点任务通常要先做 Plan、做系统设计，这方面 Opus 4.8 也是非常好的。另外和你用的 Agent 有关系，每个模型都有自己的特性，需要重新设计提示词反复调优，如果你在 Claude Code 和 Cursor 里面用 Opus 4.8，除了写作，其他任务的效果是没有什么问题的。

译推文建议，不应指望单一模型全能，而应像“渣男”一样发掘并组合使用多个模型的长处。具体指出 Opus 4.8 在写作上表现不佳，但在 UI 设计与实现方面明显优于 GPT-5.5，且在系统设计和任务规划方面质量很高。在智能体工具中使用时，除写作外效果可靠，但需针对模型特性重新设计提示词。

宝玉@dotey · 6月1日64

你不能指望一个模型在什么地方都是最强的，要像渣男一样才能用好 AI：去爱很多模型，去发掘他们的优秀点，东食西宿，组合着用 Opus 4.8 在写作不太行，但是在 UI 设计，UI 实现比 GPT-5.5 要好很多，推荐你多用用 Claude Design，然后把 Claude Design 设计好的结果分别给 GPT-5.5 和 Opus 4.8 去实现一下看看差异。然后系统设计和计划方面，质量也是很高的，一个复杂一点任务通常要先做 Plan、做系统设计，这方面 Opus 4.8 也是非常好的。另外和你用的 Agent 有关系，每个模型都有自己的特性，需要重新设计提示词反复调优，如果你在 Claude Code 和 Cursor 里面用 Opus 4.8，除了写作，其他任务的效果是没有什么问题的。

译推文建议像“渣男”一样组合使用多个AI模型，发掘各自优势。具体指出 Opus 4.8 在UI设计与实现上优于 GPT-5.5，推荐用 Claude Design 后交由不同模型实现；其在系统设计和计划方面质量也高，但在写作上较弱。在 Claude Code、Cursor 等智能体中使用时，除写作外效果良好。引用内容提及 Opus 4.8 近期负面评价与退订增多，并有人预测 Anthropic 未来可能面临困境。

meng shao@shao__meng · 6月1日55

Claude Opus 4.8 > 4.7 对，但没用 Opus 4.8 在各项 Benchmark 和诚实度、长任务等方面比 4.7 都有进步，这没错，但对于 LLM 的使用者们来说，这种进步不会产生真正的改变，只能算是 4.7 的升级而已假设你原来就在用 Opus 4.7，那切换到 4.8 是正常的，调过提示词，benchmark 通过后就可以切。假设你原来在用 GPT-5.5、DeepSeek 等，你会因为 Opus 4.8 的发布而切过来吗？我想是不会的，至少 4.8 不会，至于 Opus 5 会不会，不知道，也很难。

译推文指出，Claude Opus 4.8相比4.7确实在各项基准测试和诚实度、长任务等方面有所进步。但对于已使用GPT-5.5、DeepSeek等其他大语言模型的用户而言，这种改进被视为常规升级，不足以构成切换模型的动力。至于未来的Opus 5是否可能，目前未知且很难。

宝玉@dotey · 6月1日70

自从 Claude Design 可以共享额度，可以用的次数多了很多，但 Token 消耗还是很厉害。不过做出来的东西真的很好，真的强烈建议你多用用，这是我近期用的最好的 Agent 产品之一。一个技巧，你可以导入现成的 Design System，再让它设计，风格一致性会好很多，用一些成熟的 Design System，做出来的东西也更高端大气一些。我个人推荐试试 Adobe 的 Spectrum 2 design system, 用下面的 URL 就可以导入，导入后就可以让它设计时基于 https://github.com/adobe/react-spectrum 这里可以找到更多的设计系统： https://github.com/alexpate/awesome-design-systems

译Claude Design 现与 Claude AI 网站、Claude Code 共享额度，用起来更便捷。其产品设计和UI设计能力不错。一个提升设计一致性的技巧是先导入成熟的Design System再进行设计，例如推荐Adobe的Spectrum 2 design system。此外，GitHub上有更多设计系统资源可参考。

Berryxia.AI@berryxia · 6月1日73

http://x.com/i/article/2060375125825036288 # 用Claude花了2周时间+800美金打造的大唐语音互动3D小游戏的教程。这是一份面向普通读者、创作者和初学开发者的科普教程。它不假设你已经懂 Three.js、实时语音或 AI Agent，而是从一个朴素问题开始： > 如果一座盛唐长安城不是只能观看，而是可以走进去、和李白对诗、和导游问路、在 AI 展馆里听智能讲解，会是什么体验？我们用两周左右的高强度开发，把这个想法做成了一个可在线访问、可开源复用的 Web 3D 互动项目。项目地址： - 在线体验：https://andyhuo520.github.io/tang-changan/ - GitHub：https://github.com/andyhuo520/tang-changan > 上图是我们为语音 NPC 面板，使用GPT-image-2 模型生成的素材，准备的一组角色视觉素材。项目里每个核心角色都可以拥有自己的头像、视频开场和待机状态，让“和 NPC 说话”更像在游戏里见到一个具体的人。 ## 1. 最初的设计目标一开始，我们并不是想做一个普通的“3D 展示页”。我们的目标更像一个小型数字文旅实验： 1. 它要像游戏一样能玩。玩家可以进入场景，用 WASD 操控角色，而不是只能转动相机看模型。 1. 它要像博物馆一样能逛。场景里有宫殿、朱雀大街、珍宝馆、诗画展厅、AI 展馆。 1. 它要像真实导览一样能说话。玩家不是点几个固定按钮，而是能按住麦克风和 NPC 语音交流。 1. 它要有盛唐气质。色彩、建筑、人物、诗词、小游戏都围绕“长安”“诗酒”“万邦来朝”展开。 1. 它要能开源。最终要能部署到 GitHub Pages，让别人直接体验，也能阅读代码学习。用一句话概括： > 我们想把“盛唐长安”做成一个可漫游、可对话、可游戏、可展示 AI 能力的浏览器 3D 世界。 ## 2. 第一阶段：先搭出一个能看的长安沙盘任何复杂互动项目，第一步都不是做功能，而是先让“世界存在”。我们先用 Web 3D 技术搭建了一个低多边形风格的长安微缩沙盘。核心技术是 Three.js：它可以在浏览器中渲染 3D 场景，不需要用户安装客户端。这一阶段的重点是： - 建立主场景、相机、灯光、后期效果； - 搭建朱雀大街、宫殿、城门、市集、塔楼、河道等地标； - 用低多边形材质保持性能，让普通浏览器也能跑； - 加入昼夜、季节、天气、雾效等氛围变化； - 做出俯瞰视角，让它第一眼像一张“会动的唐代城市地图”。这一阶段看起来像“美术搭建”，但其实它决定了后续所有玩法的边界：哪里能走、哪里能互动、哪些地标能承载剧情。 ## 3. 第二阶段：把展示页变成可玩的游戏只有沙盘还不够。我们希望玩家不是“看长安”，而是“走进长安”。于是项目进入第二阶段：加入 WASD 游戏模式。玩家点击「走进长安」后，会进入角色选择： - 世子 - 商贾 - 侍女 - 游侠每个角色有自己的头像、默认名字、初始钱包和物品。进入游戏后，玩家可以： - 用 WASD 移动； - 用鼠标调整视角； - 靠近 NPC 按 E 对话； - 靠近店铺或展馆按 F 触发互动； - 查看钱包、体力、行囊、任务提示。这一阶段真正完成了从“3D 页面”到“小游戏”的转变。 ## 4. 第三阶段：让 NPC 不只是摆设很多 3D 场景的问题是：建筑很漂亮，但里面没有生活。所以我们给城市加了大量 NPC 和小游戏，让它变得有烟火气。 4.1 NPC 互动玩家靠近路人、文士、商贾、仕女、官员、僧人等 NPC，可以触发对话。不同 NPC 会有不同身份和口吻。 4.2 诗词小游戏我们设计了偏唐风的互动玩法： - 飞花令：给出一个关键字，玩家从诗句中选择含有该字的一句； - 对对联：给出上联，从多个候选句里选下联； - 猜谜：用民俗谜语和长安史实做选择题； - 猜拳：快速轻量的小互动，配合随机奖励。小游戏不是单纯为了“好玩”，而是让诗词和历史知识变成可参与的体验。 ## 5. 第四阶段：做珍宝馆与诗画展厅为了让项目更像数字文旅产品，我们加入了展厅系统。玩家可以进入不同展馆，欣赏诗画、珍宝和历史主题内容。例如： - 《步辇图》 - 《历代帝王图》 - 《簪花仕女图》 - 诗词与书画主题展 - 丹青馆 DIY 展厅展厅的作用是把“游戏”与“文化内容”连接起来：玩家既可以玩，也可以看展、听讲解、理解背后的历史语境。 ## 6. 第五阶段：加入 AI 展馆项目最特别的一部分，是我们把现代 AI 品牌做成了唐风展馆。我们设计了一个“天枢府 / AI 展馆”概念：在盛唐长安里出现一个古今穿越的科技坊市。不同 AI 品牌不再只是 logo，而是变成一座座唐风殿宇，每个展馆都有自己的讲席和风格。其中 Agora 馆作为核心语音互动展馆，承担了实时语音能力展示。 > 在游戏场景中，Agora 不只是一个外部服务名，而是被设计成一座可进入、可互动、可召唤智机使讲解的“Agora 馆”。这能帮助非技术用户理解：语音 AI 不只是后台 API，它可以成为一个场景化体验。在视觉上，我们做了： - 唐风殿宇； - 品牌 logo 立柱； - 发光牌匾； - 展馆说明牌； - 可交互门口热点； - 现代科技与古代街景混合的小彩蛋。在叙事上，我们把它包装成： > 大唐长安出现了一座“智机府”，各路 AI 智机使在这里讲解不同的智能能力。这样做的好处是：AI 展示不再像一个冷冰冰的产品页面，而是变成了玩家在游戏世界里能探索的一部分。 ## 7. 第六阶段：接入实时语音 Agent 这是整个项目最核心、也最难调的一部分。我们的目标不是让 NPC 弹出文字框，而是让玩家真的能用语音和角色交流。 7.0 开发前置：安装 Agora Skills / Agora CLI 在这个项目里，Agora 语音能力并不是直接把 App ID 写死在网页里，而是通过 Agora Skills + Agora CLI 完成项目登录、能力检查、环境变量写入和 ConvoAI 就绪检查。你可以把它理解成： > Agora Skills 负责告诉 Agent 怎么集成 Agora；Agora CLI 负责登录账号、绑定项目、写入 .env.local。更具体地说，这里有两层：层级作用谁来使用Agora Skills给 AI Coding Agent 的集成说明书，告诉 Agent 应该用官方 quickstart、怎么检查 ConvoAI、怎么处理 token 和环境变量Cursor / Claude / AgentAgora CLI真正执行登录、项目选择、能力检查、环境变量写入的命令行工具开发者和 Agent 都会用。所以，“安装 Agora Skills”在实际复现时，通常会落到两件事： 1. 确保你的 AI 开发环境已经有 Agora Skill / Agora 参考资料； 1. 在本机安装并登录 agora CLI，让项目可以拿到有效的 Agora 项目配置。第一步：确认是否已有 Agora Skill / Agora CLI 如果本机还没有 agora 命令，可以安装：安装完成后，重新打开终端，确认命令存在：如果能输出路径和版本号，说明 CLI 已经进入你的 PATH。安装后检查：如果终端能看到 Agora CLI install is healthy，说明 CLI 本身可用。 > 如果 agora 命令不存在，通常是 shell 没有加载新的 PATH。可以重开终端，或检查安装脚本输出里提示的 PATH 配置。第二步：登录 Agora 账号 agora login 命令会打开浏览器完成授权。正常流程一般是： 1. 终端打印一个 https://sso2.agora.io/... 登录链接； 1. 浏览器打开 Agora SSO 页面； 1. 登录并授权 Agora CLI； 1. 浏览器回调本机 localhost； 1. 终端显示 Session stored 和 Status: authenticated。登录后检查状态：你希望看到类似：如果这里显示未登录，重新执行 agora login。如果登录成功但后面 agora project list 返回： ACCOUNT_BLOCKED 说明不是代码问题，而是 Agora 账号或控制台权限被限制。此时需要换一个可用账号，或先解除账号限制。第三步：选择或创建 Agora 项目登录后先列出项目： agora project list 如果你已经有项目，可以选择它： agora project use <project-id-or-name> 如果还没有项目，可以通过 Agora Console 创建，或用 CLI 初始化 quickstart 项目：这个命令会做三件事： - 创建或绑定一个 Agora 项目； - 克隆官方 quickstart； - 写入本地 .env.local。本项目是从 official quickstart 的思路继续改造的：先确保官方 demo 能跑，再把它嵌入到《大唐长安》的 3D 场景中。第四步：检查项目是否支持 ConvoAI 实时语音 Agent 依赖 Agora 的 Conversational AI 能力。可以运行：如果提示没有启用，可以尝试：然后再次运行 doctor 确认。你希望看到的结果是 project doctor 没有 blocking issue。它不等于“语音一定已经通了”，但至少说明控制台项目配置层面准备好了。第五步：把 Agora 项目凭据写入语音后端本项目的语音后端读取：其中最关键的是：可以让 Agora CLI 自动写入： > 注意：AGORA_APP_CERTIFICATE 是敏感信息，不要提交到 GitHub。项目的 .gitignore 已经忽略 .env.local。写入后可以检查文件是否存在，但不要把证书贴到公开地方：如果只是自查证书是否存在，可以看键名，不要打印完整值：第六步：启动语音服务后端：前端 iframe：主游戏默认会把语音面板指向： http://localhost:3000 如果线上部署语音服务，可以通过 URL 参数指定： ?voiceOrigin=https://你的语音前端域名第七步：验证语音链路先验证后端能返回 Agora 配置：再验证能启动一个 agent：如果返回 agent_id，说明后端成功请求 Agora 创建了一个语音 Agent。最后打开游戏，进入 Agora 馆，点击右侧语音面板，观察三件事： - 面板不再一直停在“召唤中”； - 麦克风能采集声音； - AI 有返回语音和字幕。 > 语音功能最终不是孤立存在的，它会和玩家身份、NPC、展馆、字幕、头像面板一起工作。玩家看见的是“角色在长安城里与智机使对话”，背后才是 RTC、ConvoAI 和 Agent 编排。常见错误与排查如果看到：通常不是前端按钮坏了，而是 Agora 项目或凭据不可用。优先检查： - agora auth status 是否已登录； - agora project list 是否能正常列出项目； - 当前账号是否被限制或 blocked； - agora project doctor --feature convoai 是否通过； - .env.local 里的 App ID / Certificate 是否来自同一个项目； - 修改 .env.local 后是否重启了后端。可以按这个顺序排查：如果 CLI 登录正常，但 project list 返回 ACCOUNT_BLOCKED，说明账号侧被限制，代码无法绕过。需要换可用账号或解除 Agora 控制台限制。 7.1 基本架构项目被拆成两部分： - han-diorama 浏览器 3D 主场景负责 Three.js、WASD、NPC、展馆、小游戏 - tang-voice-agent - 语音智能体子项目 - 前端是 Next.js iframe - 后端是 FastAPI / Python - 负责 Agora ConvoAI、Persona、语音对话主场景里点击 NPC 后，会打开右侧语音面板。这个面板本质上是一个嵌入的 iframe，它和主游戏通过 postMessage 通信。 7.2 一次语音对话发生了什么当玩家按住麦克风说话时，大致流程是：玩家麦克风 ↓ 浏览器 RTC 上行 ↓ Agora 实时音频链路 ↓ ConvoAI：语音识别 → 大模型思考 → TTS 合成 ↓ AI 声音通过 RTC 回到浏览器 ↓ 游戏里 NPC 头像、字幕、状态同步变化普通用户看到的是“我和李白说话了”。技术上背后是实时音频、语音识别、大模型、语音合成和游戏状态同步一起工作。 7.3 为什么要做 Persona 如果所有 NPC 都用同一个提示词，它们就会像同一个机器人。所以我们给不同角色做了不同 Persona： - 李白：诗酒豪放； - 杜甫：沉郁关怀； - 王维：山水空灵； - 周引之：导游身份，可以带路； - 苏阮卿：画学博士，负责讲画； - 智机使 · Agora 馆：讲解实时语音与 ConvoAI。每个 persona 有自己的： - 名字； - 身份； - 场景位置； - 说话风格； - TTS 音色； - 可注入的场景上下文。这让语音功能不只是“能说话”，而是和游戏世界绑定在一起。 ## 8. 第七阶段：做角色头像、视频面板与 BGM 为了让语音互动更有“面对面”的感觉，我们做了左侧角色 portrait 面板。它支持： - idle.jpg / idle.png 静态头像； - idle.mp4 静音循环视频； - intro.mp4 带原声开场视频； - AI 说话时切换 talking 状态； - 没有素材时自动 fallback。后来又加入了古风 BGM： - 默认循环播放古琴 / 古筝曲； - 支持静音、音量、切歌； - 当玩家打开语音对话时，BGM 自动降低音量，避免盖住人声。这一步看似是“包装”，但对用户体感影响很大。没有声音和头像时，AI 对话像工具；有了角色视频、字幕和背景音乐后，它更像游戏里的角色。 ## 9. 第八阶段：解决视觉与尺度问题开发中遇到过一个典型问题：AI 展馆一开始太大，放到城市里会出现“浮在地面上”“镜头一转消失”的情况。问题根源是单位尺度不一致： - 主城使用的是游戏世界单位； - AI 展馆早期按更大的现实尺度设计； - 结果展馆实际超出了主城地面范围。解决方式是： - 把天枢府缩放到适合主城的面积； - 重新设置展馆中心点； - 调整 3×3 展馆布局； - 缩小 logo 立柱、牌坊、院墙和展馆模型； - 确认所有互动点都落在可见地面内。这个经验很重要：3D 项目里，美术好看不够，尺度一致才是可玩的前提。 ## 10. 第九阶段：部署到 GitHub 项目完成后，我们把前端开源部署到了 GitHub。前端 han-diorama 是静态 Web 项目，适合用 GitHub Pages 托管。部署流程：然后使用 GitHub Actions 自动发布 Pages。线上地址： https://andyhuo520.github.io/tang-changan/ 需要注意的是： - GitHub Pages 只能托管静态前端； - 实时语音后端 tang-voice-agent 需要单独部署； - 本地开发时可以用 http://localhost:3000 作为语音 iframe； - 线上如果要启用语音，需要给游戏传入可访问的语音前端地址。 ## 11. 普通用户怎么体验打开： https://andyhuo520.github.io/tang-changan/ 进入页面后可以： 1. 在沙盘视角浏览盛唐长安； 1. 点击「走进长安」； 1. 选择角色：世子 / 商贾 / 侍女 / 游侠； 1. 用 WASD 移动角色； 1. 靠近 NPC 按 E 对话； 1. 靠近展馆或店铺按 F 互动； 1. 进入珍宝馆看诗画； 1. 进入 AI 展馆体验语音智能体。常用按键：按键作用WASD移动鼠标调整视角E与 NPC 对话 / 触发小游戏F进入展馆 / 开店 / 触发场景Esc关闭语音面板 ## 12. 开发者如何理解项目结构项目可以分成几层： han-diorama/ index.html 页面结构与 UI 容器 scene.js 主 3D 场景、游戏模式、NPC、语音面板 modelLoader.js 角色模型加载 assets/ logo、头像、BGM、预览图 portraits/ NPC 视频 / 头像素材 murals/ 画廊素材 lib/ content/brand-data.js AI 展馆品牌数据 world/brand-plaza.js AI 展馆 / 天枢府 world/gallery-hall.js 珍宝馆 / 展厅 world/diy-hall.js 丹青馆 DIY ui/voice-intent.js 语音意图路由 hero/ 大明宫、东西市、曲江等地标模块 tang-voice-agent/ web/ Next.js 语音前端 iframe server/ FastAPI 后端 server/src/personas/ 角色 Persona 最核心的思想是： > 3D 主项目负责“玩家在哪里、看见什么、能做什么”；语音子项目负责“玩家说什么、AI 怎么回答、声音怎么回来”。 ## 13. 这次开发踩过的坑 13.1 浏览器缓存浏览器会缓存 JS 和图片。我们在模块路径后面加版本参数： scene.js?v=20260529-agora-only 这样每次重要更新后，线上用户能加载到新代码。 13.2 视频自动播放限制浏览器通常不允许带声音的视频自动播放。解决方式： - 先尝试播放 intro.mp4； - 如果被浏览器拦截，就退回静音播放； - 在用户点击页面后再解锁音频。 13.3 语音项目账号状态实时语音不只是代码问题，还依赖 Agora 账号、项目状态、ConvoAI 开通状态和 token 鉴权。如果出现： CAN_NOT_GET_GATEWAY_SERVER: no active status 401 Invalid token 通常说明： - Agora 账号或项目被阻断； - App ID / Certificate 不匹配； - 项目没有开通对应能力； - 本地 .env.local 还是旧凭据。这是开发 AI 语音项目时最容易误判的地方：页面看起来是“麦克风开了”，但其实浏览器和 Agent 都没有真正加入频道。 13.4 3D 尺度展馆、城市、NPC、地面如果不在同一尺度体系里，就会出现漂浮、穿模、消失、点不到的问题。解决办法不是不断调相机，而是回到世界坐标，统一单位、位置和可交互范围。 ## 14. 如果你想复刻一个类似项目可以按这个顺序做： 1. 确定主题先选一个世界观，例如唐代长安、宋代汴梁、敦煌石窟、未来博物馆。 1. 搭建一个能看的 3D 场景不要一开始就做大地图。先做一个核心区域，保证 30 秒内能看懂。 1. 加入一个可控角色 WASD + 简单碰撞 + 一个 NPC，就足够验证“游戏感”。 1. 设计 3 个互动点一个 NPC、一个展馆、一个小游戏。不要一开始做 20 个。 1. 接入语音 Agent 先用一个默认 persona 跑通，再扩展多个角色。 1. 把内容模块化品牌数据、NPC 数据、展馆数据都写成配置，不要散落在代码里。 1. 部署上线前端用 GitHub Pages / Vercel，后端用可公网访问的服务器。 1. 最后再做包装 BGM、头像、视频、封面图、教程、X 推文、GitHub README 都属于传播层。 ## 15. 我们最终做成了什么最终，这个项目不只是一个 3D 页面，也不只是一个语音 demo。它更像一个小型样板： - 文旅内容如何游戏化； - 历史知识如何互动化； - AI 能力如何场景化； - 语音 Agent 如何融入 3D 世界； - 开源项目如何从 demo 变成可分享作品。如果要用一句话总结整个开发过程： > 我们不是把 AI 放到一个按钮里，而是把 AI 放进了一座城。这就是《大唐长安 · 智机府》的核心。

译本教程介绍了如何构建一个名为《大唐长安》的Web 3D互动项目。项目基于Three.js搭建低多边形风格的长安城沙盘，玩家可通过WASD模式在其中漫游探索。核心玩法包括与多种NPC进行语音对话、参与飞花令等诗词小游戏。项目集成了Agora实时语音能力，通过Agora Skills（技能）和Agora CLI工具完成Agent集成与环境配置，使玩家能通过麦克风与李白等角色实时语音交流。此外，项目还设计了将现代AI品牌融入游戏的唐风AI展馆。

Berryxia.AI@berryxia · 6月1日54

兄弟们，这数据太离谱了！智谱直接遥遥领先DeepSeek！我们国内前5家纯LLM公司总估值已经高达2260亿美元，大概是Anthropic最新一轮估值的四分之一。但它们的收入运行率，只有Anthropic的1/40。国内开放权重模型一边拿大量VC资金，一边在真实产生收入。这和海外主流的闭源高定价模式，走的是完全不同的路。这个估值和收入之间的巨大差距，把AI行业当前最核心的矛盾摆在了桌面上：市场到底在为AI的什么部分支付溢价？当模型能力被快速商品化、价格被大幅拉低之后，估值逻辑要怎么变？是继续只盯短期收入，还是要认真评估它对整个行业价格体系的破坏力？你们怎么看？国内这种低价+开放权重的打法，只是短期现象，还是会成为未来全球AI竞争的主流模式？

译国内五家纯LLM公司的总估值已高达2260亿美元，约Anthropic最新一轮估值的四分之一，但其收入运行率仅为Anthropic的四十分之一。这一数据凸显了国内厂商普遍采用的“低价+开放权重”融资与商业模式，与海外主流的闭源高定价模式形成鲜明对比。该现象将AI行业的核心矛盾——市场究竟为何为模型支付溢价以及估值逻辑在模型能力商品化后如何演变——直接摆上台面。

Emad@EMostaque · 6月1日44

My review of Claude Opus 4.8: We should worry less about being turned into paper clips & more about being annoyed to death.

译我对 Claude Opus 4.8 的评测：我们应该少担心被变成回形针，多担心被烦死。

Rohan Paul@rohanpaul_ai · 6月1日72

Jensen Huang thinks Dario Amodei's prediction of $1T in AI revenue by 2030 is too conservative. "I believe Dario and Anthropic are going to do way better than that. Way better than that. And the reason for that is the one part that he hasn't considered: I believe every single enterprise software company will also be a value-added reseller of Anthropic's tokens. And they’re going to get this logarithmic expansion. Their go-to-market is going to expand tremendously this year." --- From @theallinpod YT channel (link in comment)

译Jensen Huang认为Dario Amodei预测的2030年AI收入达$1T的预期过于保守。他指出，Anthropic的token将成为众多企业软件公司的增值服务，其市场将因此实现对数级扩张。有观点补充认为，当各实验室的模型能力趋同时，真正的优势可能源于独特的私有数据输入。这类数据（如特殊工作流、医疗记录等）能为AI系统带来难以复制的差异化和提升，未来或成为并购的关键标的。

Chubby♨️@kimmonismus · 6月1日45

Claude Mythos is $25 per million input tokens and $125 per million output tokens. I assume that the Mythos-like model that Anthropic will release in the coming weeks will be just as expensive. lets see

译Claude Mythos的输入token价格为每百万25美元，输出token价格为每百万125美元。我预计Anthropic将在未来几周内发布的类似Mythos的模型，价格也会同样昂贵。让我们拭目以待。

🚨 AI News | TestingCatalog@testingcatalog · 5月31日57

Anthropic is planning to further expand into the consumer and bioscience sectors. The biggest things to watch for 👀 - Conway agent - Orbit assistant - Knowledge-based memory - Multilingual Voice Mode - Operon for bioscience researchers and more! Which one do you think will drop next?

译Anthropic计划进一步扩展至消费与生物科学领域，并预告了多款即将推出的产品，包括Conway agent、Orbit assistant、知识记忆、多语言语音模式以及面向生物科学研究的Operon。引用观点指出，Anthropic选择先聚焦编程，但随着Claude的智能提升，其应用将扩展到人类智能能发挥作用的各个领域。

AYi@AYi_AInotes · 5月31日69

Damn，Anthropic破万亿了？！所有喊AI泡沫的人，今天集体沉默了：

译天啊，Anthropic破万亿了？！所有喊AI泡沫的人，今天集体沉默了：

Chubby♨️@kimmonismus · 5月31日59

.@AndrewCurran_ has made a very important point here, with which I fully agree. Anthropic focused on coding from the very beginning and (almost) nothing else. Dario Amodei said early on that if the coding problem is "solved," all other problems will be solved as well. Therefore, no distractions from this area. All the other companies regularly got sidetracked with side quests and thus abandoned their focus. OpenAI invested massive amounts of compute in Sora but then even decided to discontinue the app. They also developed a language model, an image model, and extensive access to free ChatGPT. I don't want to judge this, just observe it. Google did the same: AI Mode, Image Model, Veo3.1, Music Model, and so much more. Again, these were certainly well-considered decisions. But Anthropic wanted one thing from the start, and only one thing: to focus on coding and then be at the forefront of enterprise computing. And it's safe to say: they succeeded. OpenAI invested massive amounts of compute in Sora but then decided to discontinue the app. I like the term "intelligence company" because I would argue that Anthropic sees itself in exactly that way. At least so far, Anthropic's own path has been successful. And I would say that OpenAI has followed suit and is increasingly abandoning its side projects. Focus on Codex and ChatGPT, less Sora, voice mode, etc. It's about the race for the best models. Distraction costs money and intelligence resources.

译Anthropic自始至终专注编程，被视为“智能力公司”而非编程公司。其策略基于Claude智能扩展后将应用于所有人类智能领域。相比之下，OpenAI和Google频繁分心开发其他产品（如Sora、图像模型、音乐模型等），OpenAI甚至停用Sora。Anthropic凭借专注在企业计算领域取得领先，而OpenAI正效仿其路线，放弃副项目，聚焦Codex与ChatGPT等核心模型竞争。