Show Codex a workflow once. Reuse it as a skill. Record & Replay lets you show Codex a recurring task, like filing an ex...
Show Codex a workflow once. Reuse it as a skill. Record & Replay lets you show Codex a recurring task, like filing an ex...
New in Claude Code: Artifacts. Interactive pages built from your session, like a PR walkthrough or a living project dash...
@theo Honestly just use Devin. It's really really good now
Introducing autoresearch for arXiv papers Change 'arxiv' to 'autoarxiv' in any paper URL An agent deploys to resolve set...
用 Codex 写代码时,将 Review 前置可显著降低返工率。作者总结三个层级:零成本版(粘贴提示要求先复述任务再执行)、官方内置版(/plan 或 Shift+Tab 触发计划)、持久化版(AGENTS.md 写入前置规则)。UCSD 黄碧薇教授深耕因果 AI 12 年,提出 AI 四代演进:相关性小模型→因果小模型→相关性大模型(LLM)→因果大模型。其团队开发的 causal-learn 入选 Apple Scholar。今日 Aether AI 完成首轮融资,被视为从堆参数转向下一代 AI 范式的信号。
人类到今天都写不出一颗煎蛋的物理方程, 一颗鸡蛋打进热油锅,它怎么凝固、怎么摊开、边缘怎么变焦, 没有任何一个公式能描述清楚,这种例子在物理世界里多到数不过来。 而这恰恰是当下通用 AI 范式的天花板,视频生成、VLA 学的都是像素层面的统...
Excited to announce Viktor in Microsoft Teams. This week we crossed $20M in annualized revenue run rate. In Slack. One a...
传统LLM智能体技能路由仅从工具库选取单一技能,难以应对多技能组合的真实任务。本文形式化定义“组合式技能路由”,将复杂查询分解为原子子任务,为每个子任务检索对应技能并组合成可执行计划。系统SkillWeaver由LLM分解器、双编码器FAISS检索器和依赖感知DAG规划器构成。同时发布CompSkillBench基准,含300个组合查询和2,209个真实技能,直接评估多技能路由能力。DAG规划器将检索技能转化为有序、尊重依赖关系的计划。
Excited to announce Viktor in Microsoft Teams. This week we crossed $20M in annualized revenue run rate. In Slack. One a...
Excited to announce Viktor in Microsoft Teams. This week we crossed $20M in annualized revenue run rate. In Slack. One a...
Excited to announce Viktor in Microsoft Teams. This week we crossed $20M in annualized revenue run rate. In Slack. One a...
Apodex专为解决无现成答案的硬问题设计。可同时派出最多150个子Agent并行探索,总步数超15,000步。在BrowseComp上超越GPT-5.5-pro,在DeepSearchQA上超越Claude-Opus-4.8和Kimi-K2.6。工作流程分深度研究、自我校验、撰写三阶段。内置三层自我验证机制(冲突审查员、事实检查员、草稿审查员)及独立全局验证器。由AgentOS负责调度、路由、事件流、检查点、成本记账、权限管理等底层事务,添加新应用只需插件代码,无需修改内核。
Vercel 发布开源 Agent 框架 Eve,核心设计“Agent 即目录”:通过 agent.ts、instructions.md、tools、skills、subagents、channels、schedules、connections 等文件声明行为。内置持久会话(可 checkpoint)、沙箱隔离(本地 Docker/Vercel Sandbox)、Human-in-the-loop 审批(不占算力)、MCP/OpenAPI 连接(鉴权由框架代理)、多 Channel 支持(HTTP/Slack/Discord)、OpenTelemetry 追踪与 eve eval 门禁。本地 eve dev TUI,部署为普通 Vercel 项目,不中断进行中会话。内部已验证:d0 月 3 万+ 查询,Lead Agent 年成本约 $5k 回报 32 倍,Vertex 约 92% 工单自动解决。
Introducing eve, an agent framework. agent/ agent.ts instructions.md tools/ skills/ sandbox/ schedules/ Like Next.js, fo...
关联讨论 1 条MarkTechPost(RSS)邵猛详解 Codex Automations 的双循环架构:内循环负责将上下文带入任务,通过“检索即写作”、可逆动作(只建草稿不自动发送)等原则快速产出可审草稿;外循环在人工审阅后启动,通过草稿与终稿的 diff 提取证据,区分修改类型(写作偏好、事实补漏、承诺删除等),将经批准的教训写入 Markdown 供内循环下次使用。双循环速度错开:内循环快(如每 2 小时),外循环慢(日末/满 N 条审阅/每周),平衡即时效率与模式改进。适用于任何“起草→人审→发送/修改”的流程。
http://x.com/i/article/2067086994455601152
一篇介绍AI自动回复邮件的“内循环”与“外循环”设计的文章。内循环是定时任务每2小时检查新邮件,自动检索相关上下文生成草稿但不发送,供用户手动修改后发出;外循环则是自进化的Skill,每次用户对草稿的修改都会被Agent记录,用于不断优化写作风格Skill,使其生成内容更符合用户习惯。作者类比了自己以前手动提炼写作风格Skill的做法,指出该方案将迭代过程自动化,形成持续改进的闭环。
http://x.com/i/article/2067086994455601152
It's now easier to move local agents to the cloud so they can keep working with your laptop closed. Prompt Cursor from y...
Email dashboards had a good run. Two decades. Billions of emails. I built two companies on them. But the dashboard was n...
🚀 Introducing Genspark AgentBase (Preview). Turn your data into custom databases, dashboards, and internal systems. Sto...
We built an internal AI system called Builderbot. It coordinates agents across our entire codebase. Engineers tag it in ...
Meet Apodex 1.0 🔭 - a heavy-duty agent team for deep research, which sets the SOTA! The team searches the web, reasons ...
1. as a mental model it is more correct to think of fable+ class models as english -> code interpreters - converts your ...
NVIDIA GEAR实验室推出ENPIRE系统,首次实现物理世界自主研究。系统让8个Codex智能体控制8台机器人,配备GPU和token预算。安全方面采用硬运动极限切断和扭矩受限夹爪两层硬件保障,支持通宵无人运行。奖励函数通过视觉分类器离线固定并冻结,防止智能体作弊。实时监测机器人利用率(MRU)、token利用率(MTU)和GPU利用率,以Tokens-to-Success和Time-to-Success评估效率。ENPIRE自主完成扎带、整理细针、安装GPU等高精度任务,发现8机器人并行探索显著更快。系统将开源。
Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fle...
Last week Apple previewed the future of Siri. In 1987 though, Apple showcased a far more advanced AI assistant that woul...
Introducing eve, an agent framework. agent/ agent.ts instructions.md tools/ skills/ sandbox/ schedules/ Like Next.js, fo...
Nitrosend 推出基于 AI 智能体的邮件自动化工具。它允许 Codex、ChatGPT、Claude、Cursor、Gemini 或任何 MCP 智能体通过一个提示词构建和发送品牌邮件活动。系统会持续从发送数据中学习,自动优化主题行、发送时机和内容,而非依赖通用建议。引用@gthartley 称,传统邮件仪表盘运行了二十年,但仪表盘本身就是瓶颈——Nitrosend 移除了它。
Email dashboards had a good run. Two decades. Billions of emails. I built two companies on them. But the dashboard was n...
Email dashboards had a good run. Two decades. Billions of emails. I built two companies on them. But the dashboard was n...
Exa 正式发布 Exa Agent,一个将前沿模型与自研搜索工具链打包成单一接口的托管式 API,面向深度调研、名单构建和实体 enrichment。核心技术包括:任务分解 + 并行子 Agent(Map-Reduce 架构);按任务动态混用前沿模型与经济模型的 Model Fusion;Highlights 模型可将 token 用量最高削减 94%。在 WideSearch 基准上采用 Row-F1 评分,Exa Agent 成本不到 GPT 5.5 和 Opus 4.8 的一半,处于 Pareto 前沿。应用场景涵盖金融、GTM/Sales、公司研究及文献/代码 review。
Introducing Exa Agent: frontier web research at less than half the cost of GPT 5.5 and Opus. /agent orchestrates a mixtu...
Get paid for your Hermes skills @NousResearch Capafy now supports Hermes Agent. Publish your Hermes skills and keep them...
微信紧随支付宝推出“AI专属卡”,让AI Agent在对话中完成推荐、下单到支付全流程。该卡与微信支付主账户完全隔离,Agent只能使用卡内余额,用户可随时充值或提现。目前功能已通过WorkBuddy(企业微信Mac版v5.1.1以上)中的美团生活助手实现,支持AI帮用户购买团购券并到店核销。这一设计将支付能力嵌入Agent工作流,既保障资金安全,又打通商业闭环,为未来Agent间自动议价、结算铺路。
With the "AI AgentPay Card"(AI专属卡), users can make purchases inside supported AI Agent conversations - from recommendati...
GLM-5.2 正式发布,实测显示其 Agent 能力有质的变化。该模型能将地图数据内化到 1M 上下文中,直接知道换电站位置,全程未调用搜索函数,在测试的 20 多个模型中唯一能做到。后端 Agentic Coding 能力提升至总榜第二名。短板是空间理解:虽记住换电站位置,但无法根据当前位置推理最近站点。