自动化SKILL.md生成：三阶段流水线论文

elvis@omarsar0

2026-06-19 23:04·1天前

AI 摘要

关键要点：OpenAI昨日为Codex推出了从交互中打包技能的类似功能；论文提出三阶段流水线（GUI轨迹分割→聚类候选技能→训练技能感知策略）。聚类纯度优异（5/8簇达0.95以上），但可读性未迁移：GRPO仅将技能步骤准确率从18.5%提至20.5%，在BrowseComp+上无改善，甚至输给简单频率先验。作者指出三个缺陷：弱边界检测器、无序片段表示、离线奖励模型。

// Automating SKILL.md Generation //

Increasingly， mining sessions is one of the best ways to improve your agents.

OpenAI released something similar yesterday that lets Codex package skills from interactions.

（bookmark it）

This paper explains a related approach.

They run a three-stage pipeline that segments GUI trajectories， clusters them into candidate skills， and trains a skill-aware policy.

The clusters are genuinely readable， with five of eight hitting 0.95 or higher purity against ground-truth workflow labels.

But readability does not transfer. GRPO lifts skill-step accuracy only from 18.5% to 20.5%， leaves BrowseComp+ flat， and loses to trivial frequency priors.

The authors name the three culprits： a weak boundary detector， an orderless segment representation， and an offline reward model.

Paper： https：//arxiv.org/abs/2606.20363

Learn to build effective AI agents in our academy： https：//academy.dair.ai/

智能体arXiv数据/训练论文/研究

在 X 查看原推

elvis@omarsar0 · X