OpenAI与Thrive合作开发了一款自我改进的税务AI智能体,已在30多家会计事务所处理约7,000份报税表。该智能体将准备时间缩短约三分之一,吞吐量提升约50%,并达到高达97%的准确率。技术难点在于处理混乱的K-1s、租赁计划等非结构化文件,以及跨文档的数值匹配。系统为每个操作记录完整追踪链,并利用会计师的重复修正作为评估目标,驱动Codex生成可测试的代码修复任务,形成自我改进闭环。
OpenAI and Thrive just built a self-improving tax agent with up to 97% accuracy.
Tax AI processed 7,000 returns across 30+ accounting firms, saved about one-third of preparation time, reached up to 97% accuracy, and raised throughput by about 50%.
The hard part was not reading W-2s or 1099s, but handling messy K-1s, rental schedules, notes, spreadsheets, prior-year files, and values that must match across documents.
The system records the full trace: source file, extracted field, citation, tax-engine mapping, accountant correction, and final filed value.
Repeated corrections become eval targets, so Codex gets a narrow task with evidence, code, tests, and a pass condition.