# OpenAI与Thrive打造自我改进税务AI智能体，准确率97%

- 来源：Rohan Paul (@rohanpaul_ai)
- 发布时间：2026-05-28 00:46
- AIHOT 分数：53
- AIHOT 链接：https://aihot.virxact.com/items/cmpobfjav04muslv42az8yy03
- 原文链接：https://x.com/rohanpaul_ai/status/2059677513341931633

## AI 摘要

OpenAI与Thrive合作开发了一款自我改进的税务AI智能体，已在30多家会计事务所处理约7,000份报税表。该智能体将准备时间缩短约三分之一，吞吐量提升约50%，并达到高达97%的准确率。技术难点在于处理混乱的K-1s、租赁计划等非结构化文件，以及跨文档的数值匹配。系统为每个操作记录完整追踪链，并利用会计师的重复修正作为评估目标，驱动Codex生成可测试的代码修复任务，形成自我改进闭环。

## 正文

OpenAI and Thrive just built a self-improving tax agent with up to 97% accuracy.

Tax AI processed 7，000 returns across 30+ accounting firms， saved about one-third of preparation time， reached up to 97% accuracy， and raised throughput by about 50%.

The hard part was not reading W-2s or 1099s， but handling messy K-1s， rental schedules， notes， spreadsheets， prior-year files， and values that must match across documents.

The system records the full trace： source file， extracted field， citation， tax-engine mapping， accountant correction， and final filed value.

Repeated corrections become eval targets， so Codex gets a narrow task with evidence， code， tests， and a pass condition.

A wrong tax field can come from many places： bad extraction， weak mapping， unsupported workflow， prior-year carryover， or human judgment.

The clever part was not simply using Codex to write fixes， but building a product environment where repeated practitioner corrections became bounded， testable engineering tasks.

In the rental-property example， the agent could inspect source documents， extraction traces， mapper behavior， expected outputs， and regression tests before proposing a change.
