# Response Healing：将 JSON 缺陷减少 80% 以上

- 来源：OpenRouter：Announcements（RSS）
- 作者：Alex Atallah
- 发布时间：2025-12-18 08:00
- AIHOT 分数：61
- AIHOT 标记：精选
- AIHOT 链接：https://aihot.virxact.com/items/cmor004i80070slixlug7hiki
- 原文链接：https://openrouter.ai/announcements/response-healing-reduce-json-defects-by-80percent

## 精选理由

做 Agent 的人最怕 JSON 解析炸掉整个 pipeline，OpenRouter 这个 Response Healing 相当于在网关层加了自动纠错，接入成本几乎为零，值得试试。

## AI 摘要

OpenRouter 推出新功能 Response Healing，可在 LLM 生成的畸形 JSON 响应抵达用户应用前自动修复。该功能旨在将 JSON 格式错误减少超过 80%，直接提升 API 响应的结构完整性与可靠性，减少下游应用的处理负担。

## 正文

Response Healing: Reduce JSON Defects by 80%+ — OpenRouter Blog

Response Healing: Reduce JSON Defects by 80%+

Alex Atallah · 12/18/2025

On this page

The Math That Should Keep You Up at Night

What We’re Fixing

The Benchmarks

How to Enable It

What This Doesn’t Fix

Ship with Confidence

We expect our APIs to have 99.999% uptime. We’d never tolerate a payment processor that failed 2% of the time. So why do we accept LLMs that routinely break JSON syntax in structured output requests?

Today we’re launching Response Healing: a new feature on OpenRouter that automatically fixes malformed JSON responses from LLMs before they reach your application.

Two standout improvements from a week of data:

Gemini 2.0 Flash, our most popular model for structured output with over 1.6 million requests in the past week, saw its defect rate decline by 80%.

Qwen3 235B, one of the most capable open-weight models available, saw its defect rate decline by 99.8%.

The Math That Should Keep You Up at Night

Here’s something most developers overlook: if an LLM has a 2% JSON defect rate, and Response Healing drops that to 1%, you haven’t just made a 1% improvement. You’ve cut your defects, bugs, and support tickets in half.

At OpenRouter’s scale, we see this compounding effect across billions of tokens daily. A “small” improvement in structured output reliability translates to dramatically fewer 3am pages, fewer angry users, and fewer hours debugging why your agent suddenly stopped working.

This is why we obsess over this problem more than any other gateway. Reliability at the margins is where real production systems succeed or fail.

What We’re Fixing

LLMs make surprisingly creative mistakes when generating JSON. Common issues include trailing commas after the last element, unescaped control characters in strings, missing closing brackets, and various syntax errors that break parsers.

Here’s the data you requested: {…}

That’s not something that should ever take you down.

For a detailed breakdown of the failure modes we handle, check out our Response Healing documentation:

The Benchmarks

We analyzed millions of structured output generations across our platform. We did this on the fly, at inference time, without logging any completions or storing results.

Here are the results for the highest-volume models:

Model Requests Success Before Success After Defects Resolved

Gemini 2.0 Flash 1.62M 99.61% 99.92% 80.0%

Gemini 2.5 Flash 772k 98.97% 99.65% 66.3%

Gemini 2.5 Flash Lite 703k 99.64% 99.89% 68.7%

GPT-4o Mini 494k 99.98% 100.00% 80.7%

Grok 4 Fast 488k 92.89% 94.87% 27.8%

Grok 4.1 Fast 284k 98.70% 99.17% 36.4%

Gemini 2.0 Flash Lite 282k 99.94% 100.00% 98.9%

Deepseek Chat v3.1 196k 82.54% 97.39% 85.0%

GPT-4.1 155k 98.22% 98.40% 10.4%

Qwen3 235B 113k 88.02% 99.98% 99.8%

GPT-oss-120b 112k 99.53% 99.82% 62.2%

Devstral 2512 104k 96.59% 99.99% 99.6%

Gemini 2.5 Flash Lite Preview 93k 99.14% 99.86% 83.7%

Llama 3.1 8B Instruct 79k 99.68% 99.91% 72.4%

GPT-oss-20b 58k 99.01% 99.36% 34.8%

Mistral Small 3.2 24B 57k 98.82% 99.99% 99.3%

GPT-5 Nano 52k 99.96% 99.96% 8.7%

Ministral 3B 52k 99.99% 100.00% 100.0%

Some highlights worth noting since we soft-rolled this out a week ago:

mistralai/devstral-2512: customers who turned on the plugin had valid json rate taken from 97% to 99.99%, which is a 99.7% defect reduction

google/gemini-2.5-flash: success rate increased from 97.5% to 99.88%. 95.2% reduction

meta-llama/llama-3.1-8b-instruct: success rate increased from 99.9% to 100%. 100% reduction

Qwen3-235B: 87.97% valid to 99.98% valid, a 99.85% reduction

Deepseek Chat V3.1: 83.16% valid to 97.46% valid, a 84.89% reduction

Several models—Ministral 3B, Devstral 2512, Mistral Small 3.2—achieve near-perfect healing rates above 99%

Even models that already perform well see meaningful gains: Gemini 2.0 Flash Lite went from 99.94% to 100% validity

How to Enable It

Response Healing is opt-in. You can configure it through the new Plugins section in your settings:

openrouter.ai/settings/plugins

Toggle it on, and every structured output request will automatically pass through our healing layer before returning to your application.

Cost

The plugin is free to use. In terms of latency, we ran an analysis of added CPU time across all production data:

Category Mean Time Ops/Second

Schema-less Repair 0.018ms 54,700

Unified API 0.019ms 51,500

Type Coercion 0.041ms 32,600

Basic Parsing 0.133ms 16,900

Large Payloads (10KB) 2.3ms 437

In reality, factors outside the plugin are going to dominate any real world latency. So we can say that, for typical responses, healing adds less than 1ms of latency, negligible compared to LLM inference time.

What This Doesn’t Fix

To be clear about scope: Response Healing fixes JSON syntax errors, not schema adherence. If a model returns valid JSON that doesn’t match your expected schema (wrong field names, missing required properties, wrong types), healing won’t catch that.

It also only works for non-streaming requests today. Contact us with your use case if you have a need for fixing streaming requests as well.

That said, you should still see a meaningful drop in your overall error rate. Syntax errors are one of the most common failure modes, and eliminating them lets you focus your error handling on the semantic issues that actually require application logic to resolve.

What about tool calling and schema adherence? Tool calling has very few structural JSON issues, but schema adherence has many defects across most models. We’ll evaluate schema adherence soon.

What about XML? The plugin can heal XML output as well - contact us if you’d like access.

Ship with Confidence

We built OpenRouter to be the infrastructure layer you don’t have to think about. Response Healing is another step toward that goal: structured outputs that just work, every time.

Enable it today at openrouter.ai/settings/plugins, and let us know what you’re building.
