OpenRouter 推出 Advisor 工具:让低成本模型可随时调用强模型增强生成
OpenRouter 开放了跨模型顾问调用,让便宜模型在关键节点求助昂贵模型,这会让 agent 开发重心从选一个万能模型转向编排一组模型,值得所有做 agent 架构的人试一下。
OpenRouter 发布 advisor 服务器工具,允许一个快速、便宜的模型在生成过程中咨询一个更强大的模型。具体而言,可用 GPT-4o Mini 处理日常例行工作,在关键时刻调用 Claude Fable 解决真正重要的问题,从而实现成本和质量的动态平衡。
Advisor: Give Any Model a Lifeline to a Smarter One — OpenRouter Blog
Advisor: Give Any Model a Lifeline to a Smarter One
Kenny Rogers · 6/10/2026

On this page
- 67x price gap, selective consultation
- Server-side execution, one tool call
- Named advisors
- How this compares to other advisor tools
- Billing
Add openrouter:advisor to your tools array and your model can ask a stronger model for help mid-generation. When the executor hits a hard decision, gets stuck, or wants a sanity check before finishing, it calls the advisor with a prompt. The advisor thinks, returns guidance as the tool result, and the executor keeps going with better information.
Both roles are open: any model on OpenRouter can be the executor, and any model from any provider can be the advisor. Run a Gemini executor that consults Claude, or a GPT executor that consults DeepSeek. You pick the pairing.
Try it in the chatroom or read the docs for the full API reference.
{
"model": "openai/gpt-4o-mini",
"messages": [{ "role": "user", "content": "Design a rate limiter for a distributed API gateway." }],
"tools": [
{
"type": "openrouter:advisor",
"parameters": { "model": "anthropic/claude-fable-5" }
}
]
}
67x price gap, selective consultation
Claude Fable 5 costs $10 per million input tokens. GPT-4o Mini costs $0.15 per million. That’s a 67x spread.
Most requests don’t need frontier-level reasoning. A mid-tier model handles the bulk of a workload without issue. But the 10-20% that involves architectural decisions, ambiguous edge cases, or multi-step reasoning chains is where cheaper models stumble.
The advisor tool covers that gap selectively. Your fast model runs the show. When it hits something genuinely hard, it calls for help. You pay frontier prices only for the moments that need frontier thinking.
In an agentic coding session with 50 tool calls, maybe 2-3 are advisor consultations. The rest run at mini prices. You’ve sanded down your per-session cost while keeping the quality ceiling high.
Server-side execution, one tool call
The advisor runs server-side during generation. Your model calls it like any other tool: pass a prompt describing what it needs help with, get back the advisor’s text as the tool result. The model then writes the final answer itself, informed by the advice. The advisor is a consultant, not a ghostwriter.
Four things worth knowing:
Any model, from any provider, can be the advisor. Pin it in the tool config with
parameters.model(anything in the model catalog works), or let the executor pick per-call. Use~anthropic/claude-fable-latestto always resolve to the newest Fable.The advisor gets its own tools. Give it
openrouter:web_searchand it’ll ground its advice in fresh sources before responding. It runs as a sub-agent with its own tool loop, then returns just the final guidance.Recursion is blocked. The advisor can’t call itself. A depth header and self-reference check prevent unbounded nesting, and consultations are capped per request to bound cost.
The advisor remembers. Replay the conversation transcript in a follow-up request (with the advisor tool calls and results included) and each advisor reconstructs its prior consultations, so a follow-up question builds on what the advisor already said. Memory is per advisor (your security reviewer and your architect each keep their own thread) and works across Chat Completions, Responses, and Anthropic Messages. Full details.
Named advisors
For complex workflows, you can configure a roster of specialists. Add one openrouter:advisor entry per advisor, each with its own name, model, instructions, and tool set:
{
"tools": [
{
"type": "openrouter:advisor",
"parameters": {
"name": "security-reviewer",
"model": "anthropic/claude-fable-5",
"instructions": "You are a security engineer. Find vulnerabilities."
}
},
{
"type": "openrouter:advisor",
"parameters": {
"name": "architect",
"model": "openai/gpt-5.5",
"instructions": "You are a systems architect. Prioritize simplicity and scalability."
}
}
]
}
The executor sees a distinct tool for each advisor and calls whichever fits the task with just a prompt. An auth flow review routes to Claude Fable with the security persona; architecture questions go to GPT-5.5. Names can use letters, digits, spaces, underscores, and dashes (“Lead Architect” works), and must be unique across entries. One entry can omit name to act as the default advisor.
Advice can also stream. Set "stream": true on an advisor entry and you get the advice incrementally as the advisor writes it. In the Responses API that means response.output_text.delta events while the advice is in flight; the completed output item still carries the full text, so consumers that ignore deltas see no difference. (Chat Completions ignores the flag, and Messages-API streaming is a fast-follow.)
How this compares to other advisor tools
Some providers ship a similar advisor concept in their own APIs, but it stays inside their model family: the executor and the advisor both have to come from the same vendor, often from a fixed pairing matrix, and sometimes behind a beta gate. OpenRouter’s advisor removes those constraints and adds a few things on top:
- Any model, any provider, on both sides. Both the executor and the advisor can be any of the hundreds of models in the catalog: a cheap open-weights executor consulting a frontier model, a Gemini executor consulting Claude, or a Claude executor getting a second opinion from GPT-5.5 outside its own model family.
- A roster of named advisors. Configure multiple specialists with their own models, instructions, and tool sets in a single request, and let the executor route each question to the right one. Single-vendor versions give you one unnamed advisor.
- Advisors with their own tools. Hand an advisor
openrouter:web_searchand it grounds its advice in fresh sources before responding. - Works across API formats, no beta gate. The same tool works through Chat Completions, Responses, and Anthropic Messages (with cross-request memory in all three), and it’s generally available. No beta header, no account-team access request.
If you’re already using a provider-native advisor through one of our compatible API skins, swapping to openrouter:advisor opens up the full catalog without changing the rest of your request.
Billing
Advisor tokens bill at the advisor model’s rates, separate from the executor. If your executor is GPT-4o Mini ($0.15/$0.60 per M tokens) and the advisor is Claude Fable 5 ($10/$50 per M tokens), each model’s tokens bill at their own price. Both show up on your activity page.
One line in your tools array:
{ "type": "openrouter:advisor", "parameters": { "model": "anthropic/claude-fable-5" } } The model decides when to use it. Most requests won’t trigger a consultation; the ones that do will be better for it. Read the full docs for parameters, named advisors, sub-agent tools, and more.