PolyAI研究证实,专为客服设计的较小模型Raven 3.5,在性能上显著超越了规模大其100倍的通用前沿模型。该模型在所有四项客服基准测试中击败GPT-5和Claude Sonnet 4.6,并将响应延迟控制在300毫秒内。这项发布同时包括ADK代码开发工具包和PolyPhone网页语音生成工具,助力企业快速构建生产级语音代理。此举旨在将企业语音AI从大型项目转变为可快速部署的基础设施,从而有效解决客服等待时间长、成本高等问题,提升服务效率与客户体验。
Can a smaller model purpose-built for one domain beat a frontier general model that's 100× its size?
A recent paper showed yes - and not by a small margin.
Raven 3.5 from PolyAI shows that a smaller specialist model can beat bigger general models on customer service calls.
It beats GPT-5 and Claude Sonnet 4.6 on all 4 customer service benchmarks while staying under 300ms latency.
This is one of the live debates in ML. Every researcher is asking this question. The paper is the empirical answer.
PolyAI's research team published "Raven 3.5: The post-training recipe that beats GPT-5 for customer service"