# 语音识别中面向低资源与口音鲁棒性的凸语言检测

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-22 08:00
- AIHOT 分数：40
- AIHOT 链接：https://aihot.virxact.com/items/cmprdn8f20d4xslnopjml67ce
- 原文链接：https://arxiv.org/abs/2605.23235

## AI 摘要

全球语音多样性导致现有语音对话系统在处理方言和口音时易误识别语言，引发下游任务失败。为此，研究提出Convex Language Detection框架，将凸优化技术集成到系统中。该方法基于多GPU ADMM在JAX中高效实现，具有全局最优性保证和快速训练能力，并从理论上证明了其稳定性与鲁棒性。实验表明，在低资源场景下，该框架实现了97-98%的语言检测准确率，展现出高样本效率。相关开源工具包已发布。

## 正文

Globalization and multiculturalism continue to produce increasingly diverse speech varieties. Yet current spoken dialogue systems frequently fail on under-represented dialects and accents, often misidentifying the input language and causing cascading failures in downstream dialogue tasks. Addressing this dialectal variance under low-resource constraints remains an open challenge, as standard fine-tuning is computationally expensive and prone to overfitting on high-dimensional speech data. We propose Convex Language Detection (CLD), a novel framework that integrates theoretically grounded convex optimization techniques into the spoken dialogue systems pipeline. Our method is efficiently implemented via multi-GPU Alternating Direction Method of Multipliers (ADMM) in JAX, thus providing global optimality guarantees and fast training in polynomial time. Theoretically, we prove that our convex objective induces certified margin stability and provide guarantees against feature perturbations. Empirically, we demonstrate sample efficiency and robustness to input dialectical variation, achieving 97-98% accuracy in challenging low-resource regimes. Our open-source package is available at https://pypi.org/project/jaxcld/
