H公司新模型Holo2在UI本地化领域取得领先
阅读原文· huggingface.coUI本地化领域新标杆,开发者可关注其多模态能力。
H公司在Hugging Face发布博客,正式推出新一代模型Holo2。该模型在用户界面本地化任务上表现突出,实现了技术领先。其核心改进在于显著提升了多语言UI元素的识别与适配能力,能够更精准地处理图标、布局、文本标签等组件的文化适配与翻译。这一进展有望帮助全球应用和软件更高效地实现界面本地化,降低跨区域运营成本。
H Company's new Holo2 model takes the lead in UI Localization
Two months since releasing our first batch of Holo2 models, H Company is back with our largest UI localization model yet: Holo2-235B-A22B Preview. This model achieves a new State-of-the-Art (SOTA) record of 78.5% on Screenspot-Pro and 79.0% on OSWorld G.
Available on Hugging Face, Holo2-235B-A22B Preview is a research release focused on UI element localization.
Agentic Localization
High-resolution 4K interfaces are challenging for localization models. Small UI elements can be difficult to pinpoint on a large display. With agentic localization, however, Holo2 can iteratively refine its predictions, improving accuracy with each step and unlocking 10-20% relative gains across all Holo2 model sizes.
Holo2-235B-A22B's Performance on ScreenSpot-Pro
Holo2-235B-A22B Preview reaches 70.6% accuracy on ScreenSpot-Pro in a single step. In agent mode, it achieves 78.5% within 3 steps, setting a new state-of-the-art on the most challenging GUI grounding benchmark.
Trained with SkyPilot
Training Holo2 models at scale requires coordinating workloads across multiple cloud providers. H Company uses SkyPilot as a unified interface for launching training jobs on our clusters with Kubernetes (k8s). By abstracting away infrastructure complexity, SkyPilot lets researchers focus on model development instead of managing k8s manifests or maintaining separate deployment scripts.
Models mentioned in this article 1
Community
· or to comment

