# TROPT：统一与推进离散文本优化的开源框架

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-22 08:00
- AIHOT 分数：69
- AIHOT 链接：https://aihot.virxact.com/items/cmqr79z320h50slp5oew4cpkz
- 原文链接：https://arxiv.org/abs/2606.23496

## AI 摘要

TROPT 是首个开源框架，通过统一接口标准化离散优化器的执行与开发。它支持灵活替换模型、目标和优化器，定制端到端优化配方。框架内置30余个优化配方（覆盖LLM越狱、模型内部探测等），由15余个优化器（白盒到黑盒）和15余个损失函数组合而成。通过大规模对比实验验证了LLM越狱优化策略改进，并将优化器从越狱场景移植至语料投毒嵌入模型等领域，显著降低了离散文本优化的使用门槛。

## 正文

Discrete text-trigger optimization -- searching for text sequences that, when ingested by a model, steer it toward a specified objective -- underpins model red-teaming (e.g., LLM jailbreaks), as well as auditing and interpretability. However, the current state of discrete optimizers hinders their adoption and progress. First, existing optimizers, when open-sourced at all, are scattered across research codebases tied to specific models, objectives, and problem domains. Second, optimizer variants proliferate, each requiring engineering overhead to use or extend, and remaining hard to compare head-to-head. Together, these raise the bar for adopting optimizers in existing or new domains, and for advancing them via new strategies. We address these gaps with TROPT, the first open-source framework that unifies discrete optimizers' execution and standardizes their development under a single interface. TROPT makes it easy to customize end-to-end optimization recipes by swapping any component -- models, objectives, and optimizers -- extending its reach across domains and new applications. TROPT currently ships with 30+ optimization recipes -- covering applications such as jailbreaking and probing model internals -- built from 15+ optimizers (spanning white-box to black-box access) and 15+ losses, from foundational to state-of-the-art methods. Demonstrating its utility, we leverage TROPT in several studies: (i) controlled, large-scale experiments comparing and enhancing optimization strategies for LLM jailbreaks, revealing potent-yet-underadopted techniques; and (ii) porting optimizers from one domain (e.g., LLM jailbreak) to new domains (e.g., corpus-poisoning embedding model). In all, TROPT significantly lowers the barrier to adopting and advancing discrete text optimization.
