PRISM:一种用于多层光学薄膜设计的位置编码回归逆光谱模型
阅读原文· arxiv.orgPRISM 是一种仅解码器的自回归 Transformer 模型,用于解决多层光学薄膜设计的组合-连续优化问题。它能通过单一骨干网络联合预测离散材料选择与连续厚度。其主要创新在于使用频谱前缀条件输入,以及将连续厚度直接编码至位置表示的累积深度旋转位置嵌入。基准测试表明,13M 参数的 PRISM-13M 模型在平均绝对误差(MAE)上较其他 Transformer 基线降低超过 50%,且参数量仅为五分之一。44M 参数变体在分布内验证基准上达到了最先进性能(MAE = 0.010),其推理速度显著快于模拟退火法。
The inverse problem of multilayer thin-film optical coatings design represents a complex combinatorial-continuous optimization challenge. We present PRISM (Position-encoded Regressive Inverse Spectral Model), a unified decoder-only autoregressive transformer that streamlines this process by jointly predicting discrete material selection and continuous thickness regression within a single backbone. PRISM introduces two primary architectural innovations: (1) spectrum prefix conditioning, which utilizes standard prefix tokens for in-context target injection, and (2) cumulative-depth Rotary Position Embeddings, which encode continuous thickness directly into the positional representation to preserve the physical spatial relationships of the stack. Our benchmarks demonstrate that a PRISM-13M model reduces MAE by over 50\% compared to other transformer baselines while utilizing only one-fifth of the parameters. Furthermore, a 44M-parameter variant achieves state-of-the-art performance (MAE = 0.010) on our in-distribution validation benchmark and operates significantly faster than simulated annealing, offering a highly efficient alternative to classical optimization methods.