# PianoKontext：从平淡上下文中生成富有表现力的演奏

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-10 08:00
- AIHOT 分数：56
- AIHOT 链接：https://aihot.virxact.com/items/cmqamzs2g0myyslldfnfknnom
- 原文链接：https://arxiv.org/abs/2606.12282

## AI 摘要

PianoKontext 是一种流匹配渲染模型，专为古典钢琴音乐设计，在预训练 Music2Latent 模型的潜在空间中生成可变长度的富有表现力演奏。该方法将 MIDI 乐谱合成为平淡音频，利用动态时间规整（DTW）在潜在空间中对齐乐谱与演奏数据，并将对齐的嵌入拼接至 DiT 块中，以简单有效的方式学习乐谱与演奏之间的依赖关系。演示音频见项目页面。

## 正文

Expressive performance rendering (EPR) aims to generate realistic performances constrained on sequences of notes. However, flow matching audio editing models manipulate only synchronized music samples of the same duration, limiting their understanding of expressive timing. We introduce PianoKontext, a flow matching rendering model for classical piano music that generates variable-length performances in the latent space of a pretrained Music2Latent model. We synthesize MIDI scores into deadpan audio and employ Dynamic Time Warping (DTW) in the latent space to construct paired data for training. The aligned embeddings are concatenated in DiT blocks, allowing for a simple and effective learning of the dependencies between the score and performances. Audio samples are available at our demo page: https://realfolkcode.github.io/pianokontext_demo/.
