# Sensor2Sensor：面向自动驾驶的跨形态传感器数据转换

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-21 08:00
- AIHOT 分数：60
- AIHOT 链接：https://aihot.virxact.com/items/cmpgadnmr0dqlsljw48goo1o7
- 原文链接：https://arxiv.org/abs/2605.22809

## AI 摘要

针对自动驾驶系统训练所需高保真、多样化数据不足的难题，研究提出了Sensor2Sensor方法。该方法能将行车记录仪等来源的非结构化单目视频，转化为包含多视角相机图像与LiDAR点云的高保真多模态传感器数据。其核心在于利用4D高斯溅射技术将真实自动驾驶日志转换为视频风格，从而解决缺乏配对训练数据的挑战，并结合扩散模型完成生成式转换。评估表明，该方法能将复杂的真实场景有效转化为可用数据，为自动驾驶开发解锁了海量的外部数据源。

## 正文

Robust training and validation of Autonomous Driving Systems (ADS) require massive, diverse datasets. Proprietary data collected by Autonomous Vehicle (AV) fleets, while high-fidelity, are limited in scale, diversity of sensor configurations, as well as geographic and long-tail-behavioral coverage. In contrast, in-the-wild data from sources like dashcams offers immense scale and diversity, capturing critical long-tail scenarios and novel environments. However, this unstructured, in-the-wild video data is incompatible with ADS expecting structured, multi-modal sensor inputs for validation and training. To bridge this data gap, we propose Sensor2Sensor, a novel generative modeling paradigm that translates in-the-wild monocular dashcam videos into a high-fidelity, multi-modal sensor suite (AV logs) comprising multi-view camera images and LiDAR point clouds. A core challenge is the lack of paired training data. We address this by converting real AV logs into dashcam-style videos via 4D Gaussian Splatting (4DGS) reconstruction and novel-view rendering. Sensor2Sensor then utilizes a diffusion architecture to perform the generative conversion. We perform comprehensive quantitative evaluations on the fidelity and realism of the generated sensor data. We demonstrate Sensor2Sensor's practical utility by converting challenging in-the-wild internet and dashcam footage into realistic, multi-modal data formats, further unlocking vast external data sources for AV development.
