Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty


Saining Zhang (Nanyang Technological University), Baijun Ye (Tsinghua University), Xiaoxue Chen (Tsinghua University, Tsinghua University), Yuantao Chen (The Chinese University of Hong Kong,Shenzhen), Zongzheng Zhang (Tsinghua University), Cheng Peng (Beijing Institute of Technology), Yongliang Shi (Tsinghua University, Tsinghua University), Hao Zhao (Tsinghua University, Tsinghua University)
The 35th British Machine Vision Conference

Abstract

Robust and realistic rendering for large-scale road scenes is essential in autonomous driving simulation. Recently, 3D Gaussian Splatting (3D-GS) has made groundbreaking progress in neural rendering, but the general fidelity of large-scale road scene renderings is often limited by the input imagery, which usually has a narrow field of view and focuses mainly on the street-level local area. Intuitively, the data from the drone's perspective can provide a complementary viewpoint for the data from the ground vehicle's perspective, enhancing the completeness of scene reconstruction and rendering. However, training naively with aerial and ground images, which exhibit large view disparity, poses a significant convergence challenge for 3D-GS, and does not demonstrate remarkable improvements in performance on road views. In order to enhance the novel view synthesis of road views and to effectively use the aerial information, we design an uncertainty-aware training method that allows aerial images to assist in the synthesis of areas where ground images have poor learning outcomes instead of weighting all pixels equally in 3D-GS training like prior work did. We are the first to introduce the cross-view uncertainty to 3D-GS by matching the car-view ensemble-based rendering uncertainty to aerial images, weighting the contribution of each pixel to the training process. Additionally, to systematically quantify evaluation metrics, we assemble a high-quality synthesized dataset comprising both aerial and ground images for road scenes. Through comprehensive results, we show that: (1) Jointly training aerial and ground images helps improve representation ability of 3D-GS when test views are shifted and rotated, but performs poorly on held-out road view test. (2) Our method reduces the weakness of the joint training, and out-performs other baselines quantitatively on both held-out tests and scenes involving view shifting and rotation on our datasets. (3) Qualitatively, our method shows great improvements in the rendering of road scene details.

Citation

@inproceedings{Zhang_2024_BMVC,
author    = {Saining Zhang and Baijun Ye and Xiaoxue Chen and Yuantao Chen and Zongzheng Zhang and Cheng Peng and Yongliang Shi and Hao Zhao},
title     = {Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0452.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection