VISTA: Virtual Stereo based Augmentation for Depth Estimation in Automated Driving

Dec 2, 2022



Depth estimation is the primary task for automated vehicles to perceive the 3D environment. The classical approach for depth estimation leverages stereo cameras on the cars. This approach can provide accurate and robust depth estimation, but also requires a more expensive setup and detailed calibration. The recent trend of depth estimation, therefore, focuses on learning the depth from monocular videos. These approaches only need an easy setup but may also be vulnerable to occlusion or light condition changes in the scene. In this work, we propose a novel idea that exploits the fact that data collected by large fleets naturally contains scenarios where vehicles with monocular cameras drive close to each other and are looking at the same scene. Our approach combines the monocular view of the ego vehicle and the neighboring vehicle to form a virtual stereo pair during training, while still only requiring the monocular image during inference. With such a virtual stereo view, we are able to train self-supervised depth estimation by two sources of constraints: 1) the spatial and temporal constraints between sequential monocular frames; 2) the geometric constraints between the frames from two cameras that form the virtual stereo.Public datasets for multiple vehicles sharing the common view to form possible virtual stereo views do not exist, and so we also created our synthetic dataset using CARLA simulator where multiple vehicles can observe the same scene at the same time. The evaluation shows that our virtual stereo approach can improve the ego vehicle's depth estimation accuracy by 8


Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%


Recommended Videos

Presentations on similar topic, category or speaker

Interested in talks like this? Follow NeurIPS 2022