SG-RoadSeg: End-to-End Collision-Free Space Detection Sharing Encoder Representations Jointly Learned via Unsupervised Deep Stereo

Zhiyuan Wu

Jiaqi Li

Yi Feng

Wei Ye

Qijun Chen

Rui Fan

[Paper]

[GitHub]

The code can be found in this repository.

Abstract

Collision-free space detection is of utmost importance for autonomous robot perception and navigation. State-of-the-art (SoTA) approaches generally extract features from RGB images and an additional source or modality of 3-D information, such as depth or disparity images, using a pair of independent encoders. The extracted features are subsequently fused and decoded to yield semantic predictions of collision-free spaces. Such feature-fusion approaches become infeasible in scenarios, where the sensor for 3-D information acquisition is unavailable, or just when multi-sensor calibration falls short of the necessary precision. To overcome these limitations, this paper introduces a novel end-to-end collision-free space detection network, referred to as SG-RoadSeg, built upon our previous work SNE-RoadSeg. A key contribution of this paper is a strategy for sharing encoder representations that are co-learned through both semantic segmentation and unsupervised stereo matching tasks, enabling the features extracted from RGB images to contain both semantic and spatial geometric information. The unsupervised deep stereo serves as an auxiliary functionality, capable of generating accurate disparity maps that can be used by other perception tasks that require depth-related data. Comprehensive experimental results on the KITTI road and semantics datasets validate the effectiveness of our proposed architecture and encoder representation sharing strategy. SG-RoadSeg also demonstrates superior performance than other SoTA collision-free space detection approaches. Our source code, demo video, and supplement are publicly available at https://mias.group/SG-RoadSeg/.

Video

[Youtube Video Link]

[Bibtex]

Acknowledgements

This research was supported by the National Natural Science Foundation of China under Grants 62233013 and 61796184, the Science and Technology Commission of Shanghai Municipal under Grant 22511104500, and the Fundamental Research Funds for the Central Universities.