Unsupervised stereo matching has garnered significant attention for its independence from costly disparity annotations. Typical unsupervised methods rely on the multi-view consistency assumption for training networks, which suffer considerably from stereo matching ambiguities, such as repetitive patterns and texture-less regions. A feasible solution lies in transferring 3D geometry knowledge from a relative depth map to the stereo matching networks. However, existing knowledge transfer methods learn depth ranking information from randomly built sparse correspondences, which make only inefficient utilization of 3D geometry knowledge and introduce noise from mistaken estimations. This work proposes a novel unsupervised learning framework to address these challenges, which comprises a plug-and-play disparity confidence estimation algorithm and two depth priors-guided loss functions. Specifically, the local coherence consistency between neighboring disparities and their corresponding relative depths is first checked to obtain disparity confidence. Afterwards, quasi-dense correspondences are built using only confident disparity estimations to facilitate efficient depth ranking learning. Finally, a dual disparity smoothness loss is proposed to boost stereo matching performance at disparity discontinuities. Experimental results demonstrate that our method achieves state-of-the-art stereo matching accuracy on the KITTI Stereo benchmarks among all unsupervised stereo matching methods.
|