Fugu-MT 論文翻訳(概要): Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

論文の概要: Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

arxiv url: http://arxiv.org/abs/2508.10838v1
Date: Thu, 14 Aug 2025 17:03:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-15 22:24:48.422743
Title: Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning
Title（参考訳）: マルチベースラインコントラスト学習による自己教師付きステレオマッチング
Authors: Peng Xu, Zhiyu Xiang, Jingyun Fu, Tianyu Pu, Kai Wang, Chaojie Ji, Tingming Bai, Eryun Liu,
Abstract要約: BaCon-Stereoは、非閉塞領域と非閉塞領域の両方において、自己教師型ステレオネットワークトレーニングのための対照的な学習フレームワークである。我々は,教師と生徒に供給されるステレオペアが同じ参照ビューを共有するが,対象ビューが異なる,マルチベースライン入力を持つ教師学生パラダイムを採用する。実験により,BaCon-Stereoは閉塞領域と非閉塞領域の予測を改善し,強い一般化とロバスト性を実現し,KITTI 2015と2012のベンチマークにおいて,最先端の自己管理手法より優れていることが示された。
参考スコア（独自算出の注目度）: 12.013250652191477
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current self-supervised stereo matching relies on the photometric consistency assumption, which breaks down in occluded regions due to ill-posed correspondences. To address this issue, we propose BaCon-Stereo, a simple yet effective contrastive learning framework for self-supervised stereo network training in both non-occluded and occluded regions. We adopt a teacher-student paradigm with multi-baseline inputs, in which the stereo pairs fed into the teacher and student share the same reference view but differ in target views. Geometrically, regions occluded in the student's target view are often visible in the teacher's, making it easier for the teacher to predict in these regions. The teacher's prediction is rescaled to match the student's baseline and then used to supervise the student. We also introduce an occlusion-aware attention map to better guide the student in learning occlusion completion. To support training, we synthesize a multi-baseline dataset BaCon-20k. Extensive experiments demonstrate that BaCon-Stereo improves prediction in both occluded and non-occluded regions, achieves strong generalization and robustness, and outperforms state-of-the-art self-supervised methods on both KITTI 2015 and 2012 benchmarks. Our code and dataset will be released upon paper acceptance.
Abstract（参考訳）: 現在の自己教師型ステレオマッチングは、不適切な対応により閉鎖された領域で分解される光度整合性の仮定に依存している。この問題に対処するために,非閉塞領域と非閉塞領域の両方で自己教師付きステレオネットワークトレーニングを行うための,シンプルで効果的なコントラスト学習フレームワークBaCon-Stereoを提案する。我々は,教師と生徒に供給されるステレオペアが同じ参照ビューを共有するが,対象ビューが異なる,マルチベースライン入力を持つ教師学生パラダイムを採用する。幾何学的には、生徒の視界に隠された領域は、しばしば教師の目に見え、教師がこれらの領域で容易に予測できる。教師の予測は、生徒のベースラインに合わせるために再スケールされ、その後、生徒を監督するために使用される。また,オクルージョン・アウェア・アテンション・マップを導入し,オクルージョン・コンプリート学習の指導に役立てる。トレーニングを支援するために,マルチベースラインデータセットBaCon-20kを合成する。大規模な実験により、BaCon-Stereoは隠蔽領域と非隠蔽領域の両方での予測を改善し、強力な一般化と堅牢性を達成し、KITTI 2015と2012のベンチマークで最先端の自己管理手法より優れていることが示された。私たちのコードとデータセットは、論文の受理によってリリースされます。

論文の概要: Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

関連論文リスト