MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo
Matching
- URL: http://arxiv.org/abs/2311.02340v2
- Date: Sat, 27 Jan 2024 10:49:02 GMT
- Title: MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo
Matching
- Authors: Miaojie Feng, Junda Cheng, Hao Jia, Longliang Liu, Gangwei Xu,
Qingyong Hu, Xin Yang
- Abstract summary: We present a novel iterative optimization architecture called MC-Stereo.
It mitigates the multi-peak distribution problem in matching through the multi-peak lookup strategy.
It integrates the coarse-to-fine concept into the iterative framework via the cascade search range.
MC-Stereo ranks first among all publicly available methods on the KITTI-2012 and KITTI-2015 benchmarks.
- Score: 15.786593303130477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stereo matching is a fundamental task in scene comprehension. In recent
years, the method based on iterative optimization has shown promise in stereo
matching. However, the current iteration framework employs a single-peak
lookup, which struggles to handle the multi-peak problem effectively.
Additionally, the fixed search range used during the iteration process limits
the final convergence effects. To address these issues, we present a novel
iterative optimization architecture called MC-Stereo. This architecture
mitigates the multi-peak distribution problem in matching through the
multi-peak lookup strategy, and integrates the coarse-to-fine concept into the
iterative framework via the cascade search range. Furthermore, given that
feature representation learning is crucial for successful learn-based stereo
matching, we introduce a pre-trained network to serve as the feature extractor,
enhancing the front end of the stereo matching pipeline. Based on these
improvements, MC-Stereo ranks first among all publicly available methods on the
KITTI-2012 and KITTI-2015 benchmarks, and also achieves state-of-the-art
performance on ETH3D. Code is available at
https://github.com/MiaoJieF/MC-Stereo.
Related papers
- Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching [17.344430840048094]
Recent learning-based methods prioritize optimal performance on a single stereo pair, resulting in temporal inconsistencies.
We develop a bidirectional alignment mechanism for adjacent frames as a fundamental operation.
Unlike the existing methods, we model this task as local matching and global aggregation.
arXiv Detail & Related papers (2024-03-16T01:38:28Z) - RomniStereo: Recurrent Omnidirectional Stereo Matching [6.153793254880079]
We propose a recurrent omnidirectional stereo matching (RomniStereo) algorithm.
Our best model improves the average MAE metric by 40.7% over the previous SOTA baseline.
When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples.
arXiv Detail & Related papers (2024-01-09T04:06:01Z) - MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition [62.89464258519723]
We propose a multi-layer cross-attention fusion based AVSR approach that promotes representation of each modality by fusing them at different levels of audio/visual encoders.
Our proposed approach surpasses the first-place system, establishing a new SOTA cpCER of 29.13% on this dataset.
arXiv Detail & Related papers (2024-01-07T08:59:32Z) - Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo
Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance.
Current methods with a fixed model do not work uniformly well across various datasets.
This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z) - Learning and Crafting for the Wide Multiple Baseline Stereo [4.7210697296108926]
This thesis introduces the wide multiple baseline stereo (WxBS) problem.
WxBS considers the matching of images that differ in more than one image acquisition factor.
A new dataset with the ground truth, evaluation metric and baselines has been introduced.
arXiv Detail & Related papers (2021-12-22T16:52:55Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.