OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline
- URL: http://arxiv.org/abs/2312.00343v7
- Date: Sat, 16 Nov 2024 02:44:39 GMT
- Title: OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline
- Authors: Xianda Guo, Chenming Zhang, Juntao Lu, Yiqun Duan, Yiqi Wang, Tian Yang, Zheng Zhu, Long Chen,
- Abstract summary: We develop a flexible and efficient stereo matching, called OpenStereo.
OpenStereo includes training and inference codes of more than 10 network models.
We conduct an exhaustive analysis and deconstruction of recent developments in stereo matching through comprehensive ablative experiments.
Our StereoBase ranks 1st on SceneFlow, KITTI 2015, 2012 (Reflective) among published methods and achieves the best performance across all metrics.
- Score: 25.4712469033627
- License:
- Abstract: Stereo matching aims to estimate the disparity between matching pixels in a stereo image pair, which is important to robotics, autonomous driving, and other computer vision tasks. Despite the development of numerous impressive methods in recent years, determining the most suitable architecture for practical application remains challenging. Addressing this gap, our paper introduces a comprehensive benchmark focusing on practical applicability rather than solely on individual models for optimized performance. Specifically, we develop a flexible and efficient stereo matching codebase, called OpenStereo. OpenStereo includes training and inference codes of more than 10 network models, making it, to our knowledge, the most complete stereo matching toolbox available. Based on OpenStereo, we conducted experiments and have achieved or surpassed the performance metrics reported in the original paper. Additionally, we conduct an exhaustive analysis and deconstruction of recent developments in stereo matching through comprehensive ablative experiments. These investigations inspired the creation of StereoBase, a strong baseline model. Our StereoBase ranks 1st on SceneFlow, KITTI 2015, 2012 (Reflective) among published methods and achieves the best performance across all metrics. In addition, StereoBase has strong cross-dataset generalization. Code is available at \url{https://github.com/XiandaGuo/OpenStereo}.
Related papers
- Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data [26.029499450825092]
We introduce StereoAnything, a solution for robust stereo matching.
We scale up the dataset by collecting labeled stereo images and generating synthetic stereo pairs from unlabeled monocular images.
We extensively evaluate the zero-shot capabilities of our model on five public datasets.
arXiv Detail & Related papers (2024-11-21T11:59:04Z) - Match Stereo Videos via Bidirectional Alignment [15.876953256378224]
Recent learning-based methods often focus on optimizing performance for independent stereo pairs, leading to temporal inconsistencies in videos.
We introduce a novel video processing framework, BiDAStereo, and a plugin stabilizer network, BiDAStabilizer, compatible with general image-based methods.
We present a realistic synthetic dataset and benchmark focused on natural scenes, along with a real-world dataset captured by a stereo camera in diverse urban scenes for qualitative evaluation.
arXiv Detail & Related papers (2024-09-30T13:37:29Z) - MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo
Matching [15.786593303130477]
We present a novel iterative optimization architecture called MC-Stereo.
It mitigates the multi-peak distribution problem in matching through the multi-peak lookup strategy.
It integrates the coarse-to-fine concept into the iterative framework via the cascade search range.
MC-Stereo ranks first among all publicly available methods on the KITTI-2012 and KITTI-2015 benchmarks.
arXiv Detail & Related papers (2023-11-04T07:26:27Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo
Matching [14.603116313499648]
We propose a robust and effective self-supervised stereo matching approach, consisting of a pyramid voting module (PVM) and a novel DCNN architecture, referred to as OptStereo.
Specifically, our OptStereo first builds multi-scale cost volumes, and then adopts a recurrent unit to iteratively update disparity estimations at high resolution.
We publish the HKUST-Drive dataset, a large-scale synthetic stereo dataset, collected under different illumination and weather conditions for research purposes.
arXiv Detail & Related papers (2021-03-12T05:27:14Z) - On the confidence of stereo matching in a deep-learning era: a
quantitative evaluation [124.09613797008099]
We review more than ten years of developments in the field of confidence estimation for stereo matching.
We study the different behaviors of each measure when applied to a pool of different stereo algorithms and, for the first time in literature, when paired with a state-of-the-art deep stereo network.
arXiv Detail & Related papers (2021-01-02T11:40:17Z) - Reversing the cycle: self-supervised deep stereo through enhanced
monocular distillation [51.714092199995044]
In many fields, self-supervised learning solutions are rapidly evolving and filling the gap with supervised approaches.
We propose a novel self-supervised paradigm reversing the link between the two.
In order to train deep stereo networks, we distill knowledge through a monocular completion network.
arXiv Detail & Related papers (2020-08-17T07:40:22Z) - Self-adapting confidence estimation for stereo [48.56220165347967]
We propose a flexible and lightweight solution enabling self-adapting confidence estimation to the stereo algorithm or network.
Our strategy allows us not only a seamless integration with any stereo system, but also, due to its self-adapting capability, for its out-of-the-box deployment in the field.
arXiv Detail & Related papers (2020-08-14T16:17:28Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.