Related papers: RomniStereo: Recurrent Omnidirectional Stereo Matching

RomniStereo: Recurrent Omnidirectional Stereo Matching

URL: http://arxiv.org/abs/2401.04345v2
Date: Fri, 26 Jan 2024 03:02:34 GMT
Title: RomniStereo: Recurrent Omnidirectional Stereo Matching
Authors: Hualie Jiang, Rui Xu, Minglang Tan and Wenjie Jiang
Abstract summary: We propose a recurrent omnidirectional stereo matching (RomniStereo) algorithm. Our best model improves the average MAE metric by 40.7% over the previous SOTA baseline. When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples.
Score: 6.153793254880079
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Omnidirectional stereo matching (OSM) is an essential and reliable means for $360^{\circ}$ depth sensing. However, following earlier works on conventional stereo matching, prior state-of-the-art (SOTA) methods rely on a 3D encoder-decoder block to regularize the cost volume, causing the whole system complicated and sub-optimal results. Recently, the Recurrent All-pairs Field Transforms (RAFT) based approach employs the recurrent update in 2D and has efficiently improved image-matching tasks, ie, optical flow, and stereo matching. To bridge the gap between OSM and RAFT, we mainly propose an opposite adaptive weighting scheme to seamlessly transform the outputs of spherical sweeping of OSM into the required inputs for the recurrent update, thus creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm. Furthermore, we introduce two techniques, ie, grid embedding and adaptive context feature generation, which also contribute to RomniStereo's performance. Our best model improves the average MAE metric by 40.7\% over the previous SOTA baseline across five datasets. When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples. The code is available at \url{https://github.com/HalleyJiang/RomniStereo}.

Related papers

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model [62.37493746544967]
Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps. Existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments. We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation.
arXiv Detail & Related papers (2025-03-30T16:24:22Z)
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion [88.67015254278859]
We introduce the Mono2Stereo dataset, providing high-quality training data and benchmark to support in-depth exploration of stereo conversion. We conduct an empirical study that yields two primary findings. 1) The differences between the left and right views are subtle, yet existing metrics consider overall pixels, failing to concentrate on regions critical to stereo effects. We introduce a new evaluation metric, Stereo Intersection-over-Union, which harmonizes disparity and achieves a high correlation with human judgments on stereo effect.
arXiv Detail & Related papers (2025-03-28T09:25:58Z)
FoundationStereo: Zero-Shot Stereo Matching [50.79202911274819]
FoundationStereo is a foundation model for stereo depth estimation. We first construct a large-scale (1M stereo pairs) synthetic training dataset. We then design a number of network architecture components to enhance scalability.
arXiv Detail & Related papers (2025-01-17T01:01:44Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
LightStereo: Channel Boost Is All Your Need for Efficient 2D Cost Aggregation [27.00836175513738]
LightStereo is a cutting-edge stereo-matching network crafted to accelerate the matching process. Our breakthrough lies in enhancing performance through a dedicated focus on the channel dimension of the 3D cost volume. LightStereo achieves a competitive EPE metric in the SceneFlow datasets while demanding a minimum of only 22 GFLOPs and 17 ms of runtime.
arXiv Detail & Related papers (2024-06-28T11:11:24Z)
MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching [15.786593303130477]
We present a novel iterative optimization architecture called MC-Stereo. It mitigates the multi-peak distribution problem in matching through the multi-peak lookup strategy. It integrates the coarse-to-fine concept into the iterative framework via the cascade search range. MC-Stereo ranks first among all publicly available methods on the KITTI-2012 and KITTI-2015 benchmarks.
arXiv Detail & Related papers (2023-11-04T07:26:27Z)
Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance. Current methods with a fixed model do not work uniformly well across various datasets. This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z)
AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks. Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo. Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z)
RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching [60.44903340167672]
We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT. We introduce multi-level convolutional GRUs, which more efficiently propagate information across the image. A modified version of RAFT-Stereo can perform accurate real-time inference.
arXiv Detail & Related papers (2021-09-15T19:27:31Z)
SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures. Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities. We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z)
PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo Matching [14.603116313499648]
We propose a robust and effective self-supervised stereo matching approach, consisting of a pyramid voting module (PVM) and a novel DCNN architecture, referred to as OptStereo. Specifically, our OptStereo first builds multi-scale cost volumes, and then adopts a recurrent unit to iteratively update disparity estimations at high resolution. We publish the HKUST-Drive dataset, a large-scale synthetic stereo dataset, collected under different illumination and weather conditions for research purposes.
arXiv Detail & Related papers (2021-03-12T05:27:14Z)
AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks. Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.