RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
- URL: http://arxiv.org/abs/2109.07547v1
- Date: Wed, 15 Sep 2021 19:27:31 GMT
- Title: RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
- Authors: Lahav Lipson, Zachary Teed, Jia Deng
- Abstract summary: We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT.
We introduce multi-level convolutional GRUs, which more efficiently propagate information across the image.
A modified version of RAFT-Stereo can perform accurate real-time inference.
- Score: 60.44903340167672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce RAFT-Stereo, a new deep architecture for rectified stereo based
on the optical flow network RAFT. We introduce multi-level convolutional GRUs,
which more efficiently propagate information across the image. A modified
version of RAFT-Stereo can perform accurate real-time inference. RAFT-stereo
ranks first on the Middlebury leaderboard, outperforming the next best method
on 1px error by 29% and outperforms all published work on the ETH3D two-view
stereo benchmark. Code is available at
https://github.com/princeton-vl/RAFT-Stereo.
Related papers
- UniTT-Stereo: Unified Training of Transformer for Enhanced Stereo Matching [18.02254687807291]
UniTT-Stereo is a method to maximize the potential of Transformer-based stereo architectures.
State-of-the-art performance of UniTT-Stereo is validated on various benchmarks such as ETH3D, KITTI 2012, and KITTI 2015 datasets.
arXiv Detail & Related papers (2024-09-04T09:02:01Z) - RomniStereo: Recurrent Omnidirectional Stereo Matching [6.153793254880079]
We propose a recurrent omnidirectional stereo matching (RomniStereo) algorithm.
Our best model improves the average MAE metric by 40.7% over the previous SOTA baseline.
When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples.
arXiv Detail & Related papers (2024-01-09T04:06:01Z) - OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline [25.4712469033627]
We develop a flexible and efficient stereo matching, called OpenStereo.
OpenStereo includes training and inference codes of more than 10 network models.
We conduct an exhaustive analysis and deconstruction of recent developments in stereo matching through comprehensive ablative experiments.
Our StereoBase ranks 1st on SceneFlow, KITTI 2015, 2012 (Reflective) among published methods and achieves the best performance across all metrics.
arXiv Detail & Related papers (2023-12-01T04:35:47Z) - Multiview Stereo with Cascaded Epipolar RAFT [73.7619703879639]
We address multiview stereo (MVS), an important 3D vision task that reconstructs a 3D model such as a dense point cloud from multiple calibrated images.
We propose CER-MVS, a new approach based on the RAFT (Recurrent All-Pairs Field Transforms) architecture developed for optical flow. CER-MVS introduces five new changes to RAFT: epipolar cost volumes, cost volume cascading, multiview fusion of cost volumes, dynamic supervision, and multiresolution fusion of depth maps.
arXiv Detail & Related papers (2022-05-09T18:17:05Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.
We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z) - RAFT: Recurrent All-Pairs Field Transforms for Optical Flow [78.92562539905951]
We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow.
RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field.
RAFT achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-26T17:12:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.