Related papers: Level Set Binocular Stereo with Occlusions

Level Set Binocular Stereo with Occlusions

URL: http://arxiv.org/abs/2109.03464v1
Date: Wed, 8 Sep 2021 07:22:25 GMT
Title: Level Set Binocular Stereo with Occlusions
Authors: Jialiang Wang, Todd Zickler
Abstract summary: Localizing stereo boundaries and predicting nearby disparities are difficult because stereo boundaries induce occluded regions where matching cues are absent. This paper introduces an energy and level-set that improves boundaries by encoding occlusion geometry. It can be implemented using messages that pass predominantly between parents and children in an undecimated hierarchy of image patches.
Score: 7.868449549351486
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Localizing stereo boundaries and predicting nearby disparities are difficult because stereo boundaries induce occluded regions where matching cues are absent. Most modern computer vision algorithms treat occlusions secondarily (e.g., via left-right consistency checks after matching) or rely on high-level cues to improve nearby disparities (e.g., via deep networks and large training sets). They ignore the geometry of stereo occlusions, which dictates that the spatial extent of occlusion must equal the amplitude of the disparity jump that causes it. This paper introduces an energy and level-set optimizer that improves boundaries by encoding occlusion geometry. Our model applies to two-layer, figure-ground scenes, and it can be implemented cooperatively using messages that pass predominantly between parents and children in an undecimated hierarchy of multi-scale image patches. In a small collection of figure-ground scenes curated from Middlebury and Falling Things stereo datasets, our model provides more accurate boundaries than previous occlusion-handling stereo techniques. This suggests new directions for creating cooperative stereo systems that incorporate occlusion cues in a human-like manner.

Related papers

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space [55.40440023281068]
We introduce StereoSpace, a diffusion-based framework for monocular-to-stereo synthesis.<n>A canonical rectified space and the conditioning guide the generator to infer correspondences and fill disocclusions end-to-end.
arXiv Detail & Related papers (2025-12-11T18:59:59Z)
OmniDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment [31.118114556998048]
We introduce OmniDepth, a unified framework that bridges monocular and stereo approaches to 3D estimation.<n>At its core, a novel cross-attentive alignment mechanism dynamically synchronizes monocular contextual cues with stereo hypothesis representations.<n>This mutual alignment resolves stereo ambiguities (e.g., specular surfaces) by injecting monocular structure priors while refining monocular depth with stereo geometry.
arXiv Detail & Related papers (2025-08-06T16:31:22Z)
Integrating Disparity Confidence Estimation into Relative Depth Prior-Guided Unsupervised Stereo Matching [55.784713740698365]
Unsupervised stereo matching has garnered significant attention for its independence from costly disparity annotations.<n>A feasible solution lies in transferring 3D geometric knowledge from a relative depth map to the stereo matching networks.<n>This work proposes a novel unsupervised learning framework to address these challenges.
arXiv Detail & Related papers (2025-08-02T09:11:05Z)
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model [62.37493746544967]
Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps. Existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments. We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation.
arXiv Detail & Related papers (2025-03-30T16:24:22Z)
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion [88.67015254278859]
We introduce the Mono2Stereo dataset, providing high-quality training data and benchmark to support in-depth exploration of stereo conversion. We conduct an empirical study that yields two primary findings. 1) The differences between the left and right views are subtle, yet existing metrics consider overall pixels, failing to concentrate on regions critical to stereo effects. We introduce a new evaluation metric, Stereo Intersection-over-Union, which harmonizes disparity and achieves a high correlation with human judgments on stereo effect.
arXiv Detail & Related papers (2025-03-28T09:25:58Z)
Stereo Anything: Unifying Zero-shot Stereo Matching with Large-Scale Mixed Data [77.27700893908012]
Stereo matching serves as a cornerstone in 3D vision, aiming to establish pixel-wise correspondences between stereo image pairs for depth recovery.<n>Current models often exhibit severe performance degradation when deployed in unseen domains.<n>We introduce StereoAnything, a data-centric framework that substantially enhances the zero-shot generalization capability of existing stereo models.
arXiv Detail & Related papers (2024-11-21T11:59:04Z)
Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion [45.171150395915056]
3D semantic scene completion (SSC) is an ill-posed perception task that requires inferring a dense 3D scene from limited observations. Previous camera-based methods struggle to predict accurate semantic scenes due to inherent geometric ambiguity and incomplete observations. We resort to stereo matching technique and bird's-eye-view (BEV) representation learning to address such issues in SSC.
arXiv Detail & Related papers (2023-03-24T12:33:44Z)
Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective [65.37571681370096]
We propose a simple pixel-wise contrastive learning across the viewpoints. A stereo selective whitening loss is introduced to better preserve the stereo feature consistency across domains. Our method achieves superior performance over several state-of-the-art networks.
arXiv Detail & Related papers (2022-03-21T11:21:41Z)
AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks. Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo. Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z)
Co-Teaching: An Ark to Unsupervised Stereo Matching [14.801038005597855]
CoT-Stereo is a novel unsupervised stereo matching approach. Experiments on the KITTI Stereo benchmarks demonstrate the superior performance of CoT-Stereo.
arXiv Detail & Related papers (2021-07-17T05:33:39Z)
H-Net: Unsupervised Attention-based Stereo Depth Estimation Leveraging Epipolar Geometry [4.968452390132676]
We introduce the H-Net, a deep-learning framework for unsupervised stereo depth estimation. For the first time, a Siamese autoencoder architecture is used for depth estimation. Our method outperforms the state-ofthe-art unsupervised stereo depth estimation methods.
arXiv Detail & Related papers (2021-04-22T19:16:35Z)
SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures. Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities. We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z)
Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation [51.714092199995044]
In many fields, self-supervised learning solutions are rapidly evolving and filling the gap with supervised approaches. We propose a novel self-supervised paradigm reversing the link between the two. In order to train deep stereo networks, we distill knowledge through a monocular completion network.
arXiv Detail & Related papers (2020-08-17T07:40:22Z)
Level Set Stereo for Cooperative Grouping with Occlusion [5.837881923712393]
Localizing stereo boundaries is difficult because matching cues are absent in the occluded regions that are adjacent to them. We introduce an energy and level-set disparity that improves boundaries by encoding the essential geometry of occlusions.
arXiv Detail & Related papers (2020-06-29T14:51:08Z)
StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching [56.95846963856928]
Large-scale synthetic datasets are beneficial to stereo matching but usually introduce known domain bias. We propose an end-to-end training framework with domain translation and stereo matching networks to tackle this challenge.
arXiv Detail & Related papers (2020-05-05T03:11:38Z)
AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks. Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.