Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models
- URL: http://arxiv.org/abs/2405.11158v1
- Date: Sat, 18 May 2024 03:07:23 GMT
- Title: Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models
- Authors: Madhu Vankadari, Samuel Hodgson, Sangyun Shin, Kaichen Zhou Andrew Markham, Niki Trigoni,
- Abstract summary: Self-supervised depth estimation algorithms rely heavily on frame-warping relationships.
We introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on nighttime conditions.
- Score: 16.792458193160407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised depth estimation algorithms rely heavily on frame-warping relationships, exhibiting substantial performance degradation when applied in challenging circumstances, such as low-visibility and nighttime scenarios with varying illumination conditions. Addressing this challenge, we introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on nighttime conditions. Specifically, we use pretrained visual foundation models to extract generalised features across challenging scenes and present an efficient method for matching and integrating these features from stereo frames. Moreover, to prevent pixels violating photometric consistency assumption from negatively affecting the depth predictions, we propose a novel masking approach designed to filter out such pixels. Lastly, addressing weaknesses in the evaluation of current depth estimation algorithms, we present novel evaluation metrics. Our experiments, conducted on challenging datasets including Oxford RobotCar and Multi-Spectral Stereo, demonstrate the robust improvements realized by our approach. Code is available at: https://github.com/madhubabuv/dtd
Related papers
- D$^3$epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes [23.731667977542454]
D$3$epth is a novel method for self-supervised depth estimation in dynamic scenes.
It tackles the challenge of dynamic objects from two key perspectives.
It consistently outperforms existing self-supervised monocular depth estimation baselines.
arXiv Detail & Related papers (2024-11-07T16:07:00Z) - Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints.
To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking.
Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z) - Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Uncertainty Guided Depth Fusion for Spike Camera [49.41822923588663]
We propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse predictions of monocular and stereo depth estimation networks for spike camera.
Our framework is motivated by the fact that stereo spike depth estimation achieves better results at close range.
In order to demonstrate the advantage of spike depth estimation over traditional camera depth estimation, we contribute a spike-depth dataset named CitySpike20K.
arXiv Detail & Related papers (2022-08-26T13:04:01Z) - Gated2Gated: Self-Supervised Depth Estimation from Gated Images [22.415893281441928]
Gated cameras hold promise as an alternative to scanning LiDAR sensors with high-resolution 3D depth.
We propose an entirely self-supervised depth estimation method that uses gated intensity profiles and temporal consistency as a training signal.
arXiv Detail & Related papers (2021-12-04T19:47:38Z) - Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular
Depth Estimation in the Dark [20.66405067066299]
We introduce Priors-Based Regularization to learn distribution knowledge from unpaired depth maps.
We also leverage Mapping-Consistent Image Enhancement module to enhance image visibility and contrast.
Our framework achieves remarkable improvements and state-of-the-art results on two nighttime datasets.
arXiv Detail & Related papers (2021-08-09T06:24:35Z) - Unsupervised Monocular Depth Estimation in Highly Complex Environments [9.580317751486636]
Unsupervised monocular depth estimation methods mainly focus on the day-time scenario.
In some challenging environments, like night, rainy night or snowy winter, the photometry of the same pixel on different frames is inconsistent.
We address this challenging problem by using domain adaptation, and a unified image transfer-based adaptation framework is proposed.
arXiv Detail & Related papers (2021-07-28T02:35:38Z) - Adaptive confidence thresholding for monocular depth estimation [83.06265443599521]
We propose a new approach to leverage pseudo ground truth depth maps of stereo images generated from self-supervised stereo matching methods.
The confidence map of the pseudo ground truth depth map is estimated to mitigate performance degeneration by inaccurate pseudo depth maps.
Experimental results demonstrate superior performance to state-of-the-art monocular depth estimation methods.
arXiv Detail & Related papers (2020-09-27T13:26:16Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.