Related papers: FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation

FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation

URL: http://arxiv.org/abs/2301.08414v1
Date: Fri, 20 Jan 2023 04:02:13 GMT
Title: FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation
Authors: Junyu Zhu, Lina Liu, Yong Liu, Wanlong Li, Feng Wen and Hongbo Zhang
Abstract summary: We propose a flow distillation loss to replace the typical photometric loss and a prior flow based mask to remove invalid pixels. Our approach achieves state-of-the-art results on both KITTI and NYU-Depth-v2 datasets.
Score: 17.572459787107427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The great potential of unsupervised monocular depth estimation has been demonstrated by many works due to low annotation cost and impressive accuracy comparable to supervised methods. To further improve the performance, recent works mainly focus on designing more complex network structures and exploiting extra supervised information, e.g., semantic segmentation. These methods optimize the models by exploiting the reconstructed relationship between the target and reference images in varying degrees. However, previous methods prove that this image reconstruction optimization is prone to get trapped in local minima. In this paper, our core idea is to guide the optimization with prior knowledge from pretrained Flow-Net. And we show that the bottleneck of unsupervised monocular depth estimation can be broken with our simple but effective framework named FG-Depth. In particular, we propose (i) a flow distillation loss to replace the typical photometric loss that limits the capacity of the model and (ii) a prior flow based mask to remove invalid pixels that bring the noise in training loss. Extensive experiments demonstrate the effectiveness of each component, and our approach achieves state-of-the-art results on both KITTI and NYU-Depth-v2 datasets.

Related papers

Efficient Diffusion as Low Light Enhancer [63.789138528062225]
Reflectance-Aware Trajectory Refinement (RATR) is a simple yet effective module to refine the teacher trajectory using the reflectance component of images. textbfReflectance-aware textbfDiffusion with textbfDistilled textbfTrajectory (textbfReDDiT) is an efficient and flexible distillation framework tailored for Low-Light Image Enhancement (LLIE)
arXiv Detail & Related papers (2024-10-16T08:07:18Z)
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think [53.2706196341054]
We show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed. We perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models.
arXiv Detail & Related papers (2024-09-17T16:58:52Z)
Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion [21.939618694037108]
Unsupervised monocular depth estimation has received widespread attention because of its capability to train without ground truth. We employ a well-converging diffusion model among generative networks for unsupervised monocular depth estimation. This model significantly enriches the model's capacity for learning and interpreting depth distribution.
arXiv Detail & Related papers (2024-06-14T07:31:20Z)
DepthFM: Fast Monocular Depth Estimation with Flow Matching [22.206355073676082]
Current discriminative approaches to this problem are limited due to blurry artifacts. State-of-the-art generative methods suffer from slow sampling due to their SDE nature. We observe that this can be effectively framed using flow matching, since its straight trajectories through solution space offer efficiency and high quality.
arXiv Detail & Related papers (2024-03-20T17:51:53Z)
The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation [42.48819460873482]
Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity. We show that they also excel in estimating optical flow and monocular depth, surprisingly, without task-specific architectures and loss functions.
arXiv Detail & Related papers (2023-06-02T21:26:20Z)
Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN [60.257791714663725]
We propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes. The proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.
arXiv Detail & Related papers (2022-12-23T03:00:28Z)
CbwLoss: Constrained Bidirectional Weighted Loss for Self-supervised Learning of Depth and Pose [13.581694284209885]
Photometric differences are used to train neural networks for estimating depth and camera pose from unlabeled monocular videos. In this paper, we deal with moving objects and occlusions utilizing the difference of the flow fields and depth structure generated by affine transformation and view synthesis. We mitigate the effect of textureless regions on model optimization by measuring differences between features with more semantic and contextual information without adding networks.
arXiv Detail & Related papers (2022-12-12T12:18:24Z)
Frequency-Aware Self-Supervised Monocular Depth Estimation [41.97188738587212]
We present two versatile methods to enhance self-supervised monocular depth estimation models. The high generalizability of our methods is achieved by solving the fundamental and ubiquitous problems in photometric loss function. We are the first to propose blurring images to improve depth estimators with an interpretable analysis.
arXiv Detail & Related papers (2022-10-11T14:30:26Z)
Image-specific Convolutional Kernel Modulation for Single Image Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM) We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels. Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z)
NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo [97.07453889070574]
We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors. We show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes.
arXiv Detail & Related papers (2021-09-02T17:54:31Z)
SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving [37.50089104051591]
State-of-the-art self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity. This paper introduces a novel multi-task learning strategy to improve self-supervised monocular distance estimation on fisheye and pinhole camera images.
arXiv Detail & Related papers (2020-08-10T10:52:47Z)
What Matters in Unsupervised Optical Flow [51.45112526506455]
We compare and analyze a set of key components in unsupervised optical flow. We construct a number of novel improvements to unsupervised flow models. We present a new unsupervised flow technique that significantly outperforms the previous state-of-the-art.
arXiv Detail & Related papers (2020-06-08T19:36:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.