Depth self-supervision for single image novel view synthesis
- URL: http://arxiv.org/abs/2308.14108v1
- Date: Sun, 27 Aug 2023 13:50:15 GMT
- Title: Depth self-supervision for single image novel view synthesis
- Authors: Giovanni Minelli, Matteo Poggi, Samuele Salti
- Abstract summary: We tackle the problem of generating a novel image from an arbitrary viewpoint given a single frame as input.
We jointly optimize our framework for both novel view synthesis and depth estimation to unleash the synergy between the two.
- Score: 26.223796965401654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we tackle the problem of generating a novel image from an
arbitrary viewpoint given a single frame as input. While existing methods
operating in this setup aim at predicting the target view depth map to guide
the synthesis, without explicit supervision over such a task, we jointly
optimize our framework for both novel view synthesis and depth estimation to
unleash the synergy between the two at its best. Specifically, a shared depth
decoder is trained in a self-supervised manner to predict depth maps that are
consistent across the source and target views. Our results demonstrate the
effectiveness of our approach in addressing the challenges of both tasks
allowing for higher-quality generated images, as well as more accurate depth
for the target viewpoint.
Related papers
- Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions [30.148969711689773]
We present a novel approach designed to address the complexities posed by challenging, out-of-distribution data in the single-image depth estimation task.
We systematically generate new, user-defined scenes with a comprehensive set of challenges and associated depth information.
This is achieved by leveraging cutting-edge text-to-image diffusion models with depth-aware control.
arXiv Detail & Related papers (2024-07-23T17:59:59Z) - Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing [9.195173526948123]
We propose a dual-task collaborative mutual promotion framework to achieve the dehazing of a single image.
This framework integrates depth estimation and dehazing by a dual-task interaction mechanism.
We show that the proposed method can achieve better performance than that of the state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-02T06:29:44Z) - Panoptic-Depth Color Map for Combination of Depth and Image Segmentation [2.33877878310217]
We propose an innovative approach to combine image segmentation and depth estimation.
By incorporating an additional depth estimation branch into the segmentation network, it can predict the depth of each instance segment.
Our proposed method demonstrates a new possibility of combining different tasks and networks to generate a more comprehensive image recognition result.
arXiv Detail & Related papers (2023-08-24T17:25:09Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View
Synthesis [74.4983052902396]
We propose a novel training method split in three main steps to improve monocular depth estimation.
Experimental results prove that our method achieves state-of-the-art or comparable performance on the KITTI and NYU-Depth-v2 datasets.
arXiv Detail & Related papers (2021-12-22T12:21:08Z) - RigNet: Repetitive Image Guided Network for Depth Completion [20.66405067066299]
Recent approaches mainly focus on image guided learning to predict dense results.
blurry image guidance and object structures in depth still impede the performance of image guided frameworks.
We explore a repetitive design in our image guided network to sufficiently and gradually recover depth values.
Our method achieves state-of-the-art result on the NYUv2 dataset and ranks 1st on the KITTI benchmark at the time of submission.
arXiv Detail & Related papers (2021-07-29T08:00:33Z) - Domain Adaptive Semantic Segmentation with Self-Supervised Depth
Estimation [84.34227665232281]
Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain.
We leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap.
We demonstrate the effectiveness of our proposed approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes.
arXiv Detail & Related papers (2021-04-28T07:47:36Z) - Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection [86.25022248968908]
We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.
We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
arXiv Detail & Related papers (2021-03-30T16:20:24Z) - Self-Supervised Visibility Learning for Novel View Synthesis [79.53158728483375]
Conventional rendering methods estimate scene geometry and synthesize novel views in two separate steps.
We propose an end-to-end NVS framework to eliminate the error propagation issue.
Our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis.
arXiv Detail & Related papers (2021-03-29T08:11:25Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - The Edge of Depth: Explicit Constraints between Segmentation and Depth [25.232436455640716]
We study the mutual benefits of two common computer vision tasks, self-supervised depth estimation and semantic segmentation from images.
We propose to explicitly measure the border consistency between segmentation and depth and minimize it.
Through extensive experiments, our proposed approach advances the state of the art on unsupervised monocular depth estimation in the KITTI.
arXiv Detail & Related papers (2020-04-01T00:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.