Background Prompting for Improved Object Depth
- URL: http://arxiv.org/abs/2306.05428v1
- Date: Thu, 8 Jun 2023 17:59:59 GMT
- Title: Background Prompting for Improved Object Depth
- Authors: Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein,
Antonio Torralba, William T. Freeman, Varun Jampani
- Abstract summary: Estimating the depth of objects from a single image is a valuable task for many vision, robotics, and graphics applications.
We propose a simple yet effective Background Prompting strategy that adapts the input object image with a learned background.
Results on multiple synthetic and real datasets demonstrate consistent improvements in real object depths for a variety of existing depth networks.
- Score: 70.25467510077706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating the depth of objects from a single image is a valuable task for
many vision, robotics, and graphics applications. However, current methods
often fail to produce accurate depth for objects in diverse scenes. In this
work, we propose a simple yet effective Background Prompting strategy that
adapts the input object image with a learned background. We learn the
background prompts only using small-scale synthetic object datasets. To infer
object depth on a real image, we place the segmented object into the learned
background prompt and run off-the-shelf depth networks. Background Prompting
helps the depth networks focus on the foreground object, as they are made
invariant to background variations. Moreover, Background Prompting minimizes
the domain gap between synthetic and real object images, leading to better
sim2real generalization than simple finetuning. Results on multiple synthetic
and real datasets demonstrate consistent improvements in real object depths for
a variety of existing depth networks. Code and optimized background prompts can
be found at: https://mbaradad.github.io/depth_prompt.
Related papers
- Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge.
Our method extracts low-level features from edges and textures to create a texture image.
By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z) - DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition [7.693732944239458]
DepGAN is a Generative Adversarial Network that utilizes depth maps and alpha channels to rectify inaccurate occlusions.
Central to our network is a novel loss function called Depth Aware Loss which quantifies the pixel wise depth difference.
We enhance our network's learning process by utilizing opacity data, enabling it to effectively manage compositions involving transparent and semi-transparent objects.
arXiv Detail & Related papers (2024-07-16T16:18:40Z) - Impact of Pseudo Depth on Open World Object Segmentation with Minimal
User Guidance [18.176606453818557]
Pseudo depth maps are depth map predicitions which are used as ground truth during training.
In this paper we leverage pseudo depth maps in order to segment objects of classes that have never been seen during training.
arXiv Detail & Related papers (2023-04-12T09:18:38Z) - Source-free Depth for Object Pop-out [113.24407776545652]
Modern learning-based methods offer promising depth maps by inference in the wild.
We adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D.
Our experiments on eight datasets consistently demonstrate the benefit of our method in terms of both performance and generalizability.
arXiv Detail & Related papers (2022-12-10T21:57:11Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - Domain Randomization-Enhanced Depth Simulation and Restoration for
Perceiving and Grasping Specular and Transparent Objects [28.84776177634971]
We propose a powerful RGBD fusion network, SwinDRNet, for depth restoration.
We also propose Domain Randomization-Enhanced Depth Simulation (DREDS) approach to simulate an active stereo depth system.
We show that our depth restoration effectively boosts the performance of downstream tasks.
arXiv Detail & Related papers (2022-08-07T19:17:16Z) - DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes [68.38952377590499]
We present a novel approach for estimating depth from a monocular camera as it moves through complex indoor environments.
Our approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people.
arXiv Detail & Related papers (2021-08-12T09:12:39Z) - Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes [85.56602190773684]
We build on the idea of view synthesis, which uses classical camera geometry to re-render a source image from a different point-of-view.
By minimizing the error between the synthetic image and the corresponding real image in a video, the deep network that predicts pose and depth can be trained completely unsupervised.
arXiv Detail & Related papers (2021-05-05T17:08:10Z) - S2R-DepthNet: Learning a Generalizable Depth-specific Structural
Representation [63.58891781246175]
Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes.
We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information.
Our S2R-DepthNet can be well generalized to unseen real-world data directly even though it is only trained on synthetic data.
arXiv Detail & Related papers (2021-04-02T03:55:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.