Domain Randomization-Enhanced Depth Simulation and Restoration for
Perceiving and Grasping Specular and Transparent Objects
- URL: http://arxiv.org/abs/2208.03792v1
- Date: Sun, 7 Aug 2022 19:17:16 GMT
- Title: Domain Randomization-Enhanced Depth Simulation and Restoration for
Perceiving and Grasping Specular and Transparent Objects
- Authors: Qiyu Dai, Jiyao Zhang, Qiwei Li, Tianhao Wu, Hao Dong, Ziyuan Liu,
Ping Tan, He Wang
- Abstract summary: We propose a powerful RGBD fusion network, SwinDRNet, for depth restoration.
We also propose Domain Randomization-Enhanced Depth Simulation (DREDS) approach to simulate an active stereo depth system.
We show that our depth restoration effectively boosts the performance of downstream tasks.
- Score: 28.84776177634971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Commercial depth sensors usually generate noisy and missing depths,
especially on specular and transparent objects, which poses critical issues to
downstream depth or point cloud-based tasks. To mitigate this problem, we
propose a powerful RGBD fusion network, SwinDRNet, for depth restoration. We
further propose Domain Randomization-Enhanced Depth Simulation (DREDS) approach
to simulate an active stereo depth system using physically based rendering and
generate a large-scale synthetic dataset that contains 130K photorealistic RGB
images along with their simulated depths carrying realistic sensor noises. To
evaluate depth restoration methods, we also curate a real-world dataset, namely
STD, that captures 30 cluttered scenes composed of 50 objects with different
materials from specular, transparent, to diffuse. Experiments demonstrate that
the proposed DREDS dataset bridges the sim-to-real domain gap such that,
trained on DREDS, our SwinDRNet can seamlessly generalize to other real depth
datasets, e.g. ClearGrasp, and outperform the competing methods on depth
restoration with a real-time speed. We further show that our depth restoration
effectively boosts the performance of downstream tasks, including
category-level pose estimation and grasping tasks. Our data and code are
available at https://github.com/PKU-EPIC/DREDS
Related papers
- SelfReDepth: Self-Supervised Real-Time Depth Restoration for Consumer-Grade Sensors [42.48726526726542]
SelfReDepth is a self-supervised deep learning technique for depth restoration.
It uses multiple sequential depth frames and color data to achieve high-quality depth videos with temporal coherence.
Our results demonstrate our approach's real-time performance on real-world datasets.
arXiv Detail & Related papers (2024-06-05T15:38:02Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects [7.529049797077149]
Acquiring accurate depth information of transparent objects using off-the-shelf RGB-D cameras is a well-known challenge in Computer Vision and Robotics.
NeRFs are learning-free approaches and have demonstrated wide success in novel view synthesis and shape recovery.
Our proposed method-AID-NeRF shows significant performance on depth completion datasets for transparent objects and robotic grasping.
arXiv Detail & Related papers (2024-03-28T17:28:32Z) - Background Prompting for Improved Object Depth [70.25467510077706]
Estimating the depth of objects from a single image is a valuable task for many vision, robotics, and graphics applications.
We propose a simple yet effective Background Prompting strategy that adapts the input object image with a learned background.
Results on multiple synthetic and real datasets demonstrate consistent improvements in real object depths for a variety of existing depth networks.
arXiv Detail & Related papers (2023-06-08T17:59:59Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth
Completion and Grasping [46.6058840385155]
We contribute a large-scale real-world dataset for transparent object depth completion.
Our dataset contains 57,715 RGB-D images from 130 different scenes.
We propose an end-to-end depth completion network, which takes the RGB image and the inaccurate depth map as inputs and outputs a refined depth map.
arXiv Detail & Related papers (2022-02-17T06:50:20Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark
Dataset and Baseline [48.69396457721544]
We build a large-scale dataset named "RGB-D-D" to promote the study of depth map super-resolution (SR)
We provide a fast depth map super-resolution (FDSR) baseline, in which the high-frequency component adaptively decomposed from RGB image to guide the depth map SR.
For the real-world LR depth maps, our algorithm can produce more accurate HR depth maps with clearer boundaries and to some extent correct the depth value errors.
arXiv Detail & Related papers (2021-04-13T13:27:26Z) - RGB-D Local Implicit Function for Depth Completion of Transparent
Objects [43.238923881620494]
Majority of perception methods in robotics require depth information provided by RGB-D cameras.
Standard 3D sensors fail to capture depth of transparent objects due to refraction and absorption of light.
We present a novel framework that can complete missing depth given noisy RGB-D input.
arXiv Detail & Related papers (2021-04-01T17:00:04Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.