Related papers: DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects

DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects

URL: http://arxiv.org/abs/2506.09491v1
Date: Wed, 11 Jun 2025 08:04:22 GMT
Title: DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects
Authors: Guanghu Xie, Zhiduo Jiang, Yonglong Zhang, Yang Liu, Zongwu Xie, Baoshi Cao, Hong Liu,
Abstract summary: We propose DCIRNet, a novel multimodal depth completion network for transparent and reflective objects.<n>Our approach incorporates an innovative multimodal feature fusion module designed to extract complementary information between RGB images and incomplete depth maps.<n>We achieve a $44%$ improvement in the grasp success rate for transparent and reflective objects.
Score: 9.235004977824026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transparent and reflective objects in everyday environments pose significant challenges for depth sensors due to their unique visual properties, such as specular reflections and light transmission. These characteristics often lead to incomplete or inaccurate depth estimation, which severely impacts downstream geometry-based vision tasks, including object recognition, scene reconstruction, and robotic manipulation. To address the issue of missing depth information in transparent and reflective objects, we propose DCIRNet, a novel multimodal depth completion network that effectively integrates RGB images and depth maps to enhance depth estimation quality. Our approach incorporates an innovative multimodal feature fusion module designed to extract complementary information between RGB images and incomplete depth maps. Furthermore, we introduce a multi-stage supervision and depth refinement strategy that progressively improves depth completion and effectively mitigates the issue of blurred object boundaries. We integrate our depth completion model into dexterous grasping frameworks and achieve a $44\%$ improvement in the grasp success rate for transparent and reflective objects. We conduct extensive experiments on public datasets, where DCIRNet demonstrates superior performance. The experimental results validate the effectiveness of our approach and confirm its strong generalization capability across various transparent and reflective objects.

Related papers

Rethinking Transparent Object Grasping: Depth Completion with Monocular Depth Estimation and Instance Mask [10.472380465235629]
ReMake is a novel depth completion framework guided by an instance mask and monocular depth estimation.<n>Our method outperforms existing approaches on both benchmark datasets and real-world scenarios.
arXiv Detail & Related papers (2025-08-04T15:14:47Z)
TDCNet: Transparent Objects Depth Completion with CNN-Transformer Dual-Branch Parallel Network [8.487135422430972]
We propose TDCNet, a novel dual-branch CNN-Transformer parallel network for transparent object depth completion.<n>Our model achieves state-of-the-art performance across multiple public datasets.
arXiv Detail & Related papers (2024-12-19T15:42:21Z)
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects [6.571006663689738]
We propose a diffusion-based Depth Inpainting framework specifically designed for Transparent and Reflective objects. DITR is highly effective in depth inpainting tasks of transparent and reflective objects with robust adaptability.
arXiv Detail & Related papers (2024-10-11T06:45:15Z)
ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation [18.140839442955485]
We develop a vision transformer-based algorithm for stereo depth recovery of transparent objects. Our method incorporates a parameter-aligned, domain-adaptive, and physically realistic Sim2Real simulation for efficient data generation. Our experimental results demonstrate the model's exceptional Sim2Real generalizability in real-world scenarios.
arXiv Detail & Related papers (2024-09-13T15:44:38Z)
Transparent Object Depth Completion [11.825680661429825]
The perception of transparent objects for grasp and manipulation remains a major challenge. Existing robotic grasp methods which heavily rely on depth maps are not suitable for transparent objects due to their unique visual properties. We propose an end-to-end network for transparent object depth completion that combines the strengths of single-view RGB-D based depth completion and multi-view depth estimation.
arXiv Detail & Related papers (2024-05-24T07:38:06Z)
Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors. Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors. To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z)
Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas. With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z)
Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects [28.84776177634971]
We propose a powerful RGBD fusion network, SwinDRNet, for depth restoration. We also propose Domain Randomization-Enhanced Depth Simulation (DREDS) approach to simulate an active stereo depth system. We show that our depth restoration effectively boosts the performance of downstream tasks.
arXiv Detail & Related papers (2022-08-07T19:17:16Z)
High-resolution Iterative Feedback Network for Camouflaged Object Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms. We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries. We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z)
Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD) Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z)
High-resolution Depth Maps Imaging via Attention-based Hierarchical Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR. We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z)
Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios. We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.