Seeing Glass: Joint Point Cloud and Depth Completion for Transparent
Objects
- URL: http://arxiv.org/abs/2110.00087v1
- Date: Thu, 30 Sep 2021 21:09:09 GMT
- Title: Seeing Glass: Joint Point Cloud and Depth Completion for Transparent
Objects
- Authors: Haoping Xu, Yi Ru Wang, Sagi Eppel, Al\`an Aspuru-Guzik, Florian
Shkurti, Animesh Garg
- Abstract summary: TranspareNet is a joint point cloud and depth completion method.
It can complete the depth of transparent objects in cluttered and complex scenes.
TranspareNet outperforms existing state-of-the-art depth completion methods on multiple datasets.
- Score: 16.714074893209713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The basis of many object manipulation algorithms is RGB-D input. Yet,
commodity RGB-D sensors can only provide distorted depth maps for a wide range
of transparent objects due light refraction and absorption. To tackle the
perception challenges posed by transparent objects, we propose TranspareNet, a
joint point cloud and depth completion method, with the ability to complete the
depth of transparent objects in cluttered and complex scenes, even with
partially filled fluid contents within the vessels. To address the shortcomings
of existing transparent object data collection schemes in literature, we also
propose an automated dataset creation workflow that consists of
robot-controlled image collection and vision-based automatic annotation.
Through this automated workflow, we created Toronto Transparent Objects Depth
Dataset (TODD), which consists of nearly 15000 RGB-D images. Our experimental
evaluation demonstrates that TranspareNet outperforms existing state-of-the-art
depth completion methods on multiple datasets, including ClearGrasp, and that
it also handles cluttered scenes when trained on TODD. Code and dataset will be
released at https://www.pair.toronto.edu/TranspareNet/
Related papers
- ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation [18.140839442955485]
We develop a vision transformer-based algorithm for stereo depth recovery of transparent objects.
Our method incorporates a parameter-aligned, domain-adaptive, and physically realistic Sim2Real simulation for efficient data generation.
Our experimental results demonstrate the model's exceptional Sim2Real generalizability in real-world scenarios.
arXiv Detail & Related papers (2024-09-13T15:44:38Z) - Transparent Object Depth Completion [11.825680661429825]
The perception of transparent objects for grasp and manipulation remains a major challenge.
Existing robotic grasp methods which heavily rely on depth maps are not suitable for transparent objects due to their unique visual properties.
We propose an end-to-end network for transparent object depth completion that combines the strengths of single-view RGB-D based depth completion and multi-view depth estimation.
arXiv Detail & Related papers (2024-05-24T07:38:06Z) - MVTrans: Multi-View Perception of Transparent Objects [29.851395075937255]
We forgo the unreliable depth map from RGB-D sensors and extend the stereo based method.
Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities.
We establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset.
arXiv Detail & Related papers (2023-02-22T22:45:28Z) - StereoPose: Category-Level 6D Transparent Object Pose Estimation from
Stereo Images via Back-View NOCS [106.62225866064313]
We present StereoPose, a novel stereo image framework for category-level object pose estimation.
For a robust estimation from pure stereo images, we develop a pipeline that decouples category-level pose estimation into object size estimation, initial pose estimation, and pose refinement.
To address the issue of image content aliasing, we define a back-view NOCS map for the transparent object.
The back-view NOCS aims to reduce the network learning ambiguity caused by content aliasing, and leverage informative cues on the back of the transparent object for more accurate pose estimation.
arXiv Detail & Related papers (2022-11-03T08:36:09Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - ClearPose: Large-scale Transparent Object Dataset and Benchmark [7.342978076186365]
We contribute a large-scale real-world RGB-Depth transparent object dataset named ClearPose to serve as a benchmark dataset for segmentation, scene-level depth completion and object-centric pose estimation tasks.
The ClearPose dataset contains over 350K labeled real-world RGB-Depth frames and 4M instance annotations covering 63 household objects.
arXiv Detail & Related papers (2022-03-08T07:29:31Z) - TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth
Completion and Grasping [46.6058840385155]
We contribute a large-scale real-world dataset for transparent object depth completion.
Our dataset contains 57,715 RGB-D images from 130 different scenes.
We propose an end-to-end depth completion network, which takes the RGB image and the inaccurate depth map as inputs and outputs a refined depth map.
arXiv Detail & Related papers (2022-02-17T06:50:20Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z) - Segmenting Transparent Objects in the Wild [98.80906604285163]
This work proposes a large-scale dataset for transparent object segmentation, named Trans10K, consisting of 10,428 images of real scenarios with carefully manual annotations.
To evaluate the effectiveness of Trans10K, we propose a novel boundary-aware segmentation method, termed TransLab, which exploits boundary as the clue to improve segmentation of transparent objects.
arXiv Detail & Related papers (2020-03-31T04:44:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.