TransNet: Transparent Object Manipulation Through Category-Level Pose
Estimation
- URL: http://arxiv.org/abs/2307.12400v1
- Date: Sun, 23 Jul 2023 18:38:42 GMT
- Title: TransNet: Transparent Object Manipulation Through Category-Level Pose
Estimation
- Authors: Huijie Zhang, Anthony Opipari, Xiaotong Chen, Jiyue Zhu, Zeren Yu,
Odest Chadwicke Jenkins
- Abstract summary: We propose a two-stage pipeline that estimates category-level transparent object pose using localized depth completion and surface normal estimation.
Results show that TransNet achieves improved pose estimation accuracy on transparent objects.
We use TransNet to build an autonomous transparent object manipulation system for robotic pick-and-place and pouring tasks.
- Score: 6.844391823478345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transparent objects present multiple distinct challenges to visual perception
systems. First, their lack of distinguishing visual features makes transparent
objects harder to detect and localize than opaque objects. Even humans find
certain transparent surfaces with little specular reflection or refraction,
like glass doors, difficult to perceive. A second challenge is that depth
sensors typically used for opaque object perception cannot obtain accurate
depth measurements on transparent surfaces due to their unique reflective
properties. Stemming from these challenges, we observe that transparent object
instances within the same category, such as cups, look more similar to each
other than to ordinary opaque objects of that same category. Given this
observation, the present paper explores the possibility of category-level
transparent object pose estimation rather than instance-level pose estimation.
We propose \textit{\textbf{TransNet}}, a two-stage pipeline that estimates
category-level transparent object pose using localized depth completion and
surface normal estimation. TransNet is evaluated in terms of pose estimation
accuracy on a large-scale transparent object dataset and compared to a
state-of-the-art category-level pose estimation approach. Results from this
comparison demonstrate that TransNet achieves improved pose estimation accuracy
on transparent objects. Moreover, we use TransNet to build an autonomous
transparent object manipulation system for robotic pick-and-place and pouring
tasks.
Related papers
- Transparent Object Depth Completion [11.825680661429825]
The perception of transparent objects for grasp and manipulation remains a major challenge.
Existing robotic grasp methods which heavily rely on depth maps are not suitable for transparent objects due to their unique visual properties.
We propose an end-to-end network for transparent object depth completion that combines the strengths of single-view RGB-D based depth completion and multi-view depth estimation.
arXiv Detail & Related papers (2024-05-24T07:38:06Z) - A New Dataset and a Distractor-Aware Architecture for Transparent Object
Tracking [34.08943612955157]
Performance of modern trackers degrades substantially on transparent objects compared to opaque objects.
We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall.
We also present a new distractor-aware transparent object tracker (DiTra) that treats localization accuracy and target identification as separate tasks.
arXiv Detail & Related papers (2024-01-08T13:04:28Z) - StereoPose: Category-Level 6D Transparent Object Pose Estimation from
Stereo Images via Back-View NOCS [106.62225866064313]
We present StereoPose, a novel stereo image framework for category-level object pose estimation.
For a robust estimation from pure stereo images, we develop a pipeline that decouples category-level pose estimation into object size estimation, initial pose estimation, and pose refinement.
To address the issue of image content aliasing, we define a back-view NOCS map for the transparent object.
The back-view NOCS aims to reduce the network learning ambiguity caused by content aliasing, and leverage informative cues on the back of the transparent object for more accurate pose estimation.
arXiv Detail & Related papers (2022-11-03T08:36:09Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - TODE-Trans: Transparent Object Depth Estimation with Transformer [16.928131778902564]
We present a transformer-based transparent object depth estimation approach from a single RGB-D input.
To better enhance the fine-grained features, a feature fusion module (FFM) is designed to assist coherent prediction.
arXiv Detail & Related papers (2022-09-18T03:04:01Z) - TransNet: Category-Level Transparent Object Pose Estimation [6.844391823478345]
The lack of distinguishing visual features makes transparent objects harder to detect and localize than opaque objects.
A second challenge is that common depth sensors typically used for opaque object perception cannot obtain accurate depth measurements on transparent objects.
We propose TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation.
arXiv Detail & Related papers (2022-08-22T01:34:31Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - ClearPose: Large-scale Transparent Object Dataset and Benchmark [7.342978076186365]
We contribute a large-scale real-world RGB-Depth transparent object dataset named ClearPose to serve as a benchmark dataset for segmentation, scene-level depth completion and object-centric pose estimation tasks.
The ClearPose dataset contains over 350K labeled real-world RGB-Depth frames and 4M instance annotations covering 63 household objects.
arXiv Detail & Related papers (2022-03-08T07:29:31Z) - TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth
Completion and Grasping [46.6058840385155]
We contribute a large-scale real-world dataset for transparent object depth completion.
Our dataset contains 57,715 RGB-D images from 130 different scenes.
We propose an end-to-end depth completion network, which takes the RGB image and the inaccurate depth map as inputs and outputs a refined depth map.
arXiv Detail & Related papers (2022-02-17T06:50:20Z) - Segmenting Transparent Objects in the Wild [98.80906604285163]
This work proposes a large-scale dataset for transparent object segmentation, named Trans10K, consisting of 10,428 images of real scenarios with carefully manual annotations.
To evaluate the effectiveness of Trans10K, we propose a novel boundary-aware segmentation method, termed TransLab, which exploits boundary as the clue to improve segmentation of transparent objects.
arXiv Detail & Related papers (2020-03-31T04:44:31Z) - Refined Plane Segmentation for Cuboid-Shaped Objects by Leveraging Edge
Detection [63.942632088208505]
We propose a post-processing algorithm to align the segmented plane masks with edges detected in the image.
This allows us to increase the accuracy of state-of-the-art approaches, while limiting ourselves to cuboid-shaped objects.
arXiv Detail & Related papers (2020-03-28T18:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.