TransNet: Category-Level Transparent Object Pose Estimation
- URL: http://arxiv.org/abs/2208.10002v1
- Date: Mon, 22 Aug 2022 01:34:31 GMT
- Title: TransNet: Category-Level Transparent Object Pose Estimation
- Authors: Huijie Zhang, Anthony Opipari, Xiaotong Chen, Jiyue Zhu, Zeren Yu,
Odest Chadwicke Jenkins
- Abstract summary: The lack of distinguishing visual features makes transparent objects harder to detect and localize than opaque objects.
A second challenge is that common depth sensors typically used for opaque object perception cannot obtain accurate depth measurements on transparent objects.
We propose TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation.
- Score: 6.844391823478345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transparent objects present multiple distinct challenges to visual perception
systems. First, their lack of distinguishing visual features makes transparent
objects harder to detect and localize than opaque objects. Even humans find
certain transparent surfaces with little specular reflection or refraction,
e.g. glass doors, difficult to perceive. A second challenge is that common
depth sensors typically used for opaque object perception cannot obtain
accurate depth measurements on transparent objects due to their unique
reflective properties. Stemming from these challenges, we observe that
transparent object instances within the same category (e.g. cups) look more
similar to each other than to ordinary opaque objects of that same category.
Given this observation, the present paper sets out to explore the possibility
of category-level transparent object pose estimation rather than instance-level
pose estimation. We propose TransNet, a two-stage pipeline that learns to
estimate category-level transparent object pose using localized depth
completion and surface normal estimation. TransNet is evaluated in terms of
pose estimation accuracy on a recent, large-scale transparent object dataset
and compared to a state-of-the-art category-level pose estimation approach.
Results from this comparison demonstrate that TransNet achieves improved pose
estimation accuracy on transparent objects and key findings from the included
ablation studies suggest future directions for performance improvements.
Related papers
- Transparent Object Depth Completion [11.825680661429825]
The perception of transparent objects for grasp and manipulation remains a major challenge.
Existing robotic grasp methods which heavily rely on depth maps are not suitable for transparent objects due to their unique visual properties.
We propose an end-to-end network for transparent object depth completion that combines the strengths of single-view RGB-D based depth completion and multi-view depth estimation.
arXiv Detail & Related papers (2024-05-24T07:38:06Z) - Tabletop Transparent Scene Reconstruction via Epipolar-Guided Optical
Flow with Monocular Depth Completion Prior [14.049778178534588]
We introduce a two-stage pipeline for reconstructing transparent objects tailored for mobile platforms.
Epipolar-guided Optical Flow (EOF) to fuse several single-view shape priors to a cross-view consistent 3D reconstruction.
Our pipeline significantly outperforms baseline methods in 3D reconstruction quality.
arXiv Detail & Related papers (2023-10-15T21:30:06Z) - TransNet: Transparent Object Manipulation Through Category-Level Pose
Estimation [6.844391823478345]
We propose a two-stage pipeline that estimates category-level transparent object pose using localized depth completion and surface normal estimation.
Results show that TransNet achieves improved pose estimation accuracy on transparent objects.
We use TransNet to build an autonomous transparent object manipulation system for robotic pick-and-place and pouring tasks.
arXiv Detail & Related papers (2023-07-23T18:38:42Z) - StereoPose: Category-Level 6D Transparent Object Pose Estimation from
Stereo Images via Back-View NOCS [106.62225866064313]
We present StereoPose, a novel stereo image framework for category-level object pose estimation.
For a robust estimation from pure stereo images, we develop a pipeline that decouples category-level pose estimation into object size estimation, initial pose estimation, and pose refinement.
To address the issue of image content aliasing, we define a back-view NOCS map for the transparent object.
The back-view NOCS aims to reduce the network learning ambiguity caused by content aliasing, and leverage informative cues on the back of the transparent object for more accurate pose estimation.
arXiv Detail & Related papers (2022-11-03T08:36:09Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - TODE-Trans: Transparent Object Depth Estimation with Transformer [16.928131778902564]
We present a transformer-based transparent object depth estimation approach from a single RGB-D input.
To better enhance the fine-grained features, a feature fusion module (FFM) is designed to assist coherent prediction.
arXiv Detail & Related papers (2022-09-18T03:04:01Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - ClearPose: Large-scale Transparent Object Dataset and Benchmark [7.342978076186365]
We contribute a large-scale real-world RGB-Depth transparent object dataset named ClearPose to serve as a benchmark dataset for segmentation, scene-level depth completion and object-centric pose estimation tasks.
The ClearPose dataset contains over 350K labeled real-world RGB-Depth frames and 4M instance annotations covering 63 household objects.
arXiv Detail & Related papers (2022-03-08T07:29:31Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Segmenting Transparent Objects in the Wild [98.80906604285163]
This work proposes a large-scale dataset for transparent object segmentation, named Trans10K, consisting of 10,428 images of real scenarios with carefully manual annotations.
To evaluate the effectiveness of Trans10K, we propose a novel boundary-aware segmentation method, termed TransLab, which exploits boundary as the clue to improve segmentation of transparent objects.
arXiv Detail & Related papers (2020-03-31T04:44:31Z) - Refined Plane Segmentation for Cuboid-Shaped Objects by Leveraging Edge
Detection [63.942632088208505]
We propose a post-processing algorithm to align the segmented plane masks with edges detected in the image.
This allows us to increase the accuracy of state-of-the-art approaches, while limiting ourselves to cuboid-shaped objects.
arXiv Detail & Related papers (2020-03-28T18:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.