Towards Self-Supervised Category-Level Object Pose and Size Estimation
- URL: http://arxiv.org/abs/2203.02884v1
- Date: Sun, 6 Mar 2022 06:02:30 GMT
- Title: Towards Self-Supervised Category-Level Object Pose and Size Estimation
- Authors: Yisheng He, Haoqiang Fan, Haibin Huang, Qifeng Chen, Jian Sun
- Abstract summary: This work presents a self-supervised framework for category-level object pose and size estimation from a single depth image.
We leverage the geometric consistency residing in point clouds of the same shape for self-supervision.
- Score: 121.28537953301951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a self-supervised framework for category-level object pose
and size estimation from a single depth image. Unlike previous works that rely
on time-consuming and labor-intensive ground truth pose labels for supervision,
we leverage the geometric consistency residing in point clouds of the same
shape for self-supervision. Specifically, given a normalized category template
mesh in the object-coordinate system and the partially observed object instance
in the scene, our key idea is to apply differentiable shape deformation,
registration, and rendering to enforce geometric consistency between the
predicted and the observed scene object point cloud. We evaluate our approach
on real-world datasets and find that our approach outperforms the simple
traditional baseline by large margins while being competitive with some
fully-supervised approaches.
Related papers
- OP-Align: Object-level and Part-level Alignment for Self-supervised Category-level Articulated Object Pose Estimation [7.022004731560844]
Category-level articulated object pose estimation focuses on the pose estimation of unknown articulated objects within known categories.
We propose a novel self-supervised approach that leverages a single-frame point cloud to solve this task.
Our model consistently generates reconstruction with a canonical pose and joint state for the entire input object.
arXiv Detail & Related papers (2024-08-29T14:10:14Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation [26.982199143972835]
We introduce a diffusion-driven self-supervised network for multi-object shape reconstruction and categorical pose estimation.
Our method significantly outperforms state-of-the-art self-supervised category-level baselines and even surpasses some fully-supervised instance-level and category-level methods.
arXiv Detail & Related papers (2024-03-19T13:43:27Z) - GenPose: Generative Category-level Object Pose Estimation via Diffusion
Models [5.1998359768382905]
We propose a novel solution by reframing categorylevel object pose estimation as conditional generative modeling.
Our approach achieves state-of-the-art performance on the REAL275 dataset, surpassing 50% and 60% on strict 5d2cm and 5d5cm metrics.
arXiv Detail & Related papers (2023-06-18T11:45:42Z) - Generative Category-Level Shape and Pose Estimation with Semantic
Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image.
To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space.
We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z) - CATRE: Iterative Point Clouds Alignment for Category-level Object Pose
Refinement [52.41884119329864]
Category-level object pose and size refiner CATRE is able to iteratively enhance pose estimate from point clouds to produce accurate results.
Our approach remarkably outperforms state-of-the-art methods on REAL275, CAMERA25, and LM benchmarks up to a speed of 85.32Hz.
arXiv Detail & Related papers (2022-07-17T05:55:00Z) - 3D Object Classification on Partial Point Clouds: A Practical
Perspective [91.81377258830703]
A point cloud is a popular shape representation adopted in 3D object classification.
This paper introduces a practical setting to classify partial point clouds of object instances under any poses.
A novel algorithm in an alignment-classification manner is proposed in this paper.
arXiv Detail & Related papers (2020-12-18T04:00:56Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.