Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image
- URL: http://arxiv.org/abs/2504.03177v1
- Date: Fri, 04 Apr 2025 05:08:04 GMT
- Title: Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image
- Authors: Yuki Kawana, Tatsuya Harada,
- Abstract summary: We propose an end-to-end trainable, cross-category method for reconstructing multiple man-made articulated objects from a single RGBD image.<n>We depart from previous works that rely on learning instance-level latent space, focusing on man-made articulated objects with predefined part counts.<n>Our method successfully reconstructs variously structured multiple instances that previous works cannot handle, and outperforms prior works in shape reconstruction and kinematics estimation.
- Score: 52.11275397911693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an end-to-end trainable, cross-category method for reconstructing multiple man-made articulated objects from a single RGBD image, focusing on part-level shape reconstruction and pose and kinematics estimation. We depart from previous works that rely on learning instance-level latent space, focusing on man-made articulated objects with predefined part counts. Instead, we propose a novel alternative approach that employs part-level representation, representing instances as combinations of detected parts. While our detect-then-group approach effectively handles instances with diverse part structures and various part counts, it faces issues of false positives, varying part sizes and scales, and an increasing model size due to end-to-end training. To address these challenges, we propose 1) test-time kinematics-aware part fusion to improve detection performance while suppressing false positives, 2) anisotropic scale normalization for part shape learning to accommodate various part sizes and scales, and 3) a balancing strategy for cross-refinement between feature space and output space to improve part detection while maintaining model size. Evaluation on both synthetic and real data demonstrates that our method successfully reconstructs variously structured multiple instances that previous works cannot handle, and outperforms prior works in shape reconstruction and kinematics estimation.
Related papers
- Structure-Aware Correspondence Learning for Relative Pose Estimation [65.44234975976451]
Relative pose estimation provides a promising way for achieving object-agnostic pose estimation.<n>Existing 3D correspondence-based methods suffer from small overlaps in visible regions and unreliable feature estimation for invisible regions.<n>We propose a novel Structure-Aware Correspondence Learning method for Relative Pose Estimation, which consists of two key modules.
arXiv Detail & Related papers (2025-03-24T13:43:44Z) - ArtGS: Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting [66.29782808719301]
Building articulated objects is a key challenge in computer vision.
Existing methods often fail to effectively integrate information across different object states.
We introduce ArtGS, a novel approach that leverages 3D Gaussians as a flexible and efficient representation.
arXiv Detail & Related papers (2025-02-26T10:25:32Z) - Articulate your NeRF: Unsupervised articulated object modeling via conditional view synthesis [24.007950839144918]
We propose an unsupervised method to learn the pose and part-segmentation of articulated objects with rigid parts.
Our method learns the geometry and appearance of object parts by using an implicit model from the first observation.
arXiv Detail & Related papers (2024-06-24T13:13:31Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - Learning to Complete Object Shapes for Object-level Mapping in Dynamic
Scenes [30.500198859451434]
We propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes.
It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior.
We evaluate its effectiveness by quantitatively and qualitatively testing it in both synthetic and real-world sequences.
arXiv Detail & Related papers (2022-08-09T22:56:33Z) - ANISE: Assembly-based Neural Implicit Surface rEconstruction [12.745433575962842]
We present ANISE, a method that reconstructs a 3Dshape from partial observations (images or sparse point clouds)
The shape is formulated as an assembly of neural implicit functions, each representing a different part instance.
We demonstrate that, when performing reconstruction by decoding part representations into implicit functions, our method achieves state-of-the-art part-aware reconstruction results from both images and sparse point clouds.
arXiv Detail & Related papers (2022-05-27T00:01:40Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.