Geometric Point Attention Transformer for 3D Shape Reassembly
- URL: http://arxiv.org/abs/2411.17788v2
- Date: Sun, 01 Dec 2024 08:00:56 GMT
- Title: Geometric Point Attention Transformer for 3D Shape Reassembly
- Authors: Jiahan Li, Chaoran Cheng, Jianzhu Ma, Ge Liu,
- Abstract summary: We present a network specifically designed to address the challenges of reasoning about geometric relationships.
We integrate both global shape information and local pairwise geometric features, along with poses represented as rotation and translation vectors for each part.
We evaluate our model on both the semantic and geometric assembly tasks, showing that it outperforms previous methods in absolute pose estimation.
- Score: 17.34739330880715
- License:
- Abstract: Shape assembly, which aims to reassemble separate parts into a complete object, has gained significant interest in recent years. Existing methods primarily rely on networks to predict the poses of individual parts, but often fail to effectively capture the geometric interactions between the parts and their poses. In this paper, we present the Geometric Point Attention Transformer (GPAT), a network specifically designed to address the challenges of reasoning about geometric relationships. In the geometric point attention module, we integrate both global shape information and local pairwise geometric features, along with poses represented as rotation and translation vectors for each part. To enable iterative updates and dynamic reasoning, we introduce a geometric recycling scheme, where each prediction is fed into the next iteration for refinement. We evaluate our model on both the semantic and geometric assembly tasks, showing that it outperforms previous methods in absolute pose estimation, achieving accurate pose predictions and high alignment accuracy.
Related papers
- Geometry-guided Feature Learning and Fusion for Indoor Scene Reconstruction [14.225228781008209]
This paper proposes a novel geometry integration mechanism for 3D scene reconstruction.
Our approach incorporates 3D geometry at three levels, i.e. feature learning, feature fusion, and network supervision.
arXiv Detail & Related papers (2024-08-28T08:02:47Z) - Geometrically Consistent Partial Shape Matching [50.29468769172704]
Finding correspondences between 3D shapes is a crucial problem in computer vision and graphics.
An often neglected but essential property of matching geometrics is consistency.
We propose a novel integer linear programming partial shape matching formulation.
arXiv Detail & Related papers (2023-09-10T12:21:42Z) - Zero-shot point cloud segmentation by transferring geometric primitives [68.18710039217336]
We investigate zero-shot point cloud semantic segmentation, where the network is trained on seen objects and able to segment unseen objects.
We propose a novel framework to learn the geometric primitives shared in seen and unseen categories' objects and employ a fine-grained alignment between language and the learned geometric primitives.
arXiv Detail & Related papers (2022-10-18T15:06:54Z) - Learning to Complete Object Shapes for Object-level Mapping in Dynamic
Scenes [30.500198859451434]
We propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes.
It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior.
We evaluate its effectiveness by quantitatively and qualitatively testing it in both synthetic and real-world sequences.
arXiv Detail & Related papers (2022-08-09T22:56:33Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Plane Geometry Diagram Parsing [29.921409628478152]
We propose a powerful diagram based on deep learning and graph reasoning.
A modified instance segmentation method is proposed to extract geometric primitives.
The graph neural network (GNN) is leveraged to realize relation parsing and primitive classification.
arXiv Detail & Related papers (2022-05-19T07:47:01Z) - Unsupervised Learning for Cuboid Shape Abstraction via Joint
Segmentation from Point Clouds [8.156355030558172]
Representing complex 3D objects as simple geometric primitives, known as shape abstraction, is important for geometric modeling, structural analysis, and shape synthesis.
We propose an unsupervised shape abstraction method to map a point cloud into a compact cuboid representation.
arXiv Detail & Related papers (2021-06-07T09:15:16Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - A Divide et Impera Approach for 3D Shape Reconstruction from Multiple
Views [49.03830902235915]
Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning.
This paper proposes to rely on viewpoint variant reconstructions by merging the visible information from the given views.
To validate the proposed method, we perform a comprehensive evaluation on the ShapeNet reference benchmark in terms of relative pose estimation and 3D shape reconstruction.
arXiv Detail & Related papers (2020-11-17T09:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.