Unsupervised Learning of 3D Semantic Keypoints with Mutual
Reconstruction
- URL: http://arxiv.org/abs/2203.10212v1
- Date: Sat, 19 Mar 2022 01:49:21 GMT
- Title: Unsupervised Learning of 3D Semantic Keypoints with Mutual
Reconstruction
- Authors: Haocheng Yuan, Chen Zhao, Shichao Fan, Jiaxi Jiang and Jiaqi Yang
- Abstract summary: 3D semantic keypoints are category-level semantic consistent points on 3D objects.
We present an unsupervised method to generate consistent semantic keypoints from point clouds explicitly.
To the best of our knowledge, the proposed method is the first to mine 3D semantic consistent keypoints from a mutual reconstruction view.
- Score: 11.164069907549756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic 3D keypoints are category-level semantic consistent points on 3D
objects. Detecting 3D semantic keypoints is a foundation for a number of 3D
vision tasks but remains challenging, due to the ambiguity of semantic
information, especially when the objects are represented by unordered 3D point
clouds. Existing unsupervised methods tend to generate category-level keypoints
in implicit manners, making it difficult to extract high-level information,
such as semantic labels and topology. From a novel mutual reconstruction
perspective, we present an unsupervised method to generate consistent semantic
keypoints from point clouds explicitly. To achieve this, the proposed model
predicts keypoints that not only reconstruct the object itself but also
reconstruct other instances in the same category. To the best of our knowledge,
the proposed method is the first to mine 3D semantic consistent keypoints from
a mutual reconstruction view. Experiments under various evaluation metrics as
well as comparisons with the state-of-the-arts demonstrate the efficacy of our
new solution to mining semantic consistent keypoints with mutual
reconstruction.
Related papers
- Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly
Supervised 3D Visual Grounding [58.924180772480504]
3D visual grounding involves finding a target object in a 3D scene that corresponds to a given sentence query.
We propose to leverage weakly supervised annotations to learn the 3D visual grounding model.
We design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
arXiv Detail & Related papers (2023-07-18T13:49:49Z) - 3D Keypoint Estimation Using Implicit Representation Learning [46.09594828635109]
We tackle the challenging problem of 3D keypoint estimation of general objects using a novel implicit representation.
Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.
Specifically, our key idea is employing spheres to represent 3D keypoints, thereby enabling the learnability of the corresponding signed distance field.
arXiv Detail & Related papers (2023-06-20T13:32:01Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - SNAKE: Shape-aware Neural 3D Keypoint Field [62.91169625183118]
Detecting 3D keypoints from point clouds is important for shape reconstruction.
This work investigates the dual question: can shape reconstruction benefit 3D keypoint detection?
We propose a novel unsupervised paradigm named SNAKE, which is short for shape-aware neural 3D keypoint field.
arXiv Detail & Related papers (2022-06-03T17:58:43Z) - From Keypoints to Object Landmarks via Self-Training Correspondence: A
novel approach to Unsupervised Landmark Discovery [37.78933209094847]
This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.
We validate our method on a variety of difficult datasets, including LS3D, BBCPose, Human3.6M and PennAction.
arXiv Detail & Related papers (2022-05-31T15:44:29Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z) - Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from
Point Sets [71.84892018102465]
This paper aims at learning category-specific 3D keypoints, in an unsupervised manner, using a collection of misaligned 3D point clouds of objects from an unknown category.
To the best of our knowledge, this is the first work on learning such keypoints directly from 3D point clouds.
arXiv Detail & Related papers (2020-03-17T10:28:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.