Fine-Grained 3D Shape Classification with Hierarchical Part-View
Attentions
- URL: http://arxiv.org/abs/2005.12541v2
- Date: Mon, 28 Dec 2020 06:34:39 GMT
- Title: Fine-Grained 3D Shape Classification with Hierarchical Part-View
Attentions
- Authors: Xinhai Liu, Zhizhong Han, Yu-Shen Liu, Matthias Zwicker
- Abstract summary: We propose a novel fine-grained 3D shape classification method named FG3D-Net to capture the fine-grained local details of 3D shapes from multiple rendered views.
Our results under the fine-grained 3D shape dataset show that our method outperforms other state-of-the-art methods.
- Score: 70.0171362989609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-grained 3D shape classification is important for shape understanding and
analysis, which poses a challenging research problem. However, the studies on
the fine-grained 3D shape classification have rarely been explored, due to the
lack of fine-grained 3D shape benchmarks. To address this issue, we first
introduce a new 3D shape dataset (named FG3D dataset) with fine-grained class
labels, which consists of three categories including airplane, car and chair.
Each category consists of several subcategories at a fine-grained level.
According to our experiments under this fine-grained dataset, we find that
state-of-the-art methods are significantly limited by the small variance among
subcategories in the same category. To resolve this problem, we further propose
a novel fine-grained 3D shape classification method named FG3D-Net to capture
the fine-grained local details of 3D shapes from multiple rendered views.
Specifically, we first train a Region Proposal Network (RPN) to detect the
generally semantic parts inside multiple views under the benchmark of generally
semantic part detection. Then, we design a hierarchical part-view attention
aggregation module to learn a global shape representation by aggregating
generally semantic part features, which preserves the local details of 3D
shapes. The part-view attention module hierarchically leverages part-level and
view-level attention to increase the discriminability of our features. The
part-level attention highlights the important parts in each view while the
view-level attention highlights the discriminative views among all the views of
the same object. In addition, we integrate a Recurrent Neural Network (RNN) to
capture the spatial relationships among sequential views from different
viewpoints. Our results under the fine-grained 3D shape dataset show that our
method outperforms other state-of-the-art methods.
Related papers
- Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance [49.14140194332482]
We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance within 3D scenes.
Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task.
arXiv Detail & Related papers (2023-12-17T10:07:03Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - U3DS$^3$: Unsupervised 3D Semantic Scene Segmentation [19.706172244951116]
This paper presents U3DS$3$, as a step towards completely unsupervised point cloud segmentation for any holistic 3D scenes.
The initial step of our proposed approach involves generating superpoints based on the geometric characteristics of each scene.
We then undergo a learning process through a spatial clustering-based methodology, followed by iterative training using pseudo-labels generated in accordance with the cluster centroids.
arXiv Detail & Related papers (2023-11-10T12:05:35Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Topologically-Aware Deformation Fields for Single-View 3D Reconstruction [30.738926104317514]
We present a new framework for learning 3D object shapes and dense cross-object 3D correspondences from just an unaligned category-specific image collection.
The 3D shapes are generated implicitly as deformations to a category-specific signed distance field.
Our approach, dubbed TARS, achieves state-of-the-art reconstruction fidelity on several datasets.
arXiv Detail & Related papers (2022-05-12T17:59:59Z) - Semi-supervised 3D shape segmentation with multilevel consistency and
part substitution [21.075426681857024]
We propose an effective semi-supervised method for learning 3D segmentations from a few labeled 3D shapes and a large amount of unlabeled 3D data.
For the unlabeled data, we present a novel multilevel consistency loss to enforce consistency of network predictions between perturbed copies of a 3D shape.
For the labeled data, we develop a simple yet effective part substitution scheme to augment the labeled 3D shapes with more structural variations to enhance training.
arXiv Detail & Related papers (2022-04-19T11:48:24Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.