Two Heads are Better than One: Geometric-Latent Attention for Point
Cloud Classification and Segmentation
- URL: http://arxiv.org/abs/2111.00231v1
- Date: Sat, 30 Oct 2021 11:20:56 GMT
- Title: Two Heads are Better than One: Geometric-Latent Attention for Point
Cloud Classification and Segmentation
- Authors: Hanz Cuevas-Velasquez, Antonio Javier Gallego, Robert B. Fisher
- Abstract summary: We present an innovative two-headed attention layer that combines geometric and latent features to segment a 3D scene into meaningful subsets.
Each head combines local and global information, using either the geometric or latent features, of a neighborhood of points and uses this information to learn better local relationships.
- Score: 10.2254921311882
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present an innovative two-headed attention layer that combines geometric
and latent features to segment a 3D scene into semantically meaningful subsets.
Each head combines local and global information, using either the geometric or
latent features, of a neighborhood of points and uses this information to learn
better local relationships. This Geometric-Latent attention layer (Ge-Latto) is
combined with a sub-sampling strategy to capture global features. Our method is
invariant to permutation thanks to the use of shared-MLP layers, and it can
also be used with point clouds with varying densities because the local
attention layer does not depend on the neighbor order. Our proposal is simple
yet robust, which allows it to achieve competitive results in the ShapeNetPart
and ModelNet40 datasets, and the state-of-the-art when segmenting the complex
dataset S3DIS, with 69.2% IoU on Area 5, and 89.7% overall accuracy using
K-fold cross-validation on the 6 areas.
Related papers
- GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation [33.72549134362884]
We propose GSTran, a novel transformer network tailored for the segmentation task.
The proposed network mainly consists of two principal components: a local geometric transformer and a global semantic transformer.
Experiments on ShapeNetPart and S3DIS benchmarks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-08-21T12:12:37Z) - On-the-fly Point Feature Representation for Point Clouds Analysis [7.074010861305738]
We propose On-the-fly Point Feature Representation (OPFR), which captures abundant geometric information explicitly through Curve Feature Generator module.
We also introduce the Local Reference Constructor module, which approximates the local coordinate systems based on triangle sets.
OPFR only requires extra 1.56ms for inference (65x faster than vanilla PFH) and 0.012M more parameters, and it can serve as a versatile plug-and-play module for various backbones.
arXiv Detail & Related papers (2024-07-31T04:57:06Z) - X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition [73.0588783479853]
X-3D is an explicit 3D structure modeling approach.
It captures explicit local structural information within the input 3D space.
It produces dynamic kernels with shared weights for all neighborhood points within the current local region.
arXiv Detail & Related papers (2024-04-23T13:15:35Z) - GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue [25.747471104753426]
GeoSpark is a Plug-in module that incorporates geometry clues into the network to Spark up feature learning and downsampling.
For feature aggregation, GeoSpark improves by allowing the network to learn from both local points and neighboring geometry partitions.
GeoSpark utilizes geometry partition information to guide the downsampling process, where points with unique features are preserved while redundant points are fused.
arXiv Detail & Related papers (2023-03-14T23:30:46Z) - Adaptive Edge-to-Edge Interaction Learning for Point Cloud Analysis [118.30840667784206]
Key issue for point cloud data processing is extracting useful information from local regions.
Previous works ignore the relation between edges in local regions, which encodes the local shape information.
This paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module.
arXiv Detail & Related papers (2022-11-20T07:10:14Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - GraNet: Global Relation-aware Attentional Network for ALS Point Cloud
Classification [7.734726150561088]
We propose a novel neural network focusing on semantic labeling of ALS point clouds.
GraNet learns local geometric description and local dependencies.
Experiments were conducted on two ALS point cloud datasets.
arXiv Detail & Related papers (2020-12-24T23:54:45Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation [111.7241018610573]
We present PointGroup, a new end-to-end bottom-up architecture for instance segmentation.
We design a two-branch network to extract point features and predict semantic labels and offsets, for shifting each point towards its respective instance centroid.
A clustering component is followed to utilize both the original and offset-shifted point coordinate sets, taking advantage of their complementary strength.
We conduct extensive experiments on two challenging datasets, ScanNet v2 and S3DIS, on which our method achieves the highest performance, 63.6% and 64.0%, compared to 54.9% and 54.4% achieved by former best
arXiv Detail & Related papers (2020-04-03T16:26:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.