GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue
- URL: http://arxiv.org/abs/2303.08274v1
- Date: Tue, 14 Mar 2023 23:30:46 GMT
- Title: GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue
- Authors: Zhening Huang, Xiaoyang Wu, Hengshuang Zhao, Lei Zhu, Shujun Wang,
Georgios Hadjidemetriou, Ioannis Brilakis
- Abstract summary: GeoSpark is a Plug-in module that incorporates geometry clues into the network to Spark up feature learning and downsampling.
For feature aggregation, GeoSpark improves by allowing the network to learn from both local points and neighboring geometry partitions.
GeoSpark utilizes geometry partition information to guide the downsampling process, where points with unique features are preserved while redundant points are fused.
- Score: 25.747471104753426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current point cloud segmentation architectures suffer from limited long-range
feature modeling, as they mostly rely on aggregating information with local
neighborhoods. Furthermore, in order to learn point features at multiple
scales, most methods utilize a data-agnostic sampling approach to decrease the
number of points after each stage. Such sampling methods, however, often
discard points for small objects in the early stages, leading to inadequate
feature learning. We believe these issues are can be mitigated by introducing
explicit geometry clues as guidance. To this end, we propose GeoSpark, a
Plug-in module that incorporates Geometry clues into the network to Spark up
feature learning and downsampling. GeoSpark can be easily integrated into
various backbones. For feature aggregation, it improves feature modeling by
allowing the network to learn from both local points and neighboring geometry
partitions, resulting in an enlarged data-tailored receptive field.
Additionally, GeoSpark utilizes geometry partition information to guide the
downsampling process, where points with unique features are preserved while
redundant points are fused, resulting in better preservation of key points
throughout the network. We observed consistent improvements after adding
GeoSpark to various backbones including PointNet++, KPConv, and
PointTransformer. Notably, when integrated with Point Transformer, our GeoSpark
module achieves a 74.7% mIoU on the ScanNetv2 dataset (4.1% improvement) and
71.5% mIoU on the S3DIS Area 5 dataset (1.1% improvement), ranking top on both
benchmarks. Code and models will be made publicly available.
Related papers
- GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation [33.72549134362884]
We propose GSTran, a novel transformer network tailored for the segmentation task.
The proposed network mainly consists of two principal components: a local geometric transformer and a global semantic transformer.
Experiments on ShapeNetPart and S3DIS benchmarks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-08-21T12:12:37Z) - GeoFormer: Learning Point Cloud Completion with Tri-Plane Integrated Transformer [41.26276375114911]
Point cloud completion aims to recover accurate global geometry and preserve fine-grained local details from partial point clouds.
Conventional methods typically predict unseen points directly from 3D point cloud coordinates or use self-projected multi-view depth maps.
We introduce a GeoFormer that simultaneously enhances the global geometric structure of the points and improves the local details.
arXiv Detail & Related papers (2024-08-13T03:15:36Z) - On-the-fly Point Feature Representation for Point Clouds Analysis [7.074010861305738]
We propose On-the-fly Point Feature Representation (OPFR), which captures abundant geometric information explicitly through Curve Feature Generator module.
We also introduce the Local Reference Constructor module, which approximates the local coordinate systems based on triangle sets.
OPFR only requires extra 1.56ms for inference (65x faster than vanilla PFH) and 0.012M more parameters, and it can serve as a versatile plug-and-play module for various backbones.
arXiv Detail & Related papers (2024-07-31T04:57:06Z) - Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding [11.416392706435415]
Zero-shot 3D point cloud understanding can be achieved via 2D Vision-Language Models (VLMs)
Existing strategies directly map Vision-Language Models from 2D pixels of rendered or captured views to 3D points, overlooking the inherent and expressible point cloud geometric structure.
We introduce the first training-free aggregation technique that leverages the point cloud's 3D geometric structure to improve the quality of the transferred Vision-Language Models.
arXiv Detail & Related papers (2023-12-04T12:30:07Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information.
Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z) - Two Heads are Better than One: Geometric-Latent Attention for Point
Cloud Classification and Segmentation [10.2254921311882]
We present an innovative two-headed attention layer that combines geometric and latent features to segment a 3D scene into meaningful subsets.
Each head combines local and global information, using either the geometric or latent features, of a neighborhood of points and uses this information to learn better local relationships.
arXiv Detail & Related papers (2021-10-30T11:20:56Z) - GSIP: Green Semantic Segmentation of Large-Scale Indoor Point Clouds [64.86292006892093]
GSIP (Green of Indoor Point clouds) is an efficient solution to semantic segmentation of large-scale indoor scene point clouds.
GSIP has two novel components: 1) a room-style data pre-processing method that selects a proper subset of points for further processing, and 2) a new feature extractor which is extended from PointHop.
Experiments show that GSIP outperforms PointNet in segmentation performance for the S3DIS dataset.
arXiv Detail & Related papers (2021-09-24T09:26:53Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation [111.7241018610573]
We present PointGroup, a new end-to-end bottom-up architecture for instance segmentation.
We design a two-branch network to extract point features and predict semantic labels and offsets, for shifting each point towards its respective instance centroid.
A clustering component is followed to utilize both the original and offset-shifted point coordinate sets, taking advantage of their complementary strength.
We conduct extensive experiments on two challenging datasets, ScanNet v2 and S3DIS, on which our method achieves the highest performance, 63.6% and 64.0%, compared to 54.9% and 54.4% achieved by former best
arXiv Detail & Related papers (2020-04-03T16:26:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.