PatchMixer: Rethinking network design to boost generalization for 3D
point cloud understanding
- URL: http://arxiv.org/abs/2307.15692v1
- Date: Fri, 28 Jul 2023 17:37:53 GMT
- Title: PatchMixer: Rethinking network design to boost generalization for 3D
point cloud understanding
- Authors: Davide Boscaini, Fabio Poiesi
- Abstract summary: We argue that the ability of a model to transfer the learnt knowledge to different domains is an important feature that should be evaluated to exhaustively assess the quality of a deep network architecture.
In this work we propose PatchMixer, a simple yet effective architecture that extends the ideas behind the recent paper to 3D point clouds.
- Score: 2.512827436728378
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent trend in deep learning methods for 3D point cloud understanding is
to propose increasingly sophisticated architectures either to better capture 3D
geometries or by introducing possibly undesired inductive biases. Moreover,
prior works introducing novel architectures compared their performance on the
same domain, devoting less attention to their generalization to other domains.
We argue that the ability of a model to transfer the learnt knowledge to
different domains is an important feature that should be evaluated to
exhaustively assess the quality of a deep network architecture. In this work we
propose PatchMixer, a simple yet effective architecture that extends the ideas
behind the recent MLP-Mixer paper to 3D point clouds. The novelties of our
approach are the processing of local patches instead of the whole shape to
promote robustness to partial point clouds, and the aggregation of patch-wise
features using an MLP as a simpler alternative to the graph convolutions or the
attention mechanisms that are used in prior works. We evaluated our method on
the shape classification and part segmentation tasks, achieving superior
generalization performance compared to a selection of the most relevant deep
architectures.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D [18.33878596057853]
We introduce a pioneering approach to self-supervised learning for point clouds.
We employ a geometrically informed mask selection strategy called GeoMask3D (GM3D) to boost the efficiency of Masked Autos.
arXiv Detail & Related papers (2024-05-20T23:53:42Z) - Towards Compact 3D Representations via Point Feature Enhancement Masked
Autoencoders [52.66195794216989]
We propose Point Feature Enhancement Masked Autoencoders (Point-FEMAE) to learn compact 3D representations.
Point-FEMAE consists of a global branch and a local branch to capture latent semantic features.
Our method significantly improves the pre-training efficiency compared to cross-modal alternatives.
arXiv Detail & Related papers (2023-12-17T14:17:05Z) - Human as Points: Explicit Point-based 3D Human Reconstruction from
Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP.
Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.
Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - Patch-Wise Point Cloud Generation: A Divide-and-Conquer Approach [83.05340155068721]
We devise a new 3d point cloud generation framework using a divide-and-conquer approach.
All patch generators are based on learnable priors, which aim to capture the information of geometry primitives.
Experimental results on a variety of object categories from the most popular point cloud dataset, ShapeNet, show the effectiveness of the proposed patch-wise point cloud generation.
arXiv Detail & Related papers (2023-07-22T11:10:39Z) - Small but Mighty: Enhancing 3D Point Clouds Semantic Segmentation with
U-Next Framework [7.9395601503353825]
We propose U-Next, a small but mighty framework designed for point cloud semantic segmentation.
We build our U-Next by stacking multiple U-Net $L1$ codecs in a nested and densely arranged manner to minimize the semantic gap.
Extensive experiments conducted on three large-scale benchmarks including S3DIS, Toronto3D, and SensatUrban demonstrate the superiority and the effectiveness of the proposed U-Next architecture.
arXiv Detail & Related papers (2023-04-03T06:59:08Z) - Quality evaluation of point clouds: a novel no-reference approach using
transformer-based architecture [11.515951211296361]
We propose a novel no-reference quality metric that operates directly on the whole point cloud without requiring extensive pre-processing.
We use a novel model design consisting primarily of cross and self-attention layers, in order to learn the best set of local semantic affinities.
arXiv Detail & Related papers (2023-03-15T14:01:12Z) - PointAttN: You Only Need Attention for Point Cloud Completion [89.88766317412052]
Point cloud completion refers to completing 3D shapes from partial 3D point clouds.
We propose a novel neural network for processing point cloud in a per-point manner to eliminate kNNs.
The proposed framework, namely PointAttN, is simple, neat and effective, which can precisely capture the structural information of 3D shapes.
arXiv Detail & Related papers (2022-03-16T09:20:01Z) - Revisiting Point Cloud Simplification: A Learnable Feature Preserving
Approach [57.67932970472768]
Mesh and Point Cloud simplification methods aim to reduce the complexity of 3D models while retaining visual quality and relevant salient features.
We propose a fast point cloud simplification method by learning to sample salient points.
The proposed method relies on a graph neural network architecture trained to select an arbitrary, user-defined, number of points from the input space and to re-arrange their positions so as to minimize the visual perception error.
arXiv Detail & Related papers (2021-09-30T10:23:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.