PointMixer: MLP-Mixer for Point Cloud Understanding
- URL: http://arxiv.org/abs/2111.11187v1
- Date: Mon, 22 Nov 2021 13:25:54 GMT
- Title: PointMixer: MLP-Mixer for Point Cloud Understanding
- Authors: Jaesung Choe, Chunghyun Park, Francois Rameau, Jaesik Park, In So
Kweon
- Abstract summary: The concept of channel-mixings and token-mixings achieves noticeable performance in visual recognition tasks.
Unlike images, point clouds are inherently sparse, unordered and irregular, which limits the direct use of universal-Mixer for point cloud understanding.
We propose PointMixer, a universal point set operator that facilitates information sharing among unstructured 3D points.
- Score: 74.694733918351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: MLP-Mixer has newly appeared as a new challenger against the realm of CNNs
and transformer. Despite its simplicity compared to transformer, the concept of
channel-mixing MLPs and token-mixing MLPs achieves noticeable performance in
visual recognition tasks. Unlike images, point clouds are inherently sparse,
unordered and irregular, which limits the direct use of MLP-Mixer for point
cloud understanding. In this paper, we propose PointMixer, a universal point
set operator that facilitates information sharing among unstructured 3D points.
By simply replacing token-mixing MLPs with a softmax function, PointMixer can
"mix" features within/between point sets. By doing so, PointMixer can be
broadly used in the network as inter-set mixing, intra-set mixing, and pyramid
mixing. Extensive experiments show the competitive or superior performance of
PointMixer in semantic segmentation, classification, and point reconstruction
against transformer-based methods.
Related papers
- Mixer is more than just a model [23.309064032922507]
This study focuses on the domain of audio recognition, introducing a novel model named Audio Spectrogram Mixer with Roll-Time and Hermit FFT (ASM-RH)
Experimental results demonstrate that ASM-RH is particularly well-suited for audio data and yields promising outcomes across multiple classification tasks.
arXiv Detail & Related papers (2024-02-28T02:45:58Z) - iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer [2.5782420501870296]
We generalize studies on Hopfield networks and Transformer-like architecture to iMixer.
iMixer is a generalization that propagates forward from the output side to the input side.
We evaluate the model performance with various datasets on image classification tasks.
The results imply that the correspondence between the Hopfield networks and the Mixer models serves as a principle for understanding a broader class of Transformer-like architecture designs.
arXiv Detail & Related papers (2023-04-25T18:00:08Z) - SMMix: Self-Motivated Image Mixing for Vision Transformers [65.809376136455]
CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs)
Existing CutMix variants tackle this problem by generating more consistent mixed images or more precise mixed labels.
We propose an efficient and effective Self-Motivated image Mixing method (SMMix) which motivates both image and label enhancement by the model under training itself.
arXiv Detail & Related papers (2022-12-26T00:19:39Z) - OAMixer: Object-aware Mixing Layer for Vision Transformers [73.10651373341933]
We propose OAMixer, which calibrates the patch mixing layers of patch-based models based on the object labels.
By learning an object-centric representation, we demonstrate that OAMixer improves the classification accuracy and background robustness of various patch-based models.
arXiv Detail & Related papers (2022-12-13T14:14:48Z) - SplitMixer: Fat Trimmed From MLP-like Models [53.12472550578278]
We present SplitMixer, a simple and lightweight isotropic-like architecture, for visual recognition.
It contains two types of interleaving convolutional operations to mix information across locations (spatial mixing) and channels (channel mixing)
arXiv Detail & Related papers (2022-07-21T01:37:07Z) - Boosting Adversarial Transferability of MLP-Mixer [9.957957463532738]
We propose an adversarial attack method against the Dense-Mixer called Maxwell's demon Attack (MA)
Our method can be easily combined with existing methods and can improve the transferability by up to 38.0% on ResMLP.
To the best of our knowledge, we are the first work to study adversarial transferability of Dense-Mixer.
arXiv Detail & Related papers (2022-04-26T10:18:59Z) - S$^2$-MLP: Spatial-Shift MLP Architecture for Vision [34.47616917228978]
Recently, visual Transformer (ViT) and its following works abandon the convolution and exploit the self-attention operation.
In this paper, we propose a novel pure architecture, spatial-shift (S$2$-MLP)
arXiv Detail & Related papers (2021-06-14T15:05:11Z) - MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs).
Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z) - PointMixup: Augmentation for Point Clouds [65.61212404598524]
We introduce PointMixup, a method that generates new examples through an optimal assignment of the path function between two point clouds.
We show the potential of PointMixup for point cloud classification, especially when examples are scarce.
arXiv Detail & Related papers (2020-08-14T13:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.