SA-MLP: Enhancing Point Cloud Classification with Efficient Addition and Shift Operations in MLP Architectures
- URL: http://arxiv.org/abs/2409.01998v1
- Date: Tue, 3 Sep 2024 15:43:44 GMT
- Title: SA-MLP: Enhancing Point Cloud Classification with Efficient Addition and Shift Operations in MLP Architectures
- Authors: Qiang Zheng, Chao Zhang, Jian Sun,
- Abstract summary: Traditional neural networks heavily rely on multiplication operations, which are computationally expensive.
We propose Add-MLP and Shift-MLP, which replace multiplications with addition and shift operations, respectively, significantly enhancing computational efficiency.
This study offers an efficient and effective solution for point cloud classification, balancing performance with computational efficiency.
- Score: 46.266960248570086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study addresses the computational inefficiencies in point cloud classification by introducing novel MLP-based architectures inspired by recent advances in CNN optimization. Traditional neural networks heavily rely on multiplication operations, which are computationally expensive. To tackle this, we propose Add-MLP and Shift-MLP, which replace multiplications with addition and shift operations, respectively, significantly enhancing computational efficiency. Building on this, we introduce SA-MLP, a hybrid model that intermixes alternately distributed shift and adder layers to replace MLP layers, maintaining the original number of layers without freezing shift layer weights. This design contrasts with the ShiftAddNet model from previous literature, which replaces convolutional layers with shift and adder layers, leading to a doubling of the number of layers and limited representational capacity due to frozen shift weights. Moreover, SA-MLP optimizes learning by setting distinct learning rates and optimizers specifically for the adder and shift layers, fully leveraging their complementary strengths. Extensive experiments demonstrate that while Add-MLP and Shift-MLP achieve competitive performance, SA-MLP significantly surpasses the multiplication-based baseline MLP model and achieves performance comparable to state-of-the-art MLP-based models. This study offers an efficient and effective solution for point cloud classification, balancing performance with computational efficiency.
Related papers
- A deeper look at depth pruning of LLMs [49.30061112976263]
Large Language Models (LLMs) are resource-intensive to train but more costly to deploy in production.
Recent work has attempted to prune blocks of LLMs based on cheap proxies for estimating block importance.
We show that adaptive metrics exhibit a trade-off in performance between tasks.
arXiv Detail & Related papers (2024-07-23T08:40:27Z) - MLP Can Be A Good Transformer Learner [73.01739251050076]
Self-attention mechanism is the key of the Transformer but often criticized for its computation demands.
This paper introduces a novel strategy that simplifies vision transformers and reduces computational load through the selective removal of non-essential attention layers.
arXiv Detail & Related papers (2024-04-08T16:40:15Z) - NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning [40.994306592119266]
Fine-tuning a pre-trained language model (PLM) emerges as the predominant strategy in many natural language processing applications.
Some general approaches (e.g. quantization and distillation) have been widely studied to reduce the compute/memory of PLM fine-tuning.
We propose to coin a lightweight PLM through NTK-approximating modules in fusion.
arXiv Detail & Related papers (2023-07-18T03:12:51Z) - Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation [68.24659910441736]
Shifted-Pillars-Concatenation (SPC) module offers superior local modeling power and performance gains.
We build a pure-MLP architecture called Caterpillar by replacing the convolutional layer with the SPC module in a hybrid model of sMLPNet.
Experiments show Caterpillar's excellent performance on both small-scale and ImageNet-1k classification benchmarks.
arXiv Detail & Related papers (2023-05-28T06:19:36Z) - Efficient Language Modeling with Sparse all-MLP [53.81435968051093]
All-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks.
We propose sparse all-MLPs with mixture-of-experts (MoEs) in both feature and input (tokens)
We evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.
arXiv Detail & Related papers (2022-03-14T04:32:19Z) - Mixing and Shifting: Exploiting Global and Local Dependencies in Vision
MLPs [84.3235981545673]
Token-mixing multi-layer perceptron (MLP) models have shown competitive performance in computer vision tasks.
We present Mix-Shift-MLP which makes the size of the local receptive field used for mixing increase with respect to the amount of spatial shifting.
MS-MLP achieves competitive performance in multiple vision benchmarks.
arXiv Detail & Related papers (2022-02-14T06:53:48Z) - Using Fitness Dependent Optimizer for Training Multi-layer Perceptron [13.280383503879158]
This study presents a novel training algorithm depending upon the recently proposed Fitness Dependent (FDO)
The stability of this algorithm has been verified and performance-proofed in both the exploration and exploitation stages.
The proposed approach using FDO as a trainer can outperform the other approaches using different trainers on the dataset.
arXiv Detail & Related papers (2022-01-03T10:23:17Z) - Sparse-MLP: A Fully-MLP Architecture with Conditional Computation [7.901786481399378]
Mixture-of-Experts (MoE) with sparse conditional computation has been proved an effective architecture for scaling attention-based models to more parameters with comparable computation cost.
We propose Sparse-MLP, scaling the recent-Mixer model with MoE, to achieve a more-efficient architecture.
arXiv Detail & Related papers (2021-09-05T06:43:08Z) - MOI-Mixer: Improving MLP-Mixer with Multi Order Interactions in
Sequential Recommendation [40.20599070308035]
Transformer-based models require quadratic memory and time complexity to the sequence length, making it difficult to extract the long-term interest of users.
MLP-based models, renowned for their linear memory and time complexity, have recently shown competitive results compared to Transformer in various tasks.
We propose the Multi-Order Interaction layer, which is capable of expressing an arbitrary order of interactions while maintaining the memory and time complexity of the layer.
arXiv Detail & Related papers (2021-08-17T08:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.