Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
- URL: http://arxiv.org/abs/2405.16098v1
- Date: Sat, 25 May 2024 07:10:02 GMT
- Title: Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
- Authors: Zizhao Hu, Mohammad Rostami,
- Abstract summary: We propose a new simple but effective architecture called the Lateralization (L-MLP)
Inspired by the lateralization of the human brain, we propose a new simple but effective architecture called the Lateralization (L-MLP)
- Score: 20.437172251393257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Transformer architecture has dominated machine learning in a wide range of tasks. The specific characteristic of this architecture is an expensive scaled dot-product attention mechanism that models the inter-token interactions, which is known to be the reason behind its success. However, such a mechanism does not have a direct parallel to the human brain which brings the question if the scaled-dot product is necessary for intelligence with strong expressive power. Inspired by the lateralization of the human brain, we propose a new simple but effective architecture called the Lateralization MLP (L-MLP). Stacking L-MLP blocks can generate complex architectures. Each L-MLP block is based on a multi-layer perceptron (MLP) that permutes data dimensions, processes each dimension in parallel, merges them, and finally passes through a joint MLP. We discover that this specific design outperforms other MLP variants and performs comparably to a transformer-based architecture in the challenging diffusion task while being highly efficient. We conduct experiments using text-to-image generation tasks to demonstrate the effectiveness and efficiency of L-MLP. Further, we look into the model behavior and discover a connection to the function of the human brain. Our code is publicly available: \url{https://github.com/zizhao-hu/L-MLP}
Related papers
- X-MLP: A Patch Embedding-Free MLP Architecture for Vision [4.493200639605705]
Multi-layer perceptron (MLP) architectures for vision have been popular again.
We propose X-MLP, an architecture constructed absolutely upon fully connected layers and free from patch embedding.
X-MLP is tested on ten benchmark datasets, all better performance than other vision models.
arXiv Detail & Related papers (2023-07-02T15:20:25Z) - Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z) - R2-MLP: Round-Roll MLP for Multi-View 3D Object Recognition [33.53114929452528]
Vision architectures based exclusively on multi-layer perceptrons (MLPs) have gained much attention in the computer vision community.
We present an achieves a view-based 3D object recognition task by considering the communications between patches from different views.
With a conceptually simple structure, our R$2$MLP achieves competitive performance compared with existing methods.
arXiv Detail & Related papers (2022-11-20T21:13:02Z) - GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation [68.65764751482774]
GraphMLP is a global-local-graphical unified architecture for 3D human pose estimation.
It incorporates the graph structure of human bodies into a model to meet the domain-specific demand of the 3D human pose.
It can be extended to model complex temporal dynamics in a simple way with negligible computational cost gains in the sequence length.
arXiv Detail & Related papers (2022-06-13T18:59:31Z) - MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing [123.43419144051703]
We present a novel-like 3D architecture for video recognition.
The results are comparable to state-of-the-art widely-used 3D CNNs and video.
arXiv Detail & Related papers (2022-06-13T16:21:33Z) - MDMLP: Image Classification from Scratch on Small Datasets with MLP [7.672827879118106]
Recently, the attention mechanism has become a go-to technique for natural language processing and computer vision tasks.
Recently, theMixer and other-based architectures, based simply on multi-layer perceptrons (MLPs), are also powerful compared to CNNs and attention techniques.
arXiv Detail & Related papers (2022-05-28T16:26:59Z) - Efficient Language Modeling with Sparse all-MLP [53.81435968051093]
All-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks.
We propose sparse all-MLPs with mixture-of-experts (MoEs) in both feature and input (tokens)
We evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.
arXiv Detail & Related papers (2022-03-14T04:32:19Z) - RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z) - ConvMLP: Hierarchical Convolutional MLPs for Vision [7.874749885641495]
We propose a hierarchical ConMLP: a light-weight, stage-wise, co-design for visual recognition.
We show that ConvMLP can be seamlessly transferred and achieve competitive results with fewer parameters.
arXiv Detail & Related papers (2021-09-09T17:52:57Z) - AS-MLP: An Axial Shifted MLP Architecture for Vision [50.11765148947432]
An Axial Shifted architecture (AS-MLP) is proposed in this paper.
By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different directions.
With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset.
arXiv Detail & Related papers (2021-07-18T08:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.