Related papers: Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion

Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion

URL: http://arxiv.org/abs/2405.16098v1
Date: Sat, 25 May 2024 07:10:02 GMT
Title: Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Authors: Zizhao Hu, Mohammad Rostami,
Abstract summary: We propose a new simple but effective architecture called the Lateralization (L-MLP) Inspired by the lateralization of the human brain, we propose a new simple but effective architecture called the Lateralization (L-MLP)
Score: 20.437172251393257
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Transformer architecture has dominated machine learning in a wide range of tasks. The specific characteristic of this architecture is an expensive scaled dot-product attention mechanism that models the inter-token interactions, which is known to be the reason behind its success. However, such a mechanism does not have a direct parallel to the human brain which brings the question if the scaled-dot product is necessary for intelligence with strong expressive power. Inspired by the lateralization of the human brain, we propose a new simple but effective architecture called the Lateralization MLP (L-MLP). Stacking L-MLP blocks can generate complex architectures. Each L-MLP block is based on a multi-layer perceptron (MLP) that permutes data dimensions, processes each dimension in parallel, merges them, and finally passes through a joint MLP. We discover that this specific design outperforms other MLP variants and performs comparably to a transformer-based architecture in the challenging diffusion task while being highly efficient. We conduct experiments using text-to-image generation tasks to demonstrate the effectiveness and efficiency of L-MLP. Further, we look into the model behavior and discover a connection to the function of the human brain. Our code is publicly available: \url{https://github.com/zizhao-hu/L-MLP}

Related papers

From MLP to NeoMLP: Leveraging Self-Attention for Neural Fields [26.659511924272962]
We develop a new type of connectionism based on hidden and scalable nodes, called NeoMLP. We demonstrate the effectiveness of our method by fitting high-resolution signals, including multi-modal audio-visual data.
arXiv Detail & Related papers (2024-12-11T19:01:38Z)
MLPs Learn In-Context on Regression and Classification Tasks [28.13046236900491]
In-context learning (ICL) is often assumed to be a unique hallmark of Transformer models. We demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Results highlight the unexpected competence of exemplars in a synthetic setting.
arXiv Detail & Related papers (2024-05-24T15:04:36Z)
X-MLP: A Patch Embedding-Free MLP Architecture for Vision [4.493200639605705]
Multi-layer perceptron (MLP) architectures for vision have been popular again. We propose X-MLP, an architecture constructed absolutely upon fully connected layers and free from patch embedding. X-MLP is tested on ten benchmark datasets, all better performance than other vision models.
arXiv Detail & Related papers (2023-07-02T15:20:25Z)
Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces. It takes as input a concatenation of weights and biases of a pre-trainedvariant. We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z)
R2-MLP: Round-Roll MLP for Multi-View 3D Object Recognition [33.53114929452528]
Vision architectures based exclusively on multi-layer perceptrons (MLPs) have gained much attention in the computer vision community. We present an achieves a view-based 3D object recognition task by considering the communications between patches from different views. With a conceptually simple structure, our R$2$MLP achieves competitive performance compared with existing methods.
arXiv Detail & Related papers (2022-11-20T21:13:02Z)
GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation [68.65764751482774]
GraphMLP is a global-local-graphical unified architecture for 3D human pose estimation. It incorporates the graph structure of human bodies into a model to meet the domain-specific demand of the 3D human pose. It can be extended to model complex temporal dynamics in a simple way with negligible computational cost gains in the sequence length.
arXiv Detail & Related papers (2022-06-13T18:59:31Z)
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing [123.43419144051703]
We present a novel-like 3D architecture for video recognition. The results are comparable to state-of-the-art widely-used 3D CNNs and video.
arXiv Detail & Related papers (2022-06-13T16:21:33Z)
MDMLP: Image Classification from Scratch on Small Datasets with MLP [7.672827879118106]
Recently, the attention mechanism has become a go-to technique for natural language processing and computer vision tasks. Recently, theMixer and other-based architectures, based simply on multi-layer perceptrons (MLPs), are also powerful compared to CNNs and attention techniques.
arXiv Detail & Related papers (2022-05-28T16:26:59Z)
Efficient Language Modeling with Sparse all-MLP [53.81435968051093]
All-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks. We propose sparse all-MLPs with mixture-of-experts (MoEs) in both feature and input (tokens) We evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.
arXiv Detail & Related papers (2022-03-14T04:32:19Z)
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer. RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z)
ConvMLP: Hierarchical Convolutional MLPs for Vision [7.874749885641495]
We propose a hierarchical ConMLP: a light-weight, stage-wise, co-design for visual recognition. We show that ConvMLP can be seamlessly transferred and achieve competitive results with fewer parameters.
arXiv Detail & Related papers (2021-09-09T17:52:57Z)
AS-MLP: An Axial Shifted MLP Architecture for Vision [50.11765148947432]
An Axial Shifted architecture (AS-MLP) is proposed in this paper. By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different directions. With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset.
arXiv Detail & Related papers (2021-07-18T08:56:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.