Related papers: Convolutional Gated MLP: Combining Convolutions & gMLP

Convolutional Gated MLP: Combining Convolutions & gMLP

URL: http://arxiv.org/abs/2111.03940v1
Date: Sat, 6 Nov 2021 19:11:24 GMT
Title: Convolutional Gated MLP: Combining Convolutions & gMLP
Authors: A.Rajagopal, V. Nirmala
Abstract summary: This paper introduces Convolutions to Gated MultiLayer Perceptron. Google Brain introduced the gMLP in May 2021; Microsoft introduced Convolutions in Vision Transformer in Mar 2021. Inspired by both gMLP and CvT, we introduce convolutional layers in gMLP.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: To the best of our knowledge, this is the first paper to introduce Convolutions to Gated MultiLayer Perceptron and contributes an implementation of this novel Deep Learning architecture. Google Brain introduced the gMLP in May 2021. Microsoft introduced Convolutions in Vision Transformer in Mar 2021. Inspired by both gMLP and CvT, we introduce convolutional layers in gMLP. CvT combined the power of Convolutions and Attention. Our implementation combines the best of Convolutional learning along with spatial gated MLP. Further, the paper visualizes how CgMLP learns. Visualizations show how CgMLP learns from features such as outline of a car. While Attention was the basis of much of recent progress in Deep Learning, gMLP proposed an approach that doesn't use Attention computation. In Transformer based approaches, a whole lot of Attention matrixes need to be learnt using vast amount of training data. In gMLP, the fine tunning for new tasks can be challenging by transfer learning with smaller datasets. We implement CgMLP and compares it with gMLP on CIFAR dataset. Experimental results explore the power of generaliza-tion of CgMLP, while gMLP tend to drastically overfit the training data. To summarize, the paper contributes a novel Deep Learning architecture and demonstrates the learning mechanism of CgMLP through visualizations, for the first time in literature.

Related papers

Training MLPs on Graphs without Supervision [38.63554842214315]
We introduce SimMLP, a Self-supervised framework for learnings on graphs. SimMLP is the first-learning method that can achieve equivalence to GNNs in the optimal case. We provide a comprehensive theoretical analysis, demonstrating the equivalence between SimMLP and GNNs based on mutual information and inductive bias.
arXiv Detail & Related papers (2024-12-05T04:20:54Z)
SimMLP: Training MLPs on Graphs without Supervision [38.63554842214315]
We introduce SimMLP, a Self-supervised framework for learnings on graphs. SimMLP is the first-learning method that can achieve equivalence to GNNs in the optimal case. We provide a comprehensive theoretical analysis, demonstrating the equivalence between SimMLP and GNNs based on mutual information and inductive bias.
arXiv Detail & Related papers (2024-02-14T03:16:13Z)
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP [46.52398427166938]
One promising inference acceleration direction is to distill the GNNs into message-passing-free student multi-layer perceptrons. We introduce a novel structure-mixing knowledge strategy to enhance the learning ability of students for structure information. Our SA-MLP can consistently outperform the teacher GNNs, while maintaining faster inference assitance.
arXiv Detail & Related papers (2022-10-18T05:55:36Z)
GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation [68.65764751482774]
GraphMLP is a global-local-graphical unified architecture for 3D human pose estimation. It incorporates the graph structure of human bodies into a model to meet the domain-specific demand of the 3D human pose. It can be extended to model complex temporal dynamics in a simple way with negligible computational cost gains in the sequence length.
arXiv Detail & Related papers (2022-06-13T18:59:31Z)
MDMLP: Image Classification from Scratch on Small Datasets with MLP [7.672827879118106]
Recently, the attention mechanism has become a go-to technique for natural language processing and computer vision tasks. Recently, theMixer and other-based architectures, based simply on multi-layer perceptrons (MLPs), are also powerful compared to CNNs and attention techniques.
arXiv Detail & Related papers (2022-05-28T16:26:59Z)
Efficient Language Modeling with Sparse all-MLP [53.81435968051093]
All-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks. We propose sparse all-MLPs with mixture-of-experts (MoEs) in both feature and input (tokens) We evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.
arXiv Detail & Related papers (2022-03-14T04:32:19Z)
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP [33.00328314841369]
Multilayer perceptron (MLP), as the first neural network structure to appear, was a big hit. constrained by the hardware computing power and the size of the datasets, it once sank for tens of years. We have witnessed a paradigm shift from manual feature extraction to the CNN with local receptive fields, and further to the Transform with global receptive fields.
arXiv Detail & Related papers (2021-11-07T12:02:00Z)
ConvMLP: Hierarchical Convolutional MLPs for Vision [7.874749885641495]
We propose a hierarchical ConMLP: a light-weight, stage-wise, co-design for visual recognition. We show that ConvMLP can be seamlessly transferred and achieve competitive results with fewer parameters.
arXiv Detail & Related papers (2021-09-09T17:52:57Z)
Hire-MLP: Vision MLP via Hierarchical Rearrangement [58.33383667626998]
Hire-MLP is a simple yet competitive vision architecture via rearrangement. The proposed Hire-MLP architecture is built with simple channel-mixing operations, thus enjoys high flexibility and inference speed. Experiments show that our Hire-MLP achieves state-of-the-art performance on the ImageNet-1K benchmark.
arXiv Detail & Related papers (2021-08-30T16:11:04Z)
AS-MLP: An Axial Shifted MLP Architecture for Vision [50.11765148947432]
An Axial Shifted architecture (AS-MLP) is proposed in this paper. By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different directions. With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset.
arXiv Detail & Related papers (2021-07-18T08:56:34Z)
MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.