TriMLP: Revenge of a MLP-like Architecture in Sequential Recommendation
- URL: http://arxiv.org/abs/2305.14675v3
- Date: Tue, 25 Jul 2023 09:55:38 GMT
- Title: TriMLP: Revenge of a MLP-like Architecture in Sequential Recommendation
- Authors: Yiheng Jiang, Yuanbo Xu, Yongjian Yang, Funing Yang, Pengyang Wang and
Hui Xiong
- Abstract summary: We present a sequential-like architecture for sequential recommendation, namely TriMLP, with a novel Triangular Mixer for cross-token communications.
In designing Triangular Mixer, we simplify the cross-token operation inascii as the basic matrix multiplication, and drop the lower-triangle neurons of the weight matrix to block the anti-chronological order connections from future tokens.
- Score: 23.32537260687907
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a MLP-like architecture for sequential
recommendation, namely TriMLP, with a novel Triangular Mixer for cross-token
communications. In designing Triangular Mixer, we simplify the cross-token
operation in MLP as the basic matrix multiplication, and drop the
lower-triangle neurons of the weight matrix to block the anti-chronological
order connections from future tokens. Accordingly, the information leakage
issue can be remedied and the prediction capability of MLP can be fully
excavated under the standard auto-regressive mode. Take a step further, the
mixer serially alternates two delicate MLPs with triangular shape, tagged as
global and local mixing, to separately capture the long range dependencies and
local patterns on fine-grained level, i.e., long and short-term preferences.
Empirical study on 12 datasets of different scales (50K\textasciitilde 10M
user-item interactions) from 4 benchmarks (Amazon, MovieLens, Tenrec and LBSN)
show that TriMLP consistently attains promising accuracy/efficiency trade-off,
where the average performance boost against several state-of-the-art baselines
achieves up to 14.88% with 8.65% less inference cost.
Related papers
- Strip-MLP: Efficient Token Interaction for Vision MLP [31.02197585697145]
We introduce textbfStrip-MLP to enrich the token interaction power in three ways.
Strip-MLP significantly improves the performance of spatial-based models on small datasets.
Models achieve higher average Top-1 accuracy than existing datasets by +2.44% on Caltech-101 and +2.16% on CIFAR-100.
arXiv Detail & Related papers (2023-07-21T09:40:42Z) - MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing [123.43419144051703]
We present a novel-like 3D architecture for video recognition.
The results are comparable to state-of-the-art widely-used 3D CNNs and video.
arXiv Detail & Related papers (2022-06-13T16:21:33Z) - Mixing and Shifting: Exploiting Global and Local Dependencies in Vision
MLPs [84.3235981545673]
Token-mixing multi-layer perceptron (MLP) models have shown competitive performance in computer vision tasks.
We present Mix-Shift-MLP which makes the size of the local receptive field used for mixing increase with respect to the amount of spatial shifting.
MS-MLP achieves competitive performance in multiple vision benchmarks.
arXiv Detail & Related papers (2022-02-14T06:53:48Z) - RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z) - Sparse-MLP: A Fully-MLP Architecture with Conditional Computation [7.901786481399378]
Mixture-of-Experts (MoE) with sparse conditional computation has been proved an effective architecture for scaling attention-based models to more parameters with comparable computation cost.
We propose Sparse-MLP, scaling the recent-Mixer model with MoE, to achieve a more-efficient architecture.
arXiv Detail & Related papers (2021-09-05T06:43:08Z) - Hire-MLP: Vision MLP via Hierarchical Rearrangement [58.33383667626998]
Hire-MLP is a simple yet competitive vision architecture via rearrangement.
The proposed Hire-MLP architecture is built with simple channel-mixing operations, thus enjoys high flexibility and inference speed.
Experiments show that our Hire-MLP achieves state-of-the-art performance on the ImageNet-1K benchmark.
arXiv Detail & Related papers (2021-08-30T16:11:04Z) - CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions.
It can cope with various image sizes and achieves linear computational complexity to image size by using local windows.
CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z) - AS-MLP: An Axial Shifted MLP Architecture for Vision [50.11765148947432]
An Axial Shifted architecture (AS-MLP) is proposed in this paper.
By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different directions.
With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset.
arXiv Detail & Related papers (2021-07-18T08:56:34Z) - S$^2$-MLP: Spatial-Shift MLP Architecture for Vision [34.47616917228978]
Recently, visual Transformer (ViT) and its following works abandon the convolution and exploit the self-attention operation.
In this paper, we propose a novel pure architecture, spatial-shift (S$2$-MLP)
arXiv Detail & Related papers (2021-06-14T15:05:11Z) - MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs).
Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z) - Irregularly Tabulated MLP for Fast Point Feature Embedding [13.218995242910497]
We propose a new framework that uses a pair of multi-layer perceptrons (MLP) and a lookup table (LUT) to transform point-coordinate inputs into high-dimensional features.
LUTIMLP also provides significant speedup for Jacobian of the embedding function.
arXiv Detail & Related papers (2020-11-13T04:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.