MOI-Mixer: Improving MLP-Mixer with Multi Order Interactions in
Sequential Recommendation
- URL: http://arxiv.org/abs/2108.07505v1
- Date: Tue, 17 Aug 2021 08:38:49 GMT
- Title: MOI-Mixer: Improving MLP-Mixer with Multi Order Interactions in
Sequential Recommendation
- Authors: Hojoon Lee, Dongyoon Hwang, Sunghwan Hong, Changyeon Kim, Seungryong
Kim, Jaegul Choo
- Abstract summary: Transformer-based models require quadratic memory and time complexity to the sequence length, making it difficult to extract the long-term interest of users.
MLP-based models, renowned for their linear memory and time complexity, have recently shown competitive results compared to Transformer in various tasks.
We propose the Multi-Order Interaction layer, which is capable of expressing an arbitrary order of interactions while maintaining the memory and time complexity of the layer.
- Score: 40.20599070308035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Successful sequential recommendation systems rely on accurately capturing the
user's short-term and long-term interest. Although Transformer-based models
achieved state-of-the-art performance in the sequential recommendation task,
they generally require quadratic memory and time complexity to the sequence
length, making it difficult to extract the long-term interest of users. On the
other hand, Multi-Layer Perceptrons (MLP)-based models, renowned for their
linear memory and time complexity, have recently shown competitive results
compared to Transformer in various tasks. Given the availability of a massive
amount of the user's behavior history, the linear memory and time complexity of
MLP-based models make them a promising alternative to explore in the sequential
recommendation task. To this end, we adopted MLP-based models in sequential
recommendation but consistently observed that MLP-based methods obtain lower
performance than those of Transformer despite their computational benefits.
From experiments, we observed that introducing explicit high-order interactions
to MLP layers mitigates such performance gap. In response, we propose the
Multi-Order Interaction (MOI) layer, which is capable of expressing an
arbitrary order of interactions within the inputs while maintaining the memory
and time complexity of the MLP layer. By replacing the MLP layer with the MOI
layer, our model was able to achieve comparable performance with
Transformer-based models while retaining the MLP-based models' computational
benefits.
Related papers
- Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting (TSF)
Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.
Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - BMLP: Behavior-aware MLP for Heterogeneous Sequential Recommendation [16.6816199104481]
We propose a novel multilayer perceptron (MLP)-based heterogeneous sequential recommendation method, namely behavior-aware multilayer perceptron (BMLP)
BMLP achieves significant improvement over state-of-the-art algorithms on four public datasets.
arXiv Detail & Related papers (2024-02-20T05:57:01Z) - Attentive Multi-Layer Perceptron for Non-autoregressive Generation [46.14195464583495]
Non-autoregressive(NAR) generation gains increasing popularity for its efficiency and growing efficacy.
In this paper, we propose a novel variant, textbfAttentive textbfMulti-textbfLayer textbfPerceptron(AMLP), to produce a generation model with linear time and space complexity.
arXiv Detail & Related papers (2023-10-14T06:44:24Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - AutoMLP: Automated MLP for Sequential Recommendations [20.73096302505791]
Sequential recommender systems aim to predict users' next interested item given their historical interactions.
Existing approaches usually set pre-defined short-term interest length by exhaustive search or empirical experience.
This paper proposes a novel sequential recommender system, AutoMLP, aiming for better modeling users' long/short-term interests.
arXiv Detail & Related papers (2023-03-11T07:50:49Z) - Efficient Language Modeling with Sparse all-MLP [53.81435968051093]
All-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks.
We propose sparse all-MLPs with mixture-of-experts (MoEs) in both feature and input (tokens)
We evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.
arXiv Detail & Related papers (2022-03-14T04:32:19Z) - Bayesian Inference in High-Dimensional Time-Serieswith the Orthogonal
Stochastic Linear Mixing Model [2.7909426811685893]
Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time.
In this paper, we propose a new Markov chain Monte Carlo framework for the analysis of diverse, large-scale time-series datasets.
arXiv Detail & Related papers (2021-06-25T01:12:54Z) - Learning representations with end-to-end models for improved remaining
useful life prognostics [64.80885001058572]
The remaining Useful Life (RUL) of equipment is defined as the duration between the current time and its failure.
We propose an end-to-end deep learning model based on multi-layer perceptron and long short-term memory layers (LSTM) to predict the RUL.
We will discuss how the proposed end-to-end model is able to achieve such good results and compare it to other deep learning and state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T16:45:18Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.