TiM4Rec: An Efficient Sequential Recommendation Model Based on Time-Aware Structured State Space Duality Model
- URL: http://arxiv.org/abs/2409.16182v3
- Date: Thu, 05 Dec 2024 14:28:42 GMT
- Title: TiM4Rec: An Efficient Sequential Recommendation Model Based on Time-Aware Structured State Space Duality Model
- Authors: Hao Fan, Mengyi Zhu, Yanrong Hu, Hailin Feng, Zhijie He, Hongjiu Liu, Qingyang Liu,
- Abstract summary: Sequential Recommendation modeling is shifting from Transformer to Mamba architecture, which comprises two generations: Mamba1, and Mamba2, based on State Space Duality (SSD)
SSD offers superior computational efficiency compared to SSM, but suffers performance degradation in sequential recommendation tasks.
We propose Time-Aware Mamba for Recommendation (TiM4Rec), which mitigates performance degradation in low-dimensional SSD contexts while preserving computational efficiency.
- Score: 4.414031127599392
- License:
- Abstract: The Sequential Recommendation modeling paradigm is shifting from Transformer to Mamba architecture, which comprises two generations: Mamba1, based on the State Space Model (SSM), and Mamba2, based on State Space Duality (SSD). Although SSD offers superior computational efficiency compared to SSM, it suffers performance degradation in sequential recommendation tasks, especially in low-dimensional scenarios that are critical for these tasks. Considering that time-aware enhancement methods are commonly employed to mitigate performance loss, our analysis reveals that the performance decline of SSD can similarly be fundamentally compensated by leveraging mechanisms in time-aware methods. Thus, we propose integrating time-awareness into the SSD framework to address these performance issues. However, integrating current time-aware methods, modeled after TiSASRec, into SSD faces the following challenges: 1) the complexity of integrating these transformer-based mechanisms with the SSD architecture, and 2) the computational inefficiency caused by the need for dimensionality expansion of time-difference modeling. To overcome these challenges, we introduce a novel Time-aware Structured Masked Matrix that efficiently incorporates time-aware capabilities into SSD. Building on this, we propose Time-Aware Mamba for Recommendation (TiM4Rec), which mitigates performance degradation in low-dimensional SSD contexts while preserving computational efficiency. This marks the inaugural application of a time-aware enhancement method specifically tailored for the Mamba architecture within the domain of sequential recommendation. Extensive experiments conducted on three real-world datasets demonstrate the superiority of our approach. The code for our model is accessible at https://github.com/AlwaysFHao/TiM4Rec.
Related papers
- Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning [54.19222454702032]
Continual Learning aims to equip AI models with the ability to learn a sequence of tasks over time, without forgetting previously learned knowledge.
State Space Models (SSMs) have achieved notable success in computer vision.
We introduce Mamba-CL, a framework that continuously fine-tunes the core SSMs of the large-scale Mamba foundation model.
arXiv Detail & Related papers (2024-11-23T06:36:16Z) - SSD4Rec: A Structured State Space Duality Model for Efficient Sequential Recommendation [18.537426536491203]
We propose a novel generic and efficient sequential recommendation backbone, SSD4Rec.
SSD4Rec marks the variable- and long-length item sequences with sequence registers and processes the item representations with bidirectional Structured State Space Duality (SSD) blocks.
Our model achieves state-of-the-art performance while maintaining near-linear scalability with user sequence length.
arXiv Detail & Related papers (2024-09-02T11:58:56Z) - Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.
We introduce a novel quantization framework that includes three strategies.
This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z) - VSSD: Vision Mamba with Non-Causal State Space Duality [26.96416515847115]
State Space Models (SSMs) have gained prominence in vision tasks as they offer linear computational complexity.
We introduce Visual State Space Duality (VSSD) model, which has a non-causal format of SSD.
We conduct extensive experiments on various benchmarks including image classification, detection, and segmentation, where VSSD surpasses existing state-of-the-art SSM-based models.
arXiv Detail & Related papers (2024-07-26T07:16:52Z) - A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies [51.7643024367548]
Stable Diffusion Model is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation.
This study focuses on reducing redundant computation in SDM and optimizing the model through both tuning and tuning-free methods.
arXiv Detail & Related papers (2024-05-31T21:47:05Z) - HOPE for a Robust Parameterization of Long-memory State Space Models [51.66430224089725]
State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences.
We develop a new parameterization scheme, called HOPE, for LTI systems that utilize Markov parameters within Hankel operators.
Our new parameterization endows the SSM with non-decaying memory within a fixed time window, which is empirically corroborated by a sequential CIFAR-10 task with padded noise.
arXiv Detail & Related papers (2024-05-22T20:20:14Z) - EfficientState Space Model viaFast Tensor Convolutionand Block Diagonalization [5.260841516691153]
We propose a new state space layer based on multiple-input multiple-output SSM, called efficient SSM.
Our eSSM is built on the convolutional representation of multi-input and multi-input (MIMO) SSM.
In the model efficiency benchmark, the parameters of eSSM are only 12.89% of LSTM and 13.24% of Mamba.
arXiv Detail & Related papers (2024-02-23T12:36:31Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - Multitask Learning for Time Series Data with 2D Convolution [32.72419542473646]
Multitask learning (MTL) aims to develop a unified model that can handle a set of closely related tasks simultaneously.
In this paper, we investigate the application of MTL to the time series classification problem.
We show that when we integrate the state-of-the-art 1D convolution-based TSC model with MTL, the performance of the TSC model actually deteriorates.
arXiv Detail & Related papers (2023-10-05T22:00:17Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.