Mamba for Scalable and Efficient Personalized Recommendations
- URL: http://arxiv.org/abs/2409.17165v1
- Date: Wed, 11 Sep 2024 14:26:14 GMT
- Title: Mamba for Scalable and Efficient Personalized Recommendations
- Authors: Andrew Starnes, Clayton Webster
- Abstract summary: We present a novel hybrid model that replaces Transformer layers with Mamba layers within the FT-Transformer architecture.
We evaluate FT-Mamba in comparison to a traditional Transformer-based model within a Two-Tower architecture on three datasets.
- Score: 0.135975510645475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this effort, we propose using the Mamba for handling tabular data in
personalized recommendation systems. We present the \textit{FT-Mamba} (Feature
Tokenizer\,$+$\,Mamba), a novel hybrid model that replaces Transformer layers
with Mamba layers within the FT-Transformer architecture, for handling tabular
data in personalized recommendation systems. The \textit{Mamba model} offers an
efficient alternative to Transformers, reducing computational complexity from
quadratic to linear by enhancing the capabilities of State Space Models (SSMs).
FT-Mamba is designed to improve the scalability and efficiency of
recommendation systems while maintaining performance. We evaluate FT-Mamba in
comparison to a traditional Transformer-based model within a Two-Tower
architecture on three datasets: Spotify music recommendation, H\&M fashion
recommendation, and vaccine messaging recommendation. Each model is trained on
160,000 user-action pairs, and performance is measured using precision (P),
recall (R), Mean Reciprocal Rank (MRR), and Hit Ratio (HR) at several
truncation values. Our results demonstrate that FT-Mamba outperforms the
Transformer-based model in terms of computational efficiency while maintaining
or exceeding performance across key recommendation metrics. By leveraging Mamba
layers, FT-Mamba provides a scalable and effective solution for large-scale
personalized recommendation systems, showcasing the potential of the Mamba
architecture to enhance both efficiency and accuracy.
Related papers
- MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs.
We propose the MobileMamba framework, which balances efficiency and performance.
MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z) - Bi-Mamba: Towards Accurate 1-Bit State Space Models [28.478762133816726]
Bi-Mamba is a scalable and powerful 1-bit Mamba architecture designed for more efficient large language models.
Bi-Mamba achieves performance comparable to its full-precision counterparts (e.g., FP16 or BF16) and much better accuracy than post-training-binarization (PTB) Mamba baselines.
arXiv Detail & Related papers (2024-11-18T18:59:15Z) - MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba [0.5530212768657544]
Mamba, a State Space Model (SSM)-based model, has attracted attention as a potential alternative to Transformers.
We investigate the effectiveness of existing PEFT methods for Transformers when applied to Mamba.
We propose new Mamba-specific PEFT methods that leverage the distinctive structure of Mamba.
arXiv Detail & Related papers (2024-11-06T11:57:55Z) - SepMamba: State-space models for speaker separation using Mamba [2.840381306234341]
We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers.
We find that our approach outperforms similarly-sized prominent models on the WSJ0 2-speaker dataset.
arXiv Detail & Related papers (2024-10-28T13:20:53Z) - ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts.
ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z) - Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.
We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.
Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [92.36510016591782]
We present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs)
Our method, called MOHAWK, is able to distill a Mamba-2 variant based on the Phi-1.5 architecture using only 3B tokens and a hybrid version (Hybrid Phi-Mamba) using 5B tokens.
Despite using less than 1% of the training data typically used to train models from scratch, Phi-Mamba boasts substantially stronger performance compared to all past open-source non-Transformer models.
arXiv Detail & Related papers (2024-08-19T17:48:11Z) - MambaVision: A Hybrid Mamba-Transformer Vision Backbone [54.965143338206644]
We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which is specifically tailored for vision applications.
Our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features.
We conduct a comprehensive ablation study on the feasibility of integrating Vision Transformers (ViT) with Mamba.
arXiv Detail & Related papers (2024-07-10T23:02:45Z) - An Empirical Study of Mamba-based Language Models [69.74383762508805]
Selective state-space models (SSMs) like Mamba overcome some shortcomings of Transformers.
We present a direct comparison between 8B-context Mamba, Mamba-2, and Transformer models trained on the same datasets.
We find that the 8B Mamba-2-Hybrid exceeds the 8B Transformer on all 12 standard tasks.
arXiv Detail & Related papers (2024-06-12T05:25:15Z) - Mamba State-Space Models Are Lyapunov-Stable Learners [1.6385815610837167]
Mamba state-space models (SSMs) were recently shown to outperform Transformer large language models (LLMs) across various tasks.
We show that Mamba's recurrent dynamics are robust to small input changes.
We also show that instruction tuning allows Mamba models to narrow this gap to 81% and Mamba-2 models to skyrocket over this gap to 132%.
arXiv Detail & Related papers (2024-05-31T21:46:23Z) - Is Mamba Effective for Time Series Forecasting? [30.85990093479062]
We propose a Mamba-based model named Simple-Mamba (S-Mamba) for time series forecasting.
Specifically, we tokenize the time points of each variate autonomously via a linear layer.
Experiments on thirteen public datasets prove that S-Mamba maintains low computational overhead and achieves leading performance.
arXiv Detail & Related papers (2024-03-17T08:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.