Decision Mamba Architectures
- URL: http://arxiv.org/abs/2405.07943v2
- Date: Thu, 17 Oct 2024 09:48:06 GMT
- Title: Decision Mamba Architectures
- Authors: André Correia, Luís A. Alexandre,
- Abstract summary: Decision Mamba architecture has shown to outperform Transformers across various task domains.
We introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM)
We demonstrate the superiority of Mamba models over their Transformer counterparts in a majority of tasks.
- Score: 1.4255659581428335
- License:
- Abstract: Recent advancements in imitation learning have been largely fueled by the integration of sequence models, which provide a structured flow of information to effectively mimic task behaviours. Currently, Decision Transformer (DT) and subsequently, the Hierarchical Decision Transformer (HDT), presented Transformer-based approaches to learn task policies. Recently, the Mamba architecture has shown to outperform Transformers across various task domains. In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models. Through extensive experimentation across diverse environments such as OpenAI Gym and D4RL, leveraging varying demonstration data sets, we demonstrate the superiority of Mamba models over their Transformer counterparts in a majority of tasks. Results show that DM outperforms other methods in most settings. The code can be found at https://github.com/meowatthemoon/DecisionMamba.
Related papers
- MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs.
We propose the MobileMamba framework, which balances efficiency and performance.
MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z) - MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba [0.5530212768657544]
Mamba, a State Space Model (SSM)-based model, has attracted attention as a potential alternative to Transformers.
We investigate the effectiveness of existing PEFT methods for Transformers when applied to Mamba.
We propose new Mamba-specific PEFT methods that leverage the distinctive structure of Mamba.
arXiv Detail & Related papers (2024-11-06T11:57:55Z) - Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba [0.0]
Sequence modeling with State Space models (SSMs) has demonstrated performance surpassing that of Transformers in various tasks.
However, decision models based on Mamba, a state-of-the-art SSM, failed to achieve superior performance compared to enhanced Decision Transformers.
We propose the Decision MetaMamba (DMM), which augments Mamba with a token mixer in its input layer.
arXiv Detail & Related papers (2024-08-20T03:35:28Z) - Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [92.36510016591782]
We present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs)
Our method, called MOHAWK, is able to distill a Mamba-2 variant based on the Phi-1.5 architecture using only 3B tokens and a hybrid version (Hybrid Phi-Mamba) using 5B tokens.
Despite using less than 1% of the training data typically used to train models from scratch, Phi-Mamba boasts substantially stronger performance compared to all past open-source non-Transformer models.
arXiv Detail & Related papers (2024-08-19T17:48:11Z) - A Mamba-based Siamese Network for Remote Sensing Change Detection [27.200043694866388]
Change detection in remote sensing images is an essential tool for analyzing a region at different times.
Deep learning methods like CNNs and Transformers have achieved remarkable success in detecting significant changes.
We propose a Mamba-based Change Detector (M-CD) that segments out the regions of interest even better.
arXiv Detail & Related papers (2024-07-08T17:05:48Z) - Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba [77.21394300708172]
Transformer, a deep neural network architecture, has long dominated the field of natural language processing and beyond.
The recent introduction of Mamba challenges its supremacy, sparks considerable interest among researchers, and gives rise to a series of Mamba-based models that have exhibited notable potential.
This survey paper orchestrates a comprehensive discussion, diving into essential research dimensions, covering: (i) the functioning of the Mamba mechanism and its foundation on the principles of structured state space models; (ii) the proposed improvements and the integration of Mamba with various networks, exploring its potential as a substitute for Transformers; (iii) the combination of
arXiv Detail & Related papers (2024-06-24T15:27:21Z) - An Empirical Study of Mamba-based Language Models [69.74383762508805]
Selective state-space models (SSMs) like Mamba overcome some shortcomings of Transformers.
We present a direct comparison between 8B-context Mamba, Mamba-2, and Transformer models trained on the same datasets.
We find that the 8B Mamba-2-Hybrid exceeds the 8B Transformer on all 12 standard tasks.
arXiv Detail & Related papers (2024-06-12T05:25:15Z) - Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling [13.253878928833688]
We propose a Decision Mamba-Hybrid (DM-H) for in-context reinforcement learning.
DM-H generates high-value sub-goals from long-term memory through the Mamba model.
Online testing of DM-H in the long-term task is 28$times$ times faster than the transformer-based baselines.
arXiv Detail & Related papers (2024-05-31T10:41:03Z) - Is Mamba Capable of In-Context Learning? [63.682741783013306]
State of the art foundation models such as GPT-4 perform surprisingly well at in-context learning (ICL)
This work provides empirical evidence that Mamba, a newly proposed state space model, has similar ICL capabilities.
arXiv Detail & Related papers (2024-02-05T16:39:12Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.