Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
- URL: http://arxiv.org/abs/2408.10517v1
- Date: Tue, 20 Aug 2024 03:35:28 GMT
- Title: Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
- Authors: Wall Kim,
- Abstract summary: This study introduces a model named Decision MetaMamba to resolve these challenges.
It employs an input token mixer to extract patterns from short sequences and uses a State Space Model (SSM) to selectively combine information from relatively distant sequences.
Based on these innovations, DMM demonstrated excellent performance across various datasets in offline RL.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Return-Conditioned Transformer Decision Models (RCTDM) have demonstrated the potential to enhance transformer performance in offline reinforcement learning by replacing rewards in the input sequence with returns-to-go. However, to achieve the goal of learning an optimal policy from offline datasets composed of limited suboptimal trajectories, RCTDM required alternative methods. One prominent approach, trajectory stitching, was designed to enable the network to combine multiple trajectories to find the optimal path. To implement this using only transformers without auxiliary networks, it was necessary to shorten the input sequence length to better capture the Markov property in reinforcement learnings. This, however, introduced a trade-off, as it reduced the accuracy of action inference. Our study introduces a model named Decision MetaMamba to resolve these challenges. DMM employs an input token mixer to extract patterns from short sequences and uses a State Space Model (SSM) to selectively combine information from relatively distant sequences. Inspired by Metaformer, this structure was developed by transforming Mamba's input layer into various multi-modal layers. Fortunately, with the advent of Mamba, implemented using parallel selective scanning, we achieved a high-performance sequence model capable of replacing transformers. Based on these innovations, DMM demonstrated excellent performance across various datasets in offline RL, confirming that models using SSM can improve performance by domain-specific alterations of the input layer. Additionally, it maintained its performance even in lightweight models with fewer parameters. These results suggest that decision models based on SSM can pave the way for improved outcomes in future developments.
Related papers
- Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.
We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.
Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [92.36510016591782]
We present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs)
Our method, called MOHAWK, is able to distill a Mamba-2 variant based on the Phi-1.5 architecture using only 3B tokens and a hybrid version (Hybrid Phi-Mamba) using 5B tokens.
Despite using less than 1% of the training data typically used to train models from scratch, Phi-Mamba boasts substantially stronger performance compared to all past open-source non-Transformer models.
arXiv Detail & Related papers (2024-08-19T17:48:11Z) - Layerwise Recurrent Router for Mixture-of-Experts [42.36093735411238]
Mixture-of-Experts (MoE) architecture stands out for its ability to scale model size without significantly increasing training costs.
Current MoE models often display parameter inefficiency.
We introduce the Layerwise Recurrent Router for Mixture-of-Experts (RMoE)
arXiv Detail & Related papers (2024-08-13T10:25:13Z) - Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning [16.23977055134524]
We propose a novel action predictor sequence, named Mamba Decision Maker (MambaDM)
MambaDM is expected to be a promising alternative for sequence modeling paradigms, owing to its efficient modeling of multi-scale dependencies.
This paper delves into the sequence modeling capabilities of MambaDM in the RL domain, paving the way for future advancements.
arXiv Detail & Related papers (2024-06-04T06:49:18Z) - Decision Mamba Architectures [1.4255659581428335]
Decision Mamba architecture has shown to outperform Transformers across various task domains.
We introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM)
We demonstrate the superiority of Mamba models over their Transformer counterparts in a majority of tasks.
arXiv Detail & Related papers (2024-05-13T17:18:08Z) - Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - The Hidden Attention of Mamba Models [54.50526986788175]
The Mamba layer offers an efficient selective state space model (SSM) that is highly effective in modeling multiple domains.
We show that such models can be viewed as attention-driven models.
This new perspective enables us to empirically and theoretically compare the underlying mechanisms to that of the self-attention layers in transformers.
arXiv Detail & Related papers (2024-03-03T18:58:21Z) - Mamba: Linear-Time Sequence Modeling with Selective State Spaces [31.985243136674146]
Foundation models are almost universally based on the Transformer architecture and its core attention module.
We identify that a key weakness of such models is their inability to perform content-based reasoning.
We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even blocks (Mamba)
As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics.
arXiv Detail & Related papers (2023-12-01T18:01:34Z) - A Reinforcement Learning Approach for Sequential Spatial Transformer
Networks [6.585049648605185]
We formulate the task as a Markovian Decision Process (MDP) and use RL to solve this sequential decision-making problem.
In our method, we are not bound to the differentiability of the sampling modules.
We design multiple experiments to verify the effectiveness of our method using cluttered MNIST and Fashion-MNIST datasets.
arXiv Detail & Related papers (2021-06-27T17:41:17Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.