Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
- URL: http://arxiv.org/abs/2510.26083v1
- Date: Thu, 30 Oct 2025 02:41:54 GMT
- Title: Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
- Authors: Yuhua Jiang, Shuang Cheng, Yihao Liu, Ermo Hua, Che Jiang, Weigao Sun, Yu Cheng, Feifei Gao, Biqing Qi, Bowen Zhou,
- Abstract summary: Specialized Generalist Models (SGMs) aim to preserve broad capabilities while achieving expert-level performance in target domains.<n>We present Nirvana, an SGM with specialized memory mechanism, linear time complexity, and test-time task information extraction.
- Score: 50.24237923346775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Specialized Generalist Models (SGMs) aim to preserve broad capabilities while achieving expert-level performance in target domains. However, traditional LLM structures including Transformer, Linear Attention, and hybrid models do not employ specialized memory mechanism guided by task information. In this paper, we present Nirvana, an SGM with specialized memory mechanism, linear time complexity, and test-time task information extraction. Besides, we propose the Task-Aware Memory Trigger ($\textit{Trigger}$) that flexibly adjusts memory mechanism based on the current task's requirements. In Trigger, each incoming sample is treated as a self-supervised fine-tuning task, enabling Nirvana to adapt its task-related parameters on the fly to domain shifts. We also design the Specialized Memory Updater ($\textit{Updater}$) that dynamically memorizes the context guided by Trigger. We conduct experiments on both general language tasks and specialized medical tasks. On a variety of natural language modeling benchmarks, Nirvana achieves competitive or superior results compared to the existing LLM structures. To prove the effectiveness of Trigger on specialized tasks, we test Nirvana's performance on a challenging medical task, i.e., Magnetic Resonance Imaging (MRI). We post-train frozen Nirvana backbone with lightweight codecs on paired electromagnetic signals and MRI images. Despite the frozen Nirvana backbone, Trigger guides the model to adapt to the MRI domain with the change of task-related parameters. Nirvana achieves higher-quality MRI reconstruction compared to conventional MRI models as well as the models with traditional LLMs' backbone, and can also generate accurate preliminary clinical reports accordingly.
Related papers
- Few-Shot Deployment of Pretrained MRI Transformers in Brain Imaging Tasks [2.982793366290863]
We propose a framework for the few-shot deployment of pretrained MRI transformers in diverse brain imaging tasks.<n>By utilizing the Masked Autoencoder (MAE) pretraining strategy, we obtain highly transferable latent representations that generalize well across tasks and datasets.
arXiv Detail & Related papers (2025-08-07T18:53:28Z) - ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process.<n>We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z) - LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts [14.72366043711941]
Current radiology report generation models are constrained to a fixed task paradigm.<n>We propose a novel large language model (LLM) based RRG framework, namely LLM-RG4.<n>We show that our model has minimal input-agnostic hallucinations, whereas current open-source models commonly suffer from this problem.
arXiv Detail & Related papers (2024-12-16T17:29:51Z) - Prompt Your Brain: Scaffold Prompt Tuning for Efficient Adaptation of fMRI Pre-trained Model [15.330413605539542]
Scaffold Prompt Tuning (ScaPT) is a novel prompt-based framework for adapting large-scale functional magnetic resonance imaging (fMRI) pre-trained models to downstream tasks.
It has high parameter efficiency and improved performance compared to fine-tuning and baselines for prompt tuning.
ScaPT outperforms fine-tuning and multitask-based prompt tuning in neurodegenerative diseases diagnosis/prognosis and personality trait prediction.
arXiv Detail & Related papers (2024-08-20T06:08:37Z) - fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for
Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training.
Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns.
Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z) - Recurrent Action Transformer with Memory [39.58317527488534]
This paper proposes a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention.
We conduct experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments.
The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments.
arXiv Detail & Related papers (2023-06-15T19:29:08Z) - Sequential Transfer Learning to Decode Heard and Imagined Timbre from
fMRI Data [0.0]
We present a sequential transfer learning framework for transformers on functional Magnetic Resonance Imaging (fMRI) data.
In the first phase, we pre-train our stacked-encoder transformer architecture on Next Thought Prediction.
In the second phase, we fine-tune the models and train additional fresh models on the supervised task of predicting whether or not two sequences of fMRI data were recorded while listening to the same musical timbre.
arXiv Detail & Related papers (2023-05-22T16:58:26Z) - BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP
for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model.
It bridges the modality gap between brain activity, image, and text.
BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z) - Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
of Head and Prompt Tuning [66.44344616836158]
We propose an analysis framework that links the pretraining and downstream tasks with an underlying latent variable generative model of text.
We show that 1) under certain non-degeneracy conditions on the HMM, simple classification heads can solve the downstream task, 2) prompt tuning obtains downstream guarantees with weaker non-degeneracy conditions, and 3) our recovery guarantees for the memory-augmented HMM are stronger than for the vanilla HMM.
arXiv Detail & Related papers (2021-06-17T03:31:47Z) - Overcoming Catastrophic Forgetting with Gaussian Mixture Replay [79.0660895390689]
We present a rehearsal-based approach for continual learning (CL) based on Gaussian Mixture Models (GMM)
We mitigate catastrophic forgetting (CF) by generating samples from previous tasks and merging them with current training data.
We evaluate GMR on multiple image datasets, which are divided into class-disjoint sub-tasks.
arXiv Detail & Related papers (2021-04-19T11:41:34Z) - Attend and Decode: 4D fMRI Task State Decoding Using Attention Models [2.6954666679827137]
We present a novel architecture called Brain Attend and Decode (BAnD)
BAnD uses residual convolutional neural networks for spatial feature extraction and self-attention mechanisms temporal modeling.
We achieve significant performance gain compared to previous works on a 7-task benchmark from the Human Connectome Project-Young Adult dataset.
arXiv Detail & Related papers (2020-04-10T21:29:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.