MSM-Seg: A Modality-and-Slice Memory Framework with Category-Agnostic Prompting for Multi-Modal Brain Tumor Segmentation
- URL: http://arxiv.org/abs/2510.10679v1
- Date: Sun, 12 Oct 2025 16:08:16 GMT
- Title: MSM-Seg: A Modality-and-Slice Memory Framework with Category-Agnostic Prompting for Multi-Modal Brain Tumor Segmentation
- Authors: Yuxiang Luo, Qing Xu, Hai Huang, Yuqi Ouyang, Zhen Chen, Wenting Duan,
- Abstract summary: Multi-modal brain tumor segmentation is critical for clinical diagnosis.<n>We propose a MSM-Seg framework for multi-modal brain tumor segmentation.<n>Our framework outperforms state-of-the-art methods in multi-modal metastases and glioma tumor segmentation.
- Score: 7.189055048337262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-modal brain tumor segmentation is critical for clinical diagnosis, and it requires accurate identification of distinct internal anatomical subregions. While the recent prompt-based segmentation paradigms enable interactive experiences for clinicians, existing methods ignore cross-modal correlations and rely on labor-intensive category-specific prompts, limiting their applicability in real-world scenarios. To address these issues, we propose a MSM-Seg framework for multi-modal brain tumor segmentation. The MSM-Seg introduces a novel dual-memory segmentation paradigm that synergistically integrates multi-modal and inter-slice information with the efficient category-agnostic prompt for brain tumor understanding. To this end, we first devise a modality-and-slice memory attention (MSMA) to exploit the cross-modal and inter-slice relationships among the input scans. Then, we propose a multi-scale category-agnostic prompt encoder (MCP-Encoder) to provide tumor region guidance for decoding. Moreover, we devise a modality-adaptive fusion decoder (MF-Decoder) that leverages the complementary decoding information across different modalities to improve segmentation accuracy. Extensive experiments on different MRI datasets demonstrate that our MSM-Seg framework outperforms state-of-the-art methods in multi-modal metastases and glioma tumor segmentation. The code is available at https://github.com/xq141839/MSM-Seg.
Related papers
- TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - Sim4Seg: Boosting Multimodal Multi-disease Medical Diagnosis Segmentation with Region-Aware Vision-Language Similarity Masks [54.00822479127598]
We introduce a medical vision-language task named Medical Diagnosis (MDS)<n>MDS aims to understand clinical queries for medical images and generate the corresponding segmentation masks as well as diagnostic results.<n>We propose Sim4Seg, a novel framework that improves the performance of diagnosis segmentation.
arXiv Detail & Related papers (2025-11-10T03:22:42Z) - Unified Multimodal Coherent Field: Synchronous Semantic-Spatial-Vision Fusion for Brain Tumor Segmentation [13.108816349958659]
We propose the Unified Multimodal Coherent Field (UMCF) method for brain tumor segmentation.<n>Method achieves synchronous interactive fusion of visual, semantic, and spatial information within a unified 3D space.<n>On Brain Tumor (BraTS) datasets, UMCF+Unn-Net achieves average Dice of 0.8579 and 0.8977 respectively, with an average 4.18% improvement across mainstream architectures.
arXiv Detail & Related papers (2025-09-22T08:45:39Z) - A Multi-Modal Fusion Framework for Brain Tumor Segmentation Based on 3D Spatial-Language-Vision Integration and Bidirectional Interactive Attention Mechanism [6.589206192038366]
The framework was evaluated on BraTS 2020 dataset comprising 369 multi-institutional MRI scans.<n>The proposed method achieved average Dice coefficient of 0.8505 and 95% Hausdorff distance of 2.8256mm across enhancing tumor, tumor core, and whole tumor regions.
arXiv Detail & Related papers (2025-07-11T13:21:56Z) - MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts [54.915060471994686]
We propose MAST-Pro, a novel framework that integrates dynamic Mixture-of-Experts (D-MoE) and knowledge-driven prompts for pan-tumor segmentation.<n>Specifically, text and anatomical prompts provide domain-specific priors guiding tumor representation learning, while D-MoE dynamically selects experts to balance generic and tumor-specific feature learning.<n>Experiments on multi-anatomical tumor datasets demonstrate that MAST-Pro outperforms state-of-the-art approaches, achieving up to a 5.20% improvement in average improvement while reducing trainable parameters by 91.04%, without compromising accuracy.
arXiv Detail & Related papers (2025-03-18T15:39:44Z) - Enhanced MRI Representation via Cross-series Masking [48.09478307927716]
Cross-Series Masking (CSM) Strategy for effectively learning MRI representation in a self-supervised manner.<n>Method achieves state-of-the-art performance on both public and in-house datasets.
arXiv Detail & Related papers (2024-12-10T10:32:09Z) - MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training [10.558275557142137]
We propose a simple Multi-Modal (MulModSeg) strategy to enhance medical image segmentation across multiple modalities.
MulModSeg incorporates a modality-conditioned text embedding framework via a frozen text encoder.
It consistently outperforms previous methods in segmenting abdominal multiorgan and cardiac substructures for both CT and MR.
arXiv Detail & Related papers (2024-11-23T14:37:01Z) - multiPI-TransBTS: A Multi-Path Learning Framework for Brain Tumor Image Segmentation Based on Multi-Physical Information [1.7359724605901228]
Brain Tumor distances (BraTS) plays a critical role in clinical diagnosis, treatment planning, and monitoring the progression of brain tumors.
Due to the variability in tumor appearance, size, and intensity across different MRI modalities, automated segmentation remains a challenging task.
We propose a novel Transformer-based framework, multiPI-TransBTS, which integrates multi-physical information to enhance segmentation accuracy.
arXiv Detail & Related papers (2024-09-18T17:35:19Z) - Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation [48.107348956719775]
We introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation.
We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks.
Our M-SAM achieves high segmentation accuracy and also exhibits robust generalization.
arXiv Detail & Related papers (2024-03-09T13:37:02Z) - Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation [12.094890186803958]
We present a novel Modality Aware and Shift Mixer that integrates intra-modality and inter-modality dependencies of multi-modal images for effective and robust brain tumor segmentation.
Specifically, we introduce a Modality-Aware module according to neuroimaging studies for modeling the specific modality pair relationships at low levels, and a Modality-Shift module with specific mosaic patterns is developed to explore the complex relationships across modalities at high levels via the self-attention.
arXiv Detail & Related papers (2024-03-04T14:21:51Z) - Cross-modality Guidance-aided Multi-modal Learning with Dual Attention
for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly.
We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.