MambaX-Net: Dual-Input Mamba-Enhanced Cross-Attention Network for Longitudinal MRI Segmentation
- URL: http://arxiv.org/abs/2510.17529v1
- Date: Mon, 20 Oct 2025 13:32:42 GMT
- Title: MambaX-Net: Dual-Input Mamba-Enhanced Cross-Attention Network for Longitudinal MRI Segmentation
- Authors: Yovin Yahathugoda, Davide Prezzi, Piyalitt Ittichaiwong, Vicky Goh, Sebastien Ourselin, Michela Antonelli,
- Abstract summary: MambaX-Net is a novel semi-supervised, dual-scan 3D segmentation architecture.<n>It computes the segmentation for time point t by leveraging the MRI and the corresponding segmentation mask from the previous time point.<n>MambaX-Net was evaluated on a longitudinal AS dataset, and results showed that it significantly outperforms state-of-the-art U-Net and Transformer-based models.
- Score: 0.9342663938194833
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active Surveillance (AS) is a treatment option for managing low and intermediate-risk prostate cancer (PCa), aiming to avoid overtreatment while monitoring disease progression through serial MRI and clinical follow-up. Accurate prostate segmentation is an important preliminary step for automating this process, enabling automated detection and diagnosis of PCa. However, existing deep-learning segmentation models are often trained on single-time-point and expertly annotated datasets, making them unsuitable for longitudinal AS analysis, where multiple time points and a scarcity of expert labels hinder their effective fine-tuning. To address these challenges, we propose MambaX-Net, a novel semi-supervised, dual-scan 3D segmentation architecture that computes the segmentation for time point t by leveraging the MRI and the corresponding segmentation mask from the previous time point. We introduce two new components: (i) a Mamba-enhanced Cross-Attention Module, which integrates the Mamba block into cross attention to efficiently capture temporal evolution and long-range spatial dependencies, and (ii) a Shape Extractor Module that encodes the previous segmentation mask into a latent anatomical representation for refined zone delination. Moreover, we introduce a semi-supervised self-training strategy that leverages pseudo-labels generated from a pre-trained nnU-Net, enabling effective learning without expert annotations. MambaX-Net was evaluated on a longitudinal AS dataset, and results showed that it significantly outperforms state-of-the-art U-Net and Transformer-based models, achieving superior prostate zone segmentation even when trained on limited and noisy data.
Related papers
- Promptable segmentation with region exploration enables minimal-effort expert-level prostate cancer delineation [2.056389633034677]
This work aims to bridge the gap between automated and manual segmentation through a framework driven by user-provided point prompts.<n>The framework was evaluated on two public prostate MR datasets (PROMIS and PICAI, with 566 and 1090 cases)
arXiv Detail & Related papers (2026-02-19T20:29:41Z) - Rethinking Convergence in Deep Learning: The Predictive-Corrective Paradigm for Anatomy-Informed Brain MRI Segmentation [30.94379425064039]
We introduce the Predictive-Corrective (PC) paradigm, a framework that decouples the modeling task to fundamentally accelerate learning.<n>PCambaNet is composed of two synergistic modules. First, the Predictive Prior Module (PPM) generates a coarse approximation at low computational cost.<n>Next, the Corrective Residual Network (CRN) learns to model the residual error, focusing the network's full capacity on refining these challenging regions.
arXiv Detail & Related papers (2025-10-17T08:51:33Z) - CLAPS: A CLIP-Unified Auto-Prompt Segmentation for Multi-Modal Retinal Imaging [47.04292769940597]
We propose CLIP-unified Auto-Prompt (CLAPS), a novel method for unified segmentation across diverse tasks and modalities in retinal imaging.<n>Our approach begins by pre-training a CLIP-based image encoder on a large, multi-modal retinal dataset.<n>To unify tasks and resolve ambiguity, we use text prompts enhanced with a unique "modality signature" for each imaging modality.
arXiv Detail & Related papers (2025-09-10T14:14:49Z) - Semi-Supervised Medical Image Segmentation via Dual Networks [1.904929457002693]
We propose an innovative semi-supervised 3D medical image segmentation method to reduce the dependency on large, expert-labeled datasets.<n>We introduce a dual-network architecture to address the limitations of existing methods in using contextual information.<n> Experiments on clinical magnetic resonance imaging demonstrate that our approach outperforms state-of-the-art techniques.
arXiv Detail & Related papers (2025-05-23T09:59:26Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - Multi-encoder nnU-Net outperforms transformer models with self-supervised pretraining [0.0]
This study addresses the essential task of medical image segmentation, which involves the automatic identification and delineation of anatomical structures and pathological regions in medical images.<n>We propose a novel self-supervised learning Multi-encoder nnU-Net architecture designed to process multiple MRI modalities independently through separate encoders.<n>Our Multi-encoder nnU-Net demonstrates exceptional performance, achieving a Dice Similarity Coefficient (DSC) of 93.72%, which surpasses that of other models such as vanilla nnU-Net, SegResNet, and Swin UNETR.
arXiv Detail & Related papers (2025-04-04T14:31:06Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Longitudinal detection of new MS lesions using Deep Learning [0.0]
We describe a deep-learning-based pipeline addressing the task of detecting and segmenting new MS lesions.
First, we propose to use transfer-learning from a model trained on a segmentation task using single time-points.
Second, we propose a data synthesis strategy to generate realistic longitudinal time-points with new lesions.
arXiv Detail & Related papers (2022-06-16T16:09:04Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.