PanopMamba: Vision State Space Modeling for Nuclei Panoptic Segmentation
- URL: http://arxiv.org/abs/2601.16631v1
- Date: Fri, 23 Jan 2026 10:33:15 GMT
- Title: PanopMamba: Vision State Space Modeling for Nuclei Panoptic Segmentation
- Authors: Ming Kang, Fung Fung Ting, Raphaƫl C. -W. Phan, Zongyuan Ge, Chee-Ming Ting,
- Abstract summary: PanopMamba is a novel hybrid encoder-decoder architecture that integrates Mamba and Transformer.<n>To the best of our knowledge, this is the first Mamba-based approach for panoptic segmentation.<n>We introduce alternative evaluation metrics, including image-level Panoptic Quality ($i$PQ), boundary-weighted PQ ($w$PQ), and frequency-weighted PQ ($fw$PQ)
- Score: 20.689908446030856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nuclei panoptic segmentation supports cancer diagnostics by integrating both semantic and instance segmentation of different cell types to analyze overall tissue structure and individual nuclei in histopathology images. Major challenges include detecting small objects, handling ambiguous boundaries, and addressing class imbalance. To address these issues, we propose PanopMamba, a novel hybrid encoder-decoder architecture that integrates Mamba and Transformer with additional feature-enhanced fusion via state space modeling. We design a multiscale Mamba backbone and a State Space Model (SSM)-based fusion network to enable efficient long-range perception in pyramid features, thereby extending the pure encoder-decoder framework while facilitating information sharing across multiscale features of nuclei. The proposed SSM-based feature-enhanced fusion integrates pyramid feature networks and dynamic feature enhancement across different spatial scales, enhancing the feature representation of densely overlapping nuclei in both semantic and spatial dimensions. To the best of our knowledge, this is the first Mamba-based approach for panoptic segmentation. Additionally, we introduce alternative evaluation metrics, including image-level Panoptic Quality ($i$PQ), boundary-weighted PQ ($w$PQ), and frequency-weighted PQ ($fw$PQ), which are specifically designed to address the unique challenges of nuclei segmentation and thereby mitigate the potential bias inherent in vanilla PQ. Experimental evaluations on two multiclass nuclei segmentation benchmark datasets, MoNuSAC2020 and NuInsSeg, demonstrate the superiority of PanopMamba for nuclei panoptic segmentation over state-of-the-art methods. Consequently, the robustness of PanopMamba is validated across various metrics, while the distinctiveness of PQ variants is also demonstrated. Code is available at https://github.com/mkang315/PanopMamba.
Related papers
- Multi-label Classification with Panoptic Context Aggregation Networks [61.82285737410154]
This paper introduces the Deep Panoptic Context Aggregation Network (PanCAN), a novel approach that hierarchically integrates multi-order geometric contexts.<n>PanCAN learns multi-order neighborhood relationships at each scale by combining random walks with an attention mechanism.<n>Experiments on NUS-WIDE, PASCAL VOC,2007, and MS-COCO benchmarks demonstrate that PanCAN consistently achieves competitive results.
arXiv Detail & Related papers (2025-12-29T14:16:21Z) - PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer [54.958921946378304]
We introduce PanFoMa, a lightweight hybrid neural network that combines the strengths of Transformers and state-space models.<n>PanFoMa consists of a front-end local-context encoder with shared self-attention layers to capture complex, order-independent gene interactions.<n>We also construct a large-scale pan-cancer single-cell benchmark, PanFoMaBench, containing over 3.5 million high-quality cells.
arXiv Detail & Related papers (2025-12-02T08:31:31Z) - HyM-UNet: Synergizing Local Texture and Global Context via Hybrid CNN-Mamba Architecture for Medical Image Segmentation [3.976000861085382]
HyM-UNet is designed to synergize the local feature extraction capabilities of CNNs with the efficient global modeling capabilities of Mamba.<n>To bridge the semantic gap between the encoder and the decoder, we propose a Mamba-Guided Fusion Skip Connection.<n>The results demonstrate that HyM-UNet significantly outperforms existing state-of-the-art methods in terms of Dice coefficient and IoU.
arXiv Detail & Related papers (2025-11-22T09:02:06Z) - MambaCAFU: Hybrid Multi-Scale and Multi-Attention Model with Mamba-Based Fusion for Medical Image Segmentation [11.967890140626716]
We propose a hybrid segmentation architecture featuring a three-branch encoder that integrates CNNs, Transformers, and a Mamba-based Attention Fusion mechanism.<n>A multi-scale attention-based CNN decoder reconstructs fine-grained segmentation maps while preserving contextual consistency.<n>Our approach outperforms state-of-the-art methods in accuracy and generalization, while maintaining comparable computational complexity.
arXiv Detail & Related papers (2025-10-04T11:25:10Z) - HSA-Net: Hierarchical and Structure-Aware Framework for Efficient and Scalable Molecular Language Modeling [7.26697833663902]
We propose Hierarchical and Structure-Aware Network (HSA-Net), a novel framework with two modules that enables a hierarchical feature projection and fusion.<n>To adaptively merge multi-level features, we design a Source-Aware Fusion (SAF) module, which flexibly selects fusion experts based on the characteristics of the aggregation features.<n>Extensive experiments demonstrate that our HSA-Net framework quantitatively and qualitatively outperforms current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2025-08-10T15:22:42Z) - MS-UMamba: An Improved Vision Mamba Unet for Fetal Abdominal Medical Image Segmentation [1.2721397985664153]
We propose MS-UMamba, a novel hybrid convolutional-mamba model for fetal ultrasound image segmentation.<n>Specifically, we design a visual state space block integrated with a CNN branch, which leverages Mamba's global modeling strengths.<n>We also propose an efficient multi-scale feature fusion module, which integrates feature information from different layers.
arXiv Detail & Related papers (2025-06-14T10:34:10Z) - MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification [46.111607032455225]
We propose a novel HSI classification model based on a Mamba model, named MambaHSI.<n> Specifically, we design a spatial Mamba block (SpaMB) to model the long-range interaction of the whole image at the pixel-level.<n>We propose a spectral Mamba block (SpeMB) to split the spectral vector into multiple groups, mine the relations across different spectral groups, and extract spectral features.
arXiv Detail & Related papers (2025-01-09T03:27:47Z) - S$^2$Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification [44.99672241508994]
Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information.
We propose S$2$Mamba, a spatial-spectral state space model for hyperspectral image classification, to excavate spatial-spectral contextual features.
arXiv Detail & Related papers (2024-04-28T15:12:56Z) - LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation [9.862277278217045]
In this paper, we introduce a Large Kernel Vision Mamba U-shape Network, or LKM-UNet, for medical image segmentation.
A distinguishing feature of our LKM-UNet is its utilization of large Mamba kernels, excelling in locally spatial modeling compared to small kernel-based CNNs and Transformers.
Comprehensive experiments demonstrate the feasibility and the effectiveness of using large-size Mamba kernels to achieve large receptive fields.
arXiv Detail & Related papers (2024-03-12T05:34:51Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - Adaptive Context Selection for Polyp Segmentation [99.9959901908053]
We propose an adaptive context selection based encoder-decoder framework which is composed of Local Context Attention (LCA) module, Global Context Module (GCM) and Adaptive Selection Module (ASM)
LCA modules deliver local context features from encoder layers to decoder layers, enhancing the attention to the hard region which is determined by the prediction map of previous layer.
GCM aims to further explore the global context features and send to the decoder layers. ASM is used for adaptive selection and aggregation of context features through channel-wise attention.
arXiv Detail & Related papers (2023-01-12T04:06:44Z) - Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.