DCMM-Transformer: Degree-Corrected Mixed-Membership Attention for Medical Imaging
- URL: http://arxiv.org/abs/2511.12047v1
- Date: Sat, 15 Nov 2025 05:55:01 GMT
- Title: DCMM-Transformer: Degree-Corrected Mixed-Membership Attention for Medical Imaging
- Authors: Huimin Cheng, Xiaowei Yu, Shushan Wu, Luyang Fang, Chao Cao, Jing Zhang, Tianming Liu, Dajiang Zhu, Wenxuan Zhong, Ping Ma,
- Abstract summary: We present DCMM-Transformer, a novel ViT architecture for medical image analysis that incorporates a Degree-Corrected Mixed-Membership (DCMM) model as an additive bias in self-attention.
- Score: 19.273362844763806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical images exhibit latent anatomical groupings, such as organs, tissues, and pathological regions, that standard Vision Transformers (ViTs) fail to exploit. While recent work like SBM-Transformer attempts to incorporate such structures through stochastic binary masking, they suffer from non-differentiability, training instability, and the inability to model complex community structure. We present DCMM-Transformer, a novel ViT architecture for medical image analysis that incorporates a Degree-Corrected Mixed-Membership (DCMM) model as an additive bias in self-attention. Unlike prior approaches that rely on multiplicative masking and binary sampling, our method introduces community structure and degree heterogeneity in a fully differentiable and interpretable manner. Comprehensive experiments across diverse medical imaging datasets, including brain, chest, breast, and ocular modalities, demonstrate the superior performance and generalizability of the proposed approach. Furthermore, the learned group structure and structured attention modulation substantially enhance interpretability by yielding attention maps that are anatomically meaningful and semantically coherent.
Related papers
- Transformer-Based Explainable Deep Learning for Breast Cancer Detection in Mammography: The MammoFormer Framework [0.0]
MammoFormer framework unites transformer-based architecture with multi-feature enhancement components and XAI functionalities.<n>Seven different architectures consisting of CNNs, Vision Transformer, Swin Transformer, and ConvNext were tested.<n>The MammoFormer framework addresses critical clinical adoption barriers of AI mammography systems.
arXiv Detail & Related papers (2025-08-08T08:59:54Z) - Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation [56.52520416420957]
We propose Multimodal Causal-Driven Representation Learning (MCDRL) to tackle domain generalization in medical image segmentation.<n>MCDRL consistently outperforms competing methods, yielding superior segmentation accuracy and exhibiting robust generalizability.
arXiv Detail & Related papers (2025-08-07T03:41:41Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [57.044719143401664]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Enhanced MRI Representation via Cross-series Masking [48.09478307927716]
Cross-Series Masking (CSM) Strategy for effectively learning MRI representation in a self-supervised manner.<n>Method achieves state-of-the-art performance on both public and in-house datasets.
arXiv Detail & Related papers (2024-12-10T10:32:09Z) - Advancing Medical Image Segmentation: Morphology-Driven Learning with Diffusion Transformer [4.672688418357066]
We propose a novel Transformer Diffusion (DTS) model for robust segmentation in the presence of noise.
Our model, which analyzes the morphological representation of images, shows better results than the previous models in various medical imaging modalities.
arXiv Detail & Related papers (2024-08-01T07:35:54Z) - QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge [93.61262892578067]
Uncertainty in medical image segmentation tasks, especially inter-rater variability, presents a significant challenge.
This variability directly impacts the development and evaluation of automated segmentation algorithms.
We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ)
arXiv Detail & Related papers (2024-03-19T17:57:24Z) - Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration [22.157402663162877]
We propose a modality-agnostic structural representation learning method to learn discriminative and contrast-invariance deep structural image representations.
Our method is superior to the conventional local structural representation and statistical-based similarity measures in terms of discriminability and accuracy.
arXiv Detail & Related papers (2024-02-29T08:01:31Z) - Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration [59.062085785106234]
This article presents a general Bayesian learning framework for multi-modal groupwise image registration.<n>We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables.<n>Experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images.
arXiv Detail & Related papers (2024-01-04T08:46:39Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Image
Segmentation [21.6412682130116]
We propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations.
Based on our novel framework, extensive experiments conducted on real-world clinical data demonstrate that a Group Equivariant Res-UNet (named GER-UNet) outperforms its regular CNN-based counterpart.
The newly built GER-UNet also shows potential in reducing the sample complexity and the redundancy of filters.
arXiv Detail & Related papers (2022-07-29T04:28:20Z) - Unsupervised Image Registration Towards Enhancing Performance and
Explainability in Cardiac And Brain Image Analysis [3.5718941645696485]
Inter- and intra-modality affine and non-rigid image registration is an essential medical image analysis process in clinical imaging.
We present an un-supervised deep learning registration methodology which can accurately model affine and non-rigid trans-formations.
Our methodology performs bi-directional cross-modality image synthesis to learn modality-invariant latent rep-resentations.
arXiv Detail & Related papers (2022-03-07T12:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.