Information Bottleneck-based Causal Attention for Multi-label Medical Image Recognition
- URL: http://arxiv.org/abs/2508.08069v1
- Date: Mon, 11 Aug 2025 15:12:54 GMT
- Title: Information Bottleneck-based Causal Attention for Multi-label Medical Image Recognition
- Authors: Xiaoxiao Cui, Yiran Li, Kai He, Shanzhi Jiang, Mengli Xue, Wentao Li, Junhong Leng, Zhi Liu, Lizhen Cui, Shuo Li,
- Abstract summary: We propose a new structural causal model (SCM) that treats class-specific attention as a mixture of causal, spurious, and noisy factors.<n>We then propose a novel Information Bottleneck-based Causal Attention (IBCA) that is capable of learning the discriminative class-specific attention for medical images.
- Score: 23.458130705598204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label classification (MLC) of medical images aims to identify multiple diseases and holds significant clinical potential. A critical step is to learn class-specific features for accurate diagnosis and improved interpretability effectively. However, current works focus primarily on causal attention to learn class-specific features, yet they struggle to interpret the true cause due to the inadvertent attention to class-irrelevant features. To address this challenge, we propose a new structural causal model (SCM) that treats class-specific attention as a mixture of causal, spurious, and noisy factors, and a novel Information Bottleneck-based Causal Attention (IBCA) that is capable of learning the discriminative class-specific attention for MLC of medical images. Specifically, we propose learning Gaussian mixture multi-label spatial attention to filter out class-irrelevant information and capture each class-specific attention pattern. Then a contrastive enhancement-based causal intervention is proposed to gradually mitigate the spurious attention and reduce noise information by aligning multi-head attention with the Gaussian mixture multi-label spatial. Quantitative and ablation results on Endo and MuReD show that IBCA outperforms all methods. Compared to the second-best results for each metric, IBCA achieves improvements of 6.35\% in CR, 7.72\% in OR, and 5.02\% in mAP for MuReD, 1.47\% in CR, and 1.65\% in CF1, and 1.42\% in mAP for Endo.
Related papers
- Attention-Enhanced Deep Learning Ensemble for Breast Density Classification in Mammography [0.0]
This study proposes an automated deep learning system for robust binary classification of breast density.<n>We implemented and compared four advanced convolutional neural networks.<n>We developed a novel Combined Focal Label Smoothing Loss function that integrates focal loss, label smoothing, and class-balanced weighting.
arXiv Detail & Related papers (2025-07-08T21:26:33Z) - Cross-Modal Clustering-Guided Negative Sampling for Self-Supervised Joint Learning from Medical Images and Reports [11.734906190235066]
This paper presents a Cross-Modal Cluster-Guided Negative Sampling (CM-CGNS) method with two-fold ideas.<n>First, it extends the k-means clustering used for local text features in the single-modal domain to the multimodal domain through cross-modal attention.<n>Second, it introduces a Cross-Modal Masked Image Reconstruction (CM-MIR) module that leverages local text-to-image features obtained via cross-modal attention to reconstruct masked local image regions.
arXiv Detail & Related papers (2025-06-13T11:08:16Z) - MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation [8.850534640462081]
We propose a Supervised Prototypical Contrastive Loss that fuses supervised and prototypical contrastive learning to enhance coronary DSA image segmentation.<n>We implement the proposed SPCL loss within an MSA-UNet3+: a Multi-Scale Attention-Enhanced UNet3+ architecture.<n> Experiments on a private coronary DSA dataset show that MSA-UNet3+ outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-04-07T15:35:30Z) - MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention [1.2277343096128712]
We propose to leverage advanced segmentation capabilities of Segment Anything Model 2 (SAM2) as a visual prompting cue to help visual encoder in the CLIP (Contrastive Language-Image Pretraining)<n>This helps the model to focus on highly discriminative regions, without getting distracted from visually similar background features.<n>We evaluate our method on diverse medical datasets including X-rays, CT scans, and MRI images, and report an accuracy of (71%, 81%, 86%, 58%) from the proposed approach.
arXiv Detail & Related papers (2025-01-07T14:49:12Z) - Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites:
A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area.
We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions.
We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z) - Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification
using Dermoscopy and Clinical Images [7.159532626507458]
This study introduces a Graph Convolution Network (GCN) to exploit prior co-occurrence between each category as a correlation matrix into the deep learning model for the multi-label classification.
We propose a Graph-Ensemble Learning Model (GELN) that views the prediction from GCN as complementary information of the predictions from the fusion model.
arXiv Detail & Related papers (2023-07-04T13:19:57Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy
Medical Imaging [67.02500668641831]
Deep learning models trained on noisy datasets are sensitive to the noise type and lead to less generalization on unseen samples.
We propose a Robust Knowledge Distillation (RoS-KD) framework which mimics the notion of learning a topic from multiple sources to ensure deterrence in learning noisy information.
RoS-KD learns a smooth, well-informed, and robust student manifold by distilling knowledge from multiple teachers trained on overlapping subsets of training data.
arXiv Detail & Related papers (2022-10-15T22:32:20Z) - Learning Discriminative Representation via Metric Learning for
Imbalanced Medical Image Classification [52.94051907952536]
We propose embedding metric learning into the first stage of the two-stage framework specially to help the feature extractor learn to extract more discriminative feature representations.
Experiments mainly on three medical image datasets show that the proposed approach consistently outperforms existing onestage and two-stage approaches.
arXiv Detail & Related papers (2022-07-14T14:57:01Z) - Dynamic Sub-Cluster-Aware Network for Few-Shot Skin Disease
Classification [31.539129126161978]
This paper introduces a novel approach called the Sub-Cluster-Aware Network (SCAN) that enhances accuracy in diagnosing rare skin diseases.
The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple sub-clusters.
We evaluate the proposed approach on two public datasets for few-shot skin disease classification.
arXiv Detail & Related papers (2022-07-03T16:06:04Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Categorical Relation-Preserving Contrastive Knowledge Distillation for
Medical Image Classification [75.27973258196934]
We propose a novel Categorical Relation-preserving Contrastive Knowledge Distillation (CRCKD) algorithm, which takes the commonly used mean-teacher model as the supervisor.
With this regularization, the feature distribution of the student model shows higher intra-class similarity and inter-class variance.
With the contribution of the CCD and CRP, our CRCKD algorithm can distill the relational knowledge more comprehensively.
arXiv Detail & Related papers (2021-07-07T13:56:38Z) - Multiple Sclerosis Lesion Activity Segmentation with Attention-Guided
Two-Path CNNs [49.32653090178743]
convolutional neural networks (CNNs) are studied for lesion activity segmentation from two time points.
CNNs are designed and evaluated that combine the information from two points in different ways.
It is demonstrated that deep learning-based methods outperform classic approaches.
arXiv Detail & Related papers (2020-08-05T08:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.