DCAT: Dual Cross-Attention Fusion for Disease Classification in Radiological Images with Uncertainty Estimation
- URL: http://arxiv.org/abs/2503.11851v2
- Date: Wed, 19 Mar 2025 12:18:48 GMT
- Title: DCAT: Dual Cross-Attention Fusion for Disease Classification in Radiological Images with Uncertainty Estimation
- Authors: Jutika Borah, Hidam Kumarjit Singh,
- Abstract summary: This paper proposes a novel dual cross-attention fusion model for medical image analysis.<n>It addresses key challenges in feature integration and interpretability.<n>The proposed model achieved AUC of 99.75%, 100%, 99.93% and 98.69% and AUPR of 99.81%, 100%, 99.97%, and 96.36% on Covid-19, Tuberculosis, Pneumonia Chest X-ray images and Retinal OCT images respectively.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and reliable image classification is crucial in radiology, where diagnostic decisions significantly impact patient outcomes. Conventional deep learning models tend to produce overconfident predictions despite underlying uncertainties, potentially leading to misdiagnoses. Attention mechanisms have emerged as powerful tools in deep learning, enabling models to focus on relevant parts of the input data. Combined with feature fusion, they can be effective in addressing uncertainty challenges. Cross-attention has become increasingly important in medical image analysis for capturing dependencies across features and modalities. This paper proposes a novel dual cross-attention fusion model for medical image analysis by addressing key challenges in feature integration and interpretability. Our approach introduces a bidirectional cross-attention mechanism with refined channel and spatial attention that dynamically fuses feature maps from EfficientNetB4 and ResNet34 leveraging multi-network contextual dependencies. The refined features through channel and spatial attention highlights discriminative patterns crucial for accurate classification. The proposed model achieved AUC of 99.75%, 100%, 99.93% and 98.69% and AUPR of 99.81%, 100%, 99.97%, and 96.36% on Covid-19, Tuberculosis, Pneumonia Chest X-ray images and Retinal OCT images respectively. The entropy values and several high uncertain samples give an interpretable visualization from the model enhancing transparency. By combining multi-scale feature extraction, bidirectional attention and uncertainty estimation, our proposed model strongly impacts medical image analysis.
Related papers
- Causal Disentanglement for Robust Long-tail Medical Image Generation [80.15257897500578]
We propose a novel medical image generation framework, which generates independent pathological and structural features.
We leverage a diffusion model guided by pathological findings to model pathological features, enabling the generation of diverse counterfactual images.
arXiv Detail & Related papers (2025-04-20T01:54:18Z) - An Intrinsically Explainable Approach to Detecting Vertebral Compression Fractures in CT Scans via Neurosymbolic Modeling [9.108675519106319]
Vertebral compression fractures (VCFs) are a common and potentially serious consequence of osteoporosis.
In high-stakes scenarios like opportunistic medical diagnosis, model interpretability is a key factor for the adoption of AI recommendations.
We introduce a neurosymbolic approach for VCF detection in CT volumes.
arXiv Detail & Related papers (2024-12-23T04:01:44Z) - Multiscale Latent Diffusion Model for Enhanced Feature Extraction from Medical Images [5.395912799904941]
variations in CT scanner models and acquisition protocols introduce significant variability in the extracted radiomic features.<n> LTDiff++ is a multiscale latent diffusion model designed to enhance feature extraction in medical imaging.
arXiv Detail & Related papers (2024-10-05T02:13:57Z) - StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model [62.25424831998405]
StealthDiffusion is a framework that modifies AI-generated images into high-quality, imperceptible adversarial examples.
It is effective in both white-box and black-box settings, transforming AI-generated images into high-quality adversarial forgeries.
arXiv Detail & Related papers (2024-08-11T01:22:29Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - Reconstruction of Patient-Specific Confounders in AI-based Radiologic
Image Interpretation using Generative Pretraining [12.656718786788758]
We propose a self-conditioned diffusion model termed DiffChest and train it on a dataset of chest radiographs.
DiffChest explains classifications on a patient-specific level and visualizes the confounding factors that may mislead the model.
Our findings highlight the potential of pretraining based on diffusion models in medical image classification.
arXiv Detail & Related papers (2023-09-29T10:38:08Z) - HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic
Joint Infection Diagnosis Using CT Images and Text [0.0]
Prosthetic Joint Infection (PJI) is a prevalent and severe complication.
Currently, a unified diagnostic standard incorporating both computed tomography (CT) images and numerical text data for PJI remains unestablished.
This study introduces a diagnostic method, HGT, based on deep learning and multimodal techniques.
arXiv Detail & Related papers (2023-05-29T11:25:57Z) - Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis.
The hierarchical transformers in the generator are designed to estimate the noise at multiple scales.
Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z) - MAF-Net: Multiple attention-guided fusion network for fundus vascular
image segmentation [1.3295074739915493]
We propose a multiple attention-guided fusion network (MAF-Net) to accurately detect blood vessels in retinal fundus images.
Traditional UNet-based models may lose partial information due to explicitly modeling long-distance dependencies.
We show that our method produces satisfactory results compared to some state-of-the-art methods.
arXiv Detail & Related papers (2023-05-05T15:22:20Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - Diagnose Like a Radiologist: Hybrid Neuro-Probabilistic Reasoning for
Attribute-Based Medical Image Diagnosis [42.624671531003166]
We introduce a hybrid neuro-probabilistic reasoning algorithm for verifiable attribute-based medical image diagnosis.
We have successfully applied our hybrid reasoning algorithm to two challenging medical image diagnosis tasks.
arXiv Detail & Related papers (2022-08-19T12:06:46Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Two-Stream Deep Feature Modelling for Automated Video Endoscopy Data
Analysis [45.19890687786009]
We propose a two-stream model for endoscopic image analysis.
Our model fuses two streams of deep feature inputs by mapping their inherent relations through a novel relational network model.
arXiv Detail & Related papers (2020-07-12T05:24:08Z) - Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent
Multi-View Representation Learning [48.05232274463484]
Recently, the outbreak of Coronavirus Disease 2019 (COVID-19) has spread rapidly across the world.
Due to the large number of affected patients and heavy labor for doctors, computer-aided diagnosis with machine learning algorithm is urgently needed.
In this study, we propose to conduct the diagnosis of COVID-19 with a series of features extracted from CT images.
arXiv Detail & Related papers (2020-05-06T15:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.