LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection
- URL: http://arxiv.org/abs/2511.18425v1
- Date: Sun, 23 Nov 2025 12:44:13 GMT
- Title: LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection
- Authors: Mansur Yerzhanuly,
- Abstract summary: Pneumonia remains a leading global cause of mortality where timely diagnosis is critical.<n>We introduce LungX, a novel hybrid architecture combining EfficientNet's multi-scale features, CBAM attention mechanisms, and Vision Transformer's global context modeling.<n>LungX achieves state-of-the-art performance (86.5 percent accuracy, 0.943 AUC), representing a 6.7 percent AUC improvement over EfficientNet-B0 baselines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pneumonia remains a leading global cause of mortality where timely diagnosis is critical. We introduce LungX, a novel hybrid architecture combining EfficientNet's multi-scale features, CBAM attention mechanisms, and Vision Transformer's global context modeling for enhanced pneumonia detection. Evaluated on 20,000 curated chest X-rays from RSNA and CheXpert, LungX achieves state-of-the-art performance (86.5 percent accuracy, 0.943 AUC), representing a 6.7 percent AUC improvement over EfficientNet-B0 baselines. Visual analysis demonstrates superior lesion localization through interpretable attention maps. Future directions include multi-center validation and architectural optimizations targeting 88 percent accuracy for clinical deployment as an AI diagnostic aid.
Related papers
- A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice [83.11942224668127]
Janus-Pro-CXR (1B) is a chest X-ray interpretation system based on DeepSeek Janus-Pro model.<n>Our system outperforms state-of-the-art X-ray report generation models in automated report generation.
arXiv Detail & Related papers (2025-12-23T13:26:13Z) - Enhanced Chest Disease Classification Using an Improved CheXNet Framework with EfficientNetV2-M and Optimization-Driven Learning [1.0499611180329804]
The original CheXNet architecture has shown potential in automated analysis of chest radiographs.<n>We suggest a better classification framework of chest disease that relies on EfficientNetV2-M.
arXiv Detail & Related papers (2025-12-08T05:02:47Z) - LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening [37.29507297342265]
LungEvaty is a transformer-based framework for predicting 1-6 year lung cancer risk from a single LDCT scan.<n>It learns directly from large-scale screening data to capture comprehensive anatomical and pathological cues relevant for malignancy risk.<n>LungEvaty was trained on more than 90,000 CT scans, including over 28,000 for fine-tuning and 6,000 for evaluation.
arXiv Detail & Related papers (2025-11-25T09:38:10Z) - Cancer-Net PCa-MultiSeg: Multimodal Enhancement of Prostate Cancer Lesion Segmentation Using Synthetic Correlated Diffusion Imaging [55.62977326180104]
Current deep learning approaches for prostate cancer lesion segmentation achieve limited performance.<n>We investigate synthetic correlated diffusion imaging (CDI$s$) as an enhancement to standard diffusion-based protocols.<n>Our results establish validated integration pathways for CDI$s$ as a practical drop-in enhancement for PCa lesion segmentation tasks.
arXiv Detail & Related papers (2025-11-11T04:16:12Z) - Weakly Supervised Pneumonia Localization from Chest X-Rays Using Deep Neural Network and Grad-CAM Explanations [0.0]
This study proposes a weakly supervised deep learning framework for pneumonia classification and localization from chest X-rays.<n>Instead of costly pixel-level annotations, our approach utilizes image-level labels to generate clinically meaningful heatmaps.
arXiv Detail & Related papers (2025-11-01T08:44:24Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis [32.83724094606554]
We introduce Regional Expert Networks (REN), the first anatomically-informed MoE framework tailored specifically for medical image classification.<n>REN leverages anatomical priors to train seven specialized experts, each dedicated to distinct lung lobes and bilateral lung combinations.<n>Through rigorous patient-level cross-validation, REN demonstrates strong generalizability and clinical interpretability.
arXiv Detail & Related papers (2025-10-06T15:35:08Z) - A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer [54.58205672910646]
RenalCLIP is a visual-language foundation model for characterization, diagnosis and prognosis of renal mass.<n>It achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer.
arXiv Detail & Related papers (2025-08-22T17:48:19Z) - XAI-Guided Analysis of Residual Networks for Interpretable Pneumonia Detection in Paediatric Chest X-rays [0.0]
We propose an interpretable deep learning model on Residual Networks (ResNets) for automatically diagnosing paediatric pneumonia on chest X-rays.<n>Our ResNet-50 model, trained on a large paediatric chest X-rays dataset, achieves high classification accuracy (95.94%), AUC-ROC (98.91%), and Cohen's Kappa (0.913), accompanied by clinically meaningful visual explanations.
arXiv Detail & Related papers (2025-07-18T21:19:26Z) - Comparative Analysis of Vision Transformers and Traditional Deep Learning Approaches for Automated Pneumonia Detection in Chest X-Rays [1.2310316230437004]
Pneumonia, particularly when induced by diseases like COVID-19, remains a critical global health challenge requiring rapid and accurate diagnosis.<n>This study presents a comprehensive comparison of traditional machine learning and state-of-the-art deep learning approaches for automated pneumonia detection using chest X-rays.<n>We demonstrate that Vision Transformers, particularly the Cross-ViT architecture, achieve superior performance with 88.25% accuracy and 99.42% recall.
arXiv Detail & Related papers (2025-07-11T16:26:24Z) - RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods [0.0]
This study explores the application of Vision Transformers (ViT), a state-of-the-art architecture in machine learning, to chest X-ray analysis.
I present a comparative analysis of two ViT-based approaches: one utilizing full chest X-ray images and another focusing on segmented lung regions.
arXiv Detail & Related papers (2024-11-18T08:40:25Z) - Deep learning in computed tomography pulmonary angiography imaging: a
dual-pronged approach for pulmonary embolism detection [0.0]
The aim of this study is to leverage deep learning techniques to enhance the Computer Assisted Diagnosis (CAD) of Pulmonary Embolism (PE)
Our classification system includes an Attention-Guided Convolutional Neural Network (AG-CNN) that uses local context by employing an attention mechanism.
AG-CNN achieves robust performance on the FUMPE dataset, achieving an AUROC of 0.927, sensitivity of 0.862, specificity of 0.879, and an F1-score of 0.805 with the Inception-v3 backbone architecture.
arXiv Detail & Related papers (2023-11-09T08:23:44Z) - Differentiable Topology-Preserved Distance Transform for Pulmonary
Airway Segmentation [34.22415353209505]
We propose a Differentiable Topology-Preserved Distance Transform (DTPDT) framework to improve the performance of airway segmentation.
A Topology-Preserved Surrogate (TPS) learning strategy is first proposed to balance the training progress within-class distribution.
A Convolutional Distance Transform (CDT) is designed to identify the breakage phenomenon with superior sensitivity and minimize the variation of the distance map between the predictionand ground-truth.
arXiv Detail & Related papers (2022-09-17T15:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.