Related papers: LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection

LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection

URL: http://arxiv.org/abs/2511.18425v1
Date: Sun, 23 Nov 2025 12:44:13 GMT
Title: LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection
Authors: Mansur Yerzhanuly,
Abstract summary: Pneumonia remains a leading global cause of mortality where timely diagnosis is critical.<n>We introduce LungX, a novel hybrid architecture combining EfficientNet's multi-scale features, CBAM attention mechanisms, and Vision Transformer's global context modeling.<n>LungX achieves state-of-the-art performance (86.5 percent accuracy, 0.943 AUC), representing a 6.7 percent AUC improvement over EfficientNet-B0 baselines.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pneumonia remains a leading global cause of mortality where timely diagnosis is critical. We introduce LungX, a novel hybrid architecture combining EfficientNet's multi-scale features, CBAM attention mechanisms, and Vision Transformer's global context modeling for enhanced pneumonia detection. Evaluated on 20,000 curated chest X-rays from RSNA and CheXpert, LungX achieves state-of-the-art performance (86.5 percent accuracy, 0.943 AUC), representing a 6.7 percent AUC improvement over EfficientNet-B0 baselines. Visual analysis demonstrates superior lesion localization through interpretable attention maps. Future directions include multi-center validation and architectural optimizations targeting 88 percent accuracy for clinical deployment as an AI diagnostic aid.

Related papers

A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice [83.11942224668127]
Janus-Pro-CXR (1B) is a chest X-ray interpretation system based on DeepSeek Janus-Pro model.<n>Our system outperforms state-of-the-art X-ray report generation models in automated report generation.
arXiv Detail & Related papers (2025-12-23T13:26:13Z)
Enhanced Chest Disease Classification Using an Improved CheXNet Framework with EfficientNetV2-M and Optimization-Driven Learning [1.0499611180329804]
The original CheXNet architecture has shown potential in automated analysis of chest radiographs.<n>We suggest a better classification framework of chest disease that relies on EfficientNetV2-M.
arXiv Detail & Related papers (2025-12-08T05:02:47Z)
LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening [37.29507297342265]
LungEvaty is a transformer-based framework for predicting 1-6 year lung cancer risk from a single LDCT scan.<n>It learns directly from large-scale screening data to capture comprehensive anatomical and pathological cues relevant for malignancy risk.<n>LungEvaty was trained on more than 90,000 CT scans, including over 28,000 for fine-tuning and 6,000 for evaluation.
arXiv Detail & Related papers (2025-11-25T09:38:10Z)
Cancer-Net PCa-MultiSeg: Multimodal Enhancement of Prostate Cancer Lesion Segmentation Using Synthetic Correlated Diffusion Imaging [55.62977326180104]
Current deep learning approaches for prostate cancer lesion segmentation achieve limited performance.<n>We investigate synthetic correlated diffusion imaging (CDI$s$) as an enhancement to standard diffusion-based protocols.<n>Our results establish validated integration pathways for CDI$s$ as a practical drop-in enhancement for PCa lesion segmentation tasks.
arXiv Detail & Related papers (2025-11-11T04:16:12Z)
Weakly Supervised Pneumonia Localization from Chest X-Rays Using Deep Neural Network and Grad-CAM Explanations [0.0]
This study proposes a weakly supervised deep learning framework for pneumonia classification and localization from chest X-rays.<n>Instead of costly pixel-level annotations, our approach utilizes image-level labels to generate clinically meaningful heatmaps.
arXiv Detail & Related papers (2025-11-01T08:44:24Z)
An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z)
REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis [32.83724094606554]
We introduce Regional Expert Networks (REN), the first anatomically-informed MoE framework tailored specifically for medical image classification.<n>REN leverages anatomical priors to train seven specialized experts, each dedicated to distinct lung lobes and bilateral lung combinations.<n>Through rigorous patient-level cross-validation, REN demonstrates strong generalizability and clinical interpretability.
arXiv Detail & Related papers (2025-10-06T15:35:08Z)
A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer [54.58205672910646]
RenalCLIP is a visual-language foundation model for characterization, diagnosis and prognosis of renal mass.<n>It achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer.
arXiv Detail & Related papers (2025-08-22T17:48:19Z)
XAI-Guided Analysis of Residual Networks for Interpretable Pneumonia Detection in Paediatric Chest X-rays [0.0]
We propose an interpretable deep learning model on Residual Networks (ResNets) for automatically diagnosing paediatric pneumonia on chest X-rays.<n>Our ResNet-50 model, trained on a large paediatric chest X-rays dataset, achieves high classification accuracy (95.94%), AUC-ROC (98.91%), and Cohen's Kappa (0.913), accompanied by clinically meaningful visual explanations.
arXiv Detail & Related papers (2025-07-18T21:19:26Z)
Comparative Analysis of Vision Transformers and Traditional Deep Learning Approaches for Automated Pneumonia Detection in Chest X-Rays [1.2310316230437004]
Pneumonia, particularly when induced by diseases like COVID-19, remains a critical global health challenge requiring rapid and accurate diagnosis.<n>This study presents a comprehensive comparison of traditional machine learning and state-of-the-art deep learning approaches for automated pneumonia detection using chest X-rays.<n>We demonstrate that Vision Transformers, particularly the Cross-ViT architecture, achieve superior performance with 88.25% accuracy and 99.42% recall.
arXiv Detail & Related papers (2025-07-11T16:26:24Z)
RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z)
Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods [0.0]
This study explores the application of Vision Transformers (ViT), a state-of-the-art architecture in machine learning, to chest X-ray analysis. I present a comparative analysis of two ViT-based approaches: one utilizing full chest X-ray images and another focusing on segmented lung regions.
arXiv Detail & Related papers (2024-11-18T08:40:25Z)
Deep learning in computed tomography pulmonary angiography imaging: a dual-pronged approach for pulmonary embolism detection [0.0]
The aim of this study is to leverage deep learning techniques to enhance the Computer Assisted Diagnosis (CAD) of Pulmonary Embolism (PE) Our classification system includes an Attention-Guided Convolutional Neural Network (AG-CNN) that uses local context by employing an attention mechanism. AG-CNN achieves robust performance on the FUMPE dataset, achieving an AUROC of 0.927, sensitivity of 0.862, specificity of 0.879, and an F1-score of 0.805 with the Inception-v3 backbone architecture.
arXiv Detail & Related papers (2023-11-09T08:23:44Z)
Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation [34.22415353209505]
We propose a Differentiable Topology-Preserved Distance Transform (DTPDT) framework to improve the performance of airway segmentation. A Topology-Preserved Surrogate (TPS) learning strategy is first proposed to balance the training progress within-class distribution. A Convolutional Distance Transform (CDT) is designed to identify the breakage phenomenon with superior sensitivity and minimize the variation of the distance map between the predictionand ground-truth.
arXiv Detail & Related papers (2022-09-17T15:47:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.