Related papers: NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

URL: http://arxiv.org/abs/2508.09715v1
Date: Wed, 13 Aug 2025 11:08:09 GMT
Title: NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation
Authors: Devvrat Joshi, Islem Rekik,
Abstract summary: NEURAL is a novel framework that addresses the storage and transmission challenges of medical imaging data.<n>Our approach repurposes cross-attention scores between the image and its radiological report to structurally prune chest X-rays.<n>NEURAL achieves a 93.4-97.7% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC.
Score: 6.253771639590563
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The rapid growth of multimodal medical imaging data presents significant storage and transmission challenges, particularly in resource-constrained clinical settings. We propose NEURAL, a novel framework that addresses this by using semantics-guided data compression. Our approach repurposes cross-attention scores between the image and its radiological report from a fine-tuned generative vision-language model to structurally prune chest X-rays, preserving only diagnostically critical regions. This process transforms the image into a highly compressed, graph representation. This unified graph-based representation fuses the pruned visual graph with a knowledge graph derived from the clinical report, creating a universal data structure that simplifies downstream modeling. Validated on the MIMIC-CXR and CheXpert Plus dataset for pneumonia detection, NEURAL achieves a 93.4-97.7\% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC, outperforming other baseline models that use uncompressed data. By creating a persistent, task-agnostic data asset, NEURAL resolves the trade-off between data size and clinical utility, enabling efficient workflows and teleradiology without sacrificing performance. Our NEURAL code is available at https://github.com/basiralab/NEURAL.

Related papers

SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model [0.3823356975862005]
We propose a novel view-conditioned model for multi-view X-ray images from a single view.<n>Our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation.<n>This capability has significant implications not only for clinical applications but also for medical education and data extension.
arXiv Detail & Related papers (2025-07-07T15:58:11Z)
ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process.<n>We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z)
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis [1.2903829793534272]
Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions.<n>Efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records.<n>This paper introduces MedPromptX, the first clinical decision support system that integrates multimodal large language models (MLLMs), few-shot prompting (FP) and visual grounding (VG)<n>Results demonstrate the SOTA performance of MedPromptX, achieving an 11% improvement in F1-score compared to the baselines.
arXiv Detail & Related papers (2024-03-22T19:19:51Z)
Medical Report Generation based on Segment-Enhanced Contrastive Representation Learning [39.17345313432545]
We propose MSCL (Medical image with Contrastive Learning) to segment organs, abnormalities, bones, etc. We introduce a supervised contrastive loss that assigns more weight to reports that are semantically similar to the target while training. Experimental results demonstrate the effectiveness of our proposed model, where we achieve state-of-the-art performance on the IU X-Ray public dataset.
arXiv Detail & Related papers (2023-12-26T03:33:48Z)
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z)
Connecting the Dots: Graph Neural Network Powered Ensemble and Classification of Medical Images [0.0]
Deep learning for medical imaging is limited due to the requirement for large amounts of training data. We employ the Image Foresting Transform to optimally segment images into superpixels. These superpixels are subsequently transformed into graph-structured data, enabling the proficient extraction of features and modeling of relationships.
arXiv Detail & Related papers (2023-11-13T13:20:54Z)
Vision-Language Modelling For Radiological Imaging and Reports In The Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space. We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains. Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z)
Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning. In detail, the fundamental structure of our graph is pre-constructed from general knowledge. Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z)
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
Covid-19 Detection from Chest X-ray and Patient Metadata using Graph Convolutional Neural Networks [6.420262246029286]
We propose a novel Graph Convolution Neural Network (GCN) that is capable of identifying bio-markers of Covid-19 pneumonia. The proposed method exploits important relational knowledge between data instances and their features using graph representation and applies convolution to learn the graph data.
arXiv Detail & Related papers (2021-05-20T13:13:29Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.