Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAG
- URL: http://arxiv.org/abs/2412.16086v2
- Date: Wed, 22 Jan 2025 17:18:15 GMT
- Title: Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAG
- Authors: Hasan Md Tusfiqur Alam, Devansh Srivastav, Md Abdul Kadir, Daniel Sonntag,
- Abstract summary: This study enhances interpretability in Chest X-ray (CXR) classification by using concept bottleneck models (CBMs) and a multi-agent Retrieval-Augmented Generation (RAG) system for report generation.
By modeling relationships between visual features and clinical concepts, we create interpretable concept vectors that guide a multi-agent RAG system to generate radiology reports.
- Score: 1.9374282535132377
- License:
- Abstract: Deep learning has advanced medical image classification, but interpretability challenges hinder its clinical adoption. This study enhances interpretability in Chest X-ray (CXR) classification by using concept bottleneck models (CBMs) and a multi-agent Retrieval-Augmented Generation (RAG) system for report generation. By modeling relationships between visual features and clinical concepts, we create interpretable concept vectors that guide a multi-agent RAG system to generate radiology reports, enhancing clinical relevance, explainability, and transparency. Evaluation of the generated reports using an LLM-as-a-judge confirmed the interpretability and clinical utility of our model's outputs. On the COVID-QU dataset, our model achieved 81% classification accuracy and demonstrated robust report generation performance, with five key metrics ranging between 84% and 90%. This interpretable multi-agent framework bridges the gap between high-performance AI and the explainability required for reliable AI-driven CXR analysis in clinical settings. Our code is available at https://github.com/tifat58/IRR-with-CBM-RAG.git.
Related papers
- Hybrid Interpretable Deep Learning Framework for Skin Cancer Diagnosis: Integrating Radial Basis Function Networks with Explainable AI [1.1049608786515839]
Skin cancer is one of the most prevalent and potentially life-threatening diseases worldwide.
We propose a novel hybrid deep learning framework that integrates convolutional neural networks (CNNs) with Radial Basis Function (RBF) Networks to achieve high classification accuracy and enhanced interpretability.
arXiv Detail & Related papers (2025-01-24T19:19:02Z) - RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment [10.67889367763112]
RadAlign is a novel framework that combines the predictive accuracy of vision-language models with the reasoning capabilities of large language models.
Our framework maintains strong clinical interpretability while reducing hallucinations, advancing automated medical imaging and report analysis through integrated predictive and generative AI.
arXiv Detail & Related papers (2025-01-13T17:55:32Z) - CXR-Agent: Vision-language models for chest X-ray interpretation with uncertainty aware radiology reporting [0.0]
We evaluate the publicly available, state of the art, foundational vision-language models for chest X-ray interpretation.
We find that vision-language models often hallucinate with confident language, which slows down clinical interpretation.
We develop an agent-based vision-language approach for report generation using CheXagent's linear probes and BioViL-T's phrase grounding tools.
arXiv Detail & Related papers (2024-07-11T18:39:19Z) - WoLF: Wide-scope Large Language Model Framework for CXR Understanding [8.265578494822087]
We introduce Wide-scope Large Language Model Framework for Chest X-ray understanding.
We capture multi-faceted records of patients, which are utilized for accurate diagnoses in real-world clinical scenarios.
arXiv Detail & Related papers (2024-03-19T06:39:23Z) - HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction [16.060286162384536]
HistGen is a learning-empowered framework for histopathology report generation.
It aims to boost report generation by aligning whole slide images (WSIs) and diagnostic reports from local and global granularity.
Experimental results on WSI report generation show the proposed model outperforms state-of-the-art (SOTA) models by a large margin.
arXiv Detail & Related papers (2024-03-08T15:51:43Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Learning disentangled representations for explainable chest X-ray
classification using Dirichlet VAEs [68.73427163074015]
This study explores the use of the Dirichlet Variational Autoencoder (DirVAE) for learning disentangled latent representations of chest X-ray (CXR) images.
The predictive capacity of multi-modal latent representations learned by DirVAE models is investigated through implementation of an auxiliary multi-label classification task.
arXiv Detail & Related papers (2023-02-06T18:10:08Z) - Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation [116.87918100031153]
We propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG)
CGT injects clinical relation triples into the visual features as prior knowledge to drive the decoding procedure.
Experiments on the large-scale FFA-IR benchmark demonstrate that the proposed CGT is able to outperform previous benchmark methods.
arXiv Detail & Related papers (2022-06-04T13:16:30Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.