Improving Pneumonia Localization via Cross-Attention on Medical Images
and Reports
- URL: http://arxiv.org/abs/2110.03094v1
- Date: Wed, 6 Oct 2021 22:47:48 GMT
- Title: Improving Pneumonia Localization via Cross-Attention on Medical Images
and Reports
- Authors: Riddhish Bhalodia and Ali Hatamizadeh and Leo Tam and Ziyue Xu and
Xiaosong Wang and Evrim Turkbey and Daguang Xu
- Abstract summary: We propose a novel weakly-supervised attention-driven deep learning model that leverages encoded information in medical reports during training to facilitate better localization.
Our model also performs classification of attributes that are associated to pneumonia and extracted from medical reports for supervision.
In this paper, we explore and analyze the model using chest X-ray datasets and demonstrate qualitatively and quantitatively that the introduction of textual information improves pneumonia localization.
- Score: 9.034599866957945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Localization and characterization of diseases like pneumonia are primary
steps in a clinical pipeline, facilitating detailed clinical diagnosis and
subsequent treatment planning. Additionally, such location annotated datasets
can provide a pathway for deep learning models to be used for downstream tasks.
However, acquiring quality annotations is expensive on human resources and
usually requires domain expertise. On the other hand, medical reports contain a
plethora of information both about pneumonia characteristics and its location.
In this paper, we propose a novel weakly-supervised attention-driven deep
learning model that leverages encoded information in medical reports during
training to facilitate better localization. Our model also performs
classification of attributes that are associated to pneumonia and extracted
from medical reports for supervision. Both the classification and localization
are trained in conjunction and once trained, the model can be utilized for both
the localization and characterization of pneumonia using only the input image.
In this paper, we explore and analyze the model using chest X-ray datasets and
demonstrate qualitatively and quantitatively that the introduction of textual
information improves pneumonia localization. We showcase quantitative results
on two datasets, MIMIC-CXR and Chest X-ray-8, and we also showcase severity
characterization on the COVID-19 dataset.
Related papers
- Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers [14.144599890583308]
We propose a novel approach to cough-based disease classification based on both self-supervised and supervised learning on a large-scale cough data set.
Experimental results demonstrate our proposed approach outperforms prior arts consistently on two benchmark datasets for COVID-19 diagnosis and a proprietary dataset for COPD/non-COPD classification with an AUROC of 92.5%.
arXiv Detail & Related papers (2024-08-28T09:40:40Z) - Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis [53.809054774037214]
This paper proposes leveraging vision-language pretraining on bone X-rays paired with French reports.
It is the first study to integrate French reports to shape the embedding space devoted to bone X-Rays representations.
arXiv Detail & Related papers (2024-05-14T19:53:20Z) - Classifying Cancer Stage with Open-Source Clinical Large Language Models [0.35998666903987897]
Open-source clinical large language models (LLMs) can extract pathologic tumor-node-metastasis (pTNM) staging information from real-world pathology reports.
Our findings suggest that while LLMs still exhibit subpar performance in Tumor (T) classification, with the appropriate adoption of prompting strategies, they can achieve comparable performance on Metastasis (M) and improved performance on Node (N) classification.
arXiv Detail & Related papers (2024-04-02T02:30:47Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - Deep Pneumonia: Attention-Based Contrastive Learning for
Class-Imbalanced Pneumonia Lesion Recognition in Chest X-rays [11.229472535033558]
We propose a deep learning framework named Attention-Based Contrastive Learning for Class-Imbalanced X-Ray Pneumonia Lesion Recognition.
Our proposed framework can be used as a reliable computer-aided pneumonia diagnosis system to assist doctors to better diagnose pneumonia cases accurately.
arXiv Detail & Related papers (2022-07-23T02:28:37Z) - MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis [10.133715767542386]
We propose a knowledge-driven and data-driven framework for lung disease diagnosis.
We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data.
A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
arXiv Detail & Related papers (2022-02-09T04:12:30Z) - CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19
Patients Using Deep Learning [133.87426554801252]
We adopted an approach based on using an ensemble of deep convolutionalneural networks for segmentation of lung CT scans.
Using our models we are able to segment the lesions, evaluatepatients dynamics, estimate relative volume of lungs affected by lesions and evaluate the lung damage stage.
arXiv Detail & Related papers (2021-05-25T12:06:55Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Localization of Critical Findings in Chest X-Ray without Local
Annotations Using Multi-Instance Learning [0.0]
deep learning models commonly suffer from a lack of explainability.
Deep learning models require locally annotated training data in form of pixel level labels or bounding box coordinates.
In this work, we address these shortcomings with an interpretable DL algorithm based on multi-instance learning.
arXiv Detail & Related papers (2020-01-23T21:29:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.