Related papers: Indication as Prior Knowledge for Multimodal Disease Classification in Chest Radiographs with Transformers

Indication as Prior Knowledge for Multimodal Disease Classification in Chest Radiographs with Transformers

URL: http://arxiv.org/abs/2202.06076v1
Date: Sat, 12 Feb 2022 14:23:30 GMT
Title: Indication as Prior Knowledge for Multimodal Disease Classification in Chest Radiographs with Transformers
Authors: Grzegorz Jacenk\'ow, Alison Q. O'Neil, Sotirios A. Tsaftaris
Abstract summary: We use the indication field to drive better image classification, by taking a transformer network which is unimodally pre-trained on text. We evaluate the method on the MIMIC-CXR dataset, and present ablation studies to investigate the effect of the indication field on the classification performance.
Score: 15.841982111622626
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When a clinician refers a patient for an imaging exam, they include the reason (e.g. relevant patient history, suspected disease) in the scan request; this appears as the indication field in the radiology report. The interpretation and reporting of the image are substantially influenced by this request text, steering the radiologist to focus on particular aspects of the image. We use the indication field to drive better image classification, by taking a transformer network which is unimodally pre-trained on text (BERT) and fine-tuning it for multimodal classification of a dual image-text input. We evaluate the method on the MIMIC-CXR dataset, and present ablation studies to investigate the effect of the indication field on the classification performance. The experimental results show our approach achieves 87.8 average micro AUROC, outperforming the state-of-the-art methods for unimodal (84.4) and multimodal (86.0) classification. Our code is available at https://github.com/jacenkow/mmbt.

Related papers

UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation [39.48115172323913]
We propose UltraAD, a vision-language model (VLM)-based approach for anomaly localization and fine-grained classification.<n>UltraAD has been extensively evaluated on three breast US datasets, outperforming state-of-the-art methods in both lesion datasets and fine-grained medical classification.
arXiv Detail & Related papers (2025-06-24T15:00:38Z)
RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining [64.66825253356869]
We propose a novel methodology that leverages dense radiology reports to define image-wise similarity ordering at multiple granularities.<n>We construct two comprehensive medical imaging retrieval datasets: MIMIC-IR for Chest X-rays and CTRATE-IR for CT scans.<n>We develop two retrieval systems, RadIR-CXR and model-ChestCT, which demonstrate superior performance in traditional image-image and image-report retrieval tasks.
arXiv Detail & Related papers (2025-03-06T17:43:03Z)
Fréchet Radiomic Distance (FRD): A Versatile Metric for Comparing Medical Imaging Datasets [13.737058479403311]
We introduce a new perceptual metric tailored for medical images, FRD (Fr'echet Radiomic Distance)<n>We show that FRD is superior to other image distribution metrics for a range of medical imaging applications.<n> FRD offers additional benefits such as stability and computational efficiency at low sample sizes.
arXiv Detail & Related papers (2024-12-02T13:49:14Z)
MedIAnomaly: A comparative study of anomaly detection in medical images [26.319602363581442]
Anomaly detection (AD) aims at detecting abnormal samples that deviate from the expected normal patterns. Despite numerous methods for medical AD, we observe a lack of a fair and comprehensive evaluation. This paper builds a benchmark with unified comparison.
arXiv Detail & Related papers (2024-04-06T06:18:11Z)
VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics [0.0]
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image. We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models. The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction.
arXiv Detail & Related papers (2024-01-02T19:51:49Z)
Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis [61.089776864520594]
We propose eye-tracking as an alternative to text reports for medical images. By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning. We introduce the Medical contrastive Gaze Image Pre-training (McGIP) as a plug-and-play module for contrastive learning frameworks.
arXiv Detail & Related papers (2023-12-11T02:27:45Z)
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z)
MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from Chest X-ray Images [10.616065108433798]
We propose a new method, namely Multi-task Vision Transformer (MVC) for simultaneously classifying chest X-ray images and identifying affected regions from the input data. Our method is built upon the Vision Transformer but extends its learning capability in a multi-task setting.
arXiv Detail & Related papers (2023-09-30T15:52:18Z)
HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text [0.0]
Prosthetic Joint Infection (PJI) is a prevalent and severe complication. Currently, a unified diagnostic standard incorporating both computed tomography (CT) images and numerical text data for PJI remains unestablished. This study introduces a diagnostic method, HGT, based on deep learning and multimodal techniques.
arXiv Detail & Related papers (2023-05-29T11:25:57Z)
Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images. ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present. Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z)
Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays [65.88435151891369]
Radiomics-Guided Transformer (RGT) fuses textitglobal image information with textitlocal knowledge-guided radiomics information. RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomic information.
arXiv Detail & Related papers (2022-07-10T06:32:56Z)
Multi-Label Retinal Disease Classification using Transformers [0.0]
A new multi-label retinal disease dataset, MuReD, is constructed, using a number of publicly available datasets for fundus disease classification. A transformer-based model optimized through extensive experimentation is used for image analysis and decision making. It is shown that the approach performs better than state-of-the-art works on the same task by 7.9% and 8.1% in terms of AUC score for disease detection and disease classification.
arXiv Detail & Related papers (2022-07-05T22:06:52Z)
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
Synergistic Learning of Lung Lobe Segmentation and Hierarchical Multi-Instance Classification for Automated Severity Assessment of COVID-19 in CT Images [61.862364277007934]
We propose a synergistic learning framework for automated severity assessment of COVID-19 in 3D CT images. A multi-task deep network (called M$2$UNet) is then developed to assess the severity of COVID-19 patients. Our M$2$UNet consists of a patch-level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment.
arXiv Detail & Related papers (2020-05-08T03:16:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.