A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning
- URL: http://arxiv.org/abs/2406.14164v1
- Date: Thu, 20 Jun 2024 10:08:17 GMT
- Title: A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning
- Authors: Panagiotis Kaliosis, John Pavlopoulos, Foivos Charalampakos, Georgios Moschovis, Ion Androutsopoulos,
- Abstract summary: Diagnostic Captioning (DC) automatically generates a diagnostic text from one or more medical images of a patient.
We propose a new data-driven guided decoding method that incorporates medical information into the beam search of the diagnostic text generation process.
We evaluate the proposed method on two medical datasets using four DC systems that range from generic image-to-text systems with CNN encoders to pre-trained Large Language Models.
- Score: 11.817595076396925
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diagnostic Captioning (DC) automatically generates a diagnostic text from one or more medical images (e.g., X-rays, MRIs) of a patient. Treated as a draft, the generated text may assist clinicians, by providing an initial estimation of the patient's condition, speeding up and helping safeguard the diagnostic process. The accuracy of a diagnostic text, however, strongly depends on how well the key medical conditions depicted in the images are expressed. We propose a new data-driven guided decoding method that incorporates medical information, in the form of existing tags capturing key conditions of the image(s), into the beam search of the diagnostic text generation process. We evaluate the proposed method on two medical datasets using four DC systems that range from generic image-to-text systems with CNN encoders and RNN decoders to pre-trained Large Language Models. The latter can also be used in few- and zero-shot learning scenarios. In most cases, the proposed mechanism improves performance with respect to all evaluation measures. We provide an open-source implementation of the proposed method at https://github.com/nlpaueb/dmmcs.
Related papers
- Clinical Evaluation of Medical Image Synthesis: A Case Study in Wireless Capsule Endoscopy [63.39037092484374]
This study focuses on the clinical evaluation of medical Synthetic Data Generation using Artificial Intelligence (AI) models.
The paper contributes by a) presenting a protocol for the systematic evaluation of synthetic images by medical experts and b) applying it to assess TIDE-II, a novel variational autoencoder-based model for high-resolution WCE image synthesis.
The results show that TIDE-II generates clinically relevant WCE images, helping to address data scarcity and enhance diagnostic tools.
arXiv Detail & Related papers (2024-10-31T19:48:50Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image
Diagnosis [15.13309228766603]
We propose a novel CLIP-based zero-shot medical image classification framework supplemented with ChatGPT for explainable diagnosis.
The key idea is to query large language models (LLMs) with category names to automatically generate additional cues and knowledge.
Extensive results on one private dataset and four public datasets along with detailed analysis demonstrate the effectiveness and explainability of our training-free zero-shot diagnosis pipeline.
arXiv Detail & Related papers (2023-07-05T01:45:19Z) - Multimorbidity Content-Based Medical Image Retrieval Using Proxies [37.47987844057842]
We propose a novel multi-label metric learning method that can be used for both classification and content-based image retrieval.
Our model is able to support diagnosis by predicting the presence of diseases and provide evidence for these predictions.
We demonstrate the efficacy of our approach to both classification and content-based image retrieval on two multimorbidity radiology datasets.
arXiv Detail & Related papers (2022-11-22T11:23:53Z) - Morphology-Aware Interactive Keypoint Estimation [32.52024944963992]
Diagnosis based on medical images often involves manual annotation of anatomical keypoints.
We propose a novel deep neural network that automatically detects and refines the anatomical keypoints through a user-interactive system.
arXiv Detail & Related papers (2022-09-15T09:27:14Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - BI-RADS-Net: An Explainable Multitask Learning Approach for Cancer
Diagnosis in Breast Ultrasound Images [69.41441138140895]
This paper introduces BI-RADS-Net, a novel explainable deep learning approach for cancer detection in breast ultrasound images.
The proposed approach incorporates tasks for explaining and classifying breast tumors, by learning feature representations relevant to clinical diagnosis.
Explanations of the predictions (benign or malignant) are provided in terms of morphological features that are used by clinicians for diagnosis and reporting in medical practice.
arXiv Detail & Related papers (2021-10-05T19:14:46Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Explaining Predictions of Deep Neural Classifier via Activation Analysis [0.11470070927586014]
We present a novel approach to explain and support an interpretation of the decision-making process to a human expert operating a deep learning system based on Convolutional Neural Network (CNN)
Our results indicate that our method is capable of detecting distinct prediction strategies that enable us to identify the most similar predictions from an existing atlas.
arXiv Detail & Related papers (2020-12-03T20:36:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.