Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis
- URL: http://arxiv.org/abs/2410.00292v1
- Date: Tue, 1 Oct 2024 00:23:05 GMT
- Title: Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis
- Authors: Chun-Hsiao Yeh, Jiayun Wang, Andrew D. Graham, Andrea J. Liu, Bo Tan, Yubei Chen, Yi Ma, Meng C. Lin,
- Abstract summary: We introduce an innovative multi-modal diagnostic pipeline (MDPipe) by employing large language models (LLMs) for ocular surface disease diagnosis.
To tackle these challenges, we introduce an innovative multi-modal diagnostic pipeline (MDPipe) by employing large language models (LLMs) for ocular surface disease diagnosis.
- Score: 17.970320199904084
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate diagnosis of ocular surface diseases is critical in optometry and ophthalmology, which hinge on integrating clinical data sources (e.g., meibography imaging and clinical metadata). Traditional human assessments lack precision in quantifying clinical observations, while current machine-based methods often treat diagnoses as multi-class classification problems, limiting the diagnoses to a predefined closed-set of curated answers without reasoning the clinical relevance of each variable to the diagnosis. To tackle these challenges, we introduce an innovative multi-modal diagnostic pipeline (MDPipe) by employing large language models (LLMs) for ocular surface disease diagnosis. We first employ a visual translator to interpret meibography images by converting them into quantifiable morphology data, facilitating their integration with clinical metadata and enabling the communication of nuanced medical insight to LLMs. To further advance this communication, we introduce a LLM-based summarizer to contextualize the insight from the combined morphology and clinical metadata, and generate clinical report summaries. Finally, we refine the LLMs' reasoning ability with domain-specific insight from real-life clinician diagnoses. Our evaluation across diverse ocular surface disease diagnosis benchmarks demonstrates that MDPipe outperforms existing standards, including GPT-4, and provides clinically sound rationales for diagnoses.
Related papers
- DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models [32.85606857702375]
We aim at evaluating the reasoning ability and interpretability of large language models (LLMs) compared to human doctors.
The diagnostic reasoning dataset for clinical notes (DiReCT) contains 511 clinical notes, each meticulously annotated by physicians.
arXiv Detail & Related papers (2024-08-04T05:15:02Z) - DSCENet: Dynamic Screening and Clinical-Enhanced Multimodal Fusion for MPNs Subtype Classification [8.95697172316912]
We propose a Dynamic Screening and Clinical-Enhanced Network (DSCENet) for the subtype classification of MPNs on the multimodal fusion of whole slide images (WSIs) and clinical information.
Our approach has been validated on the real clinical data, achieving an increase of 7.91% AUC and 16.89% accuracy compared with the previous state-of-the-art methods.
arXiv Detail & Related papers (2024-07-11T04:13:58Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs [13.806201934732321]
medIKAL combines Large Language Models (LLMs) with knowledge graphs (KGs) to enhance diagnostic capabilities.
medIKAL assigns weighted importance to entities in medical records based on their type, enabling precise localization of candidate diseases within KGs.
We validated medIKAL's effectiveness through extensive experiments on a newly introduced open-sourced Chinese EMR dataset.
arXiv Detail & Related papers (2024-06-20T13:56:52Z) - CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making [16.310913127940857]
We introduce CliBench, a novel benchmark developed from the MIMIC IV dataset.
This benchmark offers a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis.
We conduct a zero-shot evaluation of leading LLMs to assess their proficiency in clinical decision-making.
arXiv Detail & Related papers (2024-06-14T11:10:17Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales [15.362903610463285]
We present a "reasoning-aware" diagnosis framework that rationalizes the diagnostic process via prompt-based learning.
We propose a novel set of criteria for evaluating machine-generated rationales' potential for real-world clinical settings.
arXiv Detail & Related papers (2023-12-12T16:14:45Z) - Polar-Net: A Clinical-Friendly Model for Alzheimer's Disease Detection
in OCTA Images [53.235117594102675]
Optical Coherence Tomography Angiography is a promising tool for detecting Alzheimer's disease (AD) by imaging the retinal microvasculature.
We propose a novel deep-learning framework called Polar-Net to provide interpretable results and leverage clinical prior knowledge.
We show that Polar-Net outperforms existing state-of-the-art methods and provides more valuable pathological evidence for the association between retinal vascular changes and AD.
arXiv Detail & Related papers (2023-11-10T11:49:49Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - BI-RADS-Net: An Explainable Multitask Learning Approach for Cancer
Diagnosis in Breast Ultrasound Images [69.41441138140895]
This paper introduces BI-RADS-Net, a novel explainable deep learning approach for cancer detection in breast ultrasound images.
The proposed approach incorporates tasks for explaining and classifying breast tumors, by learning feature representations relevant to clinical diagnosis.
Explanations of the predictions (benign or malignant) are provided in terms of morphological features that are used by clinicians for diagnosis and reporting in medical practice.
arXiv Detail & Related papers (2021-10-05T19:14:46Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.