Related papers: Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses

Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses

URL: http://arxiv.org/abs/2407.16634v1
Date: Tue, 23 Jul 2024 16:49:01 GMT
Title: Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses
Authors: Haojun Yu, Youcheng Li, Nan Zhang, Zihan Niu, Xuantong Gong, Yanwen Luo, Quanlin Wu, Wangyan Qin, Mengyuan Zhou, Jie Han, Jia Tao, Ziwei Zhao, Di Dai, Di He, Dong Wang, Binghui Tang, Ling Huo, Qingli Zhu, Yong Wang, Liwei Wang,
Abstract summary: We introduce a pipeline, TAILOR, that builds a knowledge-driven generative model to produce tailored synthetic data. The generative model, using 3,749 lesions as source data, can generate millions of breast-US images, especially for error-prone rare cases. In the prospective external evaluation, our diagnostic model outperforms the average performance of nine radiologists by 33.5% in specificity with the same sensitivity.
Score: 29.70102468004044
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data-driven deep learning models have shown great capabilities to assist radiologists in breast ultrasound (US) diagnoses. However, their effectiveness is limited by the long-tail distribution of training data, which leads to inaccuracies in rare cases. In this study, we address a long-standing challenge of improving the diagnostic model performance on rare cases using long-tailed data. Specifically, we introduce a pipeline, TAILOR, that builds a knowledge-driven generative model to produce tailored synthetic data. The generative model, using 3,749 lesions as source data, can generate millions of breast-US images, especially for error-prone rare cases. The generated data can be further used to build a diagnostic model for accurate and interpretable diagnoses. In the prospective external evaluation, our diagnostic model outperforms the average performance of nine radiologists by 33.5% in specificity with the same sensitivity, improving their performance by providing predictions with an interpretable decision-making process. Moreover, on ductal carcinoma in situ (DCIS), our diagnostic model outperforms all radiologists by a large margin, with only 34 DCIS lesions in the source data. We believe that TAILOR can potentially be extended to various diseases and imaging modalities.

Related papers

A Foundational Generative Model for Breast Ultrasound Image Analysis [42.618964727896156]
Foundational models have emerged as powerful tools for addressing various tasks in clinical settings. We present BUSGen, the first foundational generative model specifically designed for breast ultrasound analysis. With few-shot adaptation, BUSGen can generate repositories of realistic and informative task-specific data.
arXiv Detail & Related papers (2025-01-12T16:39:13Z)
HIST-AID: Leveraging Historical Patient Reports for Enhanced Multi-Modal Automatic Diagnosis [38.13689106933105]
We present HIST-AID, a framework that enhances automatic diagnostic accuracy using historical reports. Our experiments demonstrate significant improvements, with AUROC increasing by 6.56% and AUPRC by 9.51% compared to models that rely solely on radiographic scans.
arXiv Detail & Related papers (2024-11-16T03:20:53Z)
MGH Radiology Llama: A Llama 3 70B Model for Radiology [50.42811030970618]
This paper presents an advanced radiology-focused large language model: MGH Radiology Llama. It is developed using the Llama 3 70B model, building upon previous domain-specific models like Radiology-GPT and Radiology-Llama2. Our evaluation, incorporating both traditional metrics and a GPT-4-based assessment, highlights the enhanced performance of this work over general-purpose LLMs.
arXiv Detail & Related papers (2024-08-13T01:30:03Z)
A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation [12.617587827105496]
This research aims to bridge the gap by providing publicly accessible datasets and reliable tools for medical diagnosis. We curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients. These promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
arXiv Detail & Related papers (2024-06-26T06:39:11Z)
Large-scale Long-tailed Disease Diagnosis on Radiology Images [51.453990034460304]
RadDiag is a foundational model supporting 2D and 3D inputs across various modalities and anatomies. Our dataset, RP3D-DiagDS, contains 40,936 cases with 195,010 scans covering 5,568 disorders.
arXiv Detail & Related papers (2023-12-26T18:20:48Z)
DDxT: Deep Generative Transformer Models for Differential Diagnosis [51.25660111437394]
We show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark. The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network.
arXiv Detail & Related papers (2023-12-02T22:57:25Z)
Deep Reinforcement Learning Framework for Thoracic Diseases Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis. We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents. Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z)
A Novel Automated Classification and Segmentation for COVID-19 using 3D CT Scans [5.5957919486531935]
In COVID-19 computed tomography (CT) images of the lungs, ground glass turbidity is the most common finding that requires specialist diagnosis. Some researchers propose the relevant DL models which can replace professional diagnostic specialists in clinics when lacking expertise. Our model achieves 94.52% accuracy in the classification of lung lesions by 3 types: COVID, Pneumonia and Normal.
arXiv Detail & Related papers (2022-08-04T22:14:18Z)
Multi-confound regression adversarial network for deep learning-based diagnosis on highly heterogenous clinical data [1.2891210250935143]
We developed a novel deep learning architecture, MUCRAN, to train a deep learning model on highly heterogeneous clinical data. We trained MUCRAN using 16,821 clinical T1 Axial brain MRIs collected from Massachusetts General Hospital before 2019. The model showed a robust performance of over 90% accuracy on newly collected data.
arXiv Detail & Related papers (2022-05-05T18:39:09Z)
Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents. We generate an automatic tumor boundary detector for the rare disease of glioblastoma. We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z)
Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule Diagnosis [20.510918720980467]
Lung cancer has the highest mortality rate of deadly cancers in the world. Computer-aided diagnosis (CAD) systems have been developed to assist radiologists in nodule detection and diagnosis. Lack of model reliability and interpretability remains a major obstacle for its large-scale clinical application.
arXiv Detail & Related papers (2022-04-08T08:21:00Z)
Variational Knowledge Distillation for Disease Classification in Chest X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays. We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.