Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses
- URL: http://arxiv.org/abs/2407.16634v1
- Date: Tue, 23 Jul 2024 16:49:01 GMT
- Title: Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses
- Authors: Haojun Yu, Youcheng Li, Nan Zhang, Zihan Niu, Xuantong Gong, Yanwen Luo, Quanlin Wu, Wangyan Qin, Mengyuan Zhou, Jie Han, Jia Tao, Ziwei Zhao, Di Dai, Di He, Dong Wang, Binghui Tang, Ling Huo, Qingli Zhu, Yong Wang, Liwei Wang,
- Abstract summary: We introduce a pipeline, TAILOR, that builds a knowledge-driven generative model to produce tailored synthetic data.
The generative model, using 3,749 lesions as source data, can generate millions of breast-US images, especially for error-prone rare cases.
In the prospective external evaluation, our diagnostic model outperforms the average performance of nine radiologists by 33.5% in specificity with the same sensitivity.
- Score: 29.70102468004044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven deep learning models have shown great capabilities to assist radiologists in breast ultrasound (US) diagnoses. However, their effectiveness is limited by the long-tail distribution of training data, which leads to inaccuracies in rare cases. In this study, we address a long-standing challenge of improving the diagnostic model performance on rare cases using long-tailed data. Specifically, we introduce a pipeline, TAILOR, that builds a knowledge-driven generative model to produce tailored synthetic data. The generative model, using 3,749 lesions as source data, can generate millions of breast-US images, especially for error-prone rare cases. The generated data can be further used to build a diagnostic model for accurate and interpretable diagnoses. In the prospective external evaluation, our diagnostic model outperforms the average performance of nine radiologists by 33.5% in specificity with the same sensitivity, improving their performance by providing predictions with an interpretable decision-making process. Moreover, on ductal carcinoma in situ (DCIS), our diagnostic model outperforms all radiologists by a large margin, with only 34 DCIS lesions in the source data. We believe that TAILOR can potentially be extended to various diseases and imaging modalities.
Related papers
- A Foundational Generative Model for Breast Ultrasound Image Analysis [42.618964727896156]
Foundational models have emerged as powerful tools for addressing various tasks in clinical settings.
We present BUSGen, the first foundational generative model specifically designed for breast ultrasound analysis.
With few-shot adaptation, BUSGen can generate repositories of realistic and informative task-specific data.
arXiv Detail & Related papers (2025-01-12T16:39:13Z) - HIST-AID: Leveraging Historical Patient Reports for Enhanced Multi-Modal Automatic Diagnosis [38.13689106933105]
We present HIST-AID, a framework that enhances automatic diagnostic accuracy using historical reports.
Our experiments demonstrate significant improvements, with AUROC increasing by 6.56% and AUPRC by 9.51% compared to models that rely solely on radiographic scans.
arXiv Detail & Related papers (2024-11-16T03:20:53Z) - MGH Radiology Llama: A Llama 3 70B Model for Radiology [50.42811030970618]
This paper presents an advanced radiology-focused large language model: MGH Radiology Llama.
It is developed using the Llama 3 70B model, building upon previous domain-specific models like Radiology-GPT and Radiology-Llama2.
Our evaluation, incorporating both traditional metrics and a GPT-4-based assessment, highlights the enhanced performance of this work over general-purpose LLMs.
arXiv Detail & Related papers (2024-08-13T01:30:03Z) - A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation [12.617587827105496]
This research aims to bridge the gap by providing publicly accessible datasets and reliable tools for medical diagnosis.
We curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients.
These promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
arXiv Detail & Related papers (2024-06-26T06:39:11Z) - Large-scale Long-tailed Disease Diagnosis on Radiology Images [51.453990034460304]
RadDiag is a foundational model supporting 2D and 3D inputs across various modalities and anatomies.
Our dataset, RP3D-DiagDS, contains 40,936 cases with 195,010 scans covering 5,568 disorders.
arXiv Detail & Related papers (2023-12-26T18:20:48Z) - DDxT: Deep Generative Transformer Models for Differential Diagnosis [51.25660111437394]
We show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark.
The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network.
arXiv Detail & Related papers (2023-12-02T22:57:25Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - A Novel Automated Classification and Segmentation for COVID-19 using 3D
CT Scans [5.5957919486531935]
In COVID-19 computed tomography (CT) images of the lungs, ground glass turbidity is the most common finding that requires specialist diagnosis.
Some researchers propose the relevant DL models which can replace professional diagnostic specialists in clinics when lacking expertise.
Our model achieves 94.52% accuracy in the classification of lung lesions by 3 types: COVID, Pneumonia and Normal.
arXiv Detail & Related papers (2022-08-04T22:14:18Z) - Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule
Diagnosis [20.510918720980467]
Lung cancer has the highest mortality rate of deadly cancers in the world.
Computer-aided diagnosis (CAD) systems have been developed to assist radiologists in nodule detection and diagnosis.
Lack of model reliability and interpretability remains a major obstacle for its large-scale clinical application.
arXiv Detail & Related papers (2022-04-08T08:21:00Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.