Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings,
and Future Directions
- URL: http://arxiv.org/abs/2212.13591v2
- Date: Sun, 30 Apr 2023 06:10:44 GMT
- Title: Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings,
and Future Directions
- Authors: Edward Y. Chang
- Abstract summary: Deep learning is able to learn a wide range of examples of a particular concept or meaning.
In medicine, having a diverse set of training data on a particular disease can lead to the development of a model that is able to accurately predict the disease.
Despite the potential benefits, there have not been significant advances in image-based diagnosis due to a lack of high-quality annotated data.
This article highlights the importance of using a data-centric approach to improve the quality of data representations.
- Score: 7.673853485227739
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep learning is largely due to the availability of large
amounts of training data that cover a wide range of examples of a particular
concept or meaning. In the field of medicine, having a diverse set of training
data on a particular disease can lead to the development of a model that is
able to accurately predict the disease. However, despite the potential
benefits, there have not been significant advances in image-based diagnosis due
to a lack of high-quality annotated data. This article highlights the
importance of using a data-centric approach to improve the quality of data
representations, particularly in cases where the available data is limited. To
address this "small-data" issue, we discuss four methods for generating and
aggregating training data: data augmentation, transfer learning, federated
learning, and GANs (generative adversarial networks). We also propose the use
of knowledge-guided GANs to incorporate domain knowledge in the training data
generation process. With the recent progress in large pre-trained language
models, we believe it is possible to acquire high-quality knowledge that can be
used to improve the effectiveness of knowledge-guided generative methods.
Related papers
- A Survey of Few-Shot Learning for Biomedical Time Series [3.845248204742053]
Data-driven models have tremendous potential to assist clinical diagnosis and improve patient care.
An emerging approach to overcome the scarcity of labeled data is to augment AI methods with human-like capabilities to learn new tasks with limited examples, called few-shot learning.
This survey provides a comprehensive review and comparison of few-shot learning methods for biomedical time series applications.
arXiv Detail & Related papers (2024-05-03T21:22:27Z) - Amplifying Pathological Detection in EEG Signaling Pathways through
Cross-Dataset Transfer Learning [10.212217551908525]
We study the effectiveness of data and model scaling and cross-dataset knowledge transfer in a real-world pathology classification task.
We identify the challenges of possible negative transfer and emphasize the significance of some key components.
Our findings indicate a small and generic model (e.g. ShallowNet) performs well on a single dataset, however, a larger model (e.g. TCN) performs better on transfer and learning from a larger and diverse dataset.
arXiv Detail & Related papers (2023-09-19T20:09:15Z) - ProtoKD: Learning from Extremely Scarce Data for Parasite Ova
Recognition [5.224806515926022]
We introduce ProtoKD, one of the first approaches to tackle the problem of multi-class parasitic ova recognition using extremely scarce data.
We establish a new benchmark to drive research in this critical direction and validate that the proposed ProtoKD framework achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-09-18T23:49:04Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - More From Less: Self-Supervised Knowledge Distillation for Routine
Histopathology Data [3.93181912653522]
We show that it is possible to distil knowledge during training from information-dense data into models which only require information-sparse data for inference.
This improves downstream classification accuracy on information-sparse data, making it comparable with the fully-supervised baseline.
This approach enables the design of models which require only routine images, but contain insights from state-of-the-art data, allowing better use of the available resources.
arXiv Detail & Related papers (2023-03-19T13:41:59Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Adversarial Multi-Source Transfer Learning in Healthcare: Application to
Glucose Prediction for Diabetic People [4.17510581764131]
We propose a multi-source adversarial transfer learning framework that enables the learning of a feature representation that is similar across the sources.
We apply this idea to glucose forecasting for diabetic people using a fully convolutional neural network.
In particular, it shines when using data from different datasets, or when there is too little data in an intra-dataset situation.
arXiv Detail & Related papers (2020-06-29T11:17:50Z) - GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially
Private Generators [74.16405337436213]
We propose Gradient-sanitized Wasserstein Generative Adrial Networks (GS-WGAN)
GS-WGAN allows releasing a sanitized form of sensitive data with rigorous privacy guarantees.
We find our approach consistently outperforms state-of-the-art approaches across multiple metrics.
arXiv Detail & Related papers (2020-06-15T10:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.