Real-World Multi-Domain Data Applications for Generalizations to
Clinical Settings
- URL: http://arxiv.org/abs/2007.12672v1
- Date: Fri, 24 Jul 2020 17:41:23 GMT
- Title: Real-World Multi-Domain Data Applications for Generalizations to
Clinical Settings
- Authors: Nooshin Mojab, Vahid Noroozi, Darvin Yi, Manoj Prabhakar Nallabothula,
Abdullah Aleem, Phillip S. Yu, Joelle A. Hallak
- Abstract summary: Deep learning models perform well when trained on standardized datasets from artificial settings, such as clinical trials.
We show that by employing a self-supervised approach with transfer learning on a multi-domain real-world dataset, we can achieve 16% relative improvement on a standardized dataset.
- Score: 1.508558791031741
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With promising results of machine learning based models in computer vision,
applications on medical imaging data have been increasing exponentially.
However, generalizations to complex real-world clinical data is a persistent
problem. Deep learning models perform well when trained on standardized
datasets from artificial settings, such as clinical trials. However, real-world
data is different and translations are yielding varying results. The complexity
of real-world applications in healthcare could emanate from a mixture of
different data distributions across multiple device domains alongside the
inevitable noise sourced from varying image resolutions, human errors, and the
lack of manual gradings. In addition, healthcare applications not only suffer
from the scarcity of labeled data, but also face limited access to unlabeled
data due to HIPAA regulations, patient privacy, ambiguity in data ownership,
and challenges in collecting data from different sources. These limitations
pose additional challenges to applying deep learning algorithms in healthcare
and clinical translations. In this paper, we utilize self-supervised
representation learning methods, formulated effectively in transfer learning
settings, to address limited data availability. Our experiments verify the
importance of diverse real-world data for generalization to clinical settings.
We show that by employing a self-supervised approach with transfer learning on
a multi-domain real-world dataset, we can achieve 16% relative improvement on a
standardized dataset over supervised baselines.
Related papers
- Federated Impression for Learning with Distributed Heterogeneous Data [19.50235109938016]
Federated learning (FL) provides a paradigm that can learn from distributed datasets across clients without requiring them to share data.
In FL, sub-optimal convergence is common among data from different health centers due to the variety in data collection protocols and patient demographics across centers.
We propose FedImpres which alleviates catastrophic forgetting by restoring synthetic data that represents the global information as federated impression.
arXiv Detail & Related papers (2024-09-11T15:37:52Z) - Universal Medical Imaging Model for Domain Generalization with Data Privacy [2.8727695958743364]
We propose a federated learning approach to transfer knowledge from multiple local models to a global model.
The primary objective is to train a global model capable of performing a wide variety of medical imaging tasks.
arXiv Detail & Related papers (2024-07-20T01:24:15Z) - Generalization in medical AI: a perspective on developing scalable
models [3.003979691986621]
Many prestigious journals now require reporting results both on the local hidden test set as well as on external datasets.
This is because of the variability encountered in intended use and specificities across hospital cultures.
We establish a hierarchical three-level scale system reflecting the generalization level of a medical AI algorithm.
arXiv Detail & Related papers (2023-11-09T14:54:28Z) - GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided
Gastrointestinal Disease Detection [6.231109933741383]
This dataset includes different anatomical landmarks, pathological abnormalities, polyp removal cases and normal findings from the GI tract.
It was annotated and verified by experienced GI endoscopists.
We believe our dataset can facilitate the development of AI-based algorithms for GI disease detection and classification.
arXiv Detail & Related papers (2023-07-16T19:36:03Z) - Unsupervised pre-training of graph transformers on patient population
graphs [48.02011627390706]
We propose a graph-transformer-based network to handle heterogeneous clinical data.
We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
arXiv Detail & Related papers (2022-07-21T16:59:09Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - A Real Use Case of Semi-Supervised Learning for Mammogram Classification
in a Local Clinic of Costa Rica [0.5541644538483946]
Training a deep learning model requires a considerable amount of labeled images.
A number of publicly available datasets have been built with data from different hospitals and clinics.
The use of the semi-supervised deep learning approach known as MixMatch, to leverage the usage of unlabeled data is proposed and evaluated.
arXiv Detail & Related papers (2021-07-24T22:26:50Z) - Health Status Prediction with Local-Global Heterogeneous Behavior Graph [69.99431339130105]
Estimation of health status can be achieved with various kinds of data streams continuously collected from wearable sensors.
We propose to model the behavior-related multi-source data streams with a local-global graph.
We take experiments on StudentLife dataset, and extensive results demonstrate the effectiveness of our proposed model.
arXiv Detail & Related papers (2021-03-23T11:10:04Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially
Private Generators [74.16405337436213]
We propose Gradient-sanitized Wasserstein Generative Adrial Networks (GS-WGAN)
GS-WGAN allows releasing a sanitized form of sensitive data with rigorous privacy guarantees.
We find our approach consistently outperforms state-of-the-art approaches across multiple metrics.
arXiv Detail & Related papers (2020-06-15T10:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.