Handling Data Heterogeneity with Generative Replay in Collaborative
Learning for Medical Imaging
- URL: http://arxiv.org/abs/2106.13208v1
- Date: Thu, 24 Jun 2021 17:39:55 GMT
- Title: Handling Data Heterogeneity with Generative Replay in Collaborative
Learning for Medical Imaging
- Authors: Liangqiong Qu, Niranjan Balachandar, Miao Zhang, Daniel Rubin
- Abstract summary: We present a novel generative replay strategy to address the challenge of data heterogeneity in collaborative learning methods.
A primary model learns the desired task, and an auxiliary "generative replay model" either synthesizes images that closely resemble the input images or helps extract latent variables.
The generative replay strategy is flexible to use, can either be incorporated into existing collaborative learning methods to improve their capability of handling data heterogeneity across institutions, or be used as a novel and individual collaborative learning framework (termed FedReplay) to reduce communication cost.
- Score: 21.53220262343254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collaborative learning, which enables collaborative and decentralized
training of deep neural networks at multiple institutions in a
privacy-preserving manner, is rapidly emerging as a valuable technique in
healthcare applications. However, its distributed nature often leads to
significant heterogeneity in data distributions across institutions. Existing
collaborative learning approaches generally do not account for the presence of
heterogeneity in data among institutions, or only mildly skewed label
distributions are studied. In this paper, we present a novel generative replay
strategy to address the challenge of data heterogeneity in collaborative
learning methods. Instead of directly training a model for task performance, we
leverage recent image synthesis techniques to develop a novel dual model
architecture: a primary model learns the desired task, and an auxiliary
"generative replay model" either synthesizes images that closely resemble the
input images or helps extract latent variables. The generative replay strategy
is flexible to use, can either be incorporated into existing collaborative
learning methods to improve their capability of handling data heterogeneity
across institutions, or be used as a novel and individual collaborative
learning framework (termed FedReplay) to reduce communication cost.
Experimental results demonstrate the capability of the proposed method in
handling heterogeneous data across institutions. On highly heterogeneous data
partitions, our model achieves ~4.88% improvement in the prediction accuracy on
a diabetic retinopathy classification dataset, and ~49.8% reduction of mean
absolution value on a Bone Age prediction dataset, respectively, compared to
the state-of-the art collaborative learning methods.
Related papers
- Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Federated Learning for Data and Model Heterogeneity in Medical Imaging [19.0931609571649]
Federated Learning (FL) is an evolving machine learning method in which multiple clients participate in collaborative learning without sharing their data with each other and the central server.
In real-world applications such as hospitals and industries, FL counters the challenges of data Heterogeneity and Model Heterogeneity.
We propose a method, MDH-FL (Exploiting Model and Data Heterogeneity in FL), to solve such problems.
arXiv Detail & Related papers (2023-07-31T21:08:45Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Decentralized Distributed Learning with Privacy-Preserving Data
Synthesis [9.276097219140073]
In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data.
Recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis.
We present a decentralized distributed method that integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy.
arXiv Detail & Related papers (2022-06-20T23:49:38Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - An Experimental Study of Data Heterogeneity in Federated Learning
Methods for Medical Imaging [8.984706828657814]
Federated learning enables multiple institutions to collaboratively train machine learning models on their local data in a privacy-preserving way.
We investigate the deleterious impact of a taxonomy of data heterogeneity regimes on federated learning methods, including quantity skew, label distribution skew, and imaging acquisition skew.
We present several mitigation strategies to overcome performance drops from data heterogeneity, including weighted average for data quantity skew, weighted loss and batch normalization averaging for label distribution skew.
arXiv Detail & Related papers (2021-07-18T05:47:48Z) - SplitAVG: A heterogeneity-aware federated deep learning method for
medical imaging [29.271291030933966]
Federated learning is an emerging research paradigm for enabling collaboratively training deep learning models without sharing patient data.
In this study, we propose a novel heterogeneous-aware federated learning method, SplitAVG, to overcome the performance drops from data heterogeneity in federated learning.
We compare SplitAVG with seven state-of-the-art federated learning methods, using centrally hosted training data as the baseline.
arXiv Detail & Related papers (2021-07-06T03:58:10Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Siloed Federated Learning for Multi-Centric Histopathology Datasets [0.17842332554022694]
This paper proposes a novel federated learning approach for deep learning architectures in the medical domain.
Local-statistic batch normalization (BN) layers are introduced, resulting in collaboratively-trained, yet center-specific models.
We benchmark the proposed method on the classification of tumorous histopathology image patches extracted from the Camelyon16 and Camelyon17 datasets.
arXiv Detail & Related papers (2020-08-17T15:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.