Asynchronous Collaborative Learning Across Data Silos
- URL: http://arxiv.org/abs/2203.12637v1
- Date: Wed, 23 Mar 2022 18:00:19 GMT
- Title: Asynchronous Collaborative Learning Across Data Silos
- Authors: Tiffany Tuor, Joshua Lockhart, Daniele Magazzeni
- Abstract summary: We propose a framework to enable asynchronous collaborative training of machine learning models across data silos.
This allows data science teams to collaboratively train a machine learning model, without sharing data with one another.
- Score: 9.094748832034746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning algorithms can perform well when trained on large datasets.
While large organisations often have considerable data assets, it can be
difficult for these assets to be unified in a manner that makes training
possible. Data is very often 'siloed' in different parts of the organisation,
with little to no access between silos. This fragmentation of data assets is
especially prevalent in heavily regulated industries like financial services or
healthcare. In this paper we propose a framework to enable asynchronous
collaborative training of machine learning models across data silos. This
allows data science teams to collaboratively train a machine learning model,
without sharing data with one another. Our proposed approach enhances
conventional federated learning techniques to make them suitable for this
asynchronous training in this intra-organisation, cross-silo setting. We
validate our proposed approach via extensive experiments.
Related papers
- Towards Multi-User Activity Recognition through Facilitated Training
Data and Deep Learning for Human-Robot Collaboration Applications [2.3274633659223545]
This study proposes an alternative way of gathering data regarding multi-user activity, by collecting data related to single users and merging them in post-processing.
It is possible to make use of data collected in this way for pair HRC settings and get similar performances compared to using training data regarding groups of users recorded under the same settings.
arXiv Detail & Related papers (2023-02-11T19:27:07Z) - Privacy-Preserving Machine Learning for Collaborative Data Sharing via
Auto-encoder Latent Space Embeddings [57.45332961252628]
Privacy-preserving machine learning in data-sharing processes is an ever-critical task.
This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data.
arXiv Detail & Related papers (2022-11-10T17:36:58Z) - Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead [58.720142291102135]
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private.
In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
arXiv Detail & Related papers (2021-11-26T09:57:11Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Fairness-Driven Private Collaborative Machine Learning [7.25130576615102]
We suggest a feasible privacy-preserving pre-process mechanism for enhancing fairness of collaborative machine learning algorithms.
Our experimentation with the proposed method shows that it is able to enhance fairness considerably with only a minor compromise in accuracy.
arXiv Detail & Related papers (2021-09-29T12:22:00Z) - Assisted Learning for Organizations with Limited Imbalanced Data [17.34334881241701]
We develop an assisted learning framework for assisting organizations to improve their learning performance.
Our framework allows the learner to only occasionally share information with the service provider, but still obtain a model that achieves near-oracle performance.
arXiv Detail & Related papers (2021-09-20T05:57:52Z) - Federated Learning Versus Classical Machine Learning: A Convergence
Comparison [7.730827805192975]
In the past few decades, machine learning has revolutionized data processing for large scale applications.
In particular, the federated learning allows participants to collaboratively train the local models on local data without revealing their sensitive information to the central cloud server.
The simulation results demonstrate that federated learning achieves higher convergence within limited communication rounds while maintaining participants' anonymity.
arXiv Detail & Related papers (2021-07-22T17:14:35Z) - The Role of Cross-Silo Federated Learning in Facilitating Data Sharing
in the Agri-Food Sector [5.219568203653523]
Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in the agri-food sector.
We propose a technical solution based on federated learning that uses decentralized data.
Our results demonstrate that our approach performs better than each of the models trained on an individual data source.
arXiv Detail & Related papers (2021-04-14T16:00:28Z) - Multi-modal AsynDGAN: Learn From Distributed Medical Image Data without
Sharing Private Information [55.866673486753115]
We propose an extendable and elastic learning framework to preserve privacy and security.
The proposed framework is named distributed Asynchronized Discriminator Generative Adrial Networks (AsynDGAN)
arXiv Detail & Related papers (2020-12-15T20:41:24Z) - Federated Residual Learning [53.77128418049985]
We study a new form of federated learning where the clients train personalized local models and make predictions jointly with the server-side shared model.
Using this new federated learning framework, the complexity of the central shared model can be minimized while still gaining all the performance benefits that joint training provides.
arXiv Detail & Related papers (2020-03-28T19:55:24Z) - DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a
Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network.
We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples.
We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.