A Federated Learning Benchmark for Drug-Target Interaction
- URL: http://arxiv.org/abs/2302.07684v4
- Date: Wed, 18 Oct 2023 08:14:43 GMT
- Title: A Federated Learning Benchmark for Drug-Target Interaction
- Authors: Gianluca Mittone, Filip Svoboda, Marco Aldinucci, Nicholas D. Lane,
Pietro Lio
- Abstract summary: This work proposes the application of federated learning in the drug-target interaction (DTI) domain.
It achieves up to 15% improved performance relative to the best available non-privacy preserving alternative.
Our extensive battery of experiments shows that, unlike in other domains, the non-IID data distribution in the DTI datasets does not deteriorate FL performance.
- Score: 17.244787426504626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aggregating pharmaceutical data in the drug-target interaction (DTI) domain
has the potential to deliver life-saving breakthroughs. It is, however,
notoriously difficult due to regulatory constraints and commercial interests.
This work proposes the application of federated learning, which we argue to be
reconcilable with the industry's constraints, as it does not require sharing of
any information that would reveal the entities' data or any other high-level
summary of it. When used on a representative GraphDTA model and the KIBA
dataset it achieves up to 15% improved performance relative to the best
available non-privacy preserving alternative. Our extensive battery of
experiments shows that, unlike in other domains, the non-IID data distribution
in the DTI datasets does not deteriorate FL performance. Additionally, we
identify a material trade-off between the benefits of adding new data, and the
cost of adding more clients.
Related papers
- Non-IID data in Federated Learning: A Systematic Review with Taxonomy, Metrics, Methods, Frameworks and Future Directions [2.9434966603161072]
This systematic review aims to fill a gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics.
We describe popular solutions to address non-IID data and standardized frameworks employed in Federated Learning with heterogeneous data.
arXiv Detail & Related papers (2024-11-19T09:53:28Z) - Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data [9.045647166114916]
Federated Learning (FL) is a promising paradigm for decentralized and collaborative model training.
FL struggles with a significant performance reduction and poor convergence when confronted with Non-Independent and Identically Distributed (Non-IID) data distributions.
We introduce Gen-FedSD, a novel approach that harnesses the powerful capability of state-of-the-art text-to-image foundation models.
arXiv Detail & Related papers (2024-05-13T16:57:48Z) - Approximate Gradient Coding for Privacy-Flexible Federated Learning with Non-IID Data [9.984630251008868]
This work focuses on the challenges of non-IID data and stragglers/dropouts in federated learning.
We introduce and explore a privacy-flexible paradigm that models parts of the clients' local data as non-private.
arXiv Detail & Related papers (2024-04-04T15:29:50Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - Model-Contrastive Federated Domain Adaptation [3.9435648520559177]
Federated domain adaptation (FDA) aims to collaboratively transfer knowledge from source clients (domains) to the related but different target client.
We propose a model-based method named FDAC, aiming to address bf Federated bf Domain bf Adaptation based on bf Contrastive learning and Vision Transformer (ViT)
To the best of our knowledge, FDAC is the first attempt to learn transferable representations by manipulating the latent architecture of ViT under the federated setting.
arXiv Detail & Related papers (2023-05-07T23:48:03Z) - Evaluating the effect of data augmentation and BALD heuristics on
distillation of Semantic-KITTI dataset [63.20765930558542]
Active Learning has remained relatively unexplored for LiDAR perception tasks in autonomous driving datasets.
We evaluate Bayesian active learning methods applied to the task of dataset distillation or core subset selection.
We also study the effect of application of data augmentation within Bayesian AL based dataset distillation.
arXiv Detail & Related papers (2023-02-21T13:56:47Z) - Proposing Novel Extrapolative Compounds by Nested Variational
Autoencoders [0.685316573653194]
The authors proposed a deep generative model with nested two variational autoencoders (VAEs)
The outer VAE learns the structural features of compounds using large-scale public data, while the inner VAE learns the relationship between the latent variables of the outer VAE and the properties from small-scale experimental data.
The results indicated that this loss function contributes to improve the probability of generating high-performance candidates.
arXiv Detail & Related papers (2023-02-06T04:12:12Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - VFed-SSD: Towards Practical Vertical Federated Advertising [53.08038962443853]
We propose a semi-supervised split distillation framework VFed-SSD to alleviate the two limitations.
Specifically, we develop a self-supervised task MatchedPair Detection (MPD) to exploit the vertically partitioned unlabeled data.
Our framework provides an efficient federation-enhanced solution for real-time display advertising with minimal deploying cost and significant performance lift.
arXiv Detail & Related papers (2022-05-31T17:45:30Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.