Privacy-Preserving Federated Unsupervised Domain Adaptation for Regression on Small-Scale and High-Dimensional Biological Data
- URL: http://arxiv.org/abs/2411.17287v2
- Date: Thu, 13 Feb 2025 14:31:24 GMT
- Title: Privacy-Preserving Federated Unsupervised Domain Adaptation for Regression on Small-Scale and High-Dimensional Biological Data
- Authors: Cem Ata Baykara, Ali Burak Ünal, Nico Pfeifer, Mete Akgün,
- Abstract summary: freda is a privacy-preserving federated method for unsupervised domain adaptation in regression tasks.
We evaluate freda on the challenging task of age prediction from DNA methylation data, demonstrating that it achieves performance comparable to the centralized state-of-the-art method.
- Score: 2.699900017799093
- License:
- Abstract: Machine learning models often struggle with generalization in small, heterogeneous datasets due to domain shifts caused by variations in data collection and population differences. This challenge is particularly pronounced in biological data, where data is high-dimensional, small-scale, and decentralized across institutions. While federated domain adaptation methods (FDA) aim to address these challenges, most existing approaches rely on deep learning and focus on classification tasks, making them unsuitable for small-scale, high-dimensional applications. In this work, we propose freda, a privacy-preserving federated method for unsupervised domain adaptation in regression tasks. Unlike deep learning-based FDA approaches, freda is the first method to enable the federated training of Gaussian Processes to model complex feature relationships while ensuring complete data privacy through randomized encoding and secure aggregation. This allows for effective domain adaptation without direct access to raw data, making it well-suited for applications involving high-dimensional, heterogeneous datasets. We evaluate freda on the challenging task of age prediction from DNA methylation data, demonstrating that it achieves performance comparable to the centralized state-of-the-art method while preserving complete data privacy.
Related papers
- Privacy-preserving datasets by capturing feature distributions with Conditional VAEs [0.11999555634662634]
Conditional Variational Autoencoders (CVAEs) trained on feature vectors extracted from large pre-trained vision foundation models.
Our method notably outperforms traditional approaches in both medical and natural image domains.
Results underscore the potential of generative models to significantly impact deep learning applications in data-scarce and privacy-sensitive environments.
arXiv Detail & Related papers (2024-08-01T15:26:24Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - DiffClass: Diffusion-Based Class Incremental Learning [30.514281721324853]
Class Incremental Learning (CIL) is challenging due to catastrophic forgetting.
Recent exemplar-free CIL methods attempt to mitigate catastrophic forgetting by synthesizing previous task data.
We propose a novel exemplar-free CIL method to overcome these issues.
arXiv Detail & Related papers (2024-03-08T03:34:18Z) - Subject-Based Domain Adaptation for Facial Expression Recognition [51.10374151948157]
Adapting a deep learning model to a specific target individual is a challenging facial expression recognition task.
This paper introduces a new MSDA method for subject-based domain adaptation in FER.
It efficiently leverages information from multiple source subjects to adapt a deep FER model to a single target individual.
arXiv Detail & Related papers (2023-12-09T18:40:37Z) - Fed-MIWAE: Federated Imputation of Incomplete Data via Deep Generative
Models [5.373862368597948]
Federated learning allows for the training of machine learning models on multiple local datasets without requiring explicit data exchange.
Data pre-processing, including strategies for handling missing data, remains a major bottleneck in real-world federated learning deployment.
We propose Fed-MIWAE, a deep latent variable model for missing data imputation based on variational autoencoders.
arXiv Detail & Related papers (2023-04-17T08:14:08Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Mitigating Data Heterogeneity in Federated Learning with Data
Augmentation [26.226057709504733]
Federated Learning (FL) is a framework that enables training a centralized model while securing user privacy by fusing local, decentralized models.
One major obstacle is data heterogeneity, i.e., each client having non-identically and independently distributed (non-IID) data.
Recent evidence suggests that data augmentation can induce equal or greater performance.
arXiv Detail & Related papers (2022-06-20T19:47:43Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.