Privacy Preserving Federated Unsupervised Domain Adaptation with Application to Age Prediction from DNA Methylation Data
- URL: http://arxiv.org/abs/2411.17287v1
- Date: Tue, 26 Nov 2024 10:19:16 GMT
- Title: Privacy Preserving Federated Unsupervised Domain Adaptation with Application to Age Prediction from DNA Methylation Data
- Authors: Cem Ata Baykara, Ali Burak Ünal, Nico Pfeifer, Mete Akgün,
- Abstract summary: We introduce a privacy-preserving framework for unsupervised domain adaptation in high-dimensional settings.
Our framework is the first privacy-preserving solution for high-dimensional domain adaptation in federated environments.
- Score: 2.699900017799093
- License:
- Abstract: In computational biology, predictive models are widely used to address complex tasks, but their performance can suffer greatly when applied to data from different distributions. The current state-of-the-art domain adaptation method for high-dimensional data aims to mitigate these issues by aligning the input dependencies between training and test data. However, this approach requires centralized access to both source and target domain data, raising concerns about data privacy, especially when the data comes from multiple sources. In this paper, we introduce a privacy-preserving federated framework for unsupervised domain adaptation in high-dimensional settings. Our method employs federated training of Gaussian processes and weighted elastic nets to effectively address the problem of distribution shift between domains, while utilizing secure aggregation and randomized encoding to protect the local data of participating data owners. We evaluate our framework on the task of age prediction using DNA methylation data from multiple tissues, demonstrating that our approach performs comparably to existing centralized methods while maintaining data privacy, even in distributed environments where data is spread across multiple institutions. Our framework is the first privacy-preserving solution for high-dimensional domain adaptation in federated environments, offering a promising tool for fields like computational biology and medicine, where protecting sensitive data is essential.
Related papers
- Privacy-preserving datasets by capturing feature distributions with Conditional VAEs [0.11999555634662634]
Conditional Variational Autoencoders (CVAEs) trained on feature vectors extracted from large pre-trained vision foundation models.
Our method notably outperforms traditional approaches in both medical and natural image domains.
Results underscore the potential of generative models to significantly impact deep learning applications in data-scarce and privacy-sensitive environments.
arXiv Detail & Related papers (2024-08-01T15:26:24Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Trust your Good Friends: Source-free Domain Adaptation by Reciprocal
Neighborhood Clustering [50.46892302138662]
We address the source-free domain adaptation problem, where the source pretrained model is adapted to the target domain in the absence of source data.
Our method is based on the observation that target data, which might not align with the source domain classifier, still forms clear clusters.
We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood.
arXiv Detail & Related papers (2023-09-01T15:31:18Z) - Deep Unsupervised Domain Adaptation: A Review of Recent Advances and
Perspectives [16.68091981866261]
Unsupervised domain adaptation (UDA) is proposed to counter the performance drop on data in a target domain.
UDA has yielded promising results on natural image processing, video analysis, natural language processing, time-series data analysis, medical image analysis, etc.
arXiv Detail & Related papers (2022-08-15T20:05:07Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - Secure Domain Adaptation with Multiple Sources [13.693640425403636]
Multi-source unsupervised domain adaptation (MUDA) is a recently explored learning framework.
The goal is to address the challenge of labeled data scarcity in a target domain via transferring knowledge from multiple source domains with annotated data.
Since the source data is distributed, the privacy of source domains' data can be a natural concern.
We benefit from the idea of domain alignment in an embedding space to address the privacy concern for MUDA.
arXiv Detail & Related papers (2021-06-23T02:26:36Z) - TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation [73.75784418508033]
In few-shot domain adaptation (FDA), classifiers for the target domain are trained with labeled data in the source domain (SD) and few labeled data in the target domain (TD)
Data usually contain private information in the current era, e.g., data distributed on personal phones.
We propose a target orientated hypothesis adaptation network (TOHAN) to solve the problem.
arXiv Detail & Related papers (2021-06-11T11:46:20Z) - Data-driven Regularized Inference Privacy [33.71757542373714]
We propose a data-driven inference privacy preserving framework to sanitize data.
We develop an inference privacy framework based on the variational method.
We present empirical methods to estimate the privacy metric.
arXiv Detail & Related papers (2020-10-10T08:42:59Z) - Open-Set Hypothesis Transfer with Semantic Consistency [99.83813484934177]
We introduce a method that focuses on the semantic consistency under transformation of target data.
Our model first discovers confident predictions and performs classification with pseudo-labels.
As a result, unlabeled data can be classified into discriminative classes coincided with either source classes or unknown classes.
arXiv Detail & Related papers (2020-10-01T10:44:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.