Privacy-Preserving Multi-Center Differential Protein Abundance Analysis with FedProt
- URL: http://arxiv.org/abs/2407.15220v1
- Date: Sun, 21 Jul 2024 17:09:20 GMT
- Title: Privacy-Preserving Multi-Center Differential Protein Abundance Analysis with FedProt
- Authors: Yuliya Burankova, Miriam Abele, Mohammad Bakhtiari, Christine von Törne, Teresa Barth, Lisa Schweizer, Pieter Giesbertz, Johannes R. Schmidt, Stefan Kalkhof, Janina Müller-Deile, Peter A van Veelen, Yassene Mohammed, Elke Hammer, Lis Arend, Klaudia Adamowicz, Tanja Laske, Anne Hartebrodt, Tobias Frisch, Chen Meng, Julian Matschinske, Julian Späth, Richard Röttger, Veit Schwämmle, Stefanie M. Hauck, Stefan Lichtenthaler, Axel Imhof, Matthias Mann, Christina Ludwig, Bernhard Kuster, Jan Baumbach, Olga Zolotareva,
- Abstract summary: FedProt is the first privacy-preserving tool for collaborative differential protein abundance analysis of distributed data.
It achieves accuracy equivalent to DEqMS applied to pooled data, with completely negligible absolute differences.
FedProt is available as a web tool with detailed documentation as a FeatureCloud App.
- Score: 1.0691609140312175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantitative mass spectrometry has revolutionized proteomics by enabling simultaneous quantification of thousands of proteins. Pooling patient-derived data from multiple institutions enhances statistical power but raises significant privacy concerns. Here we introduce FedProt, the first privacy-preserving tool for collaborative differential protein abundance analysis of distributed data, which utilizes federated learning and additive secret sharing. In the absence of a multicenter patient-derived dataset for evaluation, we created two, one at five centers from LFQ E.coli experiments and one at three centers from TMT human serum. Evaluations using these datasets confirm that FedProt achieves accuracy equivalent to DEqMS applied to pooled data, with completely negligible absolute differences no greater than $\text{$4 \times 10^{-12}$}$. In contrast, -log10(p-values) computed by the most accurate meta-analysis methods diverged from the centralized analysis results by up to 25-27. FedProt is available as a web tool with detailed documentation as a FeatureCloud App.
Related papers
- Federated Transformer-GNN for Privacy-Preserving Brain Tumor Localization with Modality-Level Explainability [0.2683233968306505]
We present a federated learning framework for brain tumor localization that enables multi-institutional collaboration without sharing sensitive patient data.<n>Our method extends a hybrid Transformer-Graph Neural Network architecture derived from prior decoder-free supervoxel GNNs.<n>We provide an explainability analysis through Transformer attention mechanisms that reveals which MRI modalities drive the model predictions.
arXiv Detail & Related papers (2026-01-21T14:46:00Z) - FedOnco-Bench: A Reproducible Benchmark for Privacy-Aware Federated Tumor Segmentation with Synthetic CT Data [0.0]
Federated Learning (FL) allows multiple institutions to cooperatively train machine learning models while retaining sensitive data at the source.<n>This paper presents FedOnco-Bench, a reproducible benchmark for privacy-aware FL using synthetic oncologic CT scans with tumor annotations.<n>It evaluates segmentation performance and privacy leakage across FL methods: FedAvg, FedProx, FedBN, and FedAvg with DP-SGD.
arXiv Detail & Related papers (2025-11-02T04:17:14Z) - Insights into the Unknown: Federated Data Diversity Analysis on Molecular Data [0.0]
Federated learning (FL) offers a promising approach to integrate private data into privacy-preserving, collaborative model training across data silos.<n>We benchmark three approaches, Federated kMeans (Fed-kMeans), Federated Principal Component Analysis combined with Fed-kMeans (Fed-PCA+Fed-kMeans), and Federated Locality-Sensitive Hashing (Fed-LSH) against their centralized counterparts on eight diverse molecular datasets.
arXiv Detail & Related papers (2025-10-22T12:41:04Z) - A Robust Pipeline for Differentially Private Federated Learning on Imbalanced Clinical Data using SMOTETomek and FedProx [0.0]
Federated Learning (FL) presents a groundbreaking approach for collaborative health research.<n>FL offers formal security guarantees when combined with Differential Privacy (DP)<n>An optimal operational region was identified on the privacy-utility frontier.
arXiv Detail & Related papers (2025-08-06T20:47:50Z) - Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation [0.0]
Causal inference typically assumes centralized access to individual-level data.<n>We address this by estimating the Average Treatment Effect (ATE) from decentralized observational data using federated learning.
arXiv Detail & Related papers (2025-05-23T14:32:57Z) - Improved Robustness for Deep Learning-based Segmentation of Multi-Center Myocardial Perfusion MRI Datasets Using Data Adaptive Uncertainty-guided Space-time Analysis [0.24285581051793656]
Fully automatic analysis of perfusion datasets enables rapid and objective reporting of stress/rest studies in patients.
Deep learning techniques that can analyze multi-center datasets despite limited training data and variations in software and hardware is an ongoing challenge.
The proposed DAUGS analysis approach has the potential to improve robustness of deep learning methods for segmentation of multi-center stress perfusion datasets.
arXiv Detail & Related papers (2024-08-09T01:21:41Z) - PrivFED -- A Framework for Privacy-Preserving Federated Learning in Enhanced Breast Cancer Diagnosis [0.0]
This study introduces a federated learning framework, trained on the Wisconsin dataset, to mitigate challenges such as data scarcity and imbalance.
The model exhibits an average accuracy of 99.95% on edge devices and 98% on the central server.
arXiv Detail & Related papers (2024-05-13T18:01:57Z) - Investigation of Federated Learning Algorithms for Retinal Optical
Coherence Tomography Image Classification with Statistical Heterogeneity [6.318288071829899]
We investigate the effectiveness of FedAvg and FedProx to train an OCT image classification model in a decentralized fashion.
We partitioned a publicly available OCT dataset across multiple clients under IID and Non-IID settings and conducted local training on the subsets for each client.
arXiv Detail & Related papers (2024-02-15T15:58:42Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - Data-Free Distillation Improves Efficiency and Privacy in Federated
Thorax Disease Analysis [11.412151951949102]
Thorax disease analysis in large-scale, multi-centre, and multi-scanner settings is often limited by strict privacy policies.
We introduce a data-free distillation-based FL approach FedKDF.
In FedKDF, the server employs a lightweight generator to aggregate knowledge from different clients without requiring access to their private data or a proxy dataset.
arXiv Detail & Related papers (2023-10-22T18:27:35Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - Scotch: An Efficient Secure Computation Framework for Secure Aggregation [0.0]
Federated learning enables multiple data owners to jointly train a machine learning model without revealing their private datasets.
A malicious aggregation server might use the model parameters to derive sensitive information about the training dataset used.
We propose textscScotch, a decentralized textitm-party secure-computation framework for federated aggregation.
arXiv Detail & Related papers (2022-01-19T17:16:35Z) - Sensitivity analysis in differentially private machine learning using
hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis.
This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data.
Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z) - Accuracy and Privacy Evaluations of Collaborative Data Analysis [4.987315310656657]
A collaborative data analysis through sharing dimensionality reduced representations of data has been proposed as a non-model sharing-type federated learning.
This paper analyzes the accuracy and privacy evaluations of this novel framework.
arXiv Detail & Related papers (2021-01-27T00:38:47Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.