Privacy-Preserving Federated Learning over Vertically and Horizontally
Partitioned Data for Financial Anomaly Detection
- URL: http://arxiv.org/abs/2310.19304v1
- Date: Mon, 30 Oct 2023 06:51:33 GMT
- Title: Privacy-Preserving Federated Learning over Vertically and Horizontally
Partitioned Data for Financial Anomaly Detection
- Authors: Swanand Ravindra Kadhe, Heiko Ludwig, Nathalie Baracaldo, Alan King,
Yi Zhou, Keith Houck, Ambrish Rawat, Mark Purcell, Naoise Holohan, Mikio
Takeuchi, Ryo Kawahara, Nir Drucker, Hayim Shaul, Eyal Kushnir, Omri Soceanu
- Abstract summary: In real-world financial anomaly detection scenarios, the data is partitioned both vertically and horizontally.
Our solution combines fully homomorphic encryption (HE), secure multi-party computation (SMPC), differential privacy (DP)
Our solution won second prize in the first phase of the U.S. Privacy Enhancing Technologies (PETs) Prize Challenge.
- Score: 11.167661320589488
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The effective detection of evidence of financial anomalies requires
collaboration among multiple entities who own a diverse set of data, such as a
payment network system (PNS) and its partner banks. Trust among these financial
institutions is limited by regulation and competition. Federated learning (FL)
enables entities to collaboratively train a model when data is either
vertically or horizontally partitioned across the entities. However, in
real-world financial anomaly detection scenarios, the data is partitioned both
vertically and horizontally and hence it is not possible to use existing FL
approaches in a plug-and-play manner.
Our novel solution, PV4FAD, combines fully homomorphic encryption (HE),
secure multi-party computation (SMPC), differential privacy (DP), and
randomization techniques to balance privacy and accuracy during training and to
prevent inference threats at model deployment time. Our solution provides input
privacy through HE and SMPC, and output privacy against inference time attacks
through DP. Specifically, we show that, in the honest-but-curious threat model,
banks do not learn any sensitive features about PNS transactions, and the PNS
does not learn any information about the banks' dataset but only learns
prediction labels. We also develop and analyze a DP mechanism to protect output
privacy during inference. Our solution generates high-utility models by
significantly reducing the per-bank noise level while satisfying distributed
DP. To ensure high accuracy, our approach produces an ensemble model, in
particular, a random forest. This enables us to take advantage of the
well-known properties of ensembles to reduce variance and increase accuracy.
Our solution won second prize in the first phase of the U.S. Privacy Enhancing
Technologies (PETs) Prize Challenge.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Libertas: Privacy-Preserving Computation for Decentralised Personal Data Stores [19.54818218429241]
We propose a modular design for integrating Secure Multi-Party Computation with Solid.
Our architecture, Libertas, requires no protocol level changes in the underlying design of Solid.
We show how this can be combined with existing differential privacy techniques to also ensure output privacy.
arXiv Detail & Related papers (2023-09-28T12:07:40Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z) - Federated Learning with Sparsified Model Perturbation: Improving
Accuracy under Client-Level Differential Privacy [27.243322019117144]
Federated learning (FL) enables distributed clients to collaboratively learn a shared statistical model.
sensitive information about the training data can still be inferred from model updates shared in FL.
Differential privacy (DP) is the state-of-the-art technique to defend against those attacks.
This paper develops a novel FL scheme named Fed-SMP that provides client-level DP guarantee while maintaining high model accuracy.
arXiv Detail & Related papers (2022-02-15T04:05:42Z) - Differentially Private Federated Learning on Heterogeneous Data [10.431137628048356]
Federated Learning (FL) is a paradigm for large-scale distributed learning.
It faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of participating users.
We propose a novel FL approach to tackle these two challenges together by incorporating Differential Privacy (DP) constraints.
arXiv Detail & Related papers (2021-11-17T18:23:49Z) - Federated Learning with Sparsification-Amplified Privacy and Adaptive
Optimization [27.243322019117144]
Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other.
We propose a new FL framework with sparsification-amplified privacy.
Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee.
arXiv Detail & Related papers (2020-08-01T20:22:57Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z) - FedSel: Federated SGD under Local Differential Privacy with Top-k
Dimension Selection [26.54574385850849]
In this work, we propose a two-stage framework FedSel for federated SGD under LDP.
Specifically, we propose three private dimension selection mechanisms and adapt the accumulation technique to stabilize the learning process with noisy updates.
We also theoretically analyze privacy, accuracy and time complexity of FedSel, which outperforms the state-of-the-art solutions.
arXiv Detail & Related papers (2020-03-24T03:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.