A Survey of Privacy Threats and Defense in Vertical Federated Learning:
From Model Life Cycle Perspective
- URL: http://arxiv.org/abs/2402.03688v1
- Date: Tue, 6 Feb 2024 04:22:44 GMT
- Title: A Survey of Privacy Threats and Defense in Vertical Federated Learning:
From Model Life Cycle Perspective
- Authors: Lei Yu, Meng Han, Yiming Li, Changting Lin, Yao Zhang, Mingyang Zhang,
Yan Liu, Haiqin Weng, Yuseok Jeon, Ka-Ho Chow, Stacy Patterson
- Abstract summary: We conduct the first comprehensive survey of the state-of-the-art in privacy attacks and defenses in Vertical Federated Learning.
We provide for both attacks and defenses, based on their characterizations, and discuss open challenges and future research directions.
- Score: 31.19776505014808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vertical Federated Learning (VFL) is a federated learning paradigm where
multiple participants, who share the same set of samples but hold different
features, jointly train machine learning models. Although VFL enables
collaborative machine learning without sharing raw data, it is still
susceptible to various privacy threats. In this paper, we conduct the first
comprehensive survey of the state-of-the-art in privacy attacks and defenses in
VFL. We provide taxonomies for both attacks and defenses, based on their
characterizations, and discuss open challenges and future research directions.
Specifically, our discussion is structured around the model's life cycle, by
delving into the privacy threats encountered during different stages of machine
learning and their corresponding countermeasures. This survey not only serves
as a resource for the research community but also offers clear guidance and
actionable insights for practitioners to safeguard data privacy throughout the
model's life cycle.
Related papers
- UIFV: Data Reconstruction Attack in Vertical Federated Learning [5.404398887781436]
Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data.
Recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process.
Our work exposes severe privacy vulnerabilities within VFL systems that pose real threats to practical VFL applications.
arXiv Detail & Related papers (2024-06-18T13:18:52Z) - Vertical Federated Learning for Effectiveness, Security, Applicability: A Survey [67.48187503803847]
Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm.
Recent research has shown promising results addressing various challenges in VFL.
This survey offers a systematic overview of recent developments.
arXiv Detail & Related papers (2024-05-25T16:05:06Z) - Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information.
For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees.
We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z) - Students Parrot Their Teachers: Membership Inference on Model
Distillation [54.392069096234074]
We study the privacy provided by knowledge distillation to both the teacher and student training sets.
Our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.
arXiv Detail & Related papers (2023-03-06T19:16:23Z) - Vertical Federated Learning: Concepts, Advances and Challenges [18.38260017835129]
We review the concept and algorithms of Vertical Federated Learning (VFL)
We provide an exhaustive categorization for VFL settings and privacy-preserving protocols.
We propose a unified framework, termed VFLow, which considers the VFL problem under communication, computation, privacy, as well as effectiveness and fairness constraints.
arXiv Detail & Related papers (2022-11-23T10:00:06Z) - ADI: Adversarial Dominating Inputs in Vertical Federated Learning Systems [19.63921342081889]
We find that certain inputs of a participant, named adversarial dominating inputs (perturbs), can dominate the joint inference towards the direction of the adversary's will.
We propose gradient-based methods to synthesize ADIs of various formats and exploit common VFL systems.
Our study reveals new VFL attack opportunities, promoting the identification of unknown threats before breaches and building more secure VFL systems.
arXiv Detail & Related papers (2022-01-08T06:18:17Z) - Federated Test-Time Adaptive Face Presentation Attack Detection with
Dual-Phase Privacy Preservation [100.69458267888962]
Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline.
Due to legal and privacy issues, training data (real face images and spoof images) are not allowed to be directly shared between different data sources.
We propose a Federated Test-Time Adaptive Face Presentation Attack Detection with Dual-Phase Privacy Preservation framework.
arXiv Detail & Related papers (2021-10-25T02:51:05Z) - Privacy and Robustness in Federated Learning: Attacks and Defenses [74.62641494122988]
We conduct the first comprehensive survey on this topic.
Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic.
arXiv Detail & Related papers (2020-12-07T12:11:45Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.