Label Leakage and Protection from Forward Embedding in Vertical
Federated Learning
- URL: http://arxiv.org/abs/2203.01451v2
- Date: Fri, 4 Mar 2022 01:54:15 GMT
- Title: Label Leakage and Protection from Forward Embedding in Vertical
Federated Learning
- Authors: Jiankai Sun and Xin Yang and Yuanshun Yao and Chong Wang
- Abstract summary: We propose a practical label inference method which can steal private labels from the shared intermediate embedding.
The effectiveness of the label attack is inseparable from the correlation between the intermediate embedding and corresponding private labels.
- Score: 19.96017956261838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vertical federated learning (vFL) has gained much attention and been deployed
to solve machine learning problems with data privacy concerns in recent years.
However, some recent work demonstrated that vFL is vulnerable to privacy
leakage even though only the forward intermediate embedding (rather than raw
features) and backpropagated gradients (rather than raw labels) are
communicated between the involved participants. As the raw labels often contain
highly sensitive information, some recent work has been proposed to prevent the
label leakage from the backpropagated gradients effectively in vFL. However,
these work only identified and defended the threat of label leakage from the
backpropagated gradients. None of these work has paid attention to the problem
of label leakage from the intermediate embedding. In this paper, we propose a
practical label inference method which can steal private labels effectively
from the shared intermediate embedding even though some existing protection
methods such as label differential privacy and gradients perturbation are
applied. The effectiveness of the label attack is inseparable from the
correlation between the intermediate embedding and corresponding private
labels. To mitigate the issue of label leakage from the forward embedding, we
add an additional optimization goal at the label party to limit the label
stealing ability of the adversary by minimizing the distance correlation
between the intermediate embedding and corresponding private labels. We
conducted massive experiments to demonstrate the effectiveness of our proposed
protection methods.
Related papers
- LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation [10.224977496821154]
Split Neural Network is popular in industry due to its privacy-preserving characteristics.
malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage.
We propose a new label obfuscation defense strategy, called LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels.
arXiv Detail & Related papers (2024-05-27T10:54:42Z) - KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning [2.765106384328772]
In a Vertical Federated Learning (VFL) scenario, the labels of the samples are kept private from all the parties except for the aggregating server, that is the label owner.
Recent works discovered that by exploiting gradient information returned by the server to bottom models, an adversary can infer the private labels.
We propose a novel framework called KDk, that combines Knowledge Distillation and k-anonymity to provide a defense mechanism.
arXiv Detail & Related papers (2024-04-18T17:51:02Z) - Defending Label Inference Attacks in Split Learning under Regression
Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning.
In this paper, we focus on label inference attacks in Split Learning under regression setting.
We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients.
To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z) - Adversary-Aware Partial label learning with Label distillation [47.18584755798137]
We present Ad-Aware Partial Label Learning and introduce the $textitrival$, a set of noisy labels, to the collection of candidate labels for each instance.
Our method achieves promising results on the CIFAR10, CIFAR100 and CUB200 datasets.
arXiv Detail & Related papers (2023-04-02T10:18:30Z) - Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers.
We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Protecting Split Learning by Potential Energy Loss [70.81375125791979]
We focus on the privacy leakage from the forward embeddings of split learning.
We propose the potential energy loss to make the forward embeddings become more 'complicated'
arXiv Detail & Related papers (2022-10-18T06:21:11Z) - Acknowledging the Unknown for Multi-label Learning with Single Positive
Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML)
We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels.
Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z) - Does Label Differential Privacy Prevent Label Inference Attacks? [26.87328379562665]
Label differential privacy (label-DP) is a popular framework for training private ML models on datasets with public features and sensitive private labels.
Despite its rigorous privacy guarantee, it has been observed that in practice label-DP does not preclude label inference attacks (LIAs)
arXiv Detail & Related papers (2022-02-25T20:57:29Z) - Does label smoothing mitigate label noise? [57.76529645344897]
We show that label smoothing is competitive with loss-correction under label noise.
We show that when distilling models from noisy data, label smoothing of the teacher is beneficial.
arXiv Detail & Related papers (2020-03-05T18:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.