Label Leakage and Protection from Forward Embedding in Vertical
Federated Learning
- URL: http://arxiv.org/abs/2203.01451v2
- Date: Fri, 4 Mar 2022 01:54:15 GMT
- Title: Label Leakage and Protection from Forward Embedding in Vertical
Federated Learning
- Authors: Jiankai Sun and Xin Yang and Yuanshun Yao and Chong Wang
- Abstract summary: We propose a practical label inference method which can steal private labels from the shared intermediate embedding.
The effectiveness of the label attack is inseparable from the correlation between the intermediate embedding and corresponding private labels.
- Score: 19.96017956261838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vertical federated learning (vFL) has gained much attention and been deployed
to solve machine learning problems with data privacy concerns in recent years.
However, some recent work demonstrated that vFL is vulnerable to privacy
leakage even though only the forward intermediate embedding (rather than raw
features) and backpropagated gradients (rather than raw labels) are
communicated between the involved participants. As the raw labels often contain
highly sensitive information, some recent work has been proposed to prevent the
label leakage from the backpropagated gradients effectively in vFL. However,
these work only identified and defended the threat of label leakage from the
backpropagated gradients. None of these work has paid attention to the problem
of label leakage from the intermediate embedding. In this paper, we propose a
practical label inference method which can steal private labels effectively
from the shared intermediate embedding even though some existing protection
methods such as label differential privacy and gradients perturbation are
applied. The effectiveness of the label attack is inseparable from the
correlation between the intermediate embedding and corresponding private
labels. To mitigate the issue of label leakage from the forward embedding, we
add an additional optimization goal at the label party to limit the label
stealing ability of the adversary by minimizing the distance correlation
between the intermediate embedding and corresponding private labels. We
conducted massive experiments to demonstrate the effectiveness of our proposed
protection methods.
Related papers
- Mixed Blessing: Class-Wise Embedding guided Instance-Dependent Partial Label Learning [53.64180787439527]
In partial label learning (PLL), every sample is associated with a candidate label set comprising the ground-truth label and several noisy labels.
For the first time, we create class-wise embeddings for each sample, which allow us to explore the relationship of instance-dependent noisy labels.
To reduce the high label ambiguity, we introduce the concept of class prototypes containing global feature information.
arXiv Detail & Related papers (2024-12-06T13:25:39Z) - Learning from Concealed Labels [5.235218636685312]
We propose a novel setting to protect privacy of each instance, namely learning from concealed labels for multi-class classification.
Concealed labels prevent sensitive labels from appearing in the label set during the label collection stage, which specifies none and some random sampled insensitive labels as concealed labels set to annotate sensitive data.
arXiv Detail & Related papers (2024-12-03T08:00:19Z) - KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning [2.765106384328772]
In a Vertical Federated Learning (VFL) scenario, the labels of the samples are kept private from all the parties except for the aggregating server, that is the label owner.
Recent works discovered that by exploiting gradient information returned by the server to bottom models, an adversary can infer the private labels.
We propose a novel framework called KDk, that combines Knowledge Distillation and k-anonymity to provide a defense mechanism.
arXiv Detail & Related papers (2024-04-18T17:51:02Z) - Defending Label Inference Attacks in Split Learning under Regression
Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning.
In this paper, we focus on label inference attacks in Split Learning under regression setting.
We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients.
To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z) - Adversary-Aware Partial label learning with Label distillation [47.18584755798137]
We present Ad-Aware Partial Label Learning and introduce the $textitrival$, a set of noisy labels, to the collection of candidate labels for each instance.
Our method achieves promising results on the CIFAR10, CIFAR100 and CUB200 datasets.
arXiv Detail & Related papers (2023-04-02T10:18:30Z) - Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers.
We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Protecting Split Learning by Potential Energy Loss [70.81375125791979]
We focus on the privacy leakage from the forward embeddings of split learning.
We propose the potential energy loss to make the forward embeddings become more 'complicated'
arXiv Detail & Related papers (2022-10-18T06:21:11Z) - Does Label Differential Privacy Prevent Label Inference Attacks? [26.87328379562665]
Label differential privacy (label-DP) is a popular framework for training private ML models on datasets with public features and sensitive private labels.
Despite its rigorous privacy guarantee, it has been observed that in practice label-DP does not preclude label inference attacks (LIAs)
arXiv Detail & Related papers (2022-02-25T20:57:29Z) - Does label smoothing mitigate label noise? [57.76529645344897]
We show that label smoothing is competitive with loss-correction under label noise.
We show that when distilling models from noisy data, label smoothing of the teacher is beneficial.
arXiv Detail & Related papers (2020-03-05T18:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.