Related papers: Label Leakage and Protection from Forward Embedding in Vertical Federated Learning

Label Leakage and Protection from Forward Embedding in Vertical Federated Learning

URL: http://arxiv.org/abs/2203.01451v2
Date: Fri, 4 Mar 2022 01:54:15 GMT
Title: Label Leakage and Protection from Forward Embedding in Vertical Federated Learning
Authors: Jiankai Sun and Xin Yang and Yuanshun Yao and Chong Wang
Abstract summary: We propose a practical label inference method which can steal private labels from the shared intermediate embedding. The effectiveness of the label attack is inseparable from the correlation between the intermediate embedding and corresponding private labels.
Score: 19.96017956261838
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vertical federated learning (vFL) has gained much attention and been deployed to solve machine learning problems with data privacy concerns in recent years. However, some recent work demonstrated that vFL is vulnerable to privacy leakage even though only the forward intermediate embedding (rather than raw features) and backpropagated gradients (rather than raw labels) are communicated between the involved participants. As the raw labels often contain highly sensitive information, some recent work has been proposed to prevent the label leakage from the backpropagated gradients effectively in vFL. However, these work only identified and defended the threat of label leakage from the backpropagated gradients. None of these work has paid attention to the problem of label leakage from the intermediate embedding. In this paper, we propose a practical label inference method which can steal private labels effectively from the shared intermediate embedding even though some existing protection methods such as label differential privacy and gradients perturbation are applied. The effectiveness of the label attack is inseparable from the correlation between the intermediate embedding and corresponding private labels. To mitigate the issue of label leakage from the forward embedding, we add an additional optimization goal at the label party to limit the label stealing ability of the adversary by minimizing the distance correlation between the intermediate embedding and corresponding private labels. We conducted massive experiments to demonstrate the effectiveness of our proposed protection methods.

Related papers

LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning [15.462384896601108]
We propose Label-Anonymized Defense with Substitution Gradient (LADSG), a unified and lightweight defense framework for Vertical Federated Learning (VFL)<n>LADSG first anonymizes true labels via soft distillation to reduce semantic exposure, then generates semantically-aligned substitute gradients to disrupt gradient-based leakage, and finally filters anomalous updates through gradient norm detection.<n>Extensive experiments on six real-world datasets show that LADSG reduces the success rates of all three types of label inference attacks by 30-60% with minimal computational overhead, demonstrating its practical effectiveness.
arXiv Detail & Related papers (2025-06-07T10:10:56Z)
Mixed Blessing: Class-Wise Embedding guided Instance-Dependent Partial Label Learning [53.64180787439527]
In partial label learning (PLL), every sample is associated with a candidate label set comprising the ground-truth label and several noisy labels. For the first time, we create class-wise embeddings for each sample, which allow us to explore the relationship of instance-dependent noisy labels. To reduce the high label ambiguity, we introduce the concept of class prototypes containing global feature information.
arXiv Detail & Related papers (2024-12-06T13:25:39Z)
Learning from Concealed Labels [5.235218636685312]
We propose a novel setting to protect privacy of each instance, namely learning from concealed labels for multi-class classification. Concealed labels prevent sensitive labels from appearing in the label set during the label collection stage, which specifies none and some random sampled insensitive labels as concealed labels set to annotate sensitive data.
arXiv Detail & Related papers (2024-12-03T08:00:19Z)
LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation [10.224977496821154]
Split Neural Network is popular in industry due to its privacy-preserving characteristics. malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage. We propose a new label obfuscation defense strategy, called LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels.
arXiv Detail & Related papers (2024-05-27T10:54:42Z)
KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning [2.765106384328772]
In a Vertical Federated Learning (VFL) scenario, the labels of the samples are kept private from all the parties except for the aggregating server, that is the label owner. Recent works discovered that by exploiting gradient information returned by the server to bottom models, an adversary can infer the private labels. We propose a novel framework called KDk, that combines Knowledge Distillation and k-anonymity to provide a defense mechanism.
arXiv Detail & Related papers (2024-04-18T17:51:02Z)
Defending Label Inference Attacks in Split Learning under Regression Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning. In this paper, we focus on label inference attacks in Split Learning under regression setting. We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients. To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z)
Adversary-Aware Partial label learning with Label distillation [47.18584755798137]
We present Ad-Aware Partial Label Learning and introduce the $textitrival$, a set of noisy labels, to the collection of candidate labels for each instance. Our method achieves promising results on the CIFAR10, CIFAR100 and CUB200 datasets.
arXiv Detail & Related papers (2023-04-02T10:18:30Z)
Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers. We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z)
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels. Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels. We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z)
Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper. Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions. Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z)
Protecting Split Learning by Potential Energy Loss [70.81375125791979]
We focus on the privacy leakage from the forward embeddings of split learning. We propose the potential energy loss to make the forward embeddings become more 'complicated'
arXiv Detail & Related papers (2022-10-18T06:21:11Z)
Acknowledging the Unknown for Multi-label Learning with Single Positive Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML) We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels. Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z)
Does Label Differential Privacy Prevent Label Inference Attacks? [26.87328379562665]
Label differential privacy (label-DP) is a popular framework for training private ML models on datasets with public features and sensitive private labels. Despite its rigorous privacy guarantee, it has been observed that in practice label-DP does not preclude label inference attacks (LIAs)
arXiv Detail & Related papers (2022-02-25T20:57:29Z)
Does label smoothing mitigate label noise? [57.76529645344897]
We show that label smoothing is competitive with loss-correction under label noise. We show that when distilling models from noisy data, label smoothing of the teacher is beneficial.
arXiv Detail & Related papers (2020-03-05T18:43:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.