LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation
- URL: http://arxiv.org/abs/2405.17042v2
- Date: Mon, 22 Jul 2024 06:25:54 GMT
- Title: LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation
- Authors: Ying He, Mingyang Niu, Jingyu Hua, Yunlong Mao, Xu Huang, Chen Li, Sheng Zhong,
- Abstract summary: Split Neural Network is popular in industry due to its privacy-preserving characteristics.
malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage.
We propose a new label obfuscation defense strategy, called LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels.
- Score: 10.224977496821154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Split Neural Network, as one of the most common architectures used in vertical federated learning, is popular in industry due to its privacy-preserving characteristics. In this architecture, the party holding the labels seeks cooperation from other parties to improve model performance due to insufficient feature data. Each of these participants has a self-defined bottom model to learn hidden representations from its own feature data and uploads the embedding vectors to the top model held by the label holder for final predictions. This design allows participants to conduct joint training without directly exchanging data. However, existing research points out that malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage. In this paper, we first propose an embedding extension attack manipulating embeddings to undermine existing defense strategies, which rely on constraining the correlation between the embeddings uploaded by participants and the labels. Subsequently, we propose a new label obfuscation defense strategy, called `LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels with values intertwined, significantly increasing the difficulty for attackers to infer the labels. We conduct experiments on four different types of datasets, and the results show that LabObf significantly reduces the attacker's success rate compared to raw models while maintaining desirable model accuracy.
Related papers
- Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation [10.404379188947383]
Two-party split learning has been proven to survive label inference attacks.
We propose a novel two-party split learning method to defend against existing label inference attacks.
arXiv Detail & Related papers (2024-10-11T09:25:21Z) - Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Defending Label Inference Attacks in Split Learning under Regression
Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning.
In this paper, we focus on label inference attacks in Split Learning under regression setting.
We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients.
To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z) - ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical
Consistency for Efficient Semi-supervised Learning [60.57998388590556]
ProtoCon is a novel method for confidence-based pseudo-labeling.
Online nature of ProtoCon allows it to utilise the label history of the entire dataset in one training cycle.
It delivers significant gains and faster convergence over state-of-the-art datasets.
arXiv Detail & Related papers (2023-03-22T23:51:54Z) - Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers.
We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Transductive CLIP with Class-Conditional Contrastive Learning [68.51078382124331]
We propose Transductive CLIP, a novel framework for learning a classification network with noisy labels from scratch.
A class-conditional contrastive learning mechanism is proposed to mitigate the reliance on pseudo labels.
ensemble labels is adopted as a pseudo label updating strategy to stabilize the training of deep neural networks with noisy labels.
arXiv Detail & Related papers (2022-06-13T14:04:57Z) - Gradient Inversion Attack: Leaking Private Labels in Two-Party Split
Learning [12.335698325757491]
We propose a label leakage attack that allows an adversarial input owner to learn the label owner's private labels.
Our attack can uncover the private label data on several multi-class image classification problems and a binary conversion prediction task with near-perfect accuracy.
While this technique is effective for simpler datasets, it significantly degrades utility for datasets with higher input dimensionality.
arXiv Detail & Related papers (2021-11-25T16:09:59Z) - GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as
Reference [153.354332374204]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
We first introduce a feature alignment objective between labeled and unlabeled data to capture potentially similar image pairs.
MITrans is shown to be a powerful knowledge module for further progressive refining features of unlabeled data.
Along with supervised learning for labeled data, the prediction of unlabeled data is jointly learned with the generated pseudo masks.
arXiv Detail & Related papers (2021-06-29T02:48:45Z) - User Label Leakage from Gradients in Federated Learning [12.239472997714804]
Federated learning enables multiple users to build a joint model by sharing their model updates (gradients)
We propose Label Leakage from Gradients (LLG), a novel attack to extract the labels of the users' training data from their shared gradients.
arXiv Detail & Related papers (2021-05-19T19:21:05Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.