Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation
- URL: http://arxiv.org/abs/2410.09125v1
- Date: Fri, 11 Oct 2024 09:25:21 GMT
- Title: Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation
- Authors: Yukun Jiang, Peiran Wang, Chengguo Lin, Ziyue Huang, Yong Cheng,
- Abstract summary: Two-party split learning has been proven to survive label inference attacks.
We propose a novel two-party split learning method to defend against existing label inference attacks.
- Score: 10.404379188947383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-party split learning has emerged as a popular paradigm for vertical federated learning. To preserve the privacy of the label owner, split learning utilizes a split model, which only requires the exchange of intermediate representations (IRs) based on the inputs and gradients for each IR between two parties during the learning process. However, split learning has recently been proven to survive label inference attacks. Though several defense methods could be adopted, they either have limited defensive performance or significantly negatively impact the original mission. In this paper, we propose a novel two-party split learning method to defend against existing label inference attacks while maintaining the high utility of the learned models. Specifically, we first craft a dimension transformation module, SecDT, which could achieve bidirectional mapping between original labels and increased K-class labels to mitigate label leakage from the directional perspective. Then, a gradient normalization algorithm is designed to remove the magnitude divergence of gradients from different classes. We propose a softmax-normalized Gaussian noise to mitigate privacy leakage and make our K unknowable to adversaries. We conducted experiments on real-world datasets, including two binary-classification datasets (Avazu and Criteo) and three multi-classification datasets (MNIST, FashionMNIST, CIFAR-10); we also considered current attack schemes, including direction, norm, spectral, and model completion attacks. The detailed experiments demonstrate our proposed method's effectiveness and superiority over existing approaches. For instance, on the Avazu dataset, the attack AUC of evaluated four prominent attacks could be reduced by 0.4532+-0.0127.
Related papers
- Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment [62.73503467108322]
This topic is widely studied in 3D point cloud segmentation due to the difficulty of annotating point clouds densely.
Until recently, pseudo-labels have been widely employed to facilitate training with limited ground-truth labels.
Existing pseudo-labeling approaches could suffer heavily from the noises and variations in unlabelled data.
We propose a novel learning strategy to regularize the pseudo-labels generated for training, thus effectively narrowing the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2024-08-29T13:31:15Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Defending Label Inference Attacks in Split Learning under Regression
Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning.
In this paper, we focus on label inference attacks in Split Learning under regression setting.
We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients.
To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z) - Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers.
We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z) - Spacing Loss for Discovering Novel Categories [72.52222295216062]
Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data.
We first characterize existing NCD approaches into single-stage and two-stage methods based on whether they require access to labeled and unlabeled data together.
We devise a simple yet powerful loss function that enforces separability in the latent space using cues from multi-dimensional scaling.
arXiv Detail & Related papers (2022-04-22T09:37:11Z) - Similarity-based Label Inference Attack against Training and Inference of Split Learning [13.104547182351332]
Split learning is a promising paradigm for privacy-preserving distributed learning.
This paper shows that the exchanged intermediate results, including smashed data, can already reveal the private labels.
We propose three label inference attacks to efficiently recover the private labels during both the training and inference phases.
arXiv Detail & Related papers (2022-03-10T08:02:03Z) - Defending Label Inference and Backdoor Attacks in Vertical Federated
Learning [11.319694528089773]
In collaborative learning, curious parities might be honest but are attempting to infer other parties' private data through inference attacks.
In this paper, we show that private labels can be reconstructed from per-sample gradients.
We introduce a novel technique termed confusional autoencoder (CoAE) based on autoencoder and entropy regularization.
arXiv Detail & Related papers (2021-12-10T09:32:09Z) - Gradient Inversion Attack: Leaking Private Labels in Two-Party Split
Learning [12.335698325757491]
We propose a label leakage attack that allows an adversarial input owner to learn the label owner's private labels.
Our attack can uncover the private label data on several multi-class image classification problems and a binary conversion prediction task with near-perfect accuracy.
While this technique is effective for simpler datasets, it significantly degrades utility for datasets with higher input dimensionality.
arXiv Detail & Related papers (2021-11-25T16:09:59Z) - Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot.
We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks.
Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Attentive WaveBlock: Complementarity-enhanced Mutual Networks for
Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB)
AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels.
Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.