Similarity-based Label Inference Attack against Training and Inference of Split Learning
- URL: http://arxiv.org/abs/2203.05222v2
- Date: Fri, 22 Mar 2024 09:38:29 GMT
- Title: Similarity-based Label Inference Attack against Training and Inference of Split Learning
- Authors: Junlin Liu, Xinchen Lyu, Qimei Cui, Xiaofeng Tao,
- Abstract summary: Split learning is a promising paradigm for privacy-preserving distributed learning.
This paper shows that the exchanged intermediate results, including smashed data, can already reveal the private labels.
We propose three label inference attacks to efficiently recover the private labels during both the training and inference phases.
- Score: 13.104547182351332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Split learning is a promising paradigm for privacy-preserving distributed learning. The learning model can be cut into multiple portions to be collaboratively trained at the participants by exchanging only the intermediate results at the cut layer. Understanding the security performance of split learning is critical for many privacy-sensitive applications. This paper shows that the exchanged intermediate results, including the smashed data (i.e., extracted features from the raw data) and gradients during training and inference of split learning, can already reveal the private labels. We mathematically analyze the potential label leakages and propose the cosine and Euclidean similarity measurements for gradients and smashed data, respectively. Then, the two similarity measurements are shown to be unified in Euclidean space. Based on the similarity metric, we design three label inference attacks to efficiently recover the private labels during both the training and inference phases. Experimental results validate that the proposed approaches can achieve close to 100% accuracy of label attacks. The proposed attack can still achieve accurate predictions against various state-of-the-art defense mechanisms, including DP-SGD, label differential privacy, gradient compression, and Marvell.
Related papers
- Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation [10.404379188947383]
Two-party split learning has been proven to survive label inference attacks.
We propose a novel two-party split learning method to defend against existing label inference attacks.
arXiv Detail & Related papers (2024-10-11T09:25:21Z) - LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation [10.224977496821154]
Split Neural Network is popular in industry due to its privacy-preserving characteristics.
malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage.
We propose a new label obfuscation defense strategy, called LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels.
arXiv Detail & Related papers (2024-05-27T10:54:42Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Defending Label Inference Attacks in Split Learning under Regression
Setting [20.77178463903939]
Split Learning is a privacy-preserving method for implementing Vertical Federated Learning.
In this paper, we focus on label inference attacks in Split Learning under regression setting.
We propose Random Label Extension (RLE), where labels are extended to obfuscate the label information contained in the gradients.
To further minimize the impact on the original task, we propose Model-based adaptive Label Extension (MLE), where original labels are preserved in the extended labels and dominate the training process.
arXiv Detail & Related papers (2023-08-18T10:22:31Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Label Inference Attack against Split Learning under Regression Setting [24.287752556622312]
We study the leakage in the scenario of the regression model, where the private labels are continuous numbers.
We propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives.
arXiv Detail & Related papers (2023-01-18T03:17:24Z) - Dense FixMatch: a simple semi-supervised learning method for pixel-wise
prediction tasks [68.36996813591425]
We propose Dense FixMatch, a simple method for online semi-supervised learning of dense and structured prediction tasks.
We enable the application of FixMatch in semi-supervised learning problems beyond image classification by adding a matching operation on the pseudo-labels.
Dense FixMatch significantly improves results compared to supervised learning using only labeled data, approaching its performance with 1/4 of the labeled samples.
arXiv Detail & Related papers (2022-10-18T15:02:51Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Gradient Inversion Attack: Leaking Private Labels in Two-Party Split
Learning [12.335698325757491]
We propose a label leakage attack that allows an adversarial input owner to learn the label owner's private labels.
Our attack can uncover the private label data on several multi-class image classification problems and a binary conversion prediction task with near-perfect accuracy.
While this technique is effective for simpler datasets, it significantly degrades utility for datasets with higher input dimensionality.
arXiv Detail & Related papers (2021-11-25T16:09:59Z) - OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set
Unlabeled Data [65.19205979542305]
Unlabeled data may include out-of-class samples in practice.
OpenCoS is a method for handling this realistic semi-supervised learning scenario.
arXiv Detail & Related papers (2021-06-29T06:10:05Z) - DivideMix: Learning with Noisy Labels as Semi-supervised Learning [111.03364864022261]
We propose DivideMix, a framework for learning with noisy labels.
Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods.
arXiv Detail & Related papers (2020-02-18T06:20:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.