BlindU: Blind Machine Unlearning without Revealing Erasing Data
- URL: http://arxiv.org/abs/2601.07214v1
- Date: Mon, 12 Jan 2026 05:09:09 GMT
- Title: BlindU: Blind Machine Unlearning without Revealing Erasing Data
- Authors: Weiqi Wang, Zhiyi Tian, Chenhan Zhang, Shui Yu,
- Abstract summary: Machine unlearning enables data holders to remove the contribution of their specified samples from trained models to protect their privacy.<n>Most unlearning methods require the unlearning requesters to upload their data to the server as a prerequisite for unlearning.<n>We propose textbfBlind Unlearning (BlindU), which carries out unlearning using compressed representations instead of original inputs.
- Score: 29.11439332064202
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine unlearning enables data holders to remove the contribution of their specified samples from trained models to protect their privacy. However, it is paradoxical that most unlearning methods require the unlearning requesters to firstly upload their data to the server as a prerequisite for unlearning. These methods are infeasible in many privacy-preserving scenarios where servers are prohibited from accessing users' data, such as federated learning (FL). In this paper, we explore how to implement unlearning under the condition of not uncovering the erasing data to the server. We propose \textbf{Blind Unlearning (BlindU)}, which carries out unlearning using compressed representations instead of original inputs. BlindU only involves the server and the unlearning user: the user locally generates privacy-preserving representations, and the server performs unlearning solely on these representations and their labels. For the FL model training, we employ the information bottleneck (IB) mechanism. The encoder of the IB-based FL model learns representations that distort maximum task-irrelevant information from inputs, allowing FL users to generate compressed representations locally. For effective unlearning using compressed representation, BlindU integrates two dedicated unlearning modules tailored explicitly for IB-based models and uses a multiple gradient descent algorithm to balance forgetting and utility retaining. While IB compression already provides protection for task-irrelevant information of inputs, to further enhance the privacy protection, we introduce a noise-free differential privacy (DP) masking method to deal with the raw erasing data before compressing. Theoretical analysis and extensive experimental results illustrate the superiority of BlindU in privacy protection and unlearning effectiveness compared with the best existing privacy-preserving unlearning benchmarks.
Related papers
- Data-Free Privacy-Preserving for LLMs via Model Inversion and Selective Unlearning [27.452191507918148]
Large language models (LLMs) exhibit powerful capabilities but risk memorizing sensitive personally identifiable information (PII) from their training data.<n>We propose Data-Free Selective Unlearning (DFSU), a novel privacy-preserving framework that removes sensitive PII from an LLM without requiring its training data.<n>Our approach first synthesizes pseudo-PII through language model inversion, then constructs token-level privacy masks for these synthetic samples, and finally performs token-level selective unlearning.
arXiv Detail & Related papers (2026-01-22T02:43:12Z) - T2UE: Generating Unlearnable Examples from Text Descriptions [60.111026156038264]
Unlearnable Examples (UEs) have emerged as a promising countermeasure against unauthorized model training.<n>We introduce textbfText-to-Unlearnable Example (T2UE), a novel framework that enables users to generate UEs using only text descriptions.
arXiv Detail & Related papers (2025-08-05T05:10:14Z) - Blockchain-enabled Trustworthy Federated Unlearning [50.01101423318312]
Federated unlearning is a promising paradigm for protecting the data ownership of distributed clients.
Existing works require central servers to retain the historical model parameters from distributed clients.
This paper proposes a new blockchain-enabled trustworthy federated unlearning framework.
arXiv Detail & Related papers (2024-01-29T07:04:48Z) - SecureCut: Federated Gradient Boosting Decision Trees with Efficient
Machine Unlearning [10.011146979811752]
It has become imperative to enable data removal in Vertical Federated Learning (VFL) where multiple parties provide private features for model training.
In VFL, data removal, i.e., textitmachine unlearning, often requires removing specific features across all samples under privacy guarentee.
We propose methname, a novel Gradient Boosting Decision Tree (GBDT) framework that effectively enables both textitinstance unlearning and textitfeature unlearning without the need for retraining from scratch.
arXiv Detail & Related papers (2023-11-22T05:38:53Z) - Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information.
For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees.
We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z) - Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model [40.83058938096914]
We propose FedDISC, a Federated Diffusion-Inspired Semi-supervised Co-training method.
We first extract prototypes of the labeled server data and use these prototypes to predict pseudo-labels of the client data.
For each category, we compute the cluster centroids and domain-specific representations to signify the semantic and stylistic information of their distributions.
These representations are sent back to the server, which uses the pre-trained to generate synthetic datasets complying with the client distributions and train a global model on it.
arXiv Detail & Related papers (2023-05-06T14:22:33Z) - Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples [128.25509832644025]
There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet.
UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models.
We present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations.
arXiv Detail & Related papers (2022-12-31T04:26:25Z) - Self-supervised On-device Federated Learning from Unlabeled Streams [15.94978097767473]
We propose a Self-supervised On-device Federated learning framework with coreset selection, which we call SOFed, to automatically select a coreset.
Experiments demonstrate the effectiveness and significance of the proposed method in visual representation learning.
arXiv Detail & Related papers (2022-12-02T07:22:00Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - FedSEAL: Semi-Supervised Federated Learning with Self-Ensemble Learning
and Negative Learning [7.771967424619346]
Federated learning (FL) is a popular decentralized and privacy-preserving machine learning (FL) framework.
In this paper, we propose a new FL algorithm, called FedSEAL, to solve this Semi-Supervised Federated Learning (SSFL) problem.
Our algorithm utilizes self-ensemble learning and complementary negative learning to enhance both the accuracy and the efficiency of clients' unsupervised learning on unlabeled data.
arXiv Detail & Related papers (2021-10-15T03:03:23Z) - Privacy Adversarial Network: Representation Learning for Mobile Data
Privacy [33.75500773909694]
A growing number of cloud-based intelligent services for mobile users require user data to be sent to the provider.
Prior works either obfuscate the data, e.g. add noise and remove identity information, or send representations extracted from the data, e.g. anonymized features.
This work departs from prior works in methodology: we leverage adversarial learning to a better balance between privacy and utility.
arXiv Detail & Related papers (2020-06-08T09:42:04Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.