Related papers: Teleportation-Based Defenses for Privacy in Approximate Machine Unlearning

Teleportation-Based Defenses for Privacy in Approximate Machine Unlearning

URL: http://arxiv.org/abs/2512.00272v1
Date: Sat, 29 Nov 2025 01:50:33 GMT
Title: Teleportation-Based Defenses for Privacy in Approximate Machine Unlearning
Authors: Mohammad M Maheri, Xavier Cadet, Peter Chin, Hamed Haddadi,
Abstract summary: Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model.<n>An adversary with access to pre- and post-unlearning models can exploit their differences for membership inference or data reconstruction.<n>We show these vulnerabilities arise from two factors: large gradient norms of forget-set samples and the close proximity of unlearned parameters to the original model.
Score: 8.735490611482364
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical alternative to full retraining. However, it introduces privacy risks: an adversary with access to pre- and post-unlearning models can exploit their differences for membership inference or data reconstruction. We show these vulnerabilities arise from two factors: large gradient norms of forget-set samples and the close proximity of unlearned parameters to the original model. To demonstrate their severity, we propose unlearning-specific membership inference and reconstruction attacks, showing that several state-of-the-art methods (e.g., NGP, SCRUB) remain vulnerable. To mitigate this leakage, we introduce WARP, a plug-and-play teleportation defense that leverages neural network symmetries to reduce forget-set gradient energy and increase parameter dispersion while preserving predictions. This reparameterization obfuscates the signal of forgotten data, making it harder for attackers to distinguish forgotten samples from non-members or recover them via reconstruction. Across six unlearning algorithms, our approach achieves consistent privacy gains, reducing adversarial advantage (AUC) by up to 64% in black-box and 92% in white-box settings, while maintaining accuracy on retained data. These results highlight teleportation as a general tool for reducing attack success in approximate unlearning.

Related papers

Shadow Unlearning: A Neuro-Semantic Approach to Fidelity-Preserving Faceless Forgetting in LLMs [10.135445130232265]
We propose Shadow Unlearning, a novel paradigm of approximate unlearning, that performs machine unlearning on anonymized forget data without exposing PII.<n>We further propose a novel privacy-preserving framework, Neuro-Semantic Projector Unlearning (NSPU) to achieve Shadow unlearning.<n> Experimental results show that NSPU achieves superior unlearning performance, preserves model utility, and enhances user privacy.
arXiv Detail & Related papers (2026-01-07T12:11:25Z)
An Efficient Gradient-Based Inference Attack for Federated Learning [0.0]
Federated learning is a machine learning setting that reduces direct data exposure, improving the privacy guarantees of machine learning models.<n>We present a new gradient-based membership inference attack for federated learning scenarios.<n>Our method uses the shadow technique to learn round-wise gradient patterns of the training records, requiring no access to the private dataset.
arXiv Detail & Related papers (2025-12-17T07:10:04Z)
Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy [18.219835803238837]
We show that approximate unlearning algorithms fail to adequately protect the privacy of unlearned data.<n>We propose the Reminiscence Attack (ReA), which amplifies the correlation between residuals and membership privacy.<n>We develop a dual-phase approximate unlearning framework that first eliminates deep-layer unlearned data traces and then enforces convergence stability.
arXiv Detail & Related papers (2025-07-28T07:12:12Z)
CRFU: Compressive Representation Forgetting Against Privacy Leakage on Machine Unlearning [14.061404670832097]
An effective unlearning method removes the information of the specified data from the trained model, resulting in different outputs for the same input before and after unlearning.<n>We introduce a Compressive Representation Forgetting Unlearning scheme (CRFU) to safeguard against privacy leakage on unlearning.
arXiv Detail & Related papers (2025-02-27T05:59:02Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Accurate Forgetting for All-in-One Image Restoration Model [3.367455972998532]
Currently, a low-cost scheme called Machine Unlearning forgets the private data remembered in the model. Inspired by this, we try to use this concept to bridge the gap between the fields of image restoration and security.
arXiv Detail & Related papers (2024-09-01T10:14:16Z)
Ungeneralizable Examples [70.76487163068109]
Current approaches to creating unlearnable data involve incorporating small, specially designed noises. We extend the concept of unlearnable data to conditional data learnability and introduce textbfUntextbfGeneralizable textbfExamples (UGEs) UGEs exhibit learnability for authorized users while maintaining unlearnability for potential hackers.
arXiv Detail & Related papers (2024-04-22T09:29:14Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model. We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z)
Learning to Invert: Simple Adaptive Attacks for Gradient Inversion in Federated Learning [31.374376311614675]
Gradient inversion attack enables recovery of training samples from model gradients in federated learning. We show that existing defenses can be broken by a simple adaptive attack.
arXiv Detail & Related papers (2022-10-19T20:41:30Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
Privacy-Preserving Federated Learning on Partitioned Attributes [6.661716208346423]
Federated learning empowers collaborative training without exposing local data or models. We introduce an adversarial learning based procedure which tunes a local model to release privacy-preserving intermediate representations. To alleviate the accuracy decline, we propose a defense method based on the forward-backward splitting algorithm.
arXiv Detail & Related papers (2021-04-29T14:49:14Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.