Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning
- URL: http://arxiv.org/abs/2509.15230v1
- Date: Fri, 05 Sep 2025 13:28:04 GMT
- Title: Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning
- Authors: Rutger Hendrix, Giovanni Patanè, Leonardo G. Russo, Simone Carnemolla, Giovanni Bellitto, Federica Proietto Salanitri, Concetto Spampinato, Matteo Pennisi,
- Abstract summary: Foundation models have transformed multimedia analysis by enabling robust and transferable representations across diverse modalities and tasks.<n>Traditional unlearning approaches, including retraining, activation editing, or distillation, are often expensive, fragile, and ill-suited for real-time or continuously evolving systems.<n>We introduce a prompt-based learning framework that unifies knowledge acquisition and removal within a single training phase.
- Score: 9.512928441517811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models have transformed multimedia analysis by enabling robust and transferable representations across diverse modalities and tasks. However, their static deployment conflicts with growing societal and regulatory demands -- particularly the need to unlearn specific data upon request, as mandated by privacy frameworks such as the GDPR. Traditional unlearning approaches, including retraining, activation editing, or distillation, are often computationally expensive, fragile, and ill-suited for real-time or continuously evolving systems. In this paper, we propose a paradigm shift: rethinking unlearning not as a retroactive intervention but as a built-in capability. We introduce a prompt-based learning framework that unifies knowledge acquisition and removal within a single training phase. Rather than encoding information in model weights, our approach binds class-level semantics to dedicated prompt tokens. This design enables instant unlearning simply by removing the corresponding prompt -- without retraining, model modification, or access to original data. Experiments demonstrate that our framework preserves predictive performance on retained classes while effectively erasing forgotten ones. Beyond utility, our method exhibits strong privacy and security guarantees: it is resistant to membership inference attacks, and prompt removal prevents any residual knowledge extraction, even under adversarial conditions. This ensures compliance with data protection principles and safeguards against unauthorized access to forgotten information, making the framework suitable for deployment in sensitive and regulated environments. Overall, by embedding removability into the architecture itself, this work establishes a new foundation for designing modular, scalable and ethically responsive AI models.
Related papers
- Auditing Language Model Unlearning via Information Decomposition [68.48660428111593]
We introduce an interpretable, information-theoretic framework for auditing unlearning using Partial Information Decomposition (PID)<n>By comparing model representations before and after unlearning, we decompose the mutual information with the forgotten data into distinct components, formalizing the notions of unlearned and residual knowledge.<n>Our work introduces a principled, representation-level audit for unlearning, offering theoretical insight and actionable tools for safer deployment of language models.
arXiv Detail & Related papers (2026-01-21T15:51:19Z) - Retrieval-augmented Prompt Learning for Pre-trained Foundation Models [101.13972024610733]
We present RetroPrompt, which aims to achieve a balance between memorization and generalization.<n>Unlike traditional prompting methods, RetroPrompt incorporates a retrieval mechanism throughout the input, training, and inference stages.<n>We conduct comprehensive experiments on a variety of datasets across natural language processing and computer vision tasks to demonstrate the superior performance of our proposed approach.
arXiv Detail & Related papers (2025-12-23T08:15:34Z) - Distill, Forget, Repeat: A Framework for Continual Unlearning in Text-to-Image Diffusion Models [42.10036183563499]
We introduce a novel generative distillation based continual unlearning framework that ensures targeted and stable unlearning under sequences of deletion requests.<n>Experiments on a 10-step sequential benchmark demonstrate that our method unlearns forget concepts with better fidelity.<n>This framework provides a viable pathway for the responsible deployment and maintenance of large-scale generative models.
arXiv Detail & Related papers (2025-12-02T11:22:32Z) - Unlearning Imperative: Securing Trustworthy and Responsible LLMs through Engineered Forgetting [0.0]
Large language models in sensitive domains can't ensure that private information can be permanently forgotten.<n>Retraining from the beginning is prohibitively costly.<n>Existing unlearning methods remain fragmented, difficult to verify, and often vulnerable to recovery.
arXiv Detail & Related papers (2025-11-13T01:29:05Z) - Rethinking Data Protection in the (Generative) Artificial Intelligence Era [138.07763415496288]
We propose a four-level taxonomy that captures the diverse protection needs arising in modern (generative) AI models and systems.<n>Our framework offers a structured understanding of the trade-offs between data utility and control, spanning the entire AI pipeline.
arXiv Detail & Related papers (2025-07-03T02:45:51Z) - Privacy-Aware Lifelong Learning [14.83033354320841]
The field of machine unlearning focuses on explicitly forgetting certain previous knowledge from pretrained models when requested.<n>We propose a solution, privacy-aware lifelong learning (PALL), involving optimization of task-specific sparseworks with parameter sharing within a single architecture.<n>We empirically demonstrate the scalability of PALL across various architectures in image classification, and provide a state-of-the-art solution.
arXiv Detail & Related papers (2025-05-16T07:27:00Z) - Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [52.40798352740857]
We introduce the Iterative Contrastive Unlearning (ICU) framework, which consists of three core components.<n>A Knowledge Unlearning Induction module targets specific knowledge for removal using an unlearning loss.<n>A Contrastive Learning Enhancement module preserves the model's expressive capabilities against the pure unlearning goal.<n>An Iterative Unlearning Refinement module dynamically adjusts the unlearning process through ongoing evaluation and updates.
arXiv Detail & Related papers (2024-07-25T07:09:35Z) - Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning [7.557226714828334]
We present a novel unlearning mechanism designed to remove the impact of specific data samples from a neural network.
In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model.
Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task.
arXiv Detail & Related papers (2024-07-01T00:20:26Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - RoFL: Attestable Robustness for Secure Federated Learning [59.63865074749391]
Federated Learning allows a large number of clients to train a joint model without the need to share their private data.
To ensure the confidentiality of the client updates, Federated Learning systems employ secure aggregation.
We present RoFL, a secure Federated Learning system that improves robustness against malicious clients.
arXiv Detail & Related papers (2021-07-07T15:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.