Related papers: Soft Prompting for Unlearning in Large Language Models

Soft Prompting for Unlearning in Large Language Models

URL: http://arxiv.org/abs/2406.12038v2
Date: Mon, 5 Aug 2024 21:48:22 GMT
Title: Soft Prompting for Unlearning in Large Language Models
Authors: Karuna Bhaila, Minh-Hao Van, Xintao Wu,
Abstract summary: This work focuses on investigating machine unlearning for Large Language Models motivated by data protection regulations. We propose a framework textbfSoft textbfPrompting for textbfUntextbflearning (SPUL) We conduct a rigorous evaluation of the proposed method and our results indicate that SPUL can significantly improve the trade-off between utility and forgetting.
Score: 11.504012974208466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread popularity of Large Language Models (LLMs), partly due to their unique ability to perform in-context learning, has also brought to light the importance of ethical and safety considerations when deploying these pre-trained models. In this work, we focus on investigating machine unlearning for LLMs motivated by data protection regulations. In contrast to the growing literature on fine-tuning methods to achieve unlearning, we focus on a comparatively lightweight alternative called soft prompting to realize the unlearning of a subset of training data. With losses designed to enforce forgetting as well as utility preservation, our framework \textbf{S}oft \textbf{P}rompting for \textbf{U}n\textbf{l}earning (SPUL) learns prompt tokens that can be appended to an arbitrary query to induce unlearning of specific examples at inference time without updating LLM parameters. We conduct a rigorous evaluation of the proposed method and our results indicate that SPUL can significantly improve the trade-off between utility and forgetting in the context of text classification and question answering with LLMs. We further validate our method using multiple LLMs to highlight the scalability of our framework and provide detailed insights into the choice of hyperparameters and the influence of the size of unlearning data. Our implementation is available at \url{https://github.com/karuna-bhaila/llm_unlearning}.

Related papers

Beyond Static LLM Policies: Imitation-Enhanced Reinforcement Learning for Recommendation [23.945049006150555]
Large language models (LLMs) have become critical tools for enhancing user engagement by delivering personalized content across diverse digital platforms.<n>Direct deployment of LLMs as primary recommendation policies presents notable challenges, including persistent latency issues.<n>This paper proposes a novel offline reinforcement learning framework that leverages imitation learning from LLM-generated trajectories.
arXiv Detail & Related papers (2025-10-15T07:28:29Z)
Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation [1.3381749415517021]
New approaches leveraging Large Language Models (LLM) in prompts have been proposed, enhancing robustness to unseen and diverse data.<n>Existing methods typically extract text-based responses (i.e., descriptions) from LLM to incorporate into prompts.<n>We propose Description-free Multi-prompt Learning(DeMul), a novel method that eliminates the process of extracting descriptions and instead directly distills knowledge from LLM into prompts.
arXiv Detail & Related papers (2025-07-09T07:55:25Z)
Low-Perplexity LLM-Generated Sequences and Where To Find Them [0.0]
We introduce a systematic approach centered on analyzing low-perplexity sequences - high-probability text spans generated by the model.<n>Our pipeline reliably extracts such long sequences across diverse topics while avoiding degeneration, then traces them back to their sources in the training data.<n>For those that do match, we quantify the distribution of occurrences across source documents, highlighting the scope and nature of verbatim recall.
arXiv Detail & Related papers (2025-07-02T15:58:51Z)
Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL) We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping. ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z)
Teaching Models to Improve on Tape [30.330699770714165]
Large Language Models (LLMs) often struggle when prompted to generate content under specific constraints. Recent works have shown that LLMs can benefit from such "corrective feedback" We introduce an RL framework for teaching models to use such rewards, by simulating interaction sessions, and rewarding the model according to its ability to satisfy the constraints.
arXiv Detail & Related papers (2024-11-03T08:49:55Z)
Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning [57.28766250993726]
This work explores adapting to dynamic user interests without any model updates. Existing Large Language Model (LLM)-based recommenders often lose the in-context learning ability during recommendation tuning. We propose RecICL, which customizes recommendation-specific in-context learning for real-time recommendations.
arXiv Detail & Related papers (2024-10-30T15:48:36Z)
Zero-shot Model-based Reinforcement Learning using Large Language Models [12.930241182192988]
We investigate how pre-trained Large Language Models can be leveraged to predict in context the dynamics of continuous Markov decision processes. We present proof-of-concept applications in two reinforcement learning settings: model-based policy evaluation and data-augmented off-policy reinforcement learning.
arXiv Detail & Related papers (2024-10-15T15:46:53Z)
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models [73.34709921061928]
We propose a training-free method to inject visual prompts into Multimodal Large Language Models (MLLMs) We optimize a learnable latent variable based on an energy function, enhancing the strength of referring regions in the attention map. Our method offers a promising direction for integrating referring abilities into MLLMs, and supports referring with box, mask, scribble and point.
arXiv Detail & Related papers (2024-07-31T11:40:29Z)
Large Language Model Unlearning via Embedding-Corrupted Prompts [10.889859281637406]
We present textbfEmbedding-COrrupted (ECO) Prompts, a lightweight unlearning framework for large language models. We enforce an unlearned state during inference by employing a prompt classifier to identify and safeguard prompts to forget. We find that these embedding-corrupted prompts not only lead to desirable outputs that satisfy the unlearning objective but also closely approximate the output from a model that has never been trained on the data intended for forgetting.
arXiv Detail & Related papers (2024-06-12T06:56:20Z)
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation. We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation. We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z)
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z)
Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners. We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting. Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
Prompt-Learning for Fine-Grained Entity Typing [40.983849729537795]
We investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios. We propose a self-supervised strategy that carries out distribution-level optimization in prompt-learning to automatically summarize the information of entity types.
arXiv Detail & Related papers (2021-08-24T09:39:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.