GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs
- URL: http://arxiv.org/abs/2504.12681v1
- Date: Thu, 17 Apr 2025 06:16:32 GMT
- Title: GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs
- Authors: Kun-Woo Kim, Ji-Hoon Park, Ju-Min Han, Seong-Whan Lee,
- Abstract summary: Large Language Models (LLMs) trained on extensive datasets often learn sensitive information.<n>Retraining entire models from scratch to remove undesired information is both costly and impractical.<n>We propose GRAIL (GRadient-based AdaptIve unLearning), a novel multi-domain unlearning framework.
- Score: 26.13653211674955
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) trained on extensive datasets often learn sensitive information, which raises significant social and legal concerns under principles such as the "Right to be forgotten." Retraining entire models from scratch to remove undesired information is both costly and impractical. Furthermore, existing single-domain unlearning methods fail to address multi-domain scenarios, where knowledge is interwoven across domains such as privacy and copyright, creating overlapping representations that lead to excessive knowledge removal or degraded performance. To tackle these issues, we propose GRAIL (GRadient-based AdaptIve unLearning), a novel multi-domain unlearning framework. GRAIL leverages gradient information from multiple domains to precisely distinguish the unlearning scope from the retention scope, and applies an adaptive parameter-wise localization strategy to selectively remove targeted knowledge while preserving critical parameters for each domain. Experimental results on unlearning benchmarks show that GRAIL achieves unlearning success on par with the existing approaches, while also demonstrating up to 17% stronger knowledge retention success compared to the previous state-of-art method. Our findings establish a new paradigm for effectively managing and regulating sensitive information in large-scale pre-trained language models.
Related papers
- GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models [34.90826139012299]
Large language model (LLM) unlearning has demonstrated its essential role in removing privacy and copyright-related responses.<n>The pursuit of complete unlearning often comes with substantial costs due to its compromises in their general functionality.<n>We propose Gradient Rectified Unlearning (GRU), an enhanced unlearning framework controlling the updating gradients in a geometry-focused manner.
arXiv Detail & Related papers (2025-03-12T07:08:54Z) - Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data [2.308168896770315]
We introduce a novel method for transfer learning called Transfer learning through Enhanced Sufficient Representation (TESR)<n>Our approach begins by estimating a sufficient and invariant representation from the source domains.<n>This representation is then enhanced with an independent component derived from the target data, ensuring that it is sufficient for the target domain and adaptable to its specific characteristics.
arXiv Detail & Related papers (2025-02-22T13:18:28Z) - CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept [5.345828824625758]
We propose a novel amortized unlearning approach using codebook features and Sparse Autoencoders (SAEs)
By leveraging a bottleneck to decompose the activation space and regulate information flow, our method efficiently unlearns targeted information while preserving the model's performance on unrelated data.
arXiv Detail & Related papers (2024-10-08T10:26:22Z) - Learning to Discover Knowledge: A Weakly-Supervised Partial Domain Adaptation Approach [20.899013563493202]
Domain adaptation has shown appealing performance by leveraging knowledge from a source domain with rich annotations.
For a specific target task, it is cumbersome to collect related and high-quality source domains.
In this paper, we propose a simple yet effective domain adaptation approach, termed as self-paced transfer classifier learning (SP-TCL)
arXiv Detail & Related papers (2024-06-20T12:54:07Z) - Unsupervised domain adaptation by learning using privileged information [6.748420131629902]
We show that training-time access to side information in the form of auxiliary variables can help relax restrictions on input variables.
We propose a simple two-stage learning algorithm, inspired by our analysis of the expected error in the target domain, and a practical end-to-end variant for image classification.
arXiv Detail & Related papers (2023-03-16T14:31:50Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Domain Adaptive Semantic Segmentation without Source Data [50.18389578589789]
We investigate domain adaptive semantic segmentation without source data, which assumes that the model is pre-trained on the source domain.
We propose an effective framework for this challenging problem with two components: positive learning and negative learning.
Our framework can be easily implemented and incorporated with other methods to further enhance the performance.
arXiv Detail & Related papers (2021-10-13T04:12:27Z) - A Curriculum-style Self-training Approach for Source-Free Semantic Segmentation [91.13472029666312]
We propose a curriculum-style self-training approach for source-free domain adaptive semantic segmentation.
Our method yields state-of-the-art performance on source-free semantic segmentation tasks for both synthetic-to-real and adverse conditions.
arXiv Detail & Related papers (2021-06-22T10:21:39Z) - Learning to Generalize Unseen Domains via Memory-based Multi-Source
Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains.
We also present a meta batch normalization layer (MetaBN) to diversify meta-test features.
Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z) - Unsupervised Transfer Learning with Self-Supervised Remedy [60.315835711438936]
Generalising deep networks to novel domains without manual labels is challenging to deep learning.
Pre-learned knowledge does not transfer well without making strong assumptions about the learned and the novel domains.
In this work, we aim to learn a discriminative latent space of the unlabelled target data in a novel domain by knowledge transfer from labelled related domains.
arXiv Detail & Related papers (2020-06-08T16:42:17Z) - Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive
Object Re-ID [55.21702895051287]
Domain adaptive object re-ID aims to transfer the learned knowledge from the labeled source domain to the unlabeled target domain.
We propose a novel self-paced contrastive learning framework with hybrid memory.
Our method outperforms state-of-the-arts on multiple domain adaptation tasks of object re-ID.
arXiv Detail & Related papers (2020-06-04T09:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.