Related papers: Rethinking Entity-level Unlearning for Large Language Models

Rethinking Entity-level Unlearning for Large Language Models

URL: http://arxiv.org/abs/2406.15796v1
Date: Sat, 22 Jun 2024 09:40:07 GMT
Title: Rethinking Entity-level Unlearning for Large Language Models
Authors: Weitao Ma, Xiaocheng Feng, Weihong Zhong, Lei Huang, Yangfan Ye, Bing Qin,
Abstract summary: We propose a novel task of entity-level unlearning, where the entity-related knowledge within the target model is supposed to be entirely erased. Experiments reveal that current unlearning algorithms struggle to achieve effective entity-level unlearning.
Score: 28.708701013154993
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language model unlearning has gained increasing attention due to its potential to mitigate security and privacy concerns. Current research predominantly focuses on Instance-level unlearning, specifically aiming at forgetting predefined instances of sensitive content. However, a notable gap still exists in exploring the deletion of complete entity-related information, which is crucial in many real-world scenarios, such as copyright protection. To this end, we propose a novel task of Entity-level unlearning, where the entity-related knowledge within the target model is supposed to be entirely erased. Given the challenge of practically accessing all entity-related knowledge within a model, we begin by simulating entity-level unlearning scenarios through fine-tuning models to introduce pseudo entities. Following this, we develop baseline methods inspired by trending unlearning techniques and conduct a detailed comparison of their effectiveness in this task. Extensive experiments reveal that current unlearning algorithms struggle to achieve effective entity-level unlearning. Additionally, our analyses further indicate that entity-related knowledge injected through fine-tuning is more susceptible than original entities from pre-training during unlearning, highlighting the necessity for more thorough pseudo-entity injection methods to make them closer to pre-trained knowledge.

Related papers

Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z)
SoK: Machine Unlearning for Large Language Models [14.88062383081161]
Large language model (LLM) unlearning has become a critical topic in machine learning.<n>We propose a new taxonomy based on the intention of unlearning.
arXiv Detail & Related papers (2025-06-10T20:30:39Z)
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond [39.39558417665764]
Large language models (LLMs) should undergo rigorous audits to identify potential risks, such as copyright and privacy infringements. We propose a toolkit of the gradient effect (G-effect), quantifying the impacts of unlearning objectives on model performance.
arXiv Detail & Related papers (2025-02-26T16:59:21Z)
A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models [36.601209595620446]
This study investigates the machine unlearning techniques within the context of large language models (LLMs) LLMs unlearning offers a principled approach to removing the influence of undesirable data from LLMs. Despite growing research interest, there is no comprehensive survey that systematically organizes existing work and distills key insights.
arXiv Detail & Related papers (2025-02-22T12:46:14Z)
Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z)
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset [94.13848736705575]
We introduce Facial Identity Unlearning Benchmark (FIUBench), a novel VLM unlearning benchmark designed to robustly evaluate the effectiveness of unlearning algorithms. We apply a two-stage evaluation pipeline that is designed to precisely control the sources of information and their exposure levels. Through the evaluation of four baseline VLM unlearning algorithms within FIUBench, we find that all methods remain limited in their unlearning performance.
arXiv Detail & Related papers (2024-11-05T23:26:10Z)
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs) In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt. Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z)
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept [5.345828824625758]
We propose a novel amortized unlearning approach using codebook features and Sparse Autoencoders (SAEs) By leveraging a bottleneck to decompose the activation space and regulate information flow, our method efficiently unlearns targeted information while preserving the model's performance on unrelated data.
arXiv Detail & Related papers (2024-10-08T10:26:22Z)
Federated Learning driven Large Language Models for Swarm Intelligence: A Survey [2.769238399659845]
Federated learning (FL) offers a compelling framework for training large language models (LLMs) We focus on machine unlearning, a crucial aspect for complying with privacy regulations like the Right to be Forgotten. We explore various strategies that enable effective unlearning, such as perturbation techniques, model decomposition, and incremental learning.
arXiv Detail & Related papers (2024-06-14T08:40:58Z)
Fusing Domain-Specific Content from Large Language Models into Knowledge Graphs for Enhanced Zero Shot Object State Classification [0.8232137862012223]
This study investigates the potential of Large Language Models (LLMs) in generating and providing domain-specific information. To achieve this, an LLM is integrated into a pipeline that utilizes Knowledge Graphs and pre-trained semantic vectors. Our findings reveal that the integration of LLM-based embeddings, in combination with general-purpose pre-trained embeddings, leads to substantial performance improvements.
arXiv Detail & Related papers (2024-03-18T18:08:44Z)
Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs) This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z)
A Survey of Label-Efficient Deep Learning for 3D Point Clouds [109.07889215814589]
This paper presents the first comprehensive survey of label-efficient learning of point clouds. We propose a taxonomy that organizes label-efficient learning methods based on the data prerequisites provided by different types of labels. For each approach, we outline the problem setup and provide an extensive literature review that showcases relevant progress and challenges.
arXiv Detail & Related papers (2023-05-31T12:54:51Z)
What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task. This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z)
The Value of Information When Deciding What to Learn [21.945359614094503]
This work builds upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014) We offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.
arXiv Detail & Related papers (2021-10-26T19:23:12Z)
Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared. In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.