Related papers: Avoiding Copyright Infringement via Large Language Model Unlearning

Avoiding Copyright Infringement via Large Language Model Unlearning

URL: http://arxiv.org/abs/2406.10952v2
Date: Thu, 17 Oct 2024 01:15:31 GMT
Title: Avoiding Copyright Infringement via Large Language Model Unlearning
Authors: Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong,
Abstract summary: We propose a novel framework designed to unlearn copyrighted content from Large Language Models over multiple time steps. We improve unlearning efficacy by introducing random labeling loss and ensuring the model retains its general-purpose knowledge. Experimental results show that SSU achieves an effective trade-off between unlearning efficacy and general-purpose language abilities.
Score: 24.050754626661124
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. In real-world scenarios, model owners need to continuously address copyright infringement as new requests for content removal emerge at different time points. This leads to the need for sequential unlearning, where copyrighted content is removed sequentially as new requests arise. Despite its practical relevance, sequential unlearning in the context of copyright infringement has not been rigorously explored in existing literature. To address this gap, we propose Stable Sequential Unlearning (SSU), a novel framework designed to unlearn copyrighted content from LLMs over multiple time steps. Our approach works by identifying and removing specific weight updates in the model's parameters that correspond to copyrighted content. We improve unlearning efficacy by introducing random labeling loss and ensuring the model retains its general-purpose knowledge by adjusting targeted parameters. Experimental results show that SSU achieves an effective trade-off between unlearning efficacy and general-purpose language abilities, outperforming existing baselines.

Related papers

SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking [58.475471437150674]
We propose sequential watermarking for soft prompts (SWAP)<n>SWAP encodes watermarks through a specific order of defender-specified out-of-distribution classes.<n>Experiments on 11 datasets demonstrate SWAP's effectiveness, harmlessness, and robustness against potential adaptive attacks.
arXiv Detail & Related papers (2025-11-05T13:48:48Z)
SoK: Large Language Model Copyright Auditing via Fingerprinting [69.14570598973195]
We introduce a unified framework and formal taxonomy that categorizes existing methods into white-box and black-box approaches.<n>We propose LeaFBench, the first systematic benchmark for evaluating LLM fingerprinting under realistic deployment scenarios.
arXiv Detail & Related papers (2025-08-27T12:56:57Z)
Large Language Model Unlearning for Source Code [65.42425213605114]
PROD is a novel unlearning approach that enables LLMs to forget undesired code content while preserving their code generation capabilities.<n>Our evaluation demonstrates that PROD achieves superior balance between forget quality and model utility compared to existing unlearning approaches.
arXiv Detail & Related papers (2025-06-20T16:27:59Z)
UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models [54.75551043657238]
We introduce UniErase, a novel unlearning paradigm that employs learnable parametric suffix (unlearning token) to steer language models toward targeted forgetting behaviors.<n>UniErase achieves state-of-the-art (SOTA) performance across batch, sequential, and precise unlearning under fictitious and real-world knowledge settings.
arXiv Detail & Related papers (2025-05-21T15:53:28Z)
SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning [22.76025238218253]
SUV is a selective unlearning framework designed to prevent Large Language Models from memorizing copyrighted content. We replace verbatim copyrighted content with plausible and coherent alternatives. We validate our approach using a large-scale dataset of 500 famous books.
arXiv Detail & Related papers (2025-03-29T02:33:26Z)
CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models [58.58208005178676]
We propose CopyJudge, an automated copyright infringement identification framework. We employ an abstraction-filtration-comparison test framework with multi-LVLM debate to assess the likelihood of infringement. Based on the judgments, we introduce a general LVLM-based mitigation strategy.
arXiv Detail & Related papers (2025-02-21T08:09:07Z)
Investigating the Feasibility of Mitigating Potential Copyright Infringement via Large Language Model Unlearning [0.0]
Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material. We propose Stable Sequential Unlearning (SSU), a novel framework designed to unlearn copyrighted content from LLMs over multiple time steps. SSU sometimes achieves an effective trade-off between unlearning efficacy and general-purpose language abilities, outperforming existing baselines, but it's not a cure-all for unlearning copyrighted material.
arXiv Detail & Related papers (2024-12-16T20:01:06Z)
RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model [42.77851688874563]
We propose a Reinforcement Learning-based Copyright Protection(RLCP) method for Text-to-Image Diffusion Model. Our approach minimizes the generation of copyright-infringing content while maintaining the quality of the model-generated dataset.
arXiv Detail & Related papers (2024-08-29T15:39:33Z)
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? [62.72729485995075]
We investigate the effectiveness of watermarking as a deterrent against the generation of copyrighted texts. We find that watermarking adversely affects the success rate of Membership Inference Attacks (MIAs) We propose an adaptive technique to improve the success rate of a recent MIA under watermarking.
arXiv Detail & Related papers (2024-07-24T16:53:09Z)
MUSE: Machine Unlearning Six-Way Evaluation for Language Models [109.76505405962783]
Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content. We propose MUSE, a comprehensive machine unlearning evaluation benchmark. We benchmark how effectively eight popular unlearning algorithms can unlearn Harry Potter books and news articles.
arXiv Detail & Related papers (2024-07-08T23:47:29Z)
UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI [50.61495097098296]
We revisit the paradigm in which unlearning is used for Large Language Models (LLMs) We introduce a concept of ununlearning, where unlearned knowledge gets reintroduced in-context. We argue that content filtering for impermissible knowledge will be required and even exact unlearning schemes are not enough for effective content regulation.
arXiv Detail & Related papers (2024-06-27T10:24:35Z)
Evaluating Copyright Takedown Methods for Language Models [100.38129820325497]
Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material. This paper introduces the first evaluation of the feasibility and side effects of copyright takedowns for LMs. We examine several strategies, including adding system prompts, decoding-time filtering interventions, and unlearning approaches.
arXiv Detail & Related papers (2024-06-26T18:09:46Z)
Machine Unlearning in Large Language Models [0.7864304771129751]
This paper introduces a methodology to align large language models (LLMs) with ethical, privacy, and safety standards. Our approach aims to selectively erase or modify learned information in LLMs, targeting harmful responses and copyrighted content.
arXiv Detail & Related papers (2024-05-24T02:12:51Z)
Copyright Traps for Large Language Models [6.902279764206365]
We propose to use copyright traps to detect the use of copyrighted content in large language models. We train a 1.3B model from scratch and insert traps into original content (books) We show, contrary to intuition, that even medium-length trap sentences repeated a significant number of times (100) are not detectable using existing methods.
arXiv Detail & Related papers (2024-02-14T18:09:53Z)
A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works. Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement. We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z)
Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information. We propose an approach to distill the generated information during fine-tuning of self-supervised speech models. We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.