Avoiding Copyright Infringement via Large Language Model Unlearning
- URL: http://arxiv.org/abs/2406.10952v2
- Date: Thu, 17 Oct 2024 01:15:31 GMT
- Title: Avoiding Copyright Infringement via Large Language Model Unlearning
- Authors: Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong,
- Abstract summary: We propose a novel framework designed to unlearn copyrighted content from Large Language Models over multiple time steps.
We improve unlearning efficacy by introducing random labeling loss and ensuring the model retains its general-purpose knowledge.
Experimental results show that SSU achieves an effective trade-off between unlearning efficacy and general-purpose language abilities.
- Score: 24.050754626661124
- License:
- Abstract: Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. In real-world scenarios, model owners need to continuously address copyright infringement as new requests for content removal emerge at different time points. This leads to the need for sequential unlearning, where copyrighted content is removed sequentially as new requests arise. Despite its practical relevance, sequential unlearning in the context of copyright infringement has not been rigorously explored in existing literature. To address this gap, we propose Stable Sequential Unlearning (SSU), a novel framework designed to unlearn copyrighted content from LLMs over multiple time steps. Our approach works by identifying and removing specific weight updates in the model's parameters that correspond to copyrighted content. We improve unlearning efficacy by introducing random labeling loss and ensuring the model retains its general-purpose knowledge by adjusting targeted parameters. Experimental results show that SSU achieves an effective trade-off between unlearning efficacy and general-purpose language abilities, outperforming existing baselines.
Related papers
- RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model [42.77851688874563]
We propose a Reinforcement Learning-based Copyright Protection(RLCP) method for Text-to-Image Diffusion Model.
Our approach minimizes the generation of copyright-infringing content while maintaining the quality of the model-generated dataset.
arXiv Detail & Related papers (2024-08-29T15:39:33Z) - Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? [62.72729485995075]
We investigate the effectiveness of watermarking as a deterrent against the generation of copyrighted texts.
We find that watermarking adversely affects the success rate of Membership Inference Attacks (MIAs)
We propose an adaptive technique to improve the success rate of a recent MIA under watermarking.
arXiv Detail & Related papers (2024-07-24T16:53:09Z) - MUSE: Machine Unlearning Six-Way Evaluation for Language Models [109.76505405962783]
Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content.
We propose MUSE, a comprehensive machine unlearning evaluation benchmark.
We benchmark how effectively eight popular unlearning algorithms can unlearn Harry Potter books and news articles.
arXiv Detail & Related papers (2024-07-08T23:47:29Z) - UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI [50.61495097098296]
We revisit the paradigm in which unlearning is used for Large Language Models (LLMs)
We introduce a concept of ununlearning, where unlearned knowledge gets reintroduced in-context.
We argue that content filtering for impermissible knowledge will be required and even exact unlearning schemes are not enough for effective content regulation.
arXiv Detail & Related papers (2024-06-27T10:24:35Z) - Evaluating Copyright Takedown Methods for Language Models [100.38129820325497]
Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material.
This paper introduces the first evaluation of the feasibility and side effects of copyright takedowns for LMs.
We examine several strategies, including adding system prompts, decoding-time filtering interventions, and unlearning approaches.
arXiv Detail & Related papers (2024-06-26T18:09:46Z) - Machine Unlearning in Large Language Models [0.7864304771129751]
This paper introduces a methodology to align large language models (LLMs) with ethical, privacy, and safety standards.
Our approach aims to selectively erase or modify learned information in LLMs, targeting harmful responses and copyrighted content.
arXiv Detail & Related papers (2024-05-24T02:12:51Z) - Copyright Traps for Large Language Models [6.902279764206365]
We propose to use copyright traps to detect the use of copyrighted content in large language models.
We train a 1.3B model from scratch and insert traps into original content (books)
We show, contrary to intuition, that even medium-length trap sentences repeated a significant number of times (100) are not detectable using existing methods.
arXiv Detail & Related papers (2024-02-14T18:09:53Z) - A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works.
Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement.
We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.