SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs
- URL: http://arxiv.org/abs/2511.07001v2
- Date: Wed, 12 Nov 2025 01:16:36 GMT
- Title: SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs
- Authors: Zhenliang Zhang, Xinyu Hu, Xiaojun Wan,
- Abstract summary: SCOPE is an inference-time method that requires no parameter updates or auxiliary filters.<n>We identify a copyright-sensitive subspace and clamp its activations during decoding.<n>Experiments on widely recognized benchmarks show that SCOPE mitigates copyright infringement without degrading general utility.
- Score: 39.14996705577274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models sometimes inadvertently reproduce passages that are copyrighted, exposing downstream applications to legal risk. Most existing studies for inference-time defences focus on surface-level token matching and rely on external blocklists or filters, which add deployment complexity and may overlook semantically paraphrased leakage. In this work, we reframe copyright infringement mitigation as intrinsic semantic-space control and introduce SCOPE, an inference-time method that requires no parameter updates or auxiliary filters. Specifically, the sparse autoencoder (SAE) projects hidden states into a high-dimensional, near-monosemantic space; benefiting from this representation, we identify a copyright-sensitive subspace and clamp its activations during decoding. Experiments on widely recognized benchmarks show that SCOPE mitigates copyright infringement without degrading general utility. Further interpretability analyses confirm that the isolated subspace captures high-level semantics.
Related papers
- LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval [74.72139580745511]
LaSER is a novel self-distillation framework that internalizes explicit reasoning into the latent space of retrievers.<n>Our method successfully combines the reasoning depth of explicit CoT pipelines with the inference efficiency of standard dense retrievers.
arXiv Detail & Related papers (2026-03-02T04:11:18Z) - Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection [6.443002210168185]
Traditional noise-layer-based watermarking remains vulnerable to inversion attacks that can recover embedded signals.<n>Recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence.<n>We introduce a Coherence-Preserving Semantic Injection (CSI) attack that leverages LLM-guided semantic manipulation under embedding-space similarity constraints.
arXiv Detail & Related papers (2026-02-25T05:38:08Z) - On the Evidentiary Limits of Membership Inference for Copyright Auditing [8.81439045962811]
We ask whether membership inference attacks (MIAs) can serve as admissible evidence in adversarial copyright disputes.<n>We introduce SAGE, a paraphrasing framework guided by Sparse Autoencoders (SAEs) that rewrites training data to alter lexical structure.<n>Experiments show that state-of-the-art MIAs degrade when models are fine-tuned on SAGE-generated paraphrases, indicating that their signals are not robust to semantics-preserving transformations.
arXiv Detail & Related papers (2026-01-19T10:46:51Z) - SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking [58.475471437150674]
We propose sequential watermarking for soft prompts (SWAP)<n>SWAP encodes watermarks through a specific order of defender-specified out-of-distribution classes.<n>Experiments on 11 datasets demonstrate SWAP's effectiveness, harmlessness, and robustness against potential adaptive attacks.
arXiv Detail & Related papers (2025-11-05T13:48:48Z) - Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption [94.887133335656]
We revisit three classes of watermarking through this lens.<n>emphLLM text watermarking offers modest provider benefit when framed solely as an anti-misuse tool.<n>emphIn-context watermarking (ICW) is tailored for trusted parties, such as conference organizers or educators.
arXiv Detail & Related papers (2025-10-21T06:34:51Z) - Large Language Models Encode Semantics in Low-Dimensional Linear Subspaces [31.401762286885656]
Understanding the latent space geometry of large language models (LLMs) is key to their behavior and alignment.<n>We conduct a large-scale study in 11 empirical models across 6 scientific topics.
arXiv Detail & Related papers (2025-07-13T17:03:25Z) - Certified Mitigation of Worst-Case LLM Copyright Infringement [46.571805194176825]
"copyright takedown" methods are aimed at preventing models from generating content substantially similar to copyrighted ones.<n>We propose BloomScrub, a remarkably simple yet highly effective inference-time approach that provides certified copyright takedown.<n>Our results suggest that lightweight, inference-time methods can be surprisingly effective for copyright prevention.
arXiv Detail & Related papers (2025-04-22T17:16:53Z) - CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models [58.58208005178676]
We propose CopyJudge, a novel automated infringement identification framework.<n>We employ an abstraction-filtration-comparison test framework to assess the likelihood of infringement.<n>We introduce a general LVLM-based mitigation strategy that automatically optimize infringing prompts.
arXiv Detail & Related papers (2025-02-21T08:09:07Z) - Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world personalized applications.<n>The valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.