Related papers: Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology

Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology

URL: http://arxiv.org/abs/2511.03641v1
Date: Wed, 05 Nov 2025 17:00:39 GMT
Title: Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology
Authors: Thomas Souverain,
Abstract summary: The Article 50 and Recital 133 call for marking methods that are ''sufficiently reliable, interoperable, effective and robust''<n>Yet, the rapidly evolving and heterogeneous landscape of watermarks for Large Language Models makes it difficult to determine how these four standards can be translated into concrete evaluations.<n>Our paper addresses this challenge, anchoring the normativity of European requirements in the multiplicity of watermarking techniques.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and Recital 133 call for marking methods that are ''sufficiently reliable, interoperable, effective and robust''. Yet, the rapidly evolving and heterogeneous landscape of watermarks for Large Language Models (LLMs) makes it difficult to determine how these four standards can be translated into concrete and measurable evaluations. Our paper addresses this challenge, anchoring the normativity of European requirements in the multiplicity of watermarking techniques. Introducing clear and distinct concepts on LLM watermarking, our contribution is threefold. (1) Watermarking Categorisation: We propose an accessible taxonomy of watermarking methods according to the stage of the LLM lifecycle at which they are applied - before, during, or after training, and during next-token distribution or sampling. (2) Watermarking Evaluation: We interpret the EU AI Act's requirements by mapping each criterion with state-of-the-art evaluations on robustness and detectability of the watermark, and of quality of the LLM. Since interoperability remains largely untheorised in LLM watermarking research, we propose three normative dimensions to frame its assessment. (3) Watermarking Comparison: We compare current watermarking methods for LLMs against the operationalised European criteria and show that no approach yet satisfies all four standards. Encouraged by emerging empirical tests, we recommend further research into watermarking directly embedded within the low-level architecture of LLMs.

Related papers

WMVLM: Evaluating Diffusion Model Image Watermarking via Vision-Language Models [79.32764976020435]
Digital watermarking is essential for securing generated images from diffusion models.<n>Previous watermark evaluation methods lack a unified framework for both residual and semantic watermarks.<n>We proposeLM, the first unified and interpretable evaluation framework for diffusion model image watermarking via vision-language models.
arXiv Detail & Related papers (2026-01-29T12:14:32Z)
BiMark: Unbiased Multilayer Watermarking for Large Language Models [68.64050157343334]
We propose BiMark, a novel watermarking framework that balances text quality preservation and message embedding capacity.<n>BiMark achieves up to 30% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity.
arXiv Detail & Related papers (2025-06-19T11:08:59Z)
In-Context Watermarks for Large Language Models [71.29952527565749]
In-Context Watermarking (ICW) embeds watermarks into generated text solely through prompt engineering.<n>We investigate four ICW strategies at different levels of granularity, each paired with a tailored detection method.<n>Our experiments validate the feasibility of ICW as a model-agnostic, practical watermarking approach.
arXiv Detail & Related papers (2025-05-22T17:24:51Z)
Watermarking Language Models through Language Models [1.249418440326334]
We introduce a prompt-guided watermarking framework that operates entirely at the input level and requires no access to model parameters or decoding logits.<n>We evaluate the framework over 25 combinations of Prompting and Marking LMs, such as GPT-4o, Mistral, LLaMA3, and DeepSeek.<n> Experimental results show that watermark signals generalize across architectures and remain robust under fine-tuning, model distillation, and prompt-based adversarial attacks.
arXiv Detail & Related papers (2024-11-07T19:16:49Z)
De-mark: Watermark Removal in Large Language Models [59.00698153097887]
We present De-mark, an advanced framework designed to remove n-gram-based watermarks effectively.<n>Our method utilizes a novel querying strategy, termed random selection probing, which aids in assessing the strength of the watermark.
arXiv Detail & Related papers (2024-10-17T17:42:10Z)
MarkLLM: An Open-Source Toolkit for LLM Watermarking [80.00466284110269]
MarkLLM is an open-source toolkit for implementing LLM watermarking algorithms. For evaluation, MarkLLM offers a comprehensive suite of 12 tools spanning three perspectives, along with two types of automated evaluation pipelines.
arXiv Detail & Related papers (2024-05-16T12:40:01Z)
Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs. It is possible to integrate watermarks without affecting the output probability distribution. The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z)
Three Bricks to Consolidate Watermarks for Large Language Models [13.559357913735122]
This research consolidates watermarks for large language models based on three theoretical and empirical considerations. First, we introduce new statistical tests that offer robust theoretical guarantees which remain valid even at low false-positive rates. Second, we compare the effectiveness of watermarks using classical benchmarks in the field of natural language processing, gaining insights into their real-world applicability.
arXiv Detail & Related papers (2023-07-26T17:56:36Z)
Provable Robust Watermarking for AI-Generated Text [41.5510809722375]
We propose a robust and high-quality watermark method, Unigram-Watermark. We prove that our watermark method enjoys guaranteed generation quality, correctness in watermark detection, and is robust against text editing and paraphrasing.
arXiv Detail & Related papers (2023-06-30T07:24:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.