Invariant-based Robust Weights Watermark for Large Language Models
- URL: http://arxiv.org/abs/2507.08288v1
- Date: Fri, 11 Jul 2025 03:24:47 GMT
- Title: Invariant-based Robust Weights Watermark for Large Language Models
- Authors: Qingxiao Guo, Xinjie Zhu, Yilong Ma, Hui Jin, Yunhao Wang, Weifeng Zhang, Xiaobing Guo,
- Abstract summary: This paper introduces a robust watermarking scheme without retraining or fine-tuning for transformer models.<n>The scheme generates a unique key for each user and derives a stable watermark value.<n>This technology utilizes noise mechanism to hide watermark locations in multi-user scenarios against collusion attack.
- Score: 7.80619681921019
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Watermarking technology has gained significant attention due to the increasing importance of intellectual property (IP) rights, particularly with the growing deployment of large language models (LLMs) on billions resource-constrained edge devices. To counter the potential threats of IP theft by malicious users, this paper introduces a robust watermarking scheme without retraining or fine-tuning for transformer models. The scheme generates a unique key for each user and derives a stable watermark value by solving linear constraints constructed from model invariants. Moreover, this technology utilizes noise mechanism to hide watermark locations in multi-user scenarios against collusion attack. This paper evaluates the approach on three popular models (Llama3, Phi3, Gemma), and the experimental results confirm the strong robustness across a range of attack methods (fine-tuning, pruning, quantization, permutation, scaling, reversible matrix and collusion attacks).
Related papers
- Hot-Swap MarkBoard: An Efficient Black-box Watermarking Approach for Large-scale Model Distribution [14.60627694687767]
We propose Hot-Swap MarkBoard, an efficient watermarking method.<n>It encodes user-specific $n$-bit binary signatures by independently embedding multiple watermarks.<n>The method supports black-box verification and is compatible with various model architectures.
arXiv Detail & Related papers (2025-07-28T09:14:21Z) - TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution [3.1682080884953736]
We propose a generative textbfspeech wattextbfermarking method (TriniMark) for authenticating the generated content.<n>We first design a structure-lightweight watermark encoder that embeds watermarks into the time-domain features of speech.<n>A temporal-aware gated convolutional network is meticulously designed in the watermark decoder for bit-wise watermark recovery.
arXiv Detail & Related papers (2025-04-29T08:23:28Z) - Gaussian Shading++: Rethinking the Realistic Deployment Challenge of Performance-Lossless Image Watermark for Diffusion Models [66.54457339638004]
Copyright protection and inappropriate content generation pose challenges for the practical implementation of diffusion models.<n>We propose a diffusion model watermarking method tailored for real-world deployment.<n>Gaussian Shading++ not only maintains performance losslessness but also outperforms existing methods in terms of robustness.
arXiv Detail & Related papers (2025-04-21T11:18:16Z) - Watermarking Recommender Systems [52.207721219147814]
We introduce Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems.
Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores.
To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence.
arXiv Detail & Related papers (2024-07-17T06:51:24Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.<n> adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.<n> Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - 3D Face Modeling via Weakly-supervised Disentanglement Network joint Identity-consistency Prior [62.80458034704989]
Generative 3D face models featuring disentangled controlling factors hold immense potential for diverse applications in computer vision and computer graphics.
Previous 3D face modeling methods face a challenge as they demand specific labels to effectively disentangle these factors.
This paper introduces a Weakly-Supervised Disentanglement Framework, denoted as WSDF, to facilitate the training of controllable 3D face models without an overly stringent labeling requirement.
arXiv Detail & Related papers (2024-04-25T11:50:47Z) - Functional Invariants to Watermark Large Transformers [30.598259061227594]
The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance.
Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance.
This paper explores watermarks with virtually no computational cost, applicable to a non-blind white-box setting.
arXiv Detail & Related papers (2023-10-17T17:56:18Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Reversible Quantization Index Modulation for Static Deep Neural Network
Watermarking [57.96787187733302]
Reversible data hiding (RDH) methods offer a potential solution, but existing approaches suffer from weaknesses in terms of usability, capacity, and fidelity.
We propose a novel RDH-based static DNN watermarking scheme using quantization index modulation (QIM)
Our scheme incorporates a novel approach based on a one-dimensional quantizer for watermark embedding.
arXiv Detail & Related papers (2023-05-29T04:39:17Z) - A Systematic Review on Model Watermarking for Neural Networks [1.2691047660244335]
This work presents a taxonomy identifying and analyzing different classes of watermarking schemes for machine learning models.
It introduces a unified threat model to allow structured reasoning on and comparison of the effectiveness of watermarking methods.
It systematizes desired security requirements and attacks against ML model watermarking.
arXiv Detail & Related papers (2020-09-25T12:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.