WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck
- URL: http://arxiv.org/abs/2602.21508v1
- Date: Wed, 25 Feb 2026 02:38:17 GMT
- Title: WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck
- Authors: Haoyuan He, Yu Zheng, Jie Zhou, Jiwen Lu,
- Abstract summary: Existing methods fail because they entangle the watermark with high-frequency cover texture, which is susceptible to being rewritten during generative purification.<n>We propose WaterVIB, a framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck.<n>This effectively filters out redundant cover nuances prone to generative shifts, retaining only the essential signal invariant to regeneration.
- Score: 54.83612291453286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which is susceptible to being rewritten during generative purification. To address this, we propose WaterVIB, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck. Instead of overfitting to fragile cover details, our approach forces the model to learn a Minimal Sufficient Statistic of the message. This effectively filters out redundant cover nuances prone to generative shifts, retaining only the essential signal invariant to regeneration. We theoretically prove that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks. Extensive experiments demonstrate that WaterVIB significantly outperforms state-of-the-art methods, achieving superior zero-shot resilience against unknown diffusion-based editing.
Related papers
- More Haste, Less Speed: Weaker Single-Layer Watermark Improves Distortion-Free Watermark Ensembles [58.941305935872265]
We show that strong watermarks significantly reduce the entropy of the token distribution.<n>We propose a framework that utilizes weaker single-layer watermarks to preserve the entropy required for effective multi-layer ensembling.
arXiv Detail & Related papers (2026-02-12T10:18:16Z) - StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models [55.05404953041403]
We propose a novel framework that seamlessly integrates a binary watermark into the diffusion generation process.<n>We show that StableGuard consistently outperforms state-of-the-art methods in image fidelity, watermark verification, and tampering localization.
arXiv Detail & Related papers (2025-09-22T16:35:19Z) - Character-Level Perturbations Disrupt LLM Watermarks [64.60090923837701]
We formalize the system model for Large Language Model (LLM) watermarking.<n>We characterize two realistic threat models constrained on limited access to the watermark detector.<n>We demonstrate character-level perturbations are significantly more effective for watermark removal under the most restrictive threat model.<n> Experiments confirm the superiority of character-level perturbations and the effectiveness of the Genetic Algorithm (GA) in removing watermarks under realistic constraints.
arXiv Detail & Related papers (2025-09-11T02:50:07Z) - Semantic Watermarking Reinvented: Enhancing Robustness and Generation Quality with Fourier Integrity [31.666430190864947]
We propose a novel embedding method called Hermitian Symmetric Fourier Watermarking (SFW)<n>SFW maintains frequency integrity by enforcing Hermitian symmetry.<n>We introduce a center-aware embedding strategy that reduces the vulnerability of semantic watermarking due to cropping attacks.
arXiv Detail & Related papers (2025-09-09T12:15:16Z) - Watermarking Degrades Alignment in Language Models: Analysis and Mitigation [8.866121740748447]
This paper presents a systematic analysis of how two popular watermarking approaches-Gumbel and KGW-affect truthfulness, safety, and helpfulness.<n>We propose an inference-time sampling method that uses an external reward model to restore alignment.
arXiv Detail & Related papers (2025-06-04T21:29:07Z) - Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach [53.32564762183639]
We introduce a novel, unified theoretical framework for watermarking Large Language Models (LLMs)<n>Our approach aims to maximize detection performance while maintaining control over the worst-case false positive rate (FPR) and distortion on text quality.<n>We propose a distortion-free, distribution-adaptive watermarking algorithm (DAWA) that leverages a surrogate model for model-agnosticism and efficiency.
arXiv Detail & Related papers (2024-10-03T18:28:10Z) - On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective [39.676548104635096]
Safeguarding the intellectual property of machine learning models has emerged as a pressing concern in AI security.
Model watermarking is a powerful technique for protecting ownership of machine learning models.
We propose a novel model watermarking scheme, In-distribution Watermark Embedding (IWE), to overcome the limitations of existing method.
arXiv Detail & Related papers (2024-09-10T00:55:21Z) - Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion [15.086451828825398]
evasion adversaries can readily exploit the shortcuts created by models memorizing watermark samples.
By learning the model to accurately recognize them, unique watermark behaviors are promoted through knowledge injection.
arXiv Detail & Related papers (2024-04-21T03:38:20Z) - Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs [23.639074918667625]
We propose a novel multi-bit box-free watermarking method for GANs with improved robustness against white-box attacks.
The watermark is embedded by adding an extra watermarking loss term during GAN training.
We show that the presence of the watermark has a negligible impact on the quality of the generated images.
arXiv Detail & Related papers (2023-10-25T18:38:10Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.