Pixel Seal: Adversarial-only training for invisible image and video watermarking
- URL: http://arxiv.org/abs/2512.16874v1
- Date: Thu, 18 Dec 2025 18:42:19 GMT
- Title: Pixel Seal: Adversarial-only training for invisible image and video watermarking
- Authors: Tomáš Souček, Pierre Fernandez, Hady Elsahar, Sylvestre-Alvise Rebuffi, Valeriu Lacatusu, Tuan Tran, Tom Sander, Alexandre Mourachko,
- Abstract summary: Invisible watermarking is essential for tracing the provenance of digital content.<n>Current approaches often struggle to balance robustness against true imperceptibility.<n>This work introduces Pixel Seal, which sets a new state-of-the-art for image and video watermarking.
- Score: 43.360750005378954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Invisible watermarking is essential for tracing the provenance of digital content. However, training state-of-the-art models remains notoriously difficult, with current approaches often struggling to balance robustness against true imperceptibility. This work introduces Pixel Seal, which sets a new state-of-the-art for image and video watermarking. We first identify three fundamental issues of existing methods: (i) the reliance on proxy perceptual losses such as MSE and LPIPS that fail to mimic human perception and result in visible watermark artifacts; (ii) the optimization instability caused by conflicting objectives, which necessitates exhaustive hyperparameter tuning; and (iii) reduced robustness and imperceptibility of watermarks when scaling models to high-resolution images and videos. To overcome these issues, we first propose an adversarial-only training paradigm that eliminates unreliable pixel-wise imperceptibility losses. Second, we introduce a three-stage training schedule that stabilizes convergence by decoupling robustness and imperceptibility. Third, we address the resolution gap via high-resolution adaptation, employing JND-based attenuation and training-time inference simulation to eliminate upscaling artifacts. We thoroughly evaluate the robustness and imperceptibility of Pixel Seal on different image types and across a wide range of transformations, and show clear improvements over the state-of-the-art. We finally demonstrate that the model efficiently adapts to video via temporal watermark pooling, positioning Pixel Seal as a practical and scalable solution for reliable provenance in real-world image and video settings.
Related papers
- Decoupling Defense Strategies for Robust Image Watermarking [13.474717200403147]
Deep learning-based image watermarking is vulnerable to adversarial and regeneration attacks.<n>We propose AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies.<n>We show AdvMark outperforms with the highest image quality and comprehensive robustness.
arXiv Detail & Related papers (2026-02-23T17:02:55Z) - TIACam: Text-Anchored Invariant Feature Learning with Auto-Augmentation for Camera-Robust Zero-Watermarking [0.5429166905724048]
TIACam is a text-anchored invariant feature learning framework with auto-augmentation for camera-robust zero-watermarking.<n>Experiments on both synthetic and real-world camera captures demonstrate that TIACam achieves feature stability and watermark extraction accuracy.
arXiv Detail & Related papers (2026-02-21T15:06:16Z) - Semantic Watermarking Reinvented: Enhancing Robustness and Generation Quality with Fourier Integrity [31.666430190864947]
We propose a novel embedding method called Hermitian Symmetric Fourier Watermarking (SFW)<n>SFW maintains frequency integrity by enforcing Hermitian symmetry.<n>We introduce a center-aware embedding strategy that reduces the vulnerability of semantic watermarking due to cropping attacks.
arXiv Detail & Related papers (2025-09-09T12:15:16Z) - WaterFlow: Learning Fast & Robust Watermarks using Stable Diffusion [46.10882190865747]
WaterFlow is a fast and extremely robust approach for high fidelity visual watermarking based on a learned latent-dependent watermark.<n>WaterFlow demonstrates state-of-the-art performance on general robustness and is the first method capable of effectively defending against difficult combination attacks.
arXiv Detail & Related papers (2025-04-15T23:27:52Z) - Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal [57.84348166457113]
We introduce a novel feature adapting framework that leverages the representation capacity of a pre-trained image inpainting model.<n>Our approach bridges the knowledge gap between image inpainting and watermark removal by fusing information of the residual background content beneath watermarks into the inpainting backbone model.<n>For relieving the dependence on high-quality watermark masks, we introduce a new training paradigm by utilizing coarse watermark masks to guide the inference process.
arXiv Detail & Related papers (2025-04-07T02:37:14Z) - Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB)<n> TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models.<n>Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z) - JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits [76.25962336540226]
JIGMARK is a first-of-its-kind watermarking technique that enhances robustness through contrastive learning.
Our evaluation reveals that JIGMARK significantly surpasses existing watermarking solutions in resilience to diffusion-model edits.
arXiv Detail & Related papers (2024-06-06T03:31:41Z) - Normalizing Flow as a Flexible Fidelity Objective for Photo-Realistic
Super-resolution [161.39504409401354]
Super-resolution is an ill-posed problem, where a ground-truth high-resolution image represents only one possibility in the space of plausible solutions.
Yet, the dominant paradigm is to employ pixel-wise losses, such as L_, which drive the prediction towards a blurry average.
We address this issue by revisiting the L_ loss and show that it corresponds to a one-layer conditional flow.
Inspired by this relation, we explore general flows as a fidelity-based alternative to the L_ objective.
We demonstrate that the flexibility of deeper flows leads to better visual quality and consistency when combined with adversarial losses.
arXiv Detail & Related papers (2021-11-05T17:56:51Z) - AdvFilter: Predictive Perturbation-aware Filtering against Adversarial
Attack via Multi-domain Learning [17.95784884411471]
We propose predictive perturbation-aware pixel-wise filtering, where dual-perturbation filtering and an uncertainty-aware fusion module are employed.
We show the advantages over enhancing CNNs' robustness, high generalization to different models, and noise levels.
arXiv Detail & Related papers (2021-07-14T06:08:48Z) - Gated Fusion Network for Degraded Image Super Resolution [78.67168802945069]
We propose a dual-branch convolutional neural network to extract base features and recovered features separately.
By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process.
arXiv Detail & Related papers (2020-03-02T13:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.