InvZW: Invariant Feature Learning via Noise-Adversarial Training for Robust Image Zero-Watermarking
- URL: http://arxiv.org/abs/2506.20370v1
- Date: Wed, 25 Jun 2025 12:32:08 GMT
- Title: InvZW: Invariant Feature Learning via Noise-Adversarial Training for Robust Image Zero-Watermarking
- Authors: Abdullah All Tanvir, Xin Zhong,
- Abstract summary: This paper introduces a novel deep learning framework for robust image zero-watermarking based on distortion-invariant feature learning.<n>As a zero-watermarking scheme, our method leaves the original image unaltered and learns a reference signature through optimization in the feature space.
- Score: 1.4042211166197214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel deep learning framework for robust image zero-watermarking based on distortion-invariant feature learning. As a zero-watermarking scheme, our method leaves the original image unaltered and learns a reference signature through optimization in the feature space. The proposed framework consists of two key modules. In the first module, a feature extractor is trained via noise-adversarial learning to generate representations that are both invariant to distortions and semantically expressive. This is achieved by combining adversarial supervision against a distortion discriminator and a reconstruction constraint to retain image content. In the second module, we design a learning-based multibit zero-watermarking scheme where the trained invariant features are projected onto a set of trainable reference codes optimized to match a target binary message. Extensive experiments on diverse image datasets and a wide range of distortions show that our method achieves state-of-the-art robustness in both feature stability and watermark recovery. Comparative evaluations against existing self-supervised and deep watermarking techniques further highlight the superiority of our framework in generalization and robustness.
Related papers
- Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal [57.84348166457113]
We introduce a novel feature adapting framework that leverages the representation capacity of a pre-trained image inpainting model.<n>Our approach bridges the knowledge gap between image inpainting and watermark removal by fusing information of the residual background content beneath watermarks into the inpainting backbone model.<n>For relieving the dependence on high-quality watermark masks, we introduce a new training paradigm by utilizing coarse watermark masks to guide the inference process.
arXiv Detail & Related papers (2025-04-07T02:37:14Z) - Text-Guided Image Invariant Feature Learning for Robust Image Watermarking [1.4042211166197214]
We propose a novel text-guided invariant feature learning framework for robust image watermarking.<n>We evaluate the proposed method across multiple datasets, demonstrating superior robustness against various image transformations.
arXiv Detail & Related papers (2025-03-18T01:32:38Z) - A Spitting Image: Modular Superpixel Tokenization in Vision Transformers [0.0]
Vision Transformer (ViT) architectures traditionally employ a grid-based approach to tokenization independent of the semantic content of an image.
We propose a modular superpixel tokenization strategy which decouples tokenization and feature extraction.
arXiv Detail & Related papers (2024-08-14T17:28:58Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - ConDL: Detector-Free Dense Image Matching [2.7582789611575897]
We introduce a deep-learning framework designed for estimating dense image correspondences.
Our fully convolutional model generates dense feature maps for images, where each pixel is associated with a descriptor that can be matched across multiple images.
arXiv Detail & Related papers (2024-08-05T18:34:15Z) - Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models [55.99654128127689]
Visual Foundation Models (VFMs) are used to generate semantic labels for weakly-supervised pixel-to-point contrastive distillation.<n>We adapt sampling probabilities of points to address imbalances in spatial distribution and category frequency.<n>Our approach consistently surpasses existing image-to-LiDAR contrastive distillation methods in downstream tasks.
arXiv Detail & Related papers (2024-05-23T07:48:19Z) - Cross-Image Attention for Zero-Shot Appearance Transfer [68.43651329067393]
We introduce a cross-image attention mechanism that implicitly establishes semantic correspondences across images.
We harness three mechanisms that either manipulate the noisy latent codes or the model's internal representations throughout the denoising process.
Experiments show that our method is effective across a wide range of object categories and is robust to variations in shape, size, and viewpoint.
arXiv Detail & Related papers (2023-11-06T18:33:24Z) - Robust One-shot Segmentation of Brain Tissues via Image-aligned Style
Transformation [13.430851964063534]
We propose a novel image-aligned style transformation to reinforce the dual-model iterative learning for one-shot segmentation of brain tissues.
Experimental results on two public datasets demonstrate 1) a competitive segmentation performance of our method compared to the fully-supervised method, and 2) a superior performance over other state-of-the-art with an increase of average Dice by up to 4.67%.
arXiv Detail & Related papers (2022-11-26T09:14:01Z) - MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust
Classifier [37.774220727662914]
We propose a one-shot mask-guided image synthesis that allows controlling manipulations of a single image.
Our proposed method, entitled MAGIC, leverages structured gradients from a pre-trained quasi-robust classifier.
MAGIC aggregates gradients over the input, driven by a guide binary mask that enforces a strong, spatial prior.
arXiv Detail & Related papers (2022-09-23T12:15:40Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Towards Unsupervised Deep Image Enhancement with Generative Adversarial
Network [92.01145655155374]
We present an unsupervised image enhancement generative network (UEGAN)
It learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner.
Results show that the proposed model effectively improves the aesthetic quality of images.
arXiv Detail & Related papers (2020-12-30T03:22:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.