AMUSE: Adaptive Multi-Segment Encoding for Dataset Watermarking
- URL: http://arxiv.org/abs/2403.05628v2
- Date: Thu, 18 Jul 2024 08:00:38 GMT
- Title: AMUSE: Adaptive Multi-Segment Encoding for Dataset Watermarking
- Authors: Saeed Ranjbar Alvar, Mohammad Akbari, David Ming Xuan Yue, Yong Zhang,
- Abstract summary: watermarking techniques are used to store ownership information (i.e., watermark) into the individual image samples.
Embedding the entire watermark into all samples leads to significant redundancy in the embedded information.
We propose a multi-segment encoding-decoding method for dataset watermarking (called AMUSE)
Our decoder is then used to reconstruct the original message from the extracted sub-messages.
- Score: 12.2352706636564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Curating high quality datasets that play a key role in the emergence of new AI applications requires considerable time, money, and computational resources. So, effective ownership protection of datasets is becoming critical. Recently, to protect the ownership of an image dataset, imperceptible watermarking techniques are used to store ownership information (i.e., watermark) into the individual image samples. Embedding the entire watermark into all samples leads to significant redundancy in the embedded information which damages the watermarked dataset quality and extraction accuracy. In this paper, a multi-segment encoding-decoding method for dataset watermarking (called AMUSE) is proposed to adaptively map the original watermark into a set of shorter sub-messages and vice versa. Our message encoder is an adaptive method that adjusts the length of the sub-messages according to the protection requirements for the target dataset. Existing image watermarking methods are then employed to embed the sub-messages into the original images in the dataset and also to extract them from the watermarked images. Our decoder is then used to reconstruct the original message from the extracted sub-messages. The proposed encoder and decoder are plug-and-play modules that can easily be added to any watermarking method. To this end, extensive experiments are preformed with multiple watermarking solutions which show that applying AMUSE improves the overall message extraction accuracy upto 28% for the same given dataset quality. Furthermore, the image dataset quality is enhanced by a PSNR of $\approx$2 dB on average, while improving the extraction accuracy for one of the tested image watermarking methods.
Related papers
- TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity [68.95168727940973]
Tamper-Aware Generative image WaterMarking method named TAG-WM.<n>This paper proposes a Tamper-Aware Generative image WaterMarking method named TAG-WM.
arXiv Detail & Related papers (2025-06-30T03:14:07Z) - Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal [57.84348166457113]
We introduce a novel feature adapting framework that leverages the representation capacity of a pre-trained image inpainting model.
Our approach bridges the knowledge gap between image inpainting and watermark removal by fusing information of the residual background content beneath watermarks into the inpainting backbone model.
For relieving the dependence on high-quality watermark masks, we introduce a new training paradigm by utilizing coarse watermark masks to guide the inference process.
arXiv Detail & Related papers (2025-04-07T02:37:14Z) - DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models [18.023143082876015]
We propose DERMARK, a dynamic, efficient, and robust multi-bit watermarking method.
DERMARK divides the text into segments of varying lengths for each bit embedding, adaptively matching the text's capacity.
It achieves this with negligible overhead and robust performance against text editing by minimizing watermark extraction loss.
arXiv Detail & Related papers (2025-02-04T11:23:49Z) - On the Coexistence and Ensembling of Watermarks [93.15379331904602]
We find that various open-source watermarks can coexist with only minor impacts on image quality and decoding robustness.
We show how ensembling can increase the overall message capacity and enable new trade-offs between capacity, accuracy, robustness and image quality, without needing to retrain the base models.
arXiv Detail & Related papers (2025-01-29T00:37:06Z) - WaterSeeker: Pioneering Efficient Detection of Watermarked Segments in Large Documents [65.11018806214388]
WaterSeeker is a novel approach to efficiently detect and locate watermarked segments amid extensive natural text.
It achieves a superior balance between detection accuracy and computational efficiency.
WaterSeeker's localization ability supports the development of interpretable AI detection systems.
arXiv Detail & Related papers (2024-09-08T14:45:47Z) - Watermarking Language Models with Error Correcting Codes [41.21656847672627]
We propose a watermarking framework that encodes statistical signals through an error correcting code.
Our method, termed robust binary code (RBC) watermark, introduces no distortion compared to the original probability distribution.
Our empirical findings suggest our watermark is fast, powerful, and robust, comparing favorably to the state-of-the-art.
arXiv Detail & Related papers (2024-06-12T05:13:09Z) - Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection [68.90458499700038]
WaterMark Detection (WMD) is the first invisible watermark detection method under a black-box and annotation-free setting.
We develop WMD using foundations of offset learning, where a clean non-watermarked dataset enables us to isolate the influence of only watermarked samples.
arXiv Detail & Related papers (2024-03-23T23:22:54Z) - Multi-Bit Distortion-Free Watermarking for Large Language Models [4.7381853007029475]
We extend an existing zero-bit distortion-free watermarking method by embedding multiple bits of meta-information as part of the watermark.
We also develop a computationally efficient decoder that extracts the embedded information from the watermark with low bit error rate.
arXiv Detail & Related papers (2024-02-26T14:01:34Z) - RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees [33.61946642460661]
This paper introduces a robust and agile watermark detection framework, dubbed as RAW.
We employ a classifier that is jointly trained with the watermark to detect the presence of the watermark.
We show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image.
arXiv Detail & Related papers (2024-01-23T22:00:49Z) - ReMark: Receptive Field based Spatial WaterMark Embedding Optimization
using Deep Network [23.357707056321534]
We investigate a novel deep learning-based architecture for embedding imperceptible watermarks.
The proposed method is robust against most common distortions on watermarks including collusive distortion.
arXiv Detail & Related papers (2023-05-11T13:21:29Z) - Divided Attention: Unsupervised Multi-Object Discovery with Contextually
Separated Slots [78.23772771485635]
We introduce a method to segment the visual field into independently moving regions, trained with no ground truth or supervision.
It consists of an adversarial conditional encoder-decoder architecture based on Slot Attention.
arXiv Detail & Related papers (2023-04-04T00:26:13Z) - Did You Train on My Dataset? Towards Public Dataset Protection with
Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data.
By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders.
This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z) - Watermarking Images in Self-Supervised Latent Spaces [75.99287942537138]
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.
We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time.
arXiv Detail & Related papers (2021-12-17T15:52:46Z) - Split then Refine: Stacked Attention-guided ResUNets for Blind Single
Image Visible Watermark Removal [69.92767260794628]
Previous watermark removal methods require to gain the watermark location from users or train a multi-task network to recover the background indiscriminately.
We propose a novel two-stage framework with a stacked attention-guided ResUNets to simulate the process of detection, removal and refinement.
We extensively evaluate our algorithm over four different datasets under various settings and the experiments show that our approach outperforms other state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-13T09:05:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.