Purification-Agnostic Proxy Learning for Agentic Copyright Watermarking against Adversarial Evidence Forgery
- URL: http://arxiv.org/abs/2409.01541v1
- Date: Tue, 3 Sep 2024 02:18:45 GMT
- Title: Purification-Agnostic Proxy Learning for Agentic Copyright Watermarking against Adversarial Evidence Forgery
- Authors: Erjin Bao, Ching-Chun Chang, Hanrui Wang, Isao Echizen,
- Abstract summary: Unauthorized use and illegal distribution of AI models pose serious threats to intellectual property.
Model watermarking has emerged as a key technique to address this issue.
This paper presents several contributions to model watermarking.
- Score: 8.695511322757262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the proliferation of AI agents in various domains, protecting the ownership of AI models has become crucial due to the significant investment in their development. Unauthorized use and illegal distribution of these models pose serious threats to intellectual property, necessitating effective copyright protection measures. Model watermarking has emerged as a key technique to address this issue, embedding ownership information within models to assert rightful ownership during copyright disputes. This paper presents several contributions to model watermarking: a self-authenticating black-box watermarking protocol using hash techniques, a study on evidence forgery attacks using adversarial perturbations, a proposed defense involving a purification step to counter adversarial attacks, and a purification-agnostic proxy learning method to enhance watermark reliability and model performance. Experimental results demonstrate the effectiveness of these approaches in improving the security, reliability, and performance of watermarked models.
Related papers
- On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective [39.676548104635096]
Safeguarding the intellectual property of machine learning models has emerged as a pressing concern in AI security.
Model watermarking is a powerful technique for protecting ownership of machine learning models.
We propose a novel model watermarking scheme, In-distribution Watermark Embedding (IWE), to overcome the limitations of existing method.
arXiv Detail & Related papers (2024-09-10T00:55:21Z) - Robustness of Watermarking on Text-to-Image Diffusion Models [9.277492743469235]
We investigate the robustness of generative watermarking, which is created from the integration of watermarking embedding and text-to-image generation processing.
We found that generative watermarking methods are robust to direct evasion attacks, like discriminator-based attacks, or manipulation based on the edge information in edge prediction-based attacks but vulnerable to malicious fine-tuning.
arXiv Detail & Related papers (2024-08-04T13:59:09Z) - Watermarking Recommender Systems [52.207721219147814]
We introduce Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems.
Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores.
To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence.
arXiv Detail & Related papers (2024-07-17T06:51:24Z) - EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.
adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.
Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion [15.086451828825398]
evasion adversaries can readily exploit the shortcuts created by models memorizing watermark samples.
By learning the model to accurately recognize them, unique watermark behaviors are promoted through knowledge injection.
arXiv Detail & Related papers (2024-04-21T03:38:20Z) - Performance-lossless Black-box Model Watermarking [69.22653003059031]
We propose a branch backdoor-based model watermarking protocol to protect model intellectual property.
In addition, we analyze the potential threats to the protocol and provide a secure and feasible watermarking instance for language models.
arXiv Detail & Related papers (2023-12-11T16:14:04Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Protecting the Intellectual Properties of Deep Neural Networks with an
Additional Class and Steganographic Images [7.234511676697502]
We propose a method to protect the intellectual properties of deep neural networks (DNN) models by using an additional class and steganographic images.
We adopt the least significant bit (LSB) image steganography to embed users' fingerprints into watermark key images.
On Fashion-MNIST and CIFAR-10 datasets, the proposed method can obtain 100% watermark accuracy and 100% fingerprint authentication success rate.
arXiv Detail & Related papers (2021-04-19T11:03:53Z) - A Systematic Review on Model Watermarking for Neural Networks [1.2691047660244335]
This work presents a taxonomy identifying and analyzing different classes of watermarking schemes for machine learning models.
It introduces a unified threat model to allow structured reasoning on and comparison of the effectiveness of watermarking methods.
It systematizes desired security requirements and attacks against ML model watermarking.
arXiv Detail & Related papers (2020-09-25T12:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.