WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
- URL: http://arxiv.org/abs/2403.01472v2
- Date: Sun, 9 Jun 2024 04:34:55 GMT
- Title: WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
- Authors: Anudeex Shetty, Yue Teng, Ke He, Qiongkai Xu,
- Abstract summary: We propose a new protocol to make the removal of watermarks more challenging by incorporating multiple possible watermark directions.
Our defense approach, WARDEN, notably increases the stealthiness of watermarks and has been empirically shown to be effective against CSE attack.
- Score: 7.660430606056949
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Embedding as a Service (EaaS) has become a widely adopted solution, which offers feature extraction capabilities for addressing various downstream tasks in Natural Language Processing (NLP). Prior studies have shown that EaaS can be prone to model extraction attacks; nevertheless, this concern could be mitigated by adding backdoor watermarks to the text embeddings and subsequently verifying the attack models post-publication. Through the analysis of the recent watermarking strategy for EaaS, EmbMarker, we design a novel CSE (Clustering, Selection, Elimination) attack that removes the backdoor watermark while maintaining the high utility of embeddings, indicating that the previous watermarking approach can be breached. In response to this new threat, we propose a new protocol to make the removal of watermarks more challenging by incorporating multiple possible watermark directions. Our defense approach, WARDEN, notably increases the stealthiness of watermarks and has been empirically shown to be effective against CSE attack.
Related papers
- Your Fixed Watermark is Fragile: Towards Semantic-Aware Watermark for EaaS Copyright Protection [5.2431999629987]
Embedding-as-a-Service (E) has emerged as a successful business pattern but faces significant challenges related to copyright infringement.
Various studies have proposed backdoor-based watermarking schemes to protect the copyright of E services.
In this paper, we reveal that previous watermarking schemes possess semantic-independent characteristics.
arXiv Detail & Related papers (2024-11-14T11:06:34Z) - ESpeW: Robust Copyright Protection for LLM-based EaaS via Embedding-Specific Watermark [50.08021440235581]
Embeds as a Service (Eding) is emerging as a crucial role in AI applications.
Eding is vulnerable to model extraction attacks, highlighting the urgent need for copyright protection.
We propose a novel embedding-specific watermarking (ESpeW) mechanism to offer robust copyright protection for Eding.
arXiv Detail & Related papers (2024-10-23T04:34:49Z) - WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks [28.992750031041744]
We show that existing E watermarks can be removed by paraphrasing when attackers clone the model.
We propose a novel watermarking technique that involves linearly transforming the embeddings.
arXiv Detail & Related papers (2024-08-29T18:59:56Z) - Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns.
Watermarking AI-generated content is a key technology to address these concerns.
We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z) - DIP-Watermark: A Double Identity Protection Method Based on Robust Adversarial Watermark [13.007649270429493]
Face Recognition (FR) systems pose privacy risks.
One countermeasure is adversarial attack, deceiving unauthorized malicious FR.
We propose the first double identity protection scheme based on traceable adversarial watermarking.
arXiv Detail & Related papers (2024-04-23T02:50:38Z) - DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.