Related papers: TokenMark: A Modality-Agnostic Watermark for Pre-trained Transformers

TokenMark: A Modality-Agnostic Watermark for Pre-trained Transformers

URL: http://arxiv.org/abs/2403.05842v3
Date: Sat, 26 Apr 2025 08:10:01 GMT
Title: TokenMark: A Modality-Agnostic Watermark for Pre-trained Transformers
Authors: Hengyuan Xu, Liyao Xiang, Borui Yang, Xingjun Ma, Siheng Chen, Baochun Li,
Abstract summary: TokenMark is a robust, modality-agnostic, robust watermarking system for pre-trained models.<n>It embeds the watermark by fine-tuning the pre-trained model on a set of specifically permuted data samples.<n>It significantly improves the robustness, efficiency, and universality of model watermarking.
Score: 67.57928750537185
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Watermarking is a critical tool for model ownership verification. However, existing watermarking techniques are often designed for specific data modalities and downstream tasks, without considering the inherent architectural properties of the model. This lack of generality and robustness underscores the need for a more versatile watermarking approach. In this work, we investigate the properties of Transformer models and propose TokenMark, a modality-agnostic, robust watermarking system for pre-trained models, leveraging the permutation equivariance property. TokenMark embeds the watermark by fine-tuning the pre-trained model on a set of specifically permuted data samples, resulting in a watermarked model that contains two distinct sets of weights -- one for normal functionality and the other for watermark extraction, the latter triggered only by permuted inputs. Extensive experiments on state-of-the-art pre-trained models demonstrate that TokenMark significantly improves the robustness, efficiency, and universality of model watermarking, highlighting its potential as a unified watermarking solution.

Related papers

Task-Agnostic Language Model Watermarking via High Entropy Passthrough Layers [11.089926858383476]
We propose model watermarking via passthrough layers, which are added to existing pre-trained networks. Our method is fully task-agnostic, and can be applied to both classification and sequence-to-sequence tasks. We show our method is robust to both downstream fine-tuning, fine-pruning, and layer removal attacks.
arXiv Detail & Related papers (2024-12-17T05:46:50Z)
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a framework designed to embed resilient watermarks into T2I diffusion models. It guides the model to disentangle the watermark information from the semantic concepts it learns. Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z)
Trigger-Based Fragile Model Watermarking for Image Transformation Networks [2.38776871944507]
In fragile watermarking, a sensitive watermark is embedded in an object in a manner such that the watermark breaks upon tampering. We introduce a novel, trigger-based fragile model watermarking system for image transformation/generation networks. Our approach, distinct from robust watermarking, effectively verifies the model's source and integrity across various datasets and attacks.
arXiv Detail & Related papers (2024-09-28T19:34:55Z)
Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB) TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models. Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Watermarking Recommender Systems [52.207721219147814]
We introduce Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems. Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores. To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence.
arXiv Detail & Related papers (2024-07-17T06:51:24Z)
EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z)
Provable Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More [9.931513542441612]
We propose a sound notion of adversarial robustness that accounts for task equivariance. certification methods are, however, unavailable for many models. We derive the first architecture-specific graph edit distance certificates, i.e. sound robustness guarantees for isomorphism equivariant tasks like node classification.
arXiv Detail & Related papers (2023-12-05T12:09:45Z)
Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs [23.639074918667625]
We propose a novel multi-bit box-free watermarking method for GANs with improved robustness against white-box attacks. The watermark is embedded by adding an extra watermarking loss term during GAN training. We show that the presence of the watermark has a negligible impact on the quality of the generated images.
arXiv Detail & Related papers (2023-10-25T18:38:10Z)
ClearMark: Intuitive and Robust Model Watermarking via Transposed Model Training [50.77001916246691]
This paper introduces ClearMark, the first DNN watermarking method designed for intuitive human assessment. ClearMark embeds visible watermarks, enabling human decision-making without rigid value thresholds. It shows an 8,544-bit watermark capacity comparable to the strongest existing work.
arXiv Detail & Related papers (2023-10-25T08:16:55Z)
Functional Invariants to Watermark Large Transformers [30.598259061227594]
The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance. This paper explores watermarks with virtually no computational cost, applicable to a non-blind white-box setting.
arXiv Detail & Related papers (2023-10-17T17:56:18Z)
Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs. It is possible to integrate watermarks without affecting the output probability distribution. The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z)
Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models. We propose a general methodology named watermarking in this paper. We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z)
Certifying Model Accuracy under Distribution Shifts [151.67113334248464]
We present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation.
arXiv Detail & Related papers (2022-01-28T22:03:50Z)
Characterizing and Taming Model Instability Across Edge Devices [4.592454933053539]
This paper presents the first methodical characterization of the variations in model prediction across real-world mobile devices. We introduce a new metric, instability, which captures this variation. In experiments, 14-17% of images produced divergent classifications across one or more phone models.
arXiv Detail & Related papers (2020-10-18T16:52:06Z)
AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering. The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch. The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level. The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.