SEAL: Entangled White-box Watermarks on Low-Rank Adaptation
- URL: http://arxiv.org/abs/2501.09284v2
- Date: Fri, 17 Jan 2025 04:59:32 GMT
- Title: SEAL: Entangled White-box Watermarks on Low-Rank Adaptation
- Authors: Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu,
- Abstract summary: SEAL embeds a secret, non-trainable matrix between trainable LoRA weights, serving as a passport to claim ownership.<n>When applying SEAL, we observed no performance degradation across commonsense reasoning, textual/visual instruction tuning, and text-to-image synthesis tasks.
- Score: 14.478685983719128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, LoRA and its variants have become the de facto strategy for training and sharing task-specific versions of large pretrained models, thanks to their efficiency and simplicity. However, the issue of copyright protection for LoRA weights, especially through watermark-based techniques, remains underexplored. To address this gap, we propose SEAL (SEcure wAtermarking on LoRA weights), the universal whitebox watermarking for LoRA. SEAL embeds a secret, non-trainable matrix between trainable LoRA weights, serving as a passport to claim ownership. SEAL then entangles the passport with the LoRA weights through training, without extra loss for entanglement, and distributes the finetuned weights after hiding the passport. When applying SEAL, we observed no performance degradation across commonsense reasoning, textual/visual instruction tuning, and text-to-image synthesis tasks. We demonstrate that SEAL is robust against a variety of known attacks: removal, obfuscation, and ambiguity attacks.
Related papers
- AuthenLoRA: Entangling Stylization with Imperceptible Watermarks for Copyright-Secure LoRA Adapters [52.556959321030966]
Low-Rank Adaptation (LoRA) offers an efficient paradigm for customizing diffusion models.<n>Existing watermarking techniques either target base models or verify LoRA modules themselves.<n>We propose AuthenLoRA, a unified watermarking framework that embeds imperceptible, traceable watermarks directly into the LoRA training process.
arXiv Detail & Related papers (2025-11-26T09:48:11Z) - LoRA Patching: Exposing the Fragility of Proactive Defenses against Deepfakes [4.217198925206348]
Low-Rank Adaptation (LoRA) patching injects a plug-and-play LoRA patch into Deepfake generators to bypass state-of-the-art defenses.<n>A learnable gating mechanism adaptively controls the effect of the LoRA patch and prevents explosions during fine-tuning.<n>With only 1,000 facial examples and a single epoch of fine-tuning, LoRA patching successfully defeats multiple preemptive defenses.
arXiv Detail & Related papers (2025-10-04T09:22:26Z) - Activated LoRA: Fine-tuned LLMs for Intrinsics [9.503174205896533]
Low-Rank Adaptation (LoRA) has emerged as a highly efficient framework for finetuning the weights of large foundation models.
We propose Activated LoRA (aLoRA), which modifies the LoRA framework to only adapt weights for the tokens in the sequence emphafter the aLoRA is invoked.
This change crucially allows aLoRA to accept the base model's KV cache of the input string, meaning that aLoRA can be instantly activated whenever needed in a chain.
arXiv Detail & Related papers (2025-04-16T18:03:21Z) - RaSA: Rank-Sharing Low-Rank Adaptation [67.40422142257091]
Low-rank adaptation (LoRA) has been prominently employed for parameter-efficient fine-tuning of large language models (LLMs)
We introduce Rank-Sharing Low-Rank Adaptation (RaSA), an innovative extension that enhances the expressive capacity of LoRA by leveraging partial rank sharing across layers.
Our theoretically grounded and empirically validated approach demonstrates that RaSA not only maintains the core advantages of LoRA but also significantly boosts performance in challenging code and math tasks.
arXiv Detail & Related papers (2025-03-16T17:16:36Z) - LoRAGuard: An Effective Black-box Watermarking Approach for LoRAs [14.199095322820314]
We introduce LoRAGuard, a novel black-box watermarking technique for detecting unauthorized misuse of LoRAs.
LoRAGuard achieves nearly 100% watermark verification success and demonstrates strong effectiveness.
arXiv Detail & Related papers (2025-01-26T10:46:59Z) - Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead [41.31302904190149]
Fine-tuning large language models with low-rank adaptations (LoRAs) has become common practice, often yielding numerous copies of the same LLM differing only in their LoRA updates.
This paradigm presents challenges for systems that serve real-time responses to queries that each involve a different LoRA.
We propose a method for the joint compression of LoRAs into a shared basis paired with LoRA-specific scaling matrices.
arXiv Detail & Related papers (2024-06-17T15:21:35Z) - Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models [51.20476412037321]
We propose Safe LoRA, a simple one-liner patch to the original LoRA implementation by introducing the projection of LoRA weights from selected layers to the safety-aligned subspace.
Our experiments demonstrate that when fine-tuning on purely malicious data, Safe LoRA retains similar safety performance as the original aligned model.
arXiv Detail & Related papers (2024-05-27T05:04:05Z) - LoRA Learns Less and Forgets Less [25.09261710396838]
Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for large language models.
We compare the performance of LoRA and full finetuning on two target domains, programming and mathematics.
arXiv Detail & Related papers (2024-05-15T19:27:45Z) - LoRA-as-an-Attack! Piercing LLM Safety Under The Share-and-Play Scenario [61.99243609126672]
We study how to inject backdoor into the LoRA module and dive deeper into LoRA's infection mechanisms.
Our aim is to raise awareness of the potential risks under the emerging share-and-play scenario, so as to proactively prevent potential consequences caused by LoRA-as-an-Attack.
arXiv Detail & Related papers (2024-02-29T20:25:16Z) - ResLoRA: Identity Residual Mapping in Low-Rank Adaption [96.59370314485074]
We propose ResLoRA, an improved framework of low-rank adaptation (LoRA)
Our method can achieve better results in fewer training steps without any extra trainable parameters or inference cost compared to LoRA.
The experiments on NLG, NLU, and text-to-image tasks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-02-28T04:33:20Z) - LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z) - DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA.
Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA)
DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.