Related papers: LoRAShield: Data-Free Editing Alignment for Secure Personalized LoRA Sharing

LoRAShield: Data-Free Editing Alignment for Secure Personalized LoRA Sharing

URL: http://arxiv.org/abs/2507.07056v1
Date: Sat, 05 Jul 2025 02:53:17 GMT
Title: LoRAShield: Data-Free Editing Alignment for Secure Personalized LoRA Sharing
Authors: Jiahao Chen, junhao li, Yiming Wang, Zhe Ma, Yi Jiang, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji,
Abstract summary: Low-Rank Adaptation (LoRA) models can be shared on platforms like Civitai and Liblib.<n>LoRAShield is the first data-free editing framework for securing LoRA models against misuse.
Score: 43.88211522311429
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The proliferation of Low-Rank Adaptation (LoRA) models has democratized personalized text-to-image generation, enabling users to share lightweight models (e.g., personal portraits) on platforms like Civitai and Liblib. However, this "share-and-play" ecosystem introduces critical risks: benign LoRAs can be weaponized by adversaries to generate harmful content (e.g., political, defamatory imagery), undermining creator rights and platform safety. Existing defenses like concept-erasure methods focus on full diffusion models (DMs), neglecting LoRA's unique role as a modular adapter and its vulnerability to adversarial prompt engineering. To bridge this gap, we propose LoRAShield, the first data-free editing framework for securing LoRA models against misuse. Our platform-driven approach dynamically edits and realigns LoRA's weight subspace via adversarial optimization and semantic augmentation. Experimental results demonstrate that LoRAShield achieves remarkable effectiveness, efficiency, and robustness in blocking malicious generations without sacrificing the functionality of the benign task. By shifting the defense to platforms, LoRAShield enables secure, scalable sharing of personalized models, a critical step toward trustworthy generative ecosystems.

Related papers

Cross-LoRA: A Data-Free LoRA Transfer Framework across Heterogeneous LLMs [10.218401136555064]
Cross-LoRA is a framework for transferring LoRA modules between diverse base models.<n>Experiments show that Cross-LoRA achieves relative gains of up to 5.26% over base models.
arXiv Detail & Related papers (2025-08-07T10:21:08Z)
AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation [32.46570968627392]
Low-rank adaptation (LoRA) have demonstrated efficacy in enabling model customization with minimal parameter overhead.<n>We introduce a novel framework that enables semantic-driven LoRA retrieval and dynamic aggregation.<n>Our approach achieves significant improvement in image generation perfermance.
arXiv Detail & Related papers (2025-08-04T06:36:00Z)
LoRA-Gen: Specializing Large Language Model via Online LoRA Generation [68.01864057372067]
We propose the LoRA-Gen framework to generate LoRA parameters for edge-side models based on task descriptions.<n>We merge the LoRA parameters into the edge-side model to achieve flexible specialization.<n>Our method facilitates knowledge transfer between models while significantly improving the inference efficiency of the specialized model.
arXiv Detail & Related papers (2025-06-13T10:11:01Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
ZKLoRA: Efficient Zero-Knowledge Proofs for LoRA Verification [0.20482269513546458]
Low-Rank Adaptation (LoRA) is a widely adopted method for customizing large-scale language models.<n>In distributed, untrusted training environments, an open source base model user may want to use LoRA weights created by an external contributor.<n>We present ZKLoRA, a zero-knowledge verification protocol that relies on succinct proofs and our novel Multi-Party Inference procedure.
arXiv Detail & Related papers (2025-01-21T23:20:33Z)
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models [51.20476412037321]
We propose Safe LoRA, a simple one-liner patch to the original LoRA implementation by introducing the projection of LoRA weights from selected layers to the safety-aligned subspace.<n>Our experiments demonstrate that when fine-tuning on purely malicious data, Safe LoRA retains similar safety performance as the original aligned model.
arXiv Detail & Related papers (2024-05-27T05:04:05Z)
LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem [55.2986934528672]
We study how backdoors can be injected into task-enhancing LoRAs.<n>We find that with a simple, efficient, yet specific recipe, a backdoor LoRA can be trained once and then seamlessly merged with multiple LoRAs.<n>Our work is among the first to study this new threat model of training-free distribution of downstream-capable-yet-backdoor-injected LoRAs.
arXiv Detail & Related papers (2024-02-29T20:25:16Z)
Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models [18.472894244598503]
Low-rank adaptation (LoRA) is an efficient strategy for adapting latent diffusion models (LDMs) on a private dataset to generate specific images.<n>However, the LoRA-adapted LDMs are vulnerable to membership inference (MI) attacks that can judge whether a particular data point belongs to the private dataset.<n>We propose membership-Privacy-preserving LoRA (MP-LoRA) to defend against MI attacks and generate high-quality images.
arXiv Detail & Related papers (2024-02-19T09:32:48Z)
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs [56.85106417530364]
Low-rank adaptations (LoRA) have been proposed as a parameter-efficient way of achieving concept-driven personalization. We propose ZipLoRA, a method to cheaply and effectively merge independently trained style and subject LoRAs. Experiments show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity.
arXiv Detail & Related papers (2023-11-22T18:59:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.