StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data
- URL: http://arxiv.org/abs/2509.23594v1
- Date: Sun, 28 Sep 2025 02:51:35 GMT
- Title: StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data
- Authors: Yixu Wang, Yan Teng, Yingchun Wang, Xingjun Ma,
- Abstract summary: This paper introduces a new focus of model extraction attacks named LoRA extraction.<n>We propose a novel extraction method called StolenLoRA which trains a substitute model to extract the functionality of a LoRA-adapted model.<n>Our experiments demonstrate the effectiveness of StolenLoRA, achieving up to a 96.60% attack success rate with only 10k queries.
- Score: 39.230850434780756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have transformed vision model adaptation, enabling the rapid deployment of customized models. However, the compactness of LoRA adaptations introduces new safety concerns, particularly their vulnerability to model extraction attacks. This paper introduces a new focus of model extraction attacks named LoRA extraction that extracts LoRA-adaptive models based on a public pre-trained model. We then propose a novel extraction method called StolenLoRA which trains a substitute model to extract the functionality of a LoRA-adapted model using synthetic data. StolenLoRA leverages a Large Language Model to craft effective prompts for data generation, and it incorporates a Disagreement-based Semi-supervised Learning (DSL) strategy to maximize information gain from limited queries. Our experiments demonstrate the effectiveness of StolenLoRA, achieving up to a 96.60% attack success rate with only 10k queries, even in cross-backbone scenarios where the attacker and victim models utilize different pre-trained backbones. These findings reveal the specific vulnerability of LoRA-adapted models to this type of extraction and underscore the urgent need for robust defense mechanisms tailored to PEFT methods. We also explore a preliminary defense strategy based on diversified LoRA deployments, highlighting its potential to mitigate such attacks.
Related papers
- When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters [10.859491015719088]
Low-Rank Adaptation (LoRA) has emerged as a leading technique for efficiently fine-tuning text-to-image diffusion models.<n>MasqLoRA is the first systematic attack framework that leverages an independent LoRA module as the attack vehicle.<n>MasqLoRA achieves a high attack success rate of 99.8%.
arXiv Detail & Related papers (2026-02-25T14:56:51Z) - Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models [2.7625323526446413]
Low-Rank Adaptation (LoRA) has emerged as an efficient method for fine-tuning large language models (LLMs)<n>We propose Causal-Guided Detoxify Backdoor Attack (CBA), a novel backdoor attack framework specifically designed for open-weight LoRA models.
arXiv Detail & Related papers (2025-12-22T11:40:47Z) - LoRAShield: Data-Free Editing Alignment for Secure Personalized LoRA Sharing [43.88211522311429]
Low-Rank Adaptation (LoRA) models can be shared on platforms like Civitai and Liblib.<n>LoRAShield is the first data-free editing framework for securing LoRA models against misuse.
arXiv Detail & Related papers (2025-07-05T02:53:17Z) - MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z) - RepLoRA: Reparameterizing Low-Rank Adaptation via the Perspective of Mixture of Experts [37.43961020113692]
Low-rank Adaptation (LoRA) has emerged as a powerful method for fine-tuning large-scale foundation models.<n>This paper presents a theoretical analysis of LoRA by examining its connection to the Mixture of Experts models.
arXiv Detail & Related papers (2025-02-05T10:03:09Z) - LoRA vs Full Fine-tuning: An Illusion of Equivalence [76.11938177294178]
We study how Low-Rank Adaptation (LoRA) and full-finetuning change pre-trained models.<n>We find that LoRA and full fine-tuning yield weight matrices whose singular value decompositions exhibit very different structure.<n>We extend the finding that LoRA forgets less than full fine-tuning and find its forgetting is vastly localized to the intruder dimension.
arXiv Detail & Related papers (2024-10-28T17:14:01Z) - Task-Specific Directions: Definition, Exploration, and Utilization in Parameter Efficient Fine-Tuning [65.31677646659895]
Large language models demonstrate impressive performance on downstream tasks, yet they require extensive resource consumption when fully fine-tuning all parameters.<n>We propose a framework to clearly define task-specific directions (TSDs) and explore their properties and practical utilization challenges.<n>We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process.
arXiv Detail & Related papers (2024-09-02T08:10:51Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - LoRA Dropout as a Sparsity Regularizer for Overfitting Control [18.992276878667997]
We propose a LoRA Dropout mechanism for the LoRA-based methods.
We show that appropriate sparsity would help tighten the gap between empirical and generalization risks.
arXiv Detail & Related papers (2024-04-15T09:32:12Z) - Continual Forgetting for Pre-trained Vision Models [70.51165239179052]
In real-world scenarios, selective information is expected to be continuously removed from a pre-trained model.
We propose Group Sparse LoRA (GS-LoRA) for efficient and effective deleting.
We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes.
arXiv Detail & Related papers (2024-03-18T07:33:56Z) - LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem [55.2986934528672]
We study how backdoors can be injected into task-enhancing LoRAs.<n>We find that with a simple, efficient, yet specific recipe, a backdoor LoRA can be trained once and then seamlessly merged with multiple LoRAs.<n>Our work is among the first to study this new threat model of training-free distribution of downstream-capable-yet-backdoor-injected LoRAs.
arXiv Detail & Related papers (2024-02-29T20:25:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.