Dataset Size Recovery from LoRA Weights
- URL: http://arxiv.org/abs/2406.19395v1
- Date: Thu, 27 Jun 2024 17:59:53 GMT
- Title: Dataset Size Recovery from LoRA Weights
- Authors: Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen,
- Abstract summary: DSiRe is a method for recovering the number of images used to fine-tune a model.
We release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models.
- Score: 41.031813850749174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.
Related papers
- How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? [55.33467849079774]
Low-rank adaptation (LoRA) is a popular and efficient training technique for updating or domain-specific adaptation of Large Language Models.
We investigate how new facts can be incorporated into the LLM using LoRA without compromising the previously learned knowledge.
arXiv Detail & Related papers (2025-02-20T12:31:03Z) - LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation [48.22550575107633]
A new adapter, Cross-Model Low-Rank Adaptation (LoRA-X), enables the training-free transfer of LoRA parameters across source and target models.
Our experiments demonstrate the effectiveness of LoRA-X for text-to-image generation.
arXiv Detail & Related papers (2025-01-27T23:02:24Z) - A LoRA is Worth a Thousand Pictures [28.928964530616593]
Low Rank Adaptation (LoRA) can replicate an artist's style or subject using minimal data and computation.
We show that LoRA weights alone can serve as an effective descriptor of style, without the need for additional image generation or knowledge of the original training set.
We conclude with a discussion on potential future applications, such as zero-shot LoRA fine-tuning and model attribution.
arXiv Detail & Related papers (2024-12-16T18:18:17Z) - LoRA vs Full Fine-tuning: An Illusion of Equivalence [76.11938177294178]
We study how different fine-tuning methods change pre-trained models by analyzing the model's weight matrices through the lens of their spectral properties.
We find that full fine-tuning and LoRA yield weight matrices whose singular value decompositions exhibit very different structure.
We conclude by examining why intruder dimensions appear in LoRA fine-tuned models, why they are undesirable, and how their effects can be minimized.
arXiv Detail & Related papers (2024-10-28T17:14:01Z) - Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models [38.197552424549514]
Low-rank adaptations (LoRAs) have revolutionized the finetuning of large foundation models.
LoRAs present opportunities for applying machine learning techniques that take these low-rank weights themselves as inputs.
In this paper, we investigate the potential of Learning on LoRAs (LoL), a paradigm where LoRA weights serve as input to machine learning models.
arXiv Detail & Related papers (2024-10-05T15:52:47Z) - Continual Forgetting for Pre-trained Vision Models [70.51165239179052]
In real-world scenarios, selective information is expected to be continuously removed from a pre-trained model.
We propose Group Sparse LoRA (GS-LoRA) for efficient and effective deleting.
We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes.
arXiv Detail & Related papers (2024-03-18T07:33:56Z) - Non-Visible Light Data Synthesis and Application: A Case Study for
Synthetic Aperture Radar Imagery [30.590315753622132]
We explore the "hidden" ability of large-scale pre-trained image generation models, such as Stable Diffusion and Imagen, in non-visible light domains.
We propose a 2-stage low-rank adaptation method, and we call it 2LoRA.
In the first stage, the model is adapted using aerial-view regular image data (whose structure matches SAR), followed by the second stage where the base model from the first stage is further adapted using SAR modality data.
arXiv Detail & Related papers (2023-11-29T09:48:01Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.