Related papers: Impact of Fine-Tuning Methods on Memorization in Large Language Models

Impact of Fine-Tuning Methods on Memorization in Large Language Models

URL: http://arxiv.org/abs/2507.00258v1
Date: Mon, 30 Jun 2025 20:52:15 GMT
Title: Impact of Fine-Tuning Methods on Memorization in Large Language Models
Authors: Jie Hou, Chuxiong Wu, Lannan Luo, Qiang Zeng,
Abstract summary: We categorize popular fine-tuning approaches and assess their impact on memorization through the lens of membership inference attacks (MIAs)<n>Our results show that, compared to parameter-based fine-tuning, prompt-based fine-tuning achieves competitive performance while exhibiting lower vulnerability to MIAs.<n>These findings suggest that parameter-based fine-tuning is more prone to leaking private information, whereas prompt-based fine-tuning serves as a more privacy-preserving option.
Score: 8.334869916058746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As the capabilities of pre-trained large language models (LLMs) continue to advance, the "pre-train and fine-tune" paradigm has become increasingly mainstream, leading to the development of various fine-tuning methods. However, the privacy risks arising from memorization during fine-tuning have received relatively little attention. To address this gap, we categorize popular fine-tuning approaches and assess their impact on memorization through the lens of membership inference attacks (MIAs). Our results show that, compared to parameter-based fine-tuning, prompt-based fine-tuning achieves competitive performance while exhibiting lower vulnerability to MIAs. Furthermore, prompt-based methods maintain low memorization regardless of model scale. These findings suggest that parameter-based fine-tuning is more prone to leaking private information, whereas prompt-based fine-tuning serves as a more privacy-preserving option.

Related papers

DynaNoise: Dynamic Probabilistic Noise Injection for Defending Against Membership Inference Attacks [6.610581923321801]
Membership Inference Attacks (MIAs) pose a significant risk to the privacy of training datasets.<n>Traditional mitigation techniques rely on injecting a fixed amount of noise during training or inference.<n>We present DynaNoise, an adaptive approach that dynamically modulates noise injection based on query sensitivity.
arXiv Detail & Related papers (2025-05-19T17:07:00Z)
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation [58.7395356511539]
Harmful fine-tuning attack introduces significant security risks to the fine-tuning services.<n> Mainstream defenses aim to vaccinate the model such that the later harmful fine-tuning attack is less effective.<n>We propose Panacea, which optimize an adaptive perturbation that will be applied to the model after fine-tuning.
arXiv Detail & Related papers (2025-01-30T02:47:09Z)
Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models [2.3281513013731145]
Fine-tuning large language models (LLMs) for specific tasks introduces privacy risks, as models may inadvertently memorise and leak sensitive training data. Differential Privacy (DP) offers a solution to mitigate these risks, but introduces significant computational and performance trade-offs. We show that PEFT methods achieve comparable performance to standard fine-tuning while requiring fewer parameters and significantly reducing privacy leakage.
arXiv Detail & Related papers (2024-11-24T13:17:36Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Fine-Tuning Language Models with Differential Privacy through Adaptive Noise Allocation [33.795122935686706]
We propose ANADP, a novel algorithm that adaptively allocates additive noise based on the importance of model parameters. We demonstrate that ANADP narrows the performance gap between regular fine-tuning and traditional DP fine-tuning on a series of datasets.
arXiv Detail & Related papers (2024-10-03T19:02:50Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns. We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions. Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z)
DPZero: Private Fine-Tuning of Language Models without Backpropagation [49.365749361283704]
We introduce DPZero, a novel private zeroth-order algorithm with nearly dimension-independent rates. The memory efficiency of DPZero is demonstrated in privately fine-tuning RoBERTa and OPT on several downstream tasks.
arXiv Detail & Related papers (2023-10-14T18:42:56Z)
Memorization in NLP Fine-tuning Methods [34.66743495192471]
We empirically study memorization of fine-tuning methods using membership inference and extraction attacks. Fine-tuning the head of the model has the highest susceptibility to attacks, whereas fine-tuning smaller adapters appears to be less vulnerable to known extraction attacks.
arXiv Detail & Related papers (2022-05-25T05:49:31Z)
Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-tuning [34.66092282348687]
We show that taking the ultimate choice of fine-tuning method into consideration boosts the performance of parameter-efficient fine-tuning. We prime the pretrained model specifically for parameter-efficient fine-tuning, resulting in gains of up to 1.7 points on cross-lingual NER fine-tuning.
arXiv Detail & Related papers (2022-05-25T02:51:57Z)
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap. We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.