From Teacher to Student: Tracking Memorization Through Model Distillation
- URL: http://arxiv.org/abs/2506.16170v1
- Date: Thu, 19 Jun 2025 09:44:25 GMT
- Title: From Teacher to Student: Tracking Memorization Through Model Distillation
- Authors: Simardeep Singh,
- Abstract summary: Large language models (LLMs) are known to memorize parts of their training data, raising important concerns around privacy and security.<n>In this study, we explore how different knowledge distillation (KD) methods influence the memorization of fine-tuned task data when a large teacher model is distilled into smaller student variants.
- Score: 0.9065034043031668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are known to memorize parts of their training data, raising important concerns around privacy and security. While previous research has focused on studying memorization in pre-trained models, much less is known about how knowledge distillation (KD) affects memorization.In this study, we explore how different KD methods influence the memorization of fine-tuned task data when a large teacher model is distilled into smaller student variants.This study demonstrates that distilling a larger teacher model, fine-tuned on a dataset, into a smaller variant not only lowers computational costs and model size but also significantly reduces the memorization risks compared to standard fine-tuning approaches.
Related papers
- Memorization in Fine-Tuned Large Language Models [0.0]
This study investigates the mechanisms and factors influencing memorization in fine-tuned large language models (LLMs)<n>We examine how different aspects of the fine-tuning process affect a model's propensity to memorize training data, using the PHEE dataset of pharmacovigilance events.
arXiv Detail & Related papers (2025-07-28T17:22:10Z) - Extending Memorization Dynamics in Pythia Models from Instance-Level Insights [8.476099189609565]
This paper presents a detailed analysis of memorization in the Pythia model family across varying scales and training steps.<n>Using granular metrics, we examine how model architecture, data characteristics, and perturbations influence memorization patterns.
arXiv Detail & Related papers (2025-06-14T03:02:42Z) - Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation [64.15918654558816]
Self-distillation (SSD) training strategy is introduced for filtering and weighting teacher representation to distill from task-relevant representations only.<n> Experimental results on real-world affective computing, wearable/biosignal datasets from the UCR Archive, the HAR dataset, and image classification datasets show that the proposed SSD method can outperform state-of-the-art methods.
arXiv Detail & Related papers (2025-04-19T14:08:56Z) - CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation [57.91828170220308]
We propose a knowledge distillation approach, CustomKD, that effectively leverages large vision foundation models (LVFMs) to enhance the performance of edge models.<n>Our simple yet effective CustomKD customizes the well-generalized features inherent in LVFMs to a given student model in order to reduce model discrepancies.
arXiv Detail & Related papers (2025-03-23T23:53:08Z) - Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them.
This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model.
OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z) - Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Comparative Knowledge Distillation [102.35425896967791]
Traditional Knowledge Distillation (KD) assumes readily available access to teacher models for frequent inference.
We propose Comparative Knowledge Distillation (CKD), which encourages student models to understand the nuanced differences in a teacher model's interpretations of samples.
CKD consistently outperforms state of the art data augmentation and KD techniques.
arXiv Detail & Related papers (2023-11-03T21:55:33Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - Emergent and Predictable Memorization in Large Language Models [23.567027014457775]
Memorization, or the tendency of large language models to output entire sequences from their training data verbatim, is a key concern for safely deploying language models.
We seek to predict which sequences will be memorized before a large model's full train-time by extrapolating the memorization behavior of lower-compute trial runs.
We provide further novel discoveries on the distribution of memorization scores across models and data.
arXiv Detail & Related papers (2023-04-21T17:58:31Z) - Understanding Unintended Memorization in Federated Learning [5.32880378510767]
We show that different components of Federated Learning play an important role in reducing unintended memorization.
We also show that training with a strong user-level differential privacy guarantee results in models that exhibit the least amount of unintended memorization.
arXiv Detail & Related papers (2020-06-12T22:10:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.