Related papers: Unintended Memorization in Large ASR Models, and How to Mitigate It

Unintended Memorization in Large ASR Models, and How to Mitigate It

URL: http://arxiv.org/abs/2310.11739v1
Date: Wed, 18 Oct 2023 06:45:49 GMT
Title: Unintended Memorization in Large ASR Models, and How to Mitigate It
Authors: Lun Wang, Om Thakkar, Rajiv Mathews
Abstract summary: auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging. We design a simple auditing method to measure memorization in large ASR models without the extra compute overhead. We show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost.
Score: 16.047859326721046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is well-known that neural networks can unintentionally memorize their training examples, causing privacy concerns. However, auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging due to the high compute cost of existing methods such as hardness calibration. In this work, we design a simple auditing method to measure memorization in large ASR models without the extra compute overhead. Concretely, we speed up randomly-generated utterances to create a mapping between vocal and text information that is difficult to learn from typical training examples. Hence, accurate predictions only for sped-up training examples can serve as clear evidence for memorization, and the corresponding accuracy can be used to measure memorization. Using the proposed method, we showcase memorization in the state-of-the-art ASR models. To mitigate memorization, we tried gradient clipping during training to bound the influence of any individual example on the final model. We empirically show that clipping each example's gradient can mitigate memorization for sped-up training examples with up to 16 repetitions in the training set. Furthermore, we show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost while providing strong privacy protection.

Related papers

A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective [15.33961902853653]
We quantify memorization for each real sample based on how many generated samples are flagged as replicas.<n>Our empirical analysis reveals a heavy-tailed distribution of memorization counts.<n>We propose DynamicCut, a two-stage, model-agnostic mitigation method.
arXiv Detail & Related papers (2025-05-28T13:06:00Z)
Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models [31.92526915009259]
Diffusion models are known for their tremendous ability to generate high-quality samples. Recent methods for memory mitigation have primarily addressed the issue within the context of the text modality. We propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
arXiv Detail & Related papers (2025-02-13T15:56:44Z)
Predicting and analyzing memorization within fine-tuned Large Language Models [0.0]
Large Language Models memorize a significant proportion of their training data, posing a serious threat when disclosed at inference time. We propose a new approach based on sliced mutual information to detect memorized samples a priori. We obtain strong empirical results, paving the way for systematic inspection and protection of these vulnerable samples before memorization happens.
arXiv Detail & Related papers (2024-09-27T15:53:55Z)
Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions. Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step. Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z)
Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy [0.0]
Large Language models (LLMs) are trained on large amounts of data. LLMs showed to memorize parts of the training data and emit those data verbatim when an adversary prompts appropriately.
arXiv Detail & Related papers (2023-05-02T15:53:28Z)
Reducing Training Sample Memorization in GANs by Training with Memorization Rejection [80.0916819303573]
We propose rejection memorization, a training scheme that rejects generated samples that are near-duplicates of training samples during training. Our scheme is simple, generic and can be directly applied to any GAN architecture.
arXiv Detail & Related papers (2022-10-21T20:17:50Z)
Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization. In specific examples, models overfit specific training and become susceptible to privacy attacks by the end. We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z)
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning [113.58691755215663]
We develop RetroPrompt to help a model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
arXiv Detail & Related papers (2022-05-29T16:07:30Z)
Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data. An open question in previous studies of language model memorization is how to filter out "common" memorization. We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z)
Exploring Memorization in Adversarial Training [58.38336773082818]
We investigate the memorization effect in adversarial training (AT) for promoting a deeper understanding of capacity, convergence, generalization, and especially robust overfitting. We propose a new mitigation algorithm motivated by detailed memorization analyses.
arXiv Detail & Related papers (2021-06-03T05:39:57Z)
Automatic Recall Machines: Internal Replay, Continual Learning and the Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective. Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.