Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
Learning
- URL: http://arxiv.org/abs/2205.14704v5
- Date: Tue, 19 Sep 2023 12:33:09 GMT
- Title: Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
Learning
- Authors: Xiang Chen, Lei Li, Ningyu Zhang, Xiaozhuan Liang, Shumin Deng,
Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen
- Abstract summary: We develop RetroPrompt to help a model strike a balance between generalization and memorization.
In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances.
Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
- Score: 113.58691755215663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt learning approaches have made waves in natural language processing by
inducing better few-shot performance while they still follow a parametric-based
learning paradigm; the oblivion and rote memorization problems in learning may
encounter unstable generalization issues. Specifically, vanilla prompt learning
may struggle to utilize atypical instances by rote during fully-supervised
training or overfit shallow patterns with low-shot data. To alleviate such
limitations, we develop RetroPrompt with the motivation of decoupling knowledge
from memorization to help the model strike a balance between generalization and
memorization. In contrast with vanilla prompt learning, RetroPrompt constructs
an open-book knowledge-store from training instances and implements a retrieval
mechanism during the process of input, training and inference, thus equipping
the model with the ability to retrieve related contexts from the training
corpus as cues for enhancement. Extensive experiments demonstrate that
RetroPrompt can obtain better performance in both few-shot and zero-shot
settings. Besides, we further illustrate that our proposed RetroPrompt can
yield better generalization abilities with new datasets. Detailed analysis of
memorization indeed reveals RetroPrompt can reduce the reliance of language
models on memorization; thus, improving generalization for downstream tasks.
Code is available in
https://github.com/zjunlp/PromptKG/tree/main/research/RetroPrompt.
Related papers
- Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Unintended Memorization in Large ASR Models, and How to Mitigate It [16.047859326721046]
auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging.
We design a simple auditing method to measure memorization in large ASR models without the extra compute overhead.
We show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost.
arXiv Detail & Related papers (2023-10-18T06:45:49Z) - KnowPrefix-Tuning: A Two-Stage Prefix-Tuning Framework for
Knowledge-Grounded Dialogue Generation [37.36605012674462]
Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner.
We propose a two-stage tuning framework, bypassing the retrieval process by injecting prior knowledge into the lightweight knowledge prefix.
KnowPrefix-Tuning outperforms fine-tuning and other lightweight tuning approaches.
arXiv Detail & Related papers (2023-06-27T12:38:49Z) - Detachedly Learn a Classifier for Class-Incremental Learning [11.865788374587734]
We present an analysis that the failure of vanilla experience replay (ER) comes from unnecessary re-learning of previous tasks and incompetence to distinguish current task from the previous ones.
We propose a novel replay strategy task-aware experience replay.
Experimental results show our method outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2023-02-23T01:35:44Z) - Preventing Verbatim Memorization in Language Models Gives a False Sense
of Privacy [91.98116450958331]
We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization.
Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization.
We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.
arXiv Detail & Related papers (2022-10-31T17:57:55Z) - Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt
Tuning [109.7767515627765]
We propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction.
Our model infers relation through knowledge stored in the weights during training.
Our method can achieve state-of-the-art in both standard supervised and few-shot settings.
arXiv Detail & Related papers (2022-05-04T23:38:37Z) - Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data.
An open question in previous studies of language model memorization is how to filter out "common" memorization.
We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z) - Remembering for the Right Reasons: Explanations Reduce Catastrophic
Forgetting [100.75479161884935]
We propose a novel training paradigm called Remembering for the Right Reasons (RRR)
RRR stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions.
We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting.
arXiv Detail & Related papers (2020-10-04T10:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.