Related papers: Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

URL: http://arxiv.org/abs/2205.14704v5
Date: Tue, 19 Sep 2023 12:33:09 GMT
Title: Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
Authors: Xiang Chen, Lei Li, Ningyu Zhang, Xiaozhuan Liang, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen
Abstract summary: We develop RetroPrompt to help a model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
Score: 113.58691755215663
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompt learning approaches have made waves in natural language processing by inducing better few-shot performance while they still follow a parametric-based learning paradigm; the oblivion and rote memorization problems in learning may encounter unstable generalization issues. Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data. To alleviate such limitations, we develop RetroPrompt with the motivation of decoupling knowledge from memorization to help the model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances and implements a retrieval mechanism during the process of input, training and inference, thus equipping the model with the ability to retrieve related contexts from the training corpus as cues for enhancement. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings. Besides, we further illustrate that our proposed RetroPrompt can yield better generalization abilities with new datasets. Detailed analysis of memorization indeed reveals RetroPrompt can reduce the reliance of language models on memorization; thus, improving generalization for downstream tasks. Code is available in https://github.com/zjunlp/PromptKG/tree/main/research/RetroPrompt.

Related papers

Continual Memorization of Factoids in Language Models [32.37538704232502]
Recent studies have shown that fine-tuning for memorization may be ineffective in storing knowledge or may exacerbate hallucinations. We introduce a setting where a model must memorize and retain a set of factoids through multiple stages of fine-tuning on subsequent datasets. We show that LMs widely suffer from forgetting, especially when needing to memorize factoids in the second stage.
arXiv Detail & Related papers (2024-11-11T17:56:15Z)
Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions. Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step. Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z)
Unintended Memorization in Large ASR Models, and How to Mitigate It [16.047859326721046]
auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging. We design a simple auditing method to measure memorization in large ASR models without the extra compute overhead. We show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost.
arXiv Detail & Related papers (2023-10-18T06:45:49Z)
KnowPrefix-Tuning: A Two-Stage Prefix-Tuning Framework for Knowledge-Grounded Dialogue Generation [37.36605012674462]
Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner. We propose a two-stage tuning framework, bypassing the retrieval process by injecting prior knowledge into the lightweight knowledge prefix. KnowPrefix-Tuning outperforms fine-tuning and other lightweight tuning approaches.
arXiv Detail & Related papers (2023-06-27T12:38:49Z)
Detachedly Learn a Classifier for Class-Incremental Learning [11.865788374587734]
We present an analysis that the failure of vanilla experience replay (ER) comes from unnecessary re-learning of previous tasks and incompetence to distinguish current task from the previous ones. We propose a novel replay strategy task-aware experience replay. Experimental results show our method outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2023-02-23T01:35:44Z)
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy [91.98116450958331]
We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.
arXiv Detail & Related papers (2022-10-31T17:57:55Z)
Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning [109.7767515627765]
We propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction. Our model infers relation through knowledge stored in the weights during training. Our method can achieve state-of-the-art in both standard supervised and few-shot settings.
arXiv Detail & Related papers (2022-05-04T23:38:37Z)
Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data. An open question in previous studies of language model memorization is how to filter out "common" memorization. We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z)
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting [100.75479161884935]
We propose a novel training paradigm called Remembering for the Right Reasons (RRR) RRR stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions. We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting.
arXiv Detail & Related papers (2020-10-04T10:05:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.