Retrieval Enhanced Model for Commonsense Generation
- URL: http://arxiv.org/abs/2105.11174v1
- Date: Mon, 24 May 2021 09:49:17 GMT
- Title: Retrieval Enhanced Model for Commonsense Generation
- Authors: Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu,
Michael Zeng
- Abstract summary: We propose a novel framework using retrieval methods to enhance both the pre-training and fine-tuning for commonsense generation.
We retrieve prototype sentence candidates by concept matching and use them as auxiliary input.
We demonstrate experimentally on the large-scale CommonGen benchmark that our approach achieves new state-of-the-art results.
- Score: 27.808363395849536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Commonsense generation is a challenging task of generating a plausible
sentence describing an everyday scenario using provided concepts. Its
requirement of reasoning over commonsense knowledge and compositional
generalization ability even puzzles strong pre-trained language generation
models. We propose a novel framework using retrieval methods to enhance both
the pre-training and fine-tuning for commonsense generation. We retrieve
prototype sentence candidates by concept matching and use them as auxiliary
input. For fine-tuning, we further boost its performance with a trainable
sentence retriever. We demonstrate experimentally on the large-scale CommonGen
benchmark that our approach achieves new state-of-the-art results.
Related papers
- Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts.
Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - Retrieval Augmentation for Commonsense Reasoning: A Unified Approach [64.63071051375289]
We propose a unified framework of retrieval-augmented commonsense reasoning (called RACo)
Our proposed RACo can significantly outperform other knowledge-enhanced method counterparts.
arXiv Detail & Related papers (2022-10-23T23:49:08Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - Augmenting BERT-style Models with Predictive Coding to Improve
Discourse-level Representations [20.855686009404703]
We propose to use ideas from predictive coding theory to augment BERT-style language models with a mechanism that allows them to learn discourse-level representations.
Our proposed approach is able to predict future sentences using explicit top-down connections that operate at the intermediate layers of the network.
arXiv Detail & Related papers (2021-09-10T00:45:28Z) - Lexically-constrained Text Generation through Commonsense Knowledge
Extraction and Injection [62.071938098215085]
We focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts.
We propose strategies for enhancing the semantic correctness of the generated text.
arXiv Detail & Related papers (2020-12-19T23:23:40Z) - An Enhanced Knowledge Injection Model for Commonsense Generation [68.12943221053025]
Commonsense generation aims at generating plausible everyday scenario description based on a set of provided concepts.
We retrieve prototypes from external knowledge to assist the understanding of the scenario for better description generation.
We conduct experiment on CommonGen benchmark, and experimental results show that our method significantly improves the performance on all the metrics.
arXiv Detail & Related papers (2020-12-01T09:51:23Z) - EnsembleGAN: Adversarial Learning for Retrieval-Generation Ensemble
Model on Short-Text Conversation [37.80290058812499]
ensembleGAN is an adversarial learning framework for enhancing a retrieval-generation ensemble model in open-domain conversation scenario.
It consists of a language-model-like generator, a ranker generator, and one ranker discriminator.
arXiv Detail & Related papers (2020-04-30T05:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.