An Enhanced Knowledge Injection Model for Commonsense Generation
- URL: http://arxiv.org/abs/2012.00366v1
- Date: Tue, 1 Dec 2020 09:51:23 GMT
- Title: An Enhanced Knowledge Injection Model for Commonsense Generation
- Authors: Zhihao Fan, Yeyun Gong, Zhongyu Wei, Siyuan Wang, Yameng Huang, Jian
Jiao, Xuanjing Huang, Nan Duan, Ruofei Zhang
- Abstract summary: Commonsense generation aims at generating plausible everyday scenario description based on a set of provided concepts.
We retrieve prototypes from external knowledge to assist the understanding of the scenario for better description generation.
We conduct experiment on CommonGen benchmark, and experimental results show that our method significantly improves the performance on all the metrics.
- Score: 68.12943221053025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Commonsense generation aims at generating plausible everyday scenario
description based on a set of provided concepts. Digging the relationship of
concepts from scratch is non-trivial, therefore, we retrieve prototypes from
external knowledge to assist the understanding of the scenario for better
description generation. We integrate two additional modules, namely position
indicator and scaling module, into the pretrained encoder-decoder model for
prototype modeling to enhance the knowledge injection procedure. We conduct
experiment on CommonGen benchmark, and experimental results show that our
method significantly improves the performance on all the metrics.
Related papers
- Deep Fusion: Capturing Dependencies in Contrastive Learning via Transformer Projection Heads [0.0]
Contrastive Learning (CL) has emerged as a powerful method for training feature extraction models using unlabeled data.
Recent studies suggest that incorporating a linear projection head post-backbone significantly enhances model performance.
We introduce a novel application of transformers in the projection head role for contrastive learning, marking the first endeavor of its kind.
arXiv Detail & Related papers (2024-03-27T15:24:54Z) - Self-Supervised Representation Learning with Meta Comprehensive
Regularization [11.387994024747842]
We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
arXiv Detail & Related papers (2024-03-03T15:53:48Z) - R-Cut: Enhancing Explainability in Vision Transformers with Relationship
Weighted Out and Cut [14.382326829600283]
We introduce two modules: the Relationship Weighted Out" and the Cut" modules.
The Cut" module performs fine-grained feature decomposition, taking into account factors such as position, texture, and color.
We validate our method with extensive qualitative and quantitative experiments on the ImageNet dataset.
arXiv Detail & Related papers (2023-07-18T08:03:51Z) - Set-to-Sequence Ranking-based Concept-aware Learning Path Recommendation [49.85548436111153]
We propose a novel framework named Set-to-Sequence Ranking-based Concept-aware Learning Path Recommendation (SRC)
SRC formulates the recommendation task under a set-to-sequence paradigm.
We conduct extensive experiments on two real-world public datasets and one industrial dataset.
arXiv Detail & Related papers (2023-06-07T08:24:44Z) - Plug-and-Play Knowledge Injection for Pre-trained Language Models [116.37916535076478]
Injecting external knowledge can improve the performance of pre-trained language models (PLMs) on various downstream NLP tasks.
Massive retraining is required to deploy new knowledge injection methods or knowledge bases for downstream tasks.
We study how to improve the flexibility and efficiency of knowledge injection by reusing existing downstream models.
arXiv Detail & Related papers (2023-05-28T10:58:00Z) - Fine-grained Contrastive Learning for Definition Generation [10.549051541793544]
Previous encoder-decoder models lack effective representation learning to contain full semantic components of the given word.
We propose a novel contrastive learning method, encouraging the model to capture more detailed semantic representations from the definition sequence encoding.
arXiv Detail & Related papers (2022-10-02T14:55:01Z) - Towards a Predictive Processing Implementation of the Common Model of
Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory.
The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z) - Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus.
Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z) - Hierarchical Predictive Coding Models in a Deep-Learning Framework [1.370633147306388]
We review some of the more well known models of predictive coding.
We also survey some recent attempts to cast these models within a deep learning framework.
arXiv Detail & Related papers (2020-05-07T03:39:57Z) - Multi-Scale Boosted Dehazing Network with Dense Feature Fusion [92.92572594942071]
We propose a Multi-Scale Boosted Dehazing Network with Dense Feature Fusion based on the U-Net architecture.
We show that the proposed model performs favorably against the state-of-the-art approaches on the benchmark datasets as well as real-world hazy images.
arXiv Detail & Related papers (2020-04-28T09:34:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.