ConcEPT: Concept-Enhanced Pre-Training for Language Models
- URL: http://arxiv.org/abs/2401.05669v1
- Date: Thu, 11 Jan 2024 05:05:01 GMT
- Title: ConcEPT: Concept-Enhanced Pre-Training for Language Models
- Authors: Xintao Wang, Zhouhong Gu, Jiaqing Liang, Dakuan Lu, Yanghua Xiao, Wei
Wang
- Abstract summary: ConcEPT aims to infuse conceptual knowledge into pre-trained language models.
It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts.
Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
- Score: 57.778895980999124
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Pre-trained language models (PLMs) have been prevailing in state-of-the-art
methods for natural language processing, and knowledge-enhanced PLMs are
further proposed to promote model performance in knowledge-intensive tasks.
However, conceptual knowledge, one essential kind of knowledge for human
cognition, still remains understudied in this line of research. This limits
PLMs' performance in scenarios requiring human-like cognition, such as
understanding long-tail entities with concepts. In this paper, we propose
ConcEPT, which stands for Concept-Enhanced Pre-Training for language models, to
infuse conceptual knowledge into PLMs. ConcEPT exploits external taxonomies
with entity concept prediction, a novel pre-training objective to predict the
concepts of entities mentioned in the pre-training contexts. Unlike previous
concept-enhanced methods, ConcEPT can be readily adapted to various downstream
applications without entity linking or concept mapping. Results of extensive
experiments show the effectiveness of ConcEPT in four tasks such as entity
typing, which validates that our model gains improved conceptual knowledge with
concept-enhanced pre-training.
Related papers
- Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - Towards a General Framework for Continual Learning with Pre-training [55.88910947643436]
We present a general framework for continual learning of sequentially arrived tasks with the use of pre-training.
We decompose its objective into three hierarchical components, including within-task prediction, task-identity inference, and task-adaptive prediction.
We propose an innovative approach to explicitly optimize these components with parameter-efficient fine-tuning (PEFT) techniques and representation statistics.
arXiv Detail & Related papers (2023-10-21T02:03:38Z) - Learning to Receive Help: Intervention-Aware Concept Embedding Models [44.1307928713715]
Concept Bottleneck Models (CBMs) tackle the opacity of neural architectures by constructing and explaining their predictions using a set of high-level concepts.
Recent work has shown that intervention efficacy can be highly dependent on the order in which concepts are intervened.
We propose Intervention-aware Concept Embedding models (IntCEMs), a novel CBM-based architecture and training paradigm that improves a model's receptiveness to test-time interventions.
arXiv Detail & Related papers (2023-09-29T02:04:24Z) - Interpretable Neural-Symbolic Concept Reasoning [7.1904050674791185]
Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts.
We propose the Deep Concept Reasoner (DCR), the first interpretable concept-based model that builds upon concept embeddings.
arXiv Detail & Related papers (2023-04-27T09:58:15Z) - ConceptX: A Framework for Latent Concept Analysis [21.760620298330235]
We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in Language Models (pLMs)
We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.
arXiv Detail & Related papers (2022-11-12T11:31:09Z) - COPEN: Probing Conceptual Knowledge in Pre-trained Language Models [60.10147136876669]
Conceptual knowledge is fundamental to human cognition and knowledge bases.
Existing knowledge probing works only focus on factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge.
We design three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts.
For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark.
arXiv Detail & Related papers (2022-11-08T08:18:06Z) - A Survey of Knowledge Enhanced Pre-trained Models [28.160826399552462]
We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
arXiv Detail & Related papers (2021-10-01T08:51:58Z) - A Competence-aware Curriculum for Visual Concepts Learning via Question
Answering [95.35905804211698]
We propose a competence-aware curriculum for visual concept learning in a question-answering manner.
We design a neural-symbolic concept learner for learning the visual concepts and a multi-dimensional Item Response Theory (mIRT) model for guiding the learning process.
Experimental results on CLEVR show that with a competence-aware curriculum, the proposed method achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-07-03T05:08:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.