Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary
- URL: http://arxiv.org/abs/2310.15541v1
- Date: Tue, 24 Oct 2023 06:15:15 GMT
- Title: Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary
- Authors: Myeongjun Erik Jang, Thomas Lukasiewicz
- Abstract summary: Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
- Score: 65.268245109828
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The non-humanlike behaviour of contemporary pre-trained language models
(PLMs) is a leading cause undermining their trustworthiness. A striking
phenomenon of such faulty behaviours is the generation of inconsistent
predictions, which produces logically contradictory results, such as generating
different predictions for texts delivering the same meaning or violating
logical properties. Previous studies exploited data augmentation or implemented
specialised loss functions to alleviate the issue. However, their usage is
limited, because they consume expensive training resources for large-sized PLMs
and can only handle a certain consistency type. To this end, we propose a
practical approach that alleviates the inconsistent behaviour issue by
fundamentally improving PLMs' meaning awareness. Based on the conceptual role
theory, our method allows PLMs to capture accurate meaning by learning precise
interrelationships between concepts from word-definition pairs in a dictionary.
Next, we propose an efficient parameter integration technique that updates only
a few additional parameters to combine the learned interrelationship with PLMs'
pre-trained knowledge. Our experimental results reveal that the approach can
concurrently improve multiple types of consistency, enables efficient knowledge
integration, and easily applies to other languages.
Related papers
- Tuning-Free Accountable Intervention for LLM Deployment -- A
Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks.
We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Explaining Language Models' Predictions with High-Impact Concepts [11.47612457613113]
We propose a complete framework for extending concept-based interpretability methods to NLP.
We optimize for features whose existence causes the output predictions to change substantially.
Our method achieves superior results on predictive impact, usability, and faithfulness compared to the baselines.
arXiv Detail & Related papers (2023-05-03T14:48:27Z) - Towards Linguistically Informed Multi-Objective Pre-Training for Natural
Language Inference [0.38233569758620045]
We introduce a linguistically enhanced combination of pre-training methods for transformers.
The pre-training objectives include POS-tagging, synset prediction based on semantic knowledge graphs, and parent prediction based on dependency parse trees.
Our approach achieves competitive results on the Natural Language Inference task, compared to the state of the art.
arXiv Detail & Related papers (2022-12-14T10:50:13Z) - Semantic Interactive Learning for Text Classification: A Constructive
Approach for Contextual Interactions [0.0]
We propose a novel interaction framework called Semantic Interactive Learning for the text domain.
We frame the problem of incorporating constructive and contextual feedback into the learner as a task to find an architecture that enables more semantic alignment between humans and machines.
We introduce a technique called SemanticPush that is effective for translating conceptual corrections of humans to non-extrapolating training examples.
arXiv Detail & Related papers (2022-09-07T08:13:45Z) - Measuring and Improving Consistency in Pretrained Language Models [40.46184998481918]
We study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?
Using ParaRel, we show that the consistency of all PLMs we experiment with is poor -- though with high variance between relations.
arXiv Detail & Related papers (2021-02-01T17:48:42Z) - Learning Causal Semantic Representation for Out-of-Distribution
Prediction [125.38836464226092]
We propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately.
We show that CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error.
arXiv Detail & Related papers (2020-11-03T13:16:05Z) - Pre-training Text-to-Text Transformers for Concept-centric Common Sense [48.11844351407072]
We propose a concept-aware language model (CALM) to augment pre-trained language models with concept-centric commonsense knowledge.
We show that CALM can pack more commonsense knowledge into the parameters of a pre-trained text-to-text transformer without relying on external knowledge graphs.
arXiv Detail & Related papers (2020-10-24T07:00:37Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.