The Queen of England is not England's Queen: On the Lack of Factual
Coherency in PLMs
- URL: http://arxiv.org/abs/2402.01453v1
- Date: Fri, 2 Feb 2024 14:42:09 GMT
- Title: The Queen of England is not England's Queen: On the Lack of Factual
Coherency in PLMs
- Authors: Paul Youssef, J\"org Schl\"otterer, Christin Seifert
- Abstract summary: Factual knowledge encoded in Pre-trained Language Models (PLMs) enriches their representations and justifies their use as knowledge bases.
Previous work has focused on probing PLMs for factual knowledge by measuring how often they can correctly predict an object entity given a subject and a relation.
In this work, we consider a complementary aspect, namely the coherency of factual knowledge in PLMs, i.e., how often can PLMs predict the subject entity given its initial prediction of the object entity.
- Score: 2.9443699603751536
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Factual knowledge encoded in Pre-trained Language Models (PLMs) enriches
their representations and justifies their use as knowledge bases. Previous work
has focused on probing PLMs for factual knowledge by measuring how often they
can correctly predict an object entity given a subject and a relation, and
improving fact retrieval by optimizing the prompts used for querying PLMs. In
this work, we consider a complementary aspect, namely the coherency of factual
knowledge in PLMs, i.e., how often can PLMs predict the subject entity given
its initial prediction of the object entity. This goes beyond evaluating how
much PLMs know, and focuses on the internal state of knowledge inside them. Our
results indicate that PLMs have low coherency using manually written, optimized
and paraphrased prompts, but including an evidence paragraph leads to
substantial improvement. This shows that PLMs fail to model inverse relations
and need further enhancements to be able to handle retrieving facts from their
parameters in a coherent manner, and to be considered as knowledge bases.
Related papers
- Enhancing Fact Retrieval in PLMs through Truthfulness [2.722191152967056]
Pre-trained Language Models (PLMs) encode various facts about the world at their pre-training phase as they are trained to predict the next or missing word in a sentence.
Recent work shows that the hidden states of PLMs can be leveraged to determine the truthfulness of the PLMs' inputs.
In this work, we investigate the use of a helper model to improve fact retrieval.
arXiv Detail & Related papers (2024-10-17T14:00:13Z) - Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models [85.13298925375692]
Large language models (LLMs) encode vast amounts of knowledge during pre-training.
LLMs can be enhanced by incorporating contextual knowledge (CK)
Can LLMs effectively integrate their internal PK with external CK to solve complex problems?
arXiv Detail & Related papers (2024-10-10T23:09:08Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - Give Me the Facts! A Survey on Factual Knowledge Probing in Pre-trained
Language Models [2.3981254787726067]
Pre-trained Language Models (PLMs) are trained on vast unlabeled data, rich in world knowledge.
This has sparked the interest of the community in quantifying the amount of factual knowledge present in PLMs.
In this work, we survey methods and datasets that are used to probe PLMs for factual knowledge.
arXiv Detail & Related papers (2023-10-25T11:57:13Z) - Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models.
Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z) - Give Us the Facts: Enhancing Large Language Models with Knowledge Graphs
for Fact-aware Language Modeling [34.59678835272862]
ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities.
This paper proposes to enhance LLMs with knowledge graph-enhanced large language models (KGLLMs)
KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.
arXiv Detail & Related papers (2023-06-20T12:21:06Z) - How Does Pretraining Improve Discourse-Aware Translation? [41.20896077662125]
We introduce a probing task to interpret the ability of pretrained language models to capture discourse relation knowledge.
We validate three state-of-the-art PLMs across encoder-, decoder-, and encoder-decoder-based models.
Our findings are instructive to understand how and when discourse knowledge in PLMs should work for downstream tasks.
arXiv Detail & Related papers (2023-05-31T13:36:51Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts)
We find that existing methods for updating knowledge show little propagation of injected knowledge.
Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z) - Pre-training Language Models with Deterministic Factual Knowledge [42.812774794720895]
We propose to let PLMs learn the deterministic relationship between the remaining context and the masked content.
Two pre-training tasks are introduced to motivate PLMs to rely on the deterministic relationship when filling masks.
Experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing.
arXiv Detail & Related papers (2022-10-20T11:04:09Z) - Can Pretrained Language Models (Yet) Reason Deductively? [72.9103833294272]
We conduct a comprehensive evaluation of the learnable deductive (also known as explicit) reasoning capability of PLMs.
Our main results suggest that PLMs cannot yet perform reliable deductive reasoning.
We reach beyond (misleading) task performance, revealing that PLMs are still far from human-level reasoning capabilities.
arXiv Detail & Related papers (2022-10-12T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.