Related papers: The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs

The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs

URL: http://arxiv.org/abs/2402.01453v1
Date: Fri, 2 Feb 2024 14:42:09 GMT
Title: The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs
Authors: Paul Youssef, J\"org Schl\"otterer, Christin Seifert
Abstract summary: Factual knowledge encoded in Pre-trained Language Models (PLMs) enriches their representations and justifies their use as knowledge bases. Previous work has focused on probing PLMs for factual knowledge by measuring how often they can correctly predict an object entity given a subject and a relation. In this work, we consider a complementary aspect, namely the coherency of factual knowledge in PLMs, i.e., how often can PLMs predict the subject entity given its initial prediction of the object entity.
Score: 2.9443699603751536
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Factual knowledge encoded in Pre-trained Language Models (PLMs) enriches their representations and justifies their use as knowledge bases. Previous work has focused on probing PLMs for factual knowledge by measuring how often they can correctly predict an object entity given a subject and a relation, and improving fact retrieval by optimizing the prompts used for querying PLMs. In this work, we consider a complementary aspect, namely the coherency of factual knowledge in PLMs, i.e., how often can PLMs predict the subject entity given its initial prediction of the object entity. This goes beyond evaluating how much PLMs know, and focuses on the internal state of knowledge inside them. Our results indicate that PLMs have low coherency using manually written, optimized and paraphrased prompts, but including an evidence paragraph leads to substantial improvement. This shows that PLMs fail to model inverse relations and need further enhancements to be able to handle retrieving facts from their parameters in a coherent manner, and to be considered as knowledge bases.

Related papers

Enhancing Fact Retrieval in PLMs through Truthfulness [2.722191152967056]
Pre-trained Language Models (PLMs) encode various facts about the world at their pre-training phase as they are trained to predict the next or missing word in a sentence. Recent work shows that the hidden states of PLMs can be leveraged to determine the truthfulness of the PLMs' inputs. In this work, we investigate the use of a helper model to improve fact retrieval.
arXiv Detail & Related papers (2024-10-17T14:00:13Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z)
Give Me the Facts! A Survey on Factual Knowledge Probing in Pre-trained Language Models [2.3981254787726067]
Pre-trained Language Models (PLMs) are trained on vast unlabeled data, rich in world knowledge. This has sparked the interest of the community in quantifying the amount of factual knowledge present in PLMs. In this work, we survey methods and datasets that are used to probe PLMs for factual knowledge.
arXiv Detail & Related papers (2023-10-25T11:57:13Z)
Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models. Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z)
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well. Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries. We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z)
Give Us the Facts: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling [34.59678835272862]
ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities. This paper proposes to enhance LLMs with knowledge graph-enhanced large language models (KGLLMs) KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.
arXiv Detail & Related papers (2023-06-20T12:21:06Z)
How Does Pretraining Improve Discourse-Aware Translation? [41.20896077662125]
We introduce a probing task to interpret the ability of pretrained language models to capture discourse relation knowledge. We validate three state-of-the-art PLMs across encoder-, decoder-, and encoder-decoder-based models. Our findings are instructive to understand how and when discourse knowledge in PLMs should work for downstream tasks.
arXiv Detail & Related papers (2023-05-31T13:36:51Z)
Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus. We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z)
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts) We find that existing methods for updating knowledge show little propagation of injected knowledge. Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z)
Pre-training Language Models with Deterministic Factual Knowledge [42.812774794720895]
We propose to let PLMs learn the deterministic relationship between the remaining context and the masked content. Two pre-training tasks are introduced to motivate PLMs to rely on the deterministic relationship when filling masks. Experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing.
arXiv Detail & Related papers (2022-10-20T11:04:09Z)
Can Pretrained Language Models (Yet) Reason Deductively? [72.9103833294272]
We conduct a comprehensive evaluation of the learnable deductive (also known as explicit) reasoning capability of PLMs. Our main results suggest that PLMs cannot yet perform reliable deductive reasoning. We reach beyond (misleading) task performance, revealing that PLMs are still far from human-level reasoning capabilities.
arXiv Detail & Related papers (2022-10-12T17:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.