Do Large Language Models Know about Facts?
- URL: http://arxiv.org/abs/2310.05177v1
- Date: Sun, 8 Oct 2023 14:26:55 GMT
- Title: Do Large Language Models Know about Facts?
- Authors: Xuming Hu, Junzhe Chen, Xiaochuan Li, Yufei Guo, Lijie Wen, Philip S.
Yu, Zhijiang Guo
- Abstract summary: Large language models (LLMs) have recently driven striking performance improvements across a range of natural language processing tasks.
We aim to evaluate the extent and scope of factual knowledge within LLMs by designing the benchmark Pinocchio.
Pinocchio contains 20K diverse factual questions that span different sources, timelines, domains, regions, and languages.
- Score: 60.501902866946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have recently driven striking performance
improvements across a range of natural language processing tasks. The factual
knowledge acquired during pretraining and instruction tuning can be useful in
various downstream tasks, such as question answering, and language generation.
Unlike conventional Knowledge Bases (KBs) that explicitly store factual
knowledge, LLMs implicitly store facts in their parameters. Content generated
by the LLMs can often exhibit inaccuracies or deviations from the truth, due to
facts that can be incorrectly induced or become obsolete over time. To this
end, we aim to comprehensively evaluate the extent and scope of factual
knowledge within LLMs by designing the benchmark Pinocchio. Pinocchio contains
20K diverse factual questions that span different sources, timelines, domains,
regions, and languages. Furthermore, we investigate whether LLMs are able to
compose multiple facts, update factual knowledge temporally, reason over
multiple pieces of facts, identify subtle factual differences, and resist
adversarial examples. Extensive experiments on different sizes and types of
LLMs show that existing LLMs still lack factual knowledge and suffer from
various spurious correlations. We believe this is a critical bottleneck for
realizing trustworthy artificial intelligence. The dataset Pinocchio and our
codes will be publicly available.
Related papers
- Scaling Laws for Fact Memorization of Large Language Models [67.94080978627363]
We analyze the scaling laws for Large Language Models' fact knowledge and their behaviors of memorizing different types of facts.
We find that LLMs' fact knowledge capacity has a linear and negative exponential law relationship with model size and training epochs.
Our findings reveal the capacity and characteristics of LLMs' fact knowledge learning, which provide directions for LLMs' fact knowledge augmentation.
arXiv Detail & Related papers (2024-06-22T03:32:09Z) - What Matters in Memorizing and Recalling Facts? Multifaceted Benchmarks for Knowledge Probing in Language Models [15.057992220389604]
Language models often struggle with handling factual knowledge, exhibiting factual hallucination issue.
We introduce a knowledge probing benchmark, BELIEF(ICL), to evaluate the knowledge recall ability of both encoder- and decoder-based pre-trained language models.
We semi-automatically create MyriadLAMA, which has massively diverse prompts.
arXiv Detail & Related papers (2024-06-18T05:11:35Z) - Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts [50.06633829833144]
Large Language Models (LLMs) are effective in performing various NLP tasks, but struggle to handle tasks that require extensive, real-world knowledge.
We propose a benchmark that requires knowledge of long-tail facts for answering the involved questions.
Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required.
arXiv Detail & Related papers (2024-05-10T15:10:20Z) - Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction [15.534647327246239]
We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs)
We leverage the in-context learning abilities of LLMs to estimate the extent to which an LLM knows the facts stored in a knowledge base.
arXiv Detail & Related papers (2024-04-19T15:40:39Z) - KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs.
KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z) - DoLa: Decoding by Contrasting Layers Improves Factuality in Large
Language Models [79.01926242857613]
Large language models (LLMs) are prone to hallucinations, generating content that deviates from facts seen during pretraining.
We propose a simple decoding strategy for reducing hallucinations with pretrained LLMs.
We find that this Decoding by Contrasting Layers (DoLa) approach is able to better surface factual knowledge and reduce the generation of incorrect facts.
arXiv Detail & Related papers (2023-09-07T17:45:31Z) - Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models.
Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.