Information Association for Language Model Updating by Mitigating
LM-Logical Discrepancy
- URL: http://arxiv.org/abs/2305.18582v2
- Date: Fri, 9 Feb 2024 06:37:51 GMT
- Title: Information Association for Language Model Updating by Mitigating
LM-Logical Discrepancy
- Authors: Pengfei Yu and Heng Ji
- Abstract summary: Large Language Models(LLMs) struggle with providing current information due to the outdated pre-training data.
Existing methods for updating LLMs, such as knowledge editing and continual fine-tuning, have significant drawbacks in generalizability of new information.
We identify the core challenge behind these drawbacks: the LM-logical discrepancy featuring the difference between language modeling probabilities and logical probabilities.
- Score: 68.31760483418901
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models~(LLMs) struggle with providing current information due
to the outdated pre-training data. Existing methods for updating LLMs, such as
knowledge editing and continual fine-tuning, have significant drawbacks in
generalizability of new information and the requirements on structured updating
corpus. We identify the core challenge behind these drawbacks: the LM-logical
discrepancy featuring the difference between language modeling probabilities
and logical probabilities. To evaluate and address the core challenge, we
propose a new task formulation of the information updating task that only
requires the provision of an unstructured updating corpus and evaluates the
performance of information updating on the generalizability to question-answer
pairs pertaining to the updating information. We further propose a novel and
effective pipeline approach for the task, highlighting a self-prompting-based
question-answer generation process and a associative distillation methods to
bridge the LM-logical discrepancy. We develop two datasets for evaluation, one
sourced from news articles published in March and April 2023, and the other
from the Natural Questions benchmark. Experimental results demonstrate the
superiority of our approach, significantly increasing the factual consistency
score (on a scale from 0 to 1) by up to 0.16. Furthermore, our method
effectively mitigates forgetting utilizing a compact replay buffer with only
2.3% of the training tokens.
Related papers
- LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models [16.67999382790238]
Large language models (LLMs) have revolutionized various domains, yet their utility comes with challenges related to outdated or problematic knowledge embedded during pretraining.
This paper addresses the challenge of modifying LLMs to unlearn problematic and outdated information while efficiently integrating new knowledge without retraining from scratch.
Using Llama2-7B, we demonstrate that LLM Surgery can achieve significant forgetting on the unlearn set, a 20% increase in accuracy on the update set, and maintain performance on the retain set.
arXiv Detail & Related papers (2024-09-19T19:07:01Z) - Belief Revision: The Adaptability of Large Language Models Reasoning [63.0281286287648]
We introduce Belief-R, a new dataset designed to test LMs' belief revision ability when presented with new evidence.
Inspired by how humans suppress prior inferences, this task assesses LMs within the newly proposed delta reasoning framework.
We evaluate $sim$30 LMs across diverse prompting strategies and found that LMs generally struggle to appropriately revise their beliefs in response to new information.
arXiv Detail & Related papers (2024-06-28T09:09:36Z) - Factual Dialogue Summarization via Learning from Large Language Models [35.63037083806503]
Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries.
We employ zero-shot learning to extract symbolic knowledge from LLMs, generating factually consistent (positive) and inconsistent (negative) summaries.
Our approach achieves better factual consistency while maintaining coherence, fluency, and relevance, as confirmed by various automatic evaluation metrics.
arXiv Detail & Related papers (2024-06-20T20:03:37Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.