Time-Aware Language Models as Temporal Knowledge Bases
- URL: http://arxiv.org/abs/2106.15110v1
- Date: Tue, 29 Jun 2021 06:18:57 GMT
- Title: Time-Aware Language Models as Temporal Knowledge Bases
- Authors: Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel
Gillick, Jacob Eisenstein, William W. Cohen
- Abstract summary: Language models (LMs) are trained on snapshots of data collected at a specific moment in time.
We introduce a diagnostic dataset aimed at probing LMs for factual knowledge that changes over time.
We propose a simple technique for jointly modeling text with its timestamp.
- Score: 39.00042720454899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many facts come with an expiration date, from the name of the President to
the basketball team Lebron James plays for. But language models (LMs) are
trained on snapshots of data collected at a specific moment in time, and this
can limit their utility, especially in the closed-book setting where the
pretraining corpus must contain the facts the model should memorize. We
introduce a diagnostic dataset aimed at probing LMs for factual knowledge that
changes over time and highlight problems with LMs at either end of the spectrum
-- those trained on specific slices of temporal data, as well as those trained
on a wide range of temporal data. To mitigate these problems, we propose a
simple technique for jointly modeling text with its timestamp. This improves
memorization of seen facts from the training time period, as well as
calibration on predictions about unseen facts from future time periods. We also
show that models trained with temporal context can be efficiently ``refreshed''
as new data arrives, without the need for retraining from scratch.
Related papers
- Formality is Favored: Unraveling the Learning Preferences of Large Language Models on Data with Conflicting Knowledge [55.65162959527848]
Large language models have shown excellent performance on many knowledge-intensive tasks.
However, pretraining data tends to contain misleading and even conflicting information.
This study systematically analyze LLMs' learning preferences for data with conflicting knowledge.
arXiv Detail & Related papers (2024-10-07T06:49:41Z) - Time Machine GPT [15.661920010658626]
Large language models (LLMs) are often trained on extensive, temporally indiscriminate text corpora.
This approach is not aligned with the evolving nature of language.
This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT)
arXiv Detail & Related papers (2024-04-29T09:34:25Z) - Temporal Blind Spots in Large Language Models [20.631107338678234]
Large language models (LLMs) have recently gained significant attention due to their unparalleled ability to perform various natural language processing tasks.
This study investigates the underlying limitations of general-purpose LLMs when deployed for tasks that require a temporal understanding.
arXiv Detail & Related papers (2024-01-22T16:20:14Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - Mitigating Temporal Misalignment by Discarding Outdated Facts [58.620269228776294]
Large language models are often used under temporal misalignment, tasked with answering questions about the present.
We propose fact duration prediction: the task of predicting how long a given fact will remain true.
Our data and code are released publicly at https://github.com/mikejqzhang/mitigating_misalignment.
arXiv Detail & Related papers (2023-05-24T07:30:08Z) - Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts)
We find that existing methods for updating knowledge show little propagation of injected knowledge.
Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z) - Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data.
An open question in previous studies of language model memorization is how to filter out "common" memorization.
We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z) - Lifelong Pretraining: Continually Adapting Language Models to Emerging
Corpora [31.136334214818305]
We study a lifelong language model pretraining challenge where a PTLM is continually updated so as to adapt to emerging data.
Over a domain-incremental research paper stream and a chronologically ordered tweet stream, we incrementally pretrain a PTLM with different continual learning algorithms.
Our experiments show continual learning algorithms improve knowledge preservation, with logit distillation being the most effective approach.
arXiv Detail & Related papers (2021-10-16T09:59:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.