TimeLMs: Diachronic Language Models from Twitter
- URL: http://arxiv.org/abs/2202.03829v1
- Date: Tue, 8 Feb 2022 12:47:38 GMT
- Title: TimeLMs: Diachronic Language Models from Twitter
- Authors: Daniel Loureiro, Francesco Barbieri, Leonardo Neves, Luis Espinosa
Anke, Jose Camacho-Collados
- Abstract summary: We present TimeLMs, a set of language models specialized on diachronic Twitter data.
We show that a continual learning strategy contributes to enhancing Twitter-based language models' capacity to deal with future and out-of-distribution tweets.
- Score: 22.166929118357714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite its importance, the time variable has been largely neglected in the
NLP and language model literature. In this paper, we present TimeLMs, a set of
language models specialized on diachronic Twitter data. We show that a
continual learning strategy contributes to enhancing Twitter-based language
models' capacity to deal with future and out-of-distribution tweets, while
making them competitive with standardized and more monolithic benchmarks. We
also perform a number of qualitative analyses showing how they cope with trends
and peaks in activity involving specific named entities or concept drift.
Related papers
- Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters [21.19251212483406]
Large language models (LLMs) have revolutionized natural language processing and broadened their applicability across diverse commercial applications.
This paper explores a training recipe of an assistant model in speculative decoding, which are leveraged to draft and-then its future tokens are verified by the target LLM.
We show that language-specific draft models, optimized through a targeted pretrain-and-finetune strategy, substantially brings a speedup of inference time compared to the previous methods.
arXiv Detail & Related papers (2024-06-24T16:06:50Z) - Time Machine GPT [15.661920010658626]
Large language models (LLMs) are often trained on extensive, temporally indiscriminate text corpora.
This approach is not aligned with the evolving nature of language.
This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT)
arXiv Detail & Related papers (2024-04-29T09:34:25Z) - AutoTimes: Autoregressive Time Series Forecasters via Large Language Models [67.83502953961505]
AutoTimes projects time series into the embedding space of language tokens and autoregressively generates future predictions with arbitrary lengths.
We formulate time series as prompts, extending the context for prediction beyond the lookback window.
AutoTimes achieves state-of-the-art with 0.1% trainable parameters and over $5times$ training/inference speedup.
arXiv Detail & Related papers (2024-02-04T06:59:21Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - TRAM: Benchmarking Temporal Reasoning for Large Language Models [12.112914393948415]
We introduce TRAM, a temporal reasoning benchmark composed of ten datasets.
We evaluate popular language models like GPT-4 and Llama2 in zero-shot and few-shot scenarios.
Our findings indicate that the best-performing model lags significantly behind human performance.
arXiv Detail & Related papers (2023-10-02T00:59:07Z) - Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating
Generalization Capacity of Language Models [18.874880342410876]
We present Jamp, a Japanese benchmark focused on temporal inference.
Our dataset includes a range of temporal inference patterns, which enables us to conduct fine-grained analysis.
We evaluate the generalization capacities of monolingual/multilingual LMs by splitting our dataset based on tense fragments.
arXiv Detail & Related papers (2023-06-19T07:00:14Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Multi-timescale Representation Learning in LSTM Language Models [69.98840820213937]
Language models must capture statistical dependencies between words at timescales ranging from very short to very long.
We derived a theory for how the memory gating mechanism in long short-term memory language models can capture power law decay.
Experiments showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.
arXiv Detail & Related papers (2020-09-27T02:13:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.