Related papers: One Fits All:Power General Time Series Analysis by Pretrained LM

One Fits All:Power General Time Series Analysis by Pretrained LM

URL: http://arxiv.org/abs/2302.11939v6
Date: Sun, 15 Oct 2023 05:07:17 GMT
Title: One Fits All:Power General Time Series Analysis by Pretrained LM
Authors: Tian Zhou, PeiSong Niu, Xue Wang, Liang Sun, Rong Jin
Abstract summary: We show that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks. Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks.
Score: 23.292260325891032
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Although we have witnessed great success of pre-trained models in natural language processing (NLP) and computer vision (CV), limited progress has been made for general time series analysis. Unlike NLP and CV where a unified model can be used to perform different tasks, specially designed approach still dominates in each time series analysis task such as classification, anomaly detection, forecasting, and few-shot learning. The main challenge that blocks the development of pre-trained model for time series analysis is the lack of a large amount of data for training. In this work, we address this challenge by leveraging language or CV models, pre-trained from billions of tokens, for time series analysis. Specifically, we refrain from altering the self-attention and feedforward layers of the residual blocks in the pre-trained language or image model. This model, known as the Frozen Pretrained Transformer (FPT), is evaluated through fine-tuning on all major types of tasks involving time series. Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks, as illustrated in Figure 1. We also found both theoretically and empirically that the self-attention module behaviors similarly to principle component analysis (PCA), an observation that helps explains how transformer bridges the domain gap and a crucial step towards understanding the universality of a pre-trained transformer.The code is publicly available at https://github.com/DAMO-DI-ML/One_Fits_All.

Related papers

TimesBERT: A BERT-Style Foundation Model for Time Series Understanding [72.64824086839631]
GPT-style models have been positioned as foundation models for time series forecasting. BERT-style architecture has not been fully unlocked for time series understanding. We design TimesBERT to learn generic representations of time series. Our model is pre-trained on 260 billion time points across diverse domains.
arXiv Detail & Related papers (2025-02-28T17:14:44Z)
Transformers and Their Roles as Time Series Foundation Models [14.61139607588868]
We analyze transformers as time series foundation models, focusing on their approximation and generalization capabilities. We prove that MOIRAI is capable of automatically fitting autoregressive models with an arbitrary number of covariates. Experiments support our theoretical findings, highlighting the efficacy of transformers as time series foundation models.
arXiv Detail & Related papers (2025-02-05T17:18:55Z)
General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data [61.163542597764796]
We show that time series with different time granularities (or corresponding frequency resolutions) exhibit distinct joint distributions in the frequency domain. A novel Fourier knowledge attention mechanism is proposed to enable learning time-aware representations from both the temporal and frequency domains. An autoregressive blank infilling pre-training framework is incorporated to time series analysis for the first time, leading to a generative tasks agnostic pre-training strategy.
arXiv Detail & Related papers (2025-02-05T15:20:04Z)
LSEAttention is All You Need for Time Series Forecasting [0.0]
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. I introduce textbfLSEAttention, an approach designed to address entropy collapse and training instability commonly observed in transformer models.
arXiv Detail & Related papers (2024-10-31T09:09:39Z)
Towards Generalisable Time Series Understanding Across Domains [10.350643783811174]
We introduce a novel pre-training paradigm specifically designed to handle time series heterogeneity. We propose a tokeniser with learnable domain signatures, a dual masking strategy, and a normalised cross-correlation loss. Our code and pre-trained weights are available at https://www.oetu.com/oetu/otis.
arXiv Detail & Related papers (2024-10-09T17:09:30Z)
Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai) Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains. Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z)
Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM) During pre-training, we curate large-scale datasets with up to 1 billion time points. To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z)
Large Pre-trained time series models for cross-domain Time series analysis tasks [20.228846068418765]
Large Pre-trained Time-series Models (LPTM) is a novel method of adaptive segmentation that automatically identifies optimal dataset-specific segmentation strategy during pre-training. LPTM achieves superior forecasting and time-series classification results taking up to 40% less data and 50% less training time compared to state-of-art baselines.
arXiv Detail & Related papers (2023-11-19T20:16:16Z)
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture. Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities. When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z)
Toward a Foundation Model for Time Series Data [34.1973242428317]
A foundation model is a machine learning model trained on a large and diverse set of data. We develop an effective time series foundation model by leveraging unlabeled samples from multiple domains.
arXiv Detail & Related papers (2023-10-05T21:44:50Z)
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems. We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting. Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z)
Ti-MAE: Self-Supervised Masked Time Series Autoencoders [16.98069693152999]
We propose a novel framework named Ti-MAE, in which the input time series are assumed to follow an integrate distribution. Ti-MAE randomly masks out embedded time series data and learns an autoencoder to reconstruct them at the point-level. Experiments on several public real-world datasets demonstrate that our framework of masked autoencoding could learn strong representations directly from the raw data.
arXiv Detail & Related papers (2023-01-21T03:20:23Z)
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education [5.374524134699487]
Time series is the most prevalent form of input data for educational prediction tasks. We propose an approach that utilizes irregular multivariate time series modeling with graph neural networks to achieve comparable or better accuracy. We analyze these advances in the education domain, addressing the task of early student performance prediction.
arXiv Detail & Related papers (2022-12-02T12:26:00Z)
A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities. We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention. Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z)
Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.