NuwaTS: a Foundation Model Mending Every Incomplete Time Series
- URL: http://arxiv.org/abs/2405.15317v3
- Date: Wed, 02 Oct 2024 14:34:08 GMT
- Title: NuwaTS: a Foundation Model Mending Every Incomplete Time Series
- Authors: Jinguo Cheng, Chunwei Yang, Wanlin Cai, Yuxuan Liang, Qingsong Wen, Yuankai Wu,
- Abstract summary: We present textbfNuwaTS, a novel framework that repurposes Pre-trained Language Models for general time series imputation.
NuwaTS can be applied to impute missing data across any domain.
We show that NuwaTS generalizes to other time series tasks, such as forecasting.
- Score: 24.768755438620666
- License:
- Abstract: Time series imputation is critical for many real-world applications and has been widely studied. However, existing models often require specialized designs tailored to specific missing patterns, variables, or domains which limits their generalizability. In addition, current evaluation frameworks primarily focus on domain-specific tasks and often rely on time-wise train/validation/test data splits, which fail to rigorously assess a model's ability to generalize across unseen variables or domains. In this paper, we present \textbf{NuwaTS}, a novel framework that repurposes Pre-trained Language Models (PLMs) for general time series imputation. Once trained, NuwaTS can be applied to impute missing data across any domain. We introduce specialized embeddings for each sub-series patch, capturing information about the patch, its missing data patterns, and its statistical characteristics. By combining contrastive learning with the imputation task, we train PLMs to create a versatile, one-for-all imputation model. Additionally, we employ a plug-and-play fine-tuning approach, enabling efficient adaptation to domain-specific tasks with minimal adjustments. To evaluate cross-variable and cross-domain generalization, we propose a new benchmarking protocol that partitions the datasets along the variable dimension. Experimental results on over seventeen million time series samples from diverse domains demonstrate that NuwaTS outperforms state-of-the-art domain-specific models across various datasets under the proposed benchmarking protocol. Furthermore, we show that NuwaTS generalizes to other time series tasks, such as forecasting. Our codes are available at https://github.com/Chengyui/NuwaTS.
Related papers
- FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting [44.33565276128137]
Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management.
Foundation models exhibit promising inferencing capabilities in new or unseen data.
We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models.
arXiv Detail & Related papers (2024-10-15T17:23:49Z) - Towards Generalisable Time Series Understanding Across Domains [10.350643783811174]
We introduce OTiS, an open model for general time series analysis.
We propose a novel pre-training paradigm including a tokeniser with learnable domain-specific signatures.
Our model is pre-trained on a large corpus of 640,187 samples and 11 billion time points spanning 8 distinct domains.
arXiv Detail & Related papers (2024-10-09T17:09:30Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - UniCL: A Universal Contrastive Learning Framework for Large Time Series Models [18.005358506435847]
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare.
Traditional supervised learning methods first annotate extensive labels for time-series data in each task.
This paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models.
arXiv Detail & Related papers (2024-05-17T07:47:11Z) - TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis [32.854449155765344]
We propose a simple tokenizer architecture that embeds time series data from varying domains using a discrete vectorized representation learned in a self-supervised manner.
We study the efficacy of TOTEM with an extensive evaluation on 17 real world time series datasets across 3 tasks.
arXiv Detail & Related papers (2024-02-26T09:11:12Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Large Pre-trained time series models for cross-domain Time series analysis tasks [20.228846068418765]
We propose a novel method of textitadaptive segmentation that automatically identifies optimal dataset-specific segmentation strategy during pre-training.
This enables LPTM to perform similar to or better than domain-specific state-of-art model when fine-tuned to different downstream time-series analysis tasks and under zero-shot settings.
arXiv Detail & Related papers (2023-11-19T20:16:16Z) - Adaptive Test-Time Personalization for Federated Learning [51.25437606915392]
We introduce a novel setting called test-time personalized federated learning (TTPFL)
In TTPFL, clients locally adapt a global model in an unsupervised way without relying on any labeled data during test-time.
We propose a novel algorithm called ATP to adaptively learn the adaptation rates for each module in the model from distribution shifts among source domains.
arXiv Detail & Related papers (2023-10-28T20:42:47Z) - UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series
Forecasting [59.11817101030137]
This research advocates for a unified model paradigm that transcends domain boundaries.
Learning an effective cross-domain model presents the following challenges.
We propose UniTime for effective cross-domain time series learning.
arXiv Detail & Related papers (2023-10-15T06:30:22Z) - Learning to Generalize across Domains on Single Test Samples [126.9447368941314]
We learn to generalize across domains on single test samples.
We formulate the adaptation to the single test sample as a variational Bayesian inference problem.
Our model achieves at least comparable -- and often better -- performance than state-of-the-art methods on multiple benchmarks for domain generalization.
arXiv Detail & Related papers (2022-02-16T13:21:04Z) - Improving QA Generalization by Concurrent Modeling of Multiple Biases [61.597362592536896]
Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets.
We propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data.
We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths.
arXiv Detail & Related papers (2020-10-07T11:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.