Related papers: Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting

Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting

URL: http://arxiv.org/abs/2502.14704v1
Date: Thu, 20 Feb 2025 16:29:37 GMT
Title: Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting
Authors: Yuxuan Yang, Dalin Zhang, Yuxuan Liang, Hua Lu, Huan Li, Gang Chen,
Abstract summary: This paper explores a novel self-supervised approach to re-label time series datasets by inherently constructing candidate datasets. During the optimization of a simple reconstruction network, intermediates are used as pseudo labels in a self-supervised paradigm. Our experiments on eleven real-world datasets demonstrate that SCAM consistently improves the performance of various backbone models.
Score: 18.25649205265032
License:
Abstract: Time Series Forecasting (TSF) is a crucial task in various domains, yet existing TSF models rely heavily on high-quality data and insufficiently exploit all available data. This paper explores a novel self-supervised approach to re-label time series datasets by inherently constructing candidate datasets. During the optimization of a simple reconstruction network, intermediates are used as pseudo labels in a self-supervised paradigm, improving generalization for any predictor. We introduce the Self-Correction with Adaptive Mask (SCAM), which discards overfitted components and selectively replaces them with pseudo labels generated from reconstructions. Additionally, we incorporate Spectral Norm Regularization (SNR) to further suppress overfitting from a loss landscape perspective. Our experiments on eleven real-world datasets demonstrate that SCAM consistently improves the performance of various backbone models. This work offers a new perspective on constructing datasets and enhancing the generalization of TSF models through self-supervised learning.

Related papers

VSFormer: Value and Shape-Aware Transformer with Prior-Enhanced Self-Attention for Multivariate Time Series Classification [47.92529531621406]
We propose a novel method, VSFormer, that incorporates both discriminative patterns (shape) and numerical information (value) In addition, we extract class-specific prior information derived from supervised information to enrich the positional encoding. Extensive experiments on all 30 UEA archived datasets demonstrate the superior performance of our method compared to SOTA models.
arXiv Detail & Related papers (2024-12-21T07:31:22Z)
Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting. Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server. We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z)
Distributionally robust self-supervised learning for tabular data [2.942619386779508]
Learning robust representation in presence of error slices is challenging, due to high cardinality features and the complexity of constructing error sets. Traditional robust representation learning methods are largely focused on improving worst group performance in supervised setting in computer vision. Our approach utilizes an encoder-decoder model trained with Masked Language Modeling (MLM) loss to learn robust latent representations.
arXiv Detail & Related papers (2024-10-11T04:23:56Z)
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
UniCL: A Universal Contrastive Learning Framework for Large Time Series Models [18.005358506435847]
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare. Traditional supervised learning methods first annotate extensive labels for time-series data in each task. This paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models.
arXiv Detail & Related papers (2024-05-17T07:47:11Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
TRIAGE: Characterizing and auditing training data for improved regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors. TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score. We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z)
MADS: Modulated Auto-Decoding SIREN for time series imputation [9.673093148930874]
We propose MADS, a novel auto-decoding framework for time series imputation, built upon implicit neural representations. We evaluate our model on two real-world datasets, and show that it outperforms state-of-the-art methods for time series imputation.
arXiv Detail & Related papers (2023-07-03T09:08:47Z)
TSI-GAN: Unsupervised Time Series Anomaly Detection using Convolutional Cycle-Consistent Generative Adversarial Networks [2.4469484645516837]
Anomaly detection is widely used in network intrusion detection, autonomous driving, medical diagnosis, credit card frauds, etc. This paper proposes TSI-GAN, an unsupervised anomaly detection model for time-series that can learn complex temporal patterns automatically. We evaluate TSI-GAN using 250 well-curated and harder-than-usual datasets and compare with 8 state-of-the-art baseline methods.
arXiv Detail & Related papers (2023-03-22T23:24:47Z)
The CLEAR Benchmark: Continual LEArning on Real-World Imagery [77.98377088698984]
Continual learning (CL) is widely regarded as crucial challenge for lifelong AI. We introduce CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts. We find that a simple unsupervised pre-training step can already boost state-of-the-art CL algorithms.
arXiv Detail & Related papers (2022-01-17T09:09:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.