Related papers: Online Continual Learning for Time Series: a Natural Score-driven Approach

Online Continual Learning for Time Series: a Natural Score-driven Approach

URL: http://arxiv.org/abs/2601.12931v1
Date: Mon, 19 Jan 2026 10:31:01 GMT
Title: Online Continual Learning for Time Series: a Natural Score-driven Approach
Authors: Edoardo Urettini, Daniele Atzeni, Ioanna-Yvonni Tsaknaki, Antonio Carta,
Abstract summary: Online continual learning (OCL) methods adapt to changing environments without forgetting past knowledge.<n>Online time series forecasting (OTSF) is a real-world problem where data evolve in time and success depends on both rapid adaptation and long-term memory.<n>This paper aims to strengthen the theoretical and practical connections between time series methods and OCL.
Score: 2.8989185098518626
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online continual learning (OCL) methods adapt to changing environments without forgetting past knowledge. Similarly, online time series forecasting (OTSF) is a real-world problem where data evolve in time and success depends on both rapid adaptation and long-term memory. Indeed, time-varying and regime-switching forecasting models have been extensively studied, offering a strong justification for the use of OCL in these settings. Building on recent work that applies OCL to OTSF, this paper aims to strengthen the theoretical and practical connections between time series methods and OCL. First, we reframe neural network optimization as a parameter filtering problem, showing that natural gradient descent is a score-driven method and proving its information-theoretic optimality. Then, we show that using a Student's t likelihood in addition to natural gradient induces a bounded update, which improves robustness to outliers. Finally, we introduce Natural Score-driven Replay (NatSR), which combines our robust optimizer with a replay buffer and a dynamic scale heuristic that improves fast adaptation at regime drifts. Empirical results demonstrate that NatSR achieves stronger forecasting performance than more complex state-of-the-art methods.

Related papers

When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training [58.25341036646294]
We analytically examine why learning recurrent poles does not provide tangible benefits in data and empirically offer real-time learning scenarios.<n>We show that fixed-pole networks achieve superior performance with lower training complexity, making them more suitable for online real-time tasks.
arXiv Detail & Related papers (2026-02-25T00:15:13Z)
Adaptive Benign Overfitting (ABO): Overparameterized RLS for Online Learning in Non-stationary Time-series [0.0]
ABO is highly accurate (comparable to baseline kernel methods) while achieving speed improvements of between 20 and 40 percent.<n>Results provide a unified view linking adaptive filtering, kernel approximation, and benign overfitting within a stable online learning framework.
arXiv Detail & Related papers (2026-01-29T15:58:01Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
TS-ACL: Closed-Form Solution for Time Series-oriented Continual Learning [16.270548433574465]
Time series class-incremental learning faces two major challenges: catastrophic forgetting and intra-class variations.<n>We propose TS-ACL, which leverages a gradient-free closed-form solution to avoid the catastrophic forgetting problem.<n>It also provides privacy protection and efficiency.
arXiv Detail & Related papers (2024-10-21T12:34:02Z)
Gradient-Free Training of Recurrent Neural Networks using Random Perturbations [1.1742364055094265]
Recurrent neural networks (RNNs) hold immense potential for computations due to their Turing completeness and sequential processing capabilities. Backpropagation through time (BPTT), the prevailing method, extends the backpropagation algorithm by unrolling the RNN over time. BPTT suffers from significant drawbacks, including the need to interleave forward and backward phases and store exact gradient information. We present a new approach to perturbation-based learning in RNNs whose performance is competitive with BPTT.
arXiv Detail & Related papers (2024-05-14T21:15:29Z)
Estimating Post-Synaptic Effects for Online Training of Feed-Forward SNNs [0.27016900604393124]
Facilitating online learning in spiking neural networks (SNNs) is a key step in developing event-based models. We propose Online Training with Postsynaptic Estimates (OTPE) for training feed-forward SNNs. We show improved scaling for multi-layer networks using a novel approximation of temporal effects on the subsequent layer's activity.
arXiv Detail & Related papers (2023-11-07T16:53:39Z)
A Hybrid Framework for Sequential Data Prediction with End-to-End Optimization [0.0]
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates hand-designed features and manual model selection issues. We employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets.
arXiv Detail & Related papers (2022-03-25T17:13:08Z)
Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting. FSNet balances fast adaptation to recent changes and retrieving similar old knowledge. Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Learning to Continuously Optimize Wireless Resource In Episodically Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment. We propose to build the notion of continual learning into the modeling process of learning wireless systems. Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z)
Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem) AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient. Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.