Learning to Predict Chaos: Curriculum-Driven Training for Robust Forecasting of Chaotic Dynamics
- URL: http://arxiv.org/abs/2510.04342v1
- Date: Sun, 05 Oct 2025 20:06:16 GMT
- Title: Learning to Predict Chaos: Curriculum-Driven Training for Robust Forecasting of Chaotic Dynamics
- Authors: Harshil Vejendla,
- Abstract summary: CCF organizes training data based on fundamental principles of dynamical systems theory.<n>CCF enables the model to build a robust and generalizable representation of dynamical behaviors.<n>CCF extends the valid prediction horizon by up to 40% compared to random-order training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Forecasting chaotic systems is a cornerstone challenge in many scientific fields, complicated by the exponential amplification of even infinitesimal prediction errors. Modern machine learning approaches often falter due to two opposing pitfalls: over-specializing on a single, well-known chaotic system (e.g., Lorenz-63), which limits generalizability, or indiscriminately mixing vast, unrelated time-series, which prevents the model from learning the nuances of any specific dynamical regime. We propose Curriculum Chaos Forecasting (CCF), a training paradigm that bridges this gap. CCF organizes training data based on fundamental principles of dynamical systems theory, creating a curriculum that progresses from simple, periodic behaviors to highly complex, chaotic dynamics. We quantify complexity using the largest Lyapunov exponent and attractor dimension, two well-established metrics of chaos. By first training a sequence model on predictable systems and gradually introducing more chaotic trajectories, CCF enables the model to build a robust and generalizable representation of dynamical behaviors. We curate a library of over 50 synthetic ODE/PDE systems to build this curriculum. Our experiments show that pre-training with CCF significantly enhances performance on unseen, real-world benchmarks. On datasets including Sunspot numbers, electricity demand, and human ECG signals, CCF extends the valid prediction horizon by up to 40% compared to random-order training and more than doubles it compared to training on real-world data alone. We demonstrate that this benefit is consistent across various neural architectures (GRU, Transformer) and provide extensive ablations to validate the importance of the curriculum's structure.
Related papers
- ChaosNexus: A Foundation Model for Universal Chaotic System Forecasting with Multi-scale Representations [15.381819123860259]
ChaosNexus is a foundation model pre-trained on a diverse corpus of chaotic dynamics.<n>It demonstrates state-of-the-art zero-shot generalization across both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2025-09-26T02:59:12Z) - Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks [59.552873049024775]
We show that compute-optimally trained models exhibit a remarkably precise universality.<n>With learning rate decay, the collapse becomes so tight that differences in the normalized curves across models fall below the noise floor.<n>We explain these phenomena by connecting collapse to the power-law structure in typical neural scaling laws.
arXiv Detail & Related papers (2025-07-02T20:03:34Z) - Learning Physical Systems: Symplectification via Gauge Fixing in Dirac Structures [8.633430288397376]
We introduce Presymplectification Networks (PSNs), the first framework to learn the symplectification lift via Dirac structures.<n>Our architecture combines a recurrent encoder with a flow-matching objective to learn the augmented phase-space dynamics end-to-end.<n>We then attach a lightweight Symplectic Network (SympNet) to forecast constrained trajectories while preserving energy, momentum, and constraint satisfaction.
arXiv Detail & Related papers (2025-06-23T16:23:37Z) - In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention [52.159541540613915]
We study how multi-head softmax attention models are trained to perform in-context learning on linear data.<n>Our results reveal that in-context learning ability emerges from the trained transformer as an aggregated effect of its architecture and the underlying data distribution.
arXiv Detail & Related papers (2025-03-17T02:00:49Z) - Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective [63.60312929416228]
textbftextitAttraos incorporates chaos theory into long-term time series forecasting.
We show that Attraos outperforms various LTSF methods on mainstream datasets and chaotic datasets with only one-twelfth of the parameters compared to PatchTST.
arXiv Detail & Related papers (2024-02-18T05:35:01Z) - Causal Temporal Regime Structure Learning [49.77103348208835]
We present CASTOR, a novel method that concurrently learns the Directed Acyclic Graph (DAG) for each regime.<n>We establish the identifiability of the regimes and DAGs within our framework.<n>Experiments show that CASTOR consistently outperforms existing causal discovery models.
arXiv Detail & Related papers (2023-11-02T17:26:49Z) - Model scale versus domain knowledge in statistical forecasting of
chaotic systems [7.6146285961466]
We benchmark 24 state-of-the-art forecasting methods on a crowdsourced database of 135 low-dimensional systems with 17 forecast metrics.
We find that large-scale, domain-agnostic forecasting methods consistently produce predictions that remain accurate up to two dozen Lyapunov times.
In data-limited settings outside the long-horizon regime, we find that physics-based hybrid methods retain a comparative advantage due to their strong inductive biases.
arXiv Detail & Related papers (2023-03-13T03:03:17Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Knowledge-based Deep Learning for Modeling Chaotic Systems [7.075125892721573]
This paper considers extreme events and their dynamics and proposes models based on deep neural networks, called knowledge-based deep learning (KDL)
Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data.
We validate our model by assessing it on three real-world benchmark datasets: El Nino sea surface temperature, San Juan Dengue viral infection, and Bjornoya daily precipitation.
arXiv Detail & Related papers (2022-09-09T11:46:25Z) - Physics-Inspired Temporal Learning of Quadrotor Dynamics for Accurate
Model Predictive Trajectory Tracking [76.27433308688592]
Accurately modeling quadrotor's system dynamics is critical for guaranteeing agile, safe, and stable navigation.
We present a novel Physics-Inspired Temporal Convolutional Network (PI-TCN) approach to learning quadrotor's system dynamics purely from robot experience.
Our approach combines the expressive power of sparse temporal convolutions and dense feed-forward connections to make accurate system predictions.
arXiv Detail & Related papers (2022-06-07T13:51:35Z) - Using scientific machine learning for experimental bifurcation analysis
of dynamic systems [2.204918347869259]
This study focuses on training universal differential equation (UDE) models for physical nonlinear dynamical systems with limit cycles.
We consider examples where training data is generated by numerical simulations, whereas we also employ the proposed modelling concept to physical experiments.
We use both neural networks and Gaussian processes as universal approximators alongside the mechanistic models to give a critical assessment of the accuracy and robustness of the UDE modelling approach.
arXiv Detail & Related papers (2021-10-22T15:43:03Z) - Optimized ensemble deep learning framework for scalable forecasting of
dynamics containing extreme events [0.0]
Two machine learning techniques are jointly used to achieve synergistic improvements in model accuracy, stability, scalability, and prompting a new wave of applications in the forecasting of dynamics.
The proposed OEDL model based on a best convex combination of feed-forward neural networks, reservoir computing, and long short-term memory can play a key role in advancing predictions of dynamics consisting of extreme events.
arXiv Detail & Related papers (2021-06-09T10:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.