Towards Stability of Autoregressive Neural Operators
- URL: http://arxiv.org/abs/2306.10619v2
- Date: Sun, 10 Dec 2023 22:07:45 GMT
- Title: Towards Stability of Autoregressive Neural Operators
- Authors: Michael McCabe, Peter Harrington, Shashank Subramanian, Jed Brown
- Abstract summary: We present results on scientific systems that include Navier-Stokes fluid flow, rotating shallow water, and a high-resolution global weather forecasting system.
Applying our design principles to neural operators leads to significantly lower errors for long-term forecasts as well as longer time horizons.
- Score: 5.161531917413708
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural operators have proven to be a promising approach for modeling
spatiotemporal systems in the physical sciences. However, training these models
for large systems can be quite challenging as they incur significant
computational and memory expense -- these systems are often forced to rely on
autoregressive time-stepping of the neural network to predict future temporal
states. While this is effective in managing costs, it can lead to uncontrolled
error growth over time and eventual instability. We analyze the sources of this
autoregressive error growth using prototypical neural operator models for
physical systems and explore ways to mitigate it. We introduce architectural
and application-specific improvements that allow for careful control of
instability-inducing operations within these models without inflating the
compute/memory expense. We present results on several scientific systems that
include Navier-Stokes fluid flow, rotating shallow water, and a high-resolution
global weather forecasting system. We demonstrate that applying our design
principles to neural operators leads to significantly lower errors for
long-term forecasts as well as longer time horizons without qualitative signs
of divergence compared to the original models for these systems. We open-source
our \href{https://github.com/mikemccabe210/stabilizing_neural_operators}{code}
for reproducibility.
Related papers
- Langevin Flows for Modeling Neural Latent Dynamics [81.81271685018284]
We introduce LangevinFlow, a sequential Variational Auto-Encoder where the time evolution of latent variables is governed by the underdamped Langevin equation.<n>Our approach incorporates physical priors -- such as inertia, damping, a learned potential function, and forces -- to represent both autonomous and non-autonomous processes in neural systems.<n>Our method outperforms state-of-the-art baselines on synthetic neural populations generated by a Lorenz attractor.
arXiv Detail & Related papers (2025-07-15T17:57:48Z) - Real-Time Anomaly Detection and Reactive Planning with Large Language Models [18.57162998677491]
Foundation models, e.g., large language models (LLMs), trained on internet-scale data possess zero-shot capabilities.
We present a two-stage reasoning framework that incorporates the judgement regarding potential anomalies into a safe control framework.
This enables our monitor to improve the trustworthiness of dynamic robotic systems, such as quadrotors or autonomous vehicles.
arXiv Detail & Related papers (2024-07-11T17:59:22Z) - The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks [11.507779310946853]
We report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network.
Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause.
arXiv Detail & Related papers (2024-05-23T02:58:23Z) - RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing [0.0]
"refreshing" mechanism in RefreshNet allows coarser blocks to reset inputs of finer blocks, effectively controlling and alleviating error accumulation.
"refreshing" mechanism in RefreshNet allows coarser blocks to reset inputs of finer blocks, effectively controlling and alleviating error accumulation.
arXiv Detail & Related papers (2024-01-24T07:47:01Z) - Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics
for Temporal Modeling [13.38194491846739]
We propose a novel machine learning model based on Koopman operator theory, which we call Koopman Invertible Autoencoders (KIA)
KIA captures the inherent characteristic of the system by modeling both forward and backward dynamics in the infinite-dimensional Hilbert space.
This enables us to efficiently learn low-dimensional representations, resulting in more accurate predictions of long-term system behavior.
arXiv Detail & Related papers (2023-09-19T03:42:55Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Physics-Inspired Temporal Learning of Quadrotor Dynamics for Accurate
Model Predictive Trajectory Tracking [76.27433308688592]
Accurately modeling quadrotor's system dynamics is critical for guaranteeing agile, safe, and stable navigation.
We present a novel Physics-Inspired Temporal Convolutional Network (PI-TCN) approach to learning quadrotor's system dynamics purely from robot experience.
Our approach combines the expressive power of sparse temporal convolutions and dense feed-forward connections to make accurate system predictions.
arXiv Detail & Related papers (2022-06-07T13:51:35Z) - An advanced spatio-temporal convolutional recurrent neural network for
storm surge predictions [73.4962254843935]
We study the capability of artificial neural network models to emulate storm surge based on the storm track/size/intensity history.
This study presents a neural network model that can predict storm surge, informed by a database of synthetic storm simulations.
arXiv Detail & Related papers (2022-04-18T23:42:18Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z) - Industrial Forecasting with Exponentially Smoothed Recurrent Neural
Networks [0.0]
We present a class of exponential smoothed recurrent neural networks (RNNs) which are well suited to modeling non-stationary dynamical systems arising in industrial applications.
Application of exponentially smoothed RNNs to forecasting electricity load, weather data, and stock prices highlight the efficacy of exponential smoothing of the hidden state for multi-step time series forecasting.
arXiv Detail & Related papers (2020-04-09T17:53:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.