Related papers: T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders

T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders

URL: http://arxiv.org/abs/2506.15881v1
Date: Wed, 18 Jun 2025 21:14:38 GMT
Title: T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders
Authors: Alexey Yermakov, David Zoro, Mars Liyao Gao, J. Nathan Kutz,
Abstract summary: SHallow REcurrent Decoders (SHRED) are effective for system identification and forecasting from sparse sensor measurements.<n>We improve SHRED by leveraging transformers (T-SHRED) for the temporal encoding which improves performance on next-step state prediction.<n> Symbolic regression improves model interpretability by learning and regularizing the dynamics of the latent space during training.
Score: 2.8820361301109365
License: http://creativecommons.org/licenses/by/4.0/
Abstract: SHallow REcurrent Decoders (SHRED) are effective for system identification and forecasting from sparse sensor measurements. Such models are light-weight and computationally efficient, allowing them to be trained on consumer laptops. SHRED-based models rely on Recurrent Neural Networks (RNNs) and a simple Multi-Layer Perceptron (MLP) for the temporal encoding and spatial decoding respectively. Despite the relatively simple structure of SHRED, they are able to predict chaotic dynamical systems on different physical, spatial, and temporal scales directly from a sparse set of sensor measurements. In this work, we improve SHRED by leveraging transformers (T-SHRED) for the temporal encoding which improves performance on next-step state prediction on large datasets. We also introduce a sparse identification of nonlinear dynamics (SINDy) attention mechanism into T-SHRED to perform symbolic regression directly on the latent space as part of the model regularization architecture. Symbolic regression improves model interpretability by learning and regularizing the dynamics of the latent space during training. We analyze the performance of T-SHRED on three different dynamical systems ranging from low-data to high-data regimes. We observe that SINDy attention T-SHRED accurately predicts future frames based on an interpretable symbolic model across all tested datasets.

Related papers

IONext: Unlocking the Next Era of Inertial Odometry [24.137981640306034]
We present a new CNN-based inertial odometry backbone, named Next Era of Inertial Odometry (IONext)<n>IONext consistently outperforms state-of-the-art (SOTA) Transformer- and CNN-based methods.<n>For instance, on the RNIN dataset, IONext reduces the average ATE by 10% and the average RTE by 12% compared to the representative model iMOT.
arXiv Detail & Related papers (2025-07-23T00:09:36Z)
Weight-Space Linear Recurrent Neural Networks [0.5937476291232799]
WARP (Weight-space Adaptive Recurrent Prediction) is a powerful framework that unifies weight-space learning with linear recurrence.<n>We show that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks.
arXiv Detail & Related papers (2025-06-01T20:13:28Z)
Sparse identification of nonlinear dynamics and Koopman operators with Shallow Recurrent Decoder Networks [3.1484174280822845]
We present a method to jointly solve the sensing and model identification problems with simple implementation, efficient, and robust performance.<n>SINDy-SHRED uses Gated Recurrent Units to model sparse sensor measurements along with a shallow network decoder to reconstruct the full-temporal field from the latent state space.<n>We conduct systematic experimental studies on PDE data such as turbulent flows, real-world sensor measurements for sea surface temperature, and direct video data.
arXiv Detail & Related papers (2025-01-23T02:18:13Z)
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs [59.434893231950205]
Dynamic graph learning aims to uncover evolutionary laws in real-world systems. We propose DyG-Mamba, a new continuous state space model for dynamic graph learning. We show that DyG-Mamba achieves state-of-the-art performance on most datasets.
arXiv Detail & Related papers (2024-08-13T15:21:46Z)
KFD-NeRF: Rethinking Dynamic NeRF with Kalman Filter [49.85369344101118]
We introduce KFD-NeRF, a novel dynamic neural radiance field integrated with an efficient and high-quality motion reconstruction framework based on Kalman filtering. Our key idea is to model the dynamic radiance field as a dynamic system whose temporally varying states are estimated based on two sources of knowledge: observations and predictions. Our KFD-NeRF demonstrates similar or even superior performance within comparable computational time and state-of-the-art view synthesis performance with thorough training.
arXiv Detail & Related papers (2024-07-18T05:48:24Z)
Shallow Recurrent Decoder for Reduced Order Modeling of Plasma Dynamics [2.9320342785886973]
We develop a model reduction scheme based upon a em Shallow REcurrent Decoder (SH) architecture. Based upon the theory of separation of variables, the SHRED architecture is capable of reconstructing full-temporal fields with as little as three point sensors.
arXiv Detail & Related papers (2024-05-20T11:21:23Z)
Brain-Inspired Spiking Neural Network for Online Unsupervised Time Series Prediction [13.521272923545409]
We present a novel Continuous Learning-based Unsupervised Recurrent Spiking Neural Network Model (CLURSNN) CLURSNN makes online predictions by reconstructing the underlying dynamical system using Random Delay Embedding. We show that the proposed online time series prediction methodology outperforms state-of-the-art DNN models when predicting an evolving Lorenz63 dynamical system.
arXiv Detail & Related papers (2023-04-10T16:18:37Z)
Online Evolutionary Neural Architecture Search for Multivariate Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm. ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks. Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z)
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks [88.77951448313486]
We present a new approach for model acceleration by exploiting spatial sparsity in visual data. We propose a dynamic token sparsification framework to prune redundant tokens. We extend our method to hierarchical models including CNNs and hierarchical vision Transformers.
arXiv Detail & Related papers (2022-07-04T17:00:51Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.