Related papers: Switching Autoregressive Low-rank Tensor Models

Switching Autoregressive Low-rank Tensor Models

URL: http://arxiv.org/abs/2306.03291v2
Date: Wed, 7 Jun 2023 00:35:34 GMT
Title: Switching Autoregressive Low-rank Tensor Models
Authors: Hyun Dong Lee, Andrew Warrington, Joshua I. Glaser, Scott W. Linderman
Abstract summary: We show how to switch autoregressive low-rank tensor (SALT) models. SALT parameterizes the tensor of an ARHMM with a low-rank factorization to control the number of parameters. We prove theoretical and discuss practical connections between SALT, linear dynamical systems, and SLDSs.
Score: 12.461139675114818
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: An important problem in time-series analysis is modeling systems with time-varying dynamics. Probabilistic models with joint continuous and discrete latent states offer interpretable, efficient, and experimentally useful descriptions of such data. Commonly used models include autoregressive hidden Markov models (ARHMMs) and switching linear dynamical systems (SLDSs), each with its own advantages and disadvantages. ARHMMs permit exact inference and easy parameter estimation, but are parameter intensive when modeling long dependencies, and hence are prone to overfitting. In contrast, SLDSs can capture long-range dependencies in a parameter efficient way through Markovian latent dynamics, but present an intractable likelihood and a challenging parameter estimation task. In this paper, we propose switching autoregressive low-rank tensor (SALT) models, which retain the advantages of both approaches while ameliorating the weaknesses. SALT parameterizes the tensor of an ARHMM with a low-rank factorization to control the number of parameters and allow longer range dependencies without overfitting. We prove theoretical and discuss practical connections between SALT, linear dynamical systems, and SLDSs. We empirically demonstrate quantitative advantages of SALT models on a range of simulated and real prediction tasks, including behavioral and neural datasets. Furthermore, the learned low-rank tensor provides novel insights into temporal dependencies within each discrete state.

Related papers

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches. We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z)
Combined Optimization of Dynamics and Assimilation with End-to-End Learning on Sparse Observations [1.492574139257933]
CODA is an end-to-end optimization scheme for jointly learning dynamics and DA directly from sparse and noisy observations. We introduce a novel learning objective that combines unrolled auto-regressive dynamics with the data- and self-consistency terms of weak-constraint 4Dvar DA.
arXiv Detail & Related papers (2024-09-11T09:36:15Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
HOPE for a Robust Parameterization of Long-memory State Space Models [51.66430224089725]
State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences. We develop a new parameterization scheme, called HOPE, for LTI systems that utilize Markov parameters within Hankel operators. Our new parameterization endows the SSM with non-decaying memory within a fixed time window, which is empirically corroborated by a sequential CIFAR-10 task with padded noise.
arXiv Detail & Related papers (2024-05-22T20:20:14Z)
Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models. We also propose a simple pre-training technique that leads to fully or partially shared models. Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z)
Multi-fidelity reduced-order surrogate modeling [5.346062841242067]
We present a new data-driven strategy that combines dimensionality reduction with multi-fidelity neural network surrogates. We show that the onset of instabilities and transients are well captured by this surrogate technique.
arXiv Detail & Related papers (2023-09-01T08:16:53Z)
Reduced order modeling of parametrized systems through autoencoders and SINDy approach: continuation of periodic solutions [0.0]
This work presents a data-driven, non-intrusive framework which combines ROM construction with reduced dynamics identification. The proposed approach leverages autoencoder neural networks with parametric sparse identification of nonlinear dynamics (SINDy) to construct a low-dimensional dynamical model. These aim at tracking the evolution of periodic steady-state responses as functions of system parameters, avoiding the computation of the transient phase, and allowing to detect instabilities and bifurcations.
arXiv Detail & Related papers (2022-11-13T01:57:18Z)
ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting [1.4502611532302039]
Short-term load forecasting (STLF) is challenging due to complex time series (TS) This paper proposes a novel hybrid hierarchical deep learning model that deals with multiple seasonality. It combines exponential smoothing (ES) and a recurrent neural network (RNN)
arXiv Detail & Related papers (2021-12-05T19:38:42Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Neural Closure Models for Dynamical Systems [35.000303827255024]
We develop a novel methodology to learn non-Markovian closure parameterizations for low-fidelity models. New "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs) We show that using non-Markovian over Markovian closures improves long-term accuracy and requires smaller networks.
arXiv Detail & Related papers (2020-12-27T05:55:33Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.