Related papers: DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture Instantaneous and Long-term Effects in Time Series

DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture Instantaneous and Long-term Effects in Time Series

URL: http://arxiv.org/abs/2308.13797v1
Date: Sat, 26 Aug 2023 07:45:41 GMT
Title: DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture Instantaneous and Long-term Effects in Time Series
Authors: Chaoqun Wang, Yijun Li, Xiangqian Sun, Qi Wu, Dongdong Wang and Zhixiang Huang
Abstract summary: We propose a Decomposition-based Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM. We demonstrate the effectiveness and interpretability of DeLELSTM on three empirical datasets.
Score: 26.378073712630467
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series forecasting is prevalent in various real-world applications. Despite the promising results of deep learning models in time series forecasting, especially the Recurrent Neural Networks (RNNs), the explanations of time series models, which are critical in high-stakes applications, have received little attention. In this paper, we propose a Decomposition-based Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM. Conventionally, the interpretability of RNNs only concentrates on the variable importance and time importance. We additionally distinguish between the instantaneous influence of new coming data and the long-term effects of historical data. Specifically, DeLELSTM consists of two components, i.e., standard LSTM and tensorized LSTM. The tensorized LSTM assigns each variable with a unique hidden state making up a matrix $\mathbf{h}_t$, and the standard LSTM models all the variables with a shared hidden state $\mathbf{H}_t$. By decomposing the $\mathbf{H}_t$ into the linear combination of past information $\mathbf{h}_{t-1}$ and the fresh information $\mathbf{h}_{t}-\mathbf{h}_{t-1}$, we can get the instantaneous influence and the long-term effect of each variable. In addition, the advantage of linear regression also makes the explanation transparent and clear. We demonstrate the effectiveness and interpretability of DeLELSTM on three empirical datasets. Extensive experiments show that the proposed method achieves competitive performance against the baseline methods and provides a reliable explanation relative to domain knowledge.

Related papers

packetLSTM: Dynamic LSTM Framework for Streaming Data with Varying Feature Space [44.62845936150961]
We study the online learning problem characterized by the varying input feature space of streaming data. We propose a dynamic LSTM-based novel method, called packetLSTM, to model the dimension-varying streams. packetLSTM achieves state-of-the-art results on five datasets, and its underlying principle is extended to other RNN types, like GRU and vanilla RNN.
arXiv Detail & Related papers (2024-10-22T20:01:39Z)
On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
Empirical Risk Minimization (ERM) is able to achieve sublinear error whenever a class is learnable with iid data. We show that ERM is able to achieve sublinear error whenever a class is learnable with iid data.
arXiv Detail & Related papers (2024-02-22T21:55:41Z)
B-LSTM-MIONet: Bayesian LSTM-based Neural Operators for Learning the Response of Complex Dynamical Systems to Length-Variant Multiple Input Functions [6.75867828529733]
Multiple-input deep neural operators (MIONet) extended DeepONet to allow multiple input functions in different Banach spaces. MIONet offers flexibility in training dataset grid spacing, without constraints on output location. This work redesigns MIONet, integrating Long Short Term Memory (LSTM) to learn neural operators from time-dependent data.
arXiv Detail & Related papers (2023-11-28T04:58:17Z)
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning [75.61681328968714]
We propose recurrent independent Grid LSTM (RigLSTM) to exploit the underlying modular structure of the target task. Our model adopts cell selection, input feature selection, hidden state selection, and soft state updating to achieve a better generalization ability.
arXiv Detail & Related papers (2023-11-03T07:40:06Z)
Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex. In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z)
Learning Mixtures of Linear Dynamical Systems [94.49754087817931]
We develop a two-stage meta-algorithm to efficiently recover each ground-truth LDS model up to error $tildeO(sqrtd/T)$. We validate our theoretical studies with numerical experiments, confirming the efficacy of the proposed algorithm.
arXiv Detail & Related papers (2022-01-26T22:26:01Z)
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers [21.09321438439848]
We introduce a simple sequence model inspired by control systems that generalize. We show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN-1s, and share features of NDEs such as time-scale adaptation.
arXiv Detail & Related papers (2021-10-26T19:44:53Z)
Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms. Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics. We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z)
Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data. Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Long short-term memory networks and laglasso for bond yield forecasting: Peeping inside the black box [10.412912723760172]
We conduct the first study of bond yield forecasting using long short-term memory (LSTM) networks. We calculate the LSTM signals through time, at selected locations in the memory cell, using sequence-to-sequence architectures.
arXiv Detail & Related papers (2020-05-05T14:23:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.