DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture
Instantaneous and Long-term Effects in Time Series
- URL: http://arxiv.org/abs/2308.13797v1
- Date: Sat, 26 Aug 2023 07:45:41 GMT
- Title: DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture
Instantaneous and Long-term Effects in Time Series
- Authors: Chaoqun Wang, Yijun Li, Xiangqian Sun, Qi Wu, Dongdong Wang and
Zhixiang Huang
- Abstract summary: We propose a Decomposition-based Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM.
We demonstrate the effectiveness and interpretability of DeLELSTM on three empirical datasets.
- Score: 26.378073712630467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting is prevalent in various real-world applications.
Despite the promising results of deep learning models in time series
forecasting, especially the Recurrent Neural Networks (RNNs), the explanations
of time series models, which are critical in high-stakes applications, have
received little attention. In this paper, we propose a Decomposition-based
Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM.
Conventionally, the interpretability of RNNs only concentrates on the variable
importance and time importance. We additionally distinguish between the
instantaneous influence of new coming data and the long-term effects of
historical data. Specifically, DeLELSTM consists of two components, i.e.,
standard LSTM and tensorized LSTM. The tensorized LSTM assigns each variable
with a unique hidden state making up a matrix $\mathbf{h}_t$, and the standard
LSTM models all the variables with a shared hidden state $\mathbf{H}_t$. By
decomposing the $\mathbf{H}_t$ into the linear combination of past information
$\mathbf{h}_{t-1}$ and the fresh information $\mathbf{h}_{t}-\mathbf{h}_{t-1}$,
we can get the instantaneous influence and the long-term effect of each
variable. In addition, the advantage of linear regression also makes the
explanation transparent and clear. We demonstrate the effectiveness and
interpretability of DeLELSTM on three empirical datasets. Extensive experiments
show that the proposed method achieves competitive performance against the
baseline methods and provides a reliable explanation relative to domain
knowledge.
Related papers
- packetLSTM: Dynamic LSTM Framework for Streaming Data with Varying Feature Space [44.62845936150961]
We study the online learning problem characterized by the varying input feature space of streaming data.
We propose a dynamic LSTM-based novel method, called packetLSTM, to model the dimension-varying streams.
packetLSTM achieves state-of-the-art results on five datasets, and its underlying principle is extended to other RNN types, like GRU and vanilla RNN.
arXiv Detail & Related papers (2024-10-22T20:01:39Z) - On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
Empirical Risk Minimization (ERM) is able to achieve sublinear error whenever a class is learnable with iid data.
We show that ERM is able to achieve sublinear error whenever a class is learnable with iid data.
arXiv Detail & Related papers (2024-02-22T21:55:41Z) - B-LSTM-MIONet: Bayesian LSTM-based Neural Operators for Learning the
Response of Complex Dynamical Systems to Length-Variant Multiple Input
Functions [6.75867828529733]
Multiple-input deep neural operators (MIONet) extended DeepONet to allow multiple input functions in different Banach spaces.
MIONet offers flexibility in training dataset grid spacing, without constraints on output location.
This work redesigns MIONet, integrating Long Short Term Memory (LSTM) to learn neural operators from time-dependent data.
arXiv Detail & Related papers (2023-11-28T04:58:17Z) - RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence
Learning [75.61681328968714]
We propose recurrent independent Grid LSTM (RigLSTM) to exploit the underlying modular structure of the target task.
Our model adopts cell selection, input feature selection, hidden state selection, and soft state updating to achieve a better generalization ability.
arXiv Detail & Related papers (2023-11-03T07:40:06Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Learning Mixtures of Linear Dynamical Systems [94.49754087817931]
We develop a two-stage meta-algorithm to efficiently recover each ground-truth LDS model up to error $tildeO(sqrtd/T)$.
We validate our theoretical studies with numerical experiments, confirming the efficacy of the proposed algorithm.
arXiv Detail & Related papers (2022-01-26T22:26:01Z) - Combining Recurrent, Convolutional, and Continuous-time Models with
Linear State-Space Layers [21.09321438439848]
We introduce a simple sequence model inspired by control systems that generalize.
We show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths.
For example, they generalize convolutions to continuous-time, explain common RNN-1s, and share features of NDEs such as time-scale adaptation.
arXiv Detail & Related papers (2021-10-26T19:44:53Z) - Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms.
Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics.
We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Long short-term memory networks and laglasso for bond yield forecasting:
Peeping inside the black box [10.412912723760172]
We conduct the first study of bond yield forecasting using long short-term memory (LSTM) networks.
We calculate the LSTM signals through time, at selected locations in the memory cell, using sequence-to-sequence architectures.
arXiv Detail & Related papers (2020-05-05T14:23:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.