Related papers: A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data

A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data

URL: http://arxiv.org/abs/2005.11353v1
Date: Fri, 22 May 2020 18:57:47 GMT
Title: A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data
Authors: S. Onur Sahin and Suleyman S. Kozat
Abstract summary: We introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence. We achieve significant performance improvements with respect to the state-of-the-art methods for the well-known financial and real life datasets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence, in a tree-like architecture without any statistical assumptions or imputations on the missing data, unlike all the previous approaches. In particular, we incorporate the missingness information by selecting a subset of these LSTM networks based on "presence-pattern" of a certain number of previous inputs. From the mixture of experts perspective, we train different LSTM networks as our experts for various missingness patterns and then combine their outputs to generate the final prediction. We also provide the computational complexity analysis of the proposed architecture, which is in the same order of the complexity of the conventional LSTM architectures for the sequence length. Our method can be readily extended to similar structures such as GRUs, RNNs as remarked in the paper. In the experiments, we achieve significant performance improvements with respect to the state-of-the-art methods for the well-known financial and real life datasets.

Related papers

Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models [48.83701310501069]
Large Language Models (LLMs) offer a transformative approach to Neural Architecture Search (NAS)<n>We formulate the search as a sequence of conditional code generation tasks, where an LLM refines architectural specifications based on performance telemetry.<n>We generate a vast corpus of valid, shape-consistent architectures via Abstract Syntax Tree (AST) mutations.<n> Experimental results on CIFAR-100 validate the efficacy of this approach, demonstrating that the model yields statistically significant improvements in accuracy.
arXiv Detail & Related papers (2026-01-13T13:00:30Z)
Recurrent Stochastic Configuration Networks for Temporal Data Analytics [3.8719670789415925]
This paper develops a recurrent version of configuration networks (RSCNs) for problem solving. We build an initial RSCN model in the light of a supervisory mechanism, followed by an online update of the output weights. Numerical results clearly indicate that the proposed RSCN performs favourably over all of the datasets.
arXiv Detail & Related papers (2024-06-21T03:21:22Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction. Our approach is capable of encoding neural networks in a model zoo of mixed architecture. We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z)
Image Classification using Sequence of Pixels [3.04585143845864]
This study compares sequential image classification methods based on recurrent neural networks. We describe methods based on Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM) architectures, etc.
arXiv Detail & Related papers (2022-09-23T09:42:44Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Automatic Remaining Useful Life Estimation Framework with Embedded Convolutional LSTM as the Backbone [5.927250637620123]
We propose a new LSTM variant called embedded convolutional LSTM (E NeuralTM) In ETM a group of different 1D convolutions is embedded into the LSTM structure. Through this, the temporal information is preserved between and within windows. We show the superiority of our proposed ETM approach over the state-of-the-art approaches on several widely used benchmark data sets for RUL Estimation.
arXiv Detail & Related papers (2020-08-10T08:34:20Z)
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z)
Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Forecasting Network-wide Traffic State with Missing Values [23.504633202965376]
We focus on RNN-based models and attempt to reformulate the way to incorporate RNN and its variants into traffic prediction models. A stacked bidirectional and unidirectional LSTM network architecture (SBU-LSTM) is proposed to assist the design of neural network structures for traffic state forecasting. We also propose a data imputation mechanism in the LSTM structure (LSTM-I) by designing an imputation unit to infer missing values and assist traffic prediction.
arXiv Detail & Related papers (2020-05-24T00:17:15Z)
Sentiment Analysis Using Simplified Long Short-term Memory Recurrent Neural Networks [1.5146765382501612]
We perform sentiment analysis on a GOP Debate Twitter dataset. To speed up training and reduce the computational cost and time, six different parameter reduced slim versions of the LSTM model are proposed.
arXiv Detail & Related papers (2020-05-08T12:50:10Z)
Meta-learning framework with applications to zero-shot time-series forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework. residual connections act as a meta-learning adaptation mechanism. We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.