Related papers: Process Outcome Prediction: CNN vs. LSTM (with Attention)

Process Outcome Prediction: CNN vs. LSTM (with Attention)

URL: http://arxiv.org/abs/2104.06934v1
Date: Wed, 14 Apr 2021 15:38:32 GMT
Title: Process Outcome Prediction: CNN vs. LSTM (with Attention)
Authors: Hans Weytjens and Jochen De Weerdt
Abstract summary: We study the performance of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) on time series problems. Our findings show that all these neural networks achieve satisfactory to high predictive power. We argue that CNNs' speed, early predictive power and robustness should pave the way for their application in process outcome prediction.
Score: 0.15229257192293202
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The early outcome prediction of ongoing or completed processes confers competitive advantage to organizations. The performance of classic machine learning and, more recently, deep learning techniques such as Long Short-Term Memory (LSTM) on this type of classification problem has been thorougly investigated. Recently, much research focused on applying Convolutional Neural Networks (CNN) to time series problems including classification, however not yet to outcome prediction. The purpose of this paper is to close this gap and compare CNNs to LSTMs. Attention is another technique that, in combination with LSTMs, has found application in time series classification and was included in our research. Our findings show that all these neural networks achieve satisfactory to high predictive power provided sufficiently large datasets. CNNs perfom on par with LSTMs; the Attention mechanism adds no value to the latter. Since CNNs run one order of magnitude faster than both types of LSTM, their use is preferable. All models are robust with respect to their hyperparameters and achieve their maximal predictive power early on in the cases, usually after only a few events, making them highly suitable for runtime predictions. We argue that CNNs' speed, early predictive power and robustness should pave the way for their application in process outcome prediction.

Related papers

CNN-LSTM Hybrid Deep Learning Model for Remaining Useful Life Estimation [0.0]
We propose a hybrid approach combining Convolutional Neural Networks with Long Short-Term Memory (LSTM) networks for RUL estimation. Our results demonstrate that the hybrid CNN-LSTM model achieves the highest accuracy, offering a superior score compared to the other methods.
arXiv Detail & Related papers (2024-12-20T15:48:57Z)
Time Elastic Neural Networks [2.1756081703276]
We introduce and detail an atypical neural network architecture, called time elastic neural network (teNN) The novelty compared to classical neural network architecture is that it explicitly incorporates time warping ability. We demonstrate that, during the training process, the teNN succeeds in reducing the number of neurons required within each cell.
arXiv Detail & Related papers (2024-05-27T09:01:30Z)
SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM [10.104358712577215]
We propose a new recurrent cell ConvwinLSTM, which integrates Swin blocks and the LSTM, an extension that replaces the convolutional structure in ConvwinLSTM with the self-attention. Our competitive experimental results demonstrate that learning global spatial dependencies is more advantageous for models to capture Swinwin dependencies.
arXiv Detail & Related papers (2023-08-19T03:08:28Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z)
Evaluation of deep learning models for multi-step ahead time series prediction [1.3764085113103222]
We present an evaluation study that compares the performance of deep learning models for multi-step ahead time series prediction. Our deep learning methods compromise of simple recurrent neural networks, long short term memory (LSTM) networks, bidirectional LSTM, encoder-decoder LSTM networks, and convolutional neural networks.
arXiv Detail & Related papers (2021-03-26T04:07:11Z)
A Meta-Learning Approach to the Optimal Power Flow Problem Under Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach. The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z)
Automatic Remaining Useful Life Estimation Framework with Embedded Convolutional LSTM as the Backbone [5.927250637620123]
We propose a new LSTM variant called embedded convolutional LSTM (E NeuralTM) In ETM a group of different 1D convolutions is embedded into the LSTM structure. Through this, the temporal information is preserved between and within windows. We show the superiority of our proposed ETM approach over the state-of-the-art approaches on several widely used benchmark data sets for RUL Estimation.
arXiv Detail & Related papers (2020-08-10T08:34:20Z)
Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs) We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)
COVID-19 growth prediction using multivariate long short term memory [2.588973722689844]
We use long short-term memory (LSTM) method to learn the correlation of COVID-19 growth over time. First, we trained training data containing confirmed cases from around the globe. We achieved favorable performance compared with that of the recurrent neural network (RNN) method with a comparable low validation error.
arXiv Detail & Related papers (2020-05-10T23:21:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.