Process Outcome Prediction: CNN vs. LSTM (with Attention)
- URL: http://arxiv.org/abs/2104.06934v1
- Date: Wed, 14 Apr 2021 15:38:32 GMT
- Title: Process Outcome Prediction: CNN vs. LSTM (with Attention)
- Authors: Hans Weytjens and Jochen De Weerdt
- Abstract summary: We study the performance of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) on time series problems.
Our findings show that all these neural networks achieve satisfactory to high predictive power.
We argue that CNNs' speed, early predictive power and robustness should pave the way for their application in process outcome prediction.
- Score: 0.15229257192293202
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The early outcome prediction of ongoing or completed processes confers
competitive advantage to organizations. The performance of classic machine
learning and, more recently, deep learning techniques such as Long Short-Term
Memory (LSTM) on this type of classification problem has been thorougly
investigated. Recently, much research focused on applying Convolutional Neural
Networks (CNN) to time series problems including classification, however not
yet to outcome prediction. The purpose of this paper is to close this gap and
compare CNNs to LSTMs. Attention is another technique that, in combination with
LSTMs, has found application in time series classification and was included in
our research. Our findings show that all these neural networks achieve
satisfactory to high predictive power provided sufficiently large datasets.
CNNs perfom on par with LSTMs; the Attention mechanism adds no value to the
latter. Since CNNs run one order of magnitude faster than both types of LSTM,
their use is preferable. All models are robust with respect to their
hyperparameters and achieve their maximal predictive power early on in the
cases, usually after only a few events, making them highly suitable for runtime
predictions. We argue that CNNs' speed, early predictive power and robustness
should pave the way for their application in process outcome prediction.
Related papers
- Time Elastic Neural Networks [2.1756081703276]
We introduce and detail an atypical neural network architecture, called time elastic neural network (teNN)
The novelty compared to classical neural network architecture is that it explicitly incorporates time warping ability.
We demonstrate that, during the training process, the teNN succeeds in reducing the number of neurons required within each cell.
arXiv Detail & Related papers (2024-05-27T09:01:30Z) - SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin
Transformer and LSTM [10.104358712577215]
We propose a new recurrent cell ConvwinLSTM, which integrates Swin blocks and the LSTM, an extension that replaces the convolutional structure in ConvwinLSTM with the self-attention.
Our competitive experimental results demonstrate that learning global spatial dependencies is more advantageous for models to capture Swinwin dependencies.
arXiv Detail & Related papers (2023-08-19T03:08:28Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone.
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions.
We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z) - Evaluation of deep learning models for multi-step ahead time series
prediction [1.3764085113103222]
We present an evaluation study that compares the performance of deep learning models for multi-step ahead time series prediction.
Our deep learning methods compromise of simple recurrent neural networks, long short term memory (LSTM) networks, bidirectional LSTM, encoder-decoder LSTM networks, and convolutional neural networks.
arXiv Detail & Related papers (2021-03-26T04:07:11Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Automatic Remaining Useful Life Estimation Framework with Embedded
Convolutional LSTM as the Backbone [5.927250637620123]
We propose a new LSTM variant called embedded convolutional LSTM (E NeuralTM)
In ETM a group of different 1D convolutions is embedded into the LSTM structure. Through this, the temporal information is preserved between and within windows.
We show the superiority of our proposed ETM approach over the state-of-the-art approaches on several widely used benchmark data sets for RUL Estimation.
arXiv Detail & Related papers (2020-08-10T08:34:20Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z) - COVID-19 growth prediction using multivariate long short term memory [2.588973722689844]
We use long short-term memory (LSTM) method to learn the correlation of COVID-19 growth over time.
First, we trained training data containing confirmed cases from around the globe.
We achieved favorable performance compared with that of the recurrent neural network (RNN) method with a comparable low validation error.
arXiv Detail & Related papers (2020-05-10T23:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.