DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a
Real Steam Turbine System
- URL: http://arxiv.org/abs/2210.06399v1
- Date: Wed, 12 Oct 2022 16:58:40 GMT
- Title: DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a
Real Steam Turbine System
- Authors: M.H. Modirrousta, M. Aliyari Shoorehdeli, M. Yari, A. Ghahremani
- Abstract summary: Machine learning and deep learning have proposed various methods for data-based fault diagnosis.
This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern industrial systems, diagnosing faults in time and using the best
methods becomes more and more crucial. It is possible to fail a system or to
waste resources if faults are not detected or are detected late. Machine
learning and deep learning have proposed various methods for data-based fault
diagnosis, and we are looking for the most reliable and practical ones. This
paper aims to develop a framework based on deep learning and reinforcement
learning for fault detection. We can increase accuracy, overcome data
imbalance, and better predict future defects by updating the reinforcement
learning policy when new data is received. By implementing this method, we will
see an increase of $3\%$ in all evaluation metrics, an improvement in
prediction speed, and $3\%$ - $4\%$ in all evaluation metrics compared to
typical backpropagation multi-layer neural network prediction with similar
parameters.
Related papers
- Improving Malware Detection with Adversarial Domain Adaptation and Control Flow Graphs [10.352741619176383]
Existing solutions to combat concept drift use active learning.
We propose a method that learns retained information in malware control flow graphs post-drift by leveraging graph neural network.
Our approach demonstrates a significant enhancement in predicting unseen malware family in a binary classification task and predicting drifted malware families in a multi-class setting.
arXiv Detail & Related papers (2024-07-18T22:06:20Z) - Predicted Embedding Power Regression for Large-Scale Out-of-Distribution
Detection [77.1596426383046]
We develop a novel approach that calculates the probability of the predicted class label based on label distributions learned during the training process.
Our method performs better than current state-of-the-art methods with only a negligible increase in compute cost.
arXiv Detail & Related papers (2023-03-07T18:28:39Z) - Imbalanced Classification In Faulty Turbine Data: New Proximal Policy
Optimization [0.5735035463793008]
We propose a framework for fault detection based on reinforcement learning and a policy known as proximal policy optimization.
Using modified Proximal Policy Optimization, we can increase performance, overcome data imbalance, and better predict future faults.
arXiv Detail & Related papers (2023-01-10T16:03:25Z) - Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks.
estimation noise becomes a bias after the max operator in the policy improvement step.
We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z) - Multivariate Anomaly Detection based on Prediction Intervals Constructed
using Deep Learning [0.0]
We benchmark our approach against the oft-preferred well-established statistical models.
We focus on three deep learning architectures, namely, cascaded neural networks, reservoir computing and long short-term memory recurrent neural networks.
arXiv Detail & Related papers (2021-10-07T12:34:31Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z) - META-Learning Eligibility Traces for More Sample Efficient Temporal
Difference Learning [2.0559497209595823]
We propose a meta-learning method for adjusting the eligibility trace parameter, in a state-dependent manner.
The adaptation is achieved with the help of auxiliary learners that learn distributional information about the update targets online.
We prove that, under some assumptions, the proposed method improves the overall quality of the update targets, by minimizing the overall target error.
arXiv Detail & Related papers (2020-06-16T03:41:07Z) - DisCor: Corrective Feedback in Reinforcement Learning via Distribution
Correction [96.90215318875859]
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from corrective feedback.
We propose a new algorithm, DisCor, which computes an approximation to this optimal distribution and uses it to re-weight the transitions used for training.
arXiv Detail & Related papers (2020-03-16T16:18:52Z) - Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z) - Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function.
We develop an approach for representation learning in RL that sits in between these two extremes.
This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.