DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a
Real Steam Turbine System
- URL: http://arxiv.org/abs/2210.06399v1
- Date: Wed, 12 Oct 2022 16:58:40 GMT
- Title: DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a
Real Steam Turbine System
- Authors: M.H. Modirrousta, M. Aliyari Shoorehdeli, M. Yari, A. Ghahremani
- Abstract summary: Machine learning and deep learning have proposed various methods for data-based fault diagnosis.
This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern industrial systems, diagnosing faults in time and using the best
methods becomes more and more crucial. It is possible to fail a system or to
waste resources if faults are not detected or are detected late. Machine
learning and deep learning have proposed various methods for data-based fault
diagnosis, and we are looking for the most reliable and practical ones. This
paper aims to develop a framework based on deep learning and reinforcement
learning for fault detection. We can increase accuracy, overcome data
imbalance, and better predict future defects by updating the reinforcement
learning policy when new data is received. By implementing this method, we will
see an increase of $3\%$ in all evaluation metrics, an improvement in
prediction speed, and $3\%$ - $4\%$ in all evaluation metrics compared to
typical backpropagation multi-layer neural network prediction with similar
parameters.
Related papers
- ReLearn: Unlearning via Learning for Large Language Models [64.2802606302194]
We propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning.
This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation.
Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output.
arXiv Detail & Related papers (2025-02-16T16:31:00Z) - What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration.
We identify the critical limitations of regression-based methods with the widely used data generation pipeline.
We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z) - Online-BLS: An Accurate and Efficient Online Broad Learning System for Data Stream Classification [52.251569042852815]
We introduce an online broad learning system framework with closed-form solutions for each online update.
We design an effective weight estimation algorithm and an efficient online updating strategy.
Our framework is naturally extended to data stream scenarios with concept drift and exceeds state-of-the-art baselines.
arXiv Detail & Related papers (2025-01-28T13:21:59Z) - A Rate-Distortion View of Uncertainty Quantification [36.85921945174863]
In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction.
We introduce Distance Aware Bottleneck (DAB), a new method for enriching deep neural networks with this property.
arXiv Detail & Related papers (2024-06-16T01:33:22Z) - Predicted Embedding Power Regression for Large-Scale Out-of-Distribution
Detection [77.1596426383046]
We develop a novel approach that calculates the probability of the predicted class label based on label distributions learned during the training process.
Our method performs better than current state-of-the-art methods with only a negligible increase in compute cost.
arXiv Detail & Related papers (2023-03-07T18:28:39Z) - Imbalanced Classification In Faulty Turbine Data: New Proximal Policy
Optimization [0.5735035463793008]
We propose a framework for fault detection based on reinforcement learning and a policy known as proximal policy optimization.
Using modified Proximal Policy Optimization, we can increase performance, overcome data imbalance, and better predict future faults.
arXiv Detail & Related papers (2023-01-10T16:03:25Z) - Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks.
estimation noise becomes a bias after the max operator in the policy improvement step.
We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z) - Multivariate Anomaly Detection based on Prediction Intervals Constructed
using Deep Learning [0.0]
We benchmark our approach against the oft-preferred well-established statistical models.
We focus on three deep learning architectures, namely, cascaded neural networks, reservoir computing and long short-term memory recurrent neural networks.
arXiv Detail & Related papers (2021-10-07T12:34:31Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z) - META-Learning Eligibility Traces for More Sample Efficient Temporal
Difference Learning [2.0559497209595823]
We propose a meta-learning method for adjusting the eligibility trace parameter, in a state-dependent manner.
The adaptation is achieved with the help of auxiliary learners that learn distributional information about the update targets online.
We prove that, under some assumptions, the proposed method improves the overall quality of the update targets, by minimizing the overall target error.
arXiv Detail & Related papers (2020-06-16T03:41:07Z) - Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.