Related papers: DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a Real Steam Turbine System

DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a Real Steam Turbine System

URL: http://arxiv.org/abs/2210.06399v1
Date: Wed, 12 Oct 2022 16:58:40 GMT
Title: DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a Real Steam Turbine System
Authors: M.H. Modirrousta, M. Aliyari Shoorehdeli, M. Yari, A. Ghahremani
Abstract summary: Machine learning and deep learning have proposed various methods for data-based fault diagnosis. This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In modern industrial systems, diagnosing faults in time and using the best methods becomes more and more crucial. It is possible to fail a system or to waste resources if faults are not detected or are detected late. Machine learning and deep learning have proposed various methods for data-based fault diagnosis, and we are looking for the most reliable and practical ones. This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection. We can increase accuracy, overcome data imbalance, and better predict future defects by updating the reinforcement learning policy when new data is received. By implementing this method, we will see an increase of $3\%$ in all evaluation metrics, an improvement in prediction speed, and $3\%$ - $4\%$ in all evaluation metrics compared to typical backpropagation multi-layer neural network prediction with similar parameters.

Related papers

ReLearn: Unlearning via Learning for Large Language Models [64.2802606302194]
We propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning. This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation. Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output.
arXiv Detail & Related papers (2025-02-16T16:31:00Z)
What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration. We identify the critical limitations of regression-based methods with the widely used data generation pipeline. We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z)
Online-BLS: An Accurate and Efficient Online Broad Learning System for Data Stream Classification [52.251569042852815]
We introduce an online broad learning system framework with closed-form solutions for each online update. We design an effective weight estimation algorithm and an efficient online updating strategy. Our framework is naturally extended to data stream scenarios with concept drift and exceeds state-of-the-art baselines.
arXiv Detail & Related papers (2025-01-28T13:21:59Z)
Improving Malware Detection with Adversarial Domain Adaptation and Control Flow Graphs [10.352741619176383]
Existing solutions to combat concept drift use active learning. We propose a method that learns retained information in malware control flow graphs post-drift by leveraging graph neural network. Our approach demonstrates a significant enhancement in predicting unseen malware family in a binary classification task and predicting drifted malware families in a multi-class setting.
arXiv Detail & Related papers (2024-07-18T22:06:20Z)
A Rate-Distortion View of Uncertainty Quantification [36.85921945174863]
In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction. We introduce Distance Aware Bottleneck (DAB), a new method for enriching deep neural networks with this property.
arXiv Detail & Related papers (2024-06-16T01:33:22Z)
Predicted Embedding Power Regression for Large-Scale Out-of-Distribution Detection [77.1596426383046]
We develop a novel approach that calculates the probability of the predicted class label based on label distributions learned during the training process. Our method performs better than current state-of-the-art methods with only a negligible increase in compute cost.
arXiv Detail & Related papers (2023-03-07T18:28:39Z)
Imbalanced Classification In Faulty Turbine Data: New Proximal Policy Optimization [0.5735035463793008]
We propose a framework for fault detection based on reinforcement learning and a policy known as proximal policy optimization. Using modified Proximal Policy Optimization, we can increase performance, overcome data imbalance, and better predict future faults.
arXiv Detail & Related papers (2023-01-10T16:03:25Z)
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks. estimation noise becomes a bias after the max operator in the policy improvement step. We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z)
Multivariate Anomaly Detection based on Prediction Intervals Constructed using Deep Learning [0.0]
We benchmark our approach against the oft-preferred well-established statistical models. We focus on three deep learning architectures, namely, cascaded neural networks, reservoir computing and long short-term memory recurrent neural networks.
arXiv Detail & Related papers (2021-10-07T12:34:31Z)
Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector. Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z)
Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods. Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z)
META-Learning Eligibility Traces for More Sample Efficient Temporal Difference Learning [2.0559497209595823]
We propose a meta-learning method for adjusting the eligibility trace parameter, in a state-dependent manner. The adaptation is achieved with the help of auxiliary learners that learn distributional information about the update targets online. We prove that, under some assumptions, the proposed method improves the overall quality of the update targets, by minimizing the overall target error.
arXiv Detail & Related papers (2020-06-16T03:41:07Z)
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction [96.90215318875859]
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from corrective feedback. We propose a new algorithm, DisCor, which computes an approximation to this optimal distribution and uses it to re-weight the transitions used for training.
arXiv Detail & Related papers (2020-03-16T16:18:52Z)
Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. We develop an alternative method, which predicts the performance of algorithms given historical data. Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z)
Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm. Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function. We develop an approach for representation learning in RL that sits in between these two extremes. This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.