Imbalanced Classification In Faulty Turbine Data: New Proximal Policy
Optimization
- URL: http://arxiv.org/abs/2301.04049v1
- Date: Tue, 10 Jan 2023 16:03:25 GMT
- Title: Imbalanced Classification In Faulty Turbine Data: New Proximal Policy
Optimization
- Authors: Mohammad Hossein Modirrousta, Mahdi Aliyari Shoorehdeli, Mostafa Yari
and Arash Ghahremani
- Abstract summary: We propose a framework for fault detection based on reinforcement learning and a policy known as proximal policy optimization.
Using modified Proximal Policy Optimization, we can increase performance, overcome data imbalance, and better predict future faults.
- Score: 0.5735035463793008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is growing importance to detecting faults and implementing the best
methods in industrial and real-world systems. We are searching for the most
trustworthy and practical data-based fault detection methods proposed by
artificial intelligence applications. In this paper, we propose a framework for
fault detection based on reinforcement learning and a policy known as proximal
policy optimization. As a result of the lack of fault data, one of the
significant problems with the traditional policy is its weakness in detecting
fault classes, which was addressed by changing the cost function. Using
modified Proximal Policy Optimization, we can increase performance, overcome
data imbalance, and better predict future faults. When our modified policy is
implemented, all evaluation metrics will increase by $3\%$ to $4\%$ as compared
to the traditional policy in the first benchmark, between $20\%$ and $55\%$ in
the second benchmark, and between $6\%$ and $14\%$ in the third benchmark, as
well as an improvement in performance and prediction speed compared to previous
methods.
Related papers
- Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Off-Policy Primal-Dual Safe Reinforcement Learning [16.918188277722503]
We show that the error in cumulative cost estimation causes significant underestimation of cost when using off-policy methods.
We propose conservative policy optimization, which learns a policy in a constraint-satisfying area by considering the uncertainty in estimation.
We then introduce local policy convexification to help eliminate such suboptimality by gradually reducing the estimation uncertainty.
arXiv Detail & Related papers (2024-01-26T10:33:38Z) - Importance-Weighted Offline Learning Done Right [16.4989952150404]
We study the problem of offline policy optimization in contextual bandit problems.
The goal is to learn a near-optimal policy based on a dataset of decision data collected by a suboptimal behavior policy.
We show that a simple alternative approach based on the "implicit exploration" estimator of citet2015 yields performance guarantees that are superior in nearly all possible terms to all previous results.
arXiv Detail & Related papers (2023-09-27T16:42:10Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Value Enhancement of Reinforcement Learning via Efficient and Robust
Trust Region Optimization [14.028916306297928]
Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy.
We propose a novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms.
arXiv Detail & Related papers (2023-01-05T18:43:40Z) - DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a
Real Steam Turbine System [0.0]
Machine learning and deep learning have proposed various methods for data-based fault diagnosis.
This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection.
arXiv Detail & Related papers (2022-10-12T16:58:40Z) - Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used.
Second, to explain these findings we introduce the concept of committal rate for policy optimization.
Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z) - Optimizing for the Future in Non-Stationary MDPs [52.373873622008944]
We present a policy gradient algorithm that maximizes a forecast of future performance.
We show that our algorithm, called Prognosticator, is more robust to non-stationarity than two online adaptation techniques.
arXiv Detail & Related papers (2020-05-17T03:41:19Z) - Greedy Policy Search: A Simple Baseline for Learnable Test-Time
Augmentation [65.92151529708036]
We introduce greedy policy search (GPS) as a simple but high-performing method for learning a policy of test-time augmentation.
We demonstrate that augmentation policies learned with GPS achieve superior predictive performance on image classification problems.
arXiv Detail & Related papers (2020-02-21T02:57:13Z) - Efficient Policy Learning from Surrogate-Loss Classification Reductions [65.91730154730905]
We consider the estimation problem given by a weighted surrogate-loss classification reduction of policy learning.
We show that, under a correct specification assumption, the weighted classification formulation need not be efficient for policy parameters.
We propose an estimation approach based on generalized method of moments, which is efficient for the policy parameters.
arXiv Detail & Related papers (2020-02-12T18:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.