Reliability Quantification of Deep Reinforcement Learning-based Control
- URL: http://arxiv.org/abs/2309.16977v2
- Date: Sat, 14 Oct 2023 01:34:30 GMT
- Title: Reliability Quantification of Deep Reinforcement Learning-based Control
- Authors: Hitoshi Yoshioka, Hirotada Hashimoto
- Abstract summary: This study proposes a method for quantifying the reliability of DRL-based control.
The reliability is quantified using two neural networks: reference and evaluator.
The proposed method was applied to the problem of switching trained models depending on the state.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliability quantification of deep reinforcement learning (DRL)-based control
is a significant challenge for the practical application of artificial
intelligence (AI) in safety-critical systems. This study proposes a method for
quantifying the reliability of DRL-based control. First, an existing method,
random noise distillation, was applied to the reliability evaluation to clarify
the issues to be solved. Second, a novel method for reliability quantification
was proposed to solve these issues. The reliability is quantified using two
neural networks: reference and evaluator. They have the same structure with the
same initial parameters. The outputs of the two networks were the same before
training. During training, the evaluator network parameters were updated to
maximize the difference between the reference and evaluator networks for
trained data. Thus, the reliability of the DRL-based control for a state can be
evaluated based on the difference in output between the two networks. The
proposed method was applied to DQN-based control as an example of a simple
task, and its effectiveness was demonstrated. Finally, the proposed method was
applied to the problem of switching trained models depending on the state.
Con-sequently, the performance of the DRL-based control was improved by
switching the trained models according to their reliability.
Related papers
- Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - A Perspective of Q-value Estimation on Offline-to-Online Reinforcement
Learning [54.48409201256968]
offline-to-online Reinforcement Learning (O2O RL) aims to improve the performance of offline pretrained policy using only a few online samples.
Most O2O methods focus on the balance between RL objective and pessimism, or the utilization of offline and online samples.
arXiv Detail & Related papers (2023-12-12T19:24:35Z) - Digital Twin Assisted Deep Reinforcement Learning for Online Admission
Control in Sliced Network [19.152875040151976]
We propose a digital twin (DT) accelerated DRL solution to address this issue.
A neural network-based DT is established with a customized output layer for queuing systems, trained through supervised learning, and then employed to assist the training phase of the DRL model.
Extensive simulations show that the DT-accelerated DRL improves resource utilization by over 40% compared to the directly trained state-of-the-art dueling deep Q-learning model.
arXiv Detail & Related papers (2023-10-07T09:09:19Z) - Statistically Efficient Variance Reduction with Double Policy Estimation
for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation.
We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Steady-State Error Compensation in Reference Tracking and Disturbance
Rejection Problems for Reinforcement Learning-Based Control [0.9023847175654602]
Reinforcement learning (RL) is a promising, upcoming topic in automatic control applications.
Initiative action state augmentation (IASA) for actor-critic-based RL controllers is introduced.
This augmentation does not require any expert knowledge, leaving the approach model free.
arXiv Detail & Related papers (2022-01-31T16:29:19Z) - On the Robustness of Controlled Deep Reinforcement Learning for Slice
Placement [0.8459686722437155]
We compare two Deep Reinforcement Learning algorithms: a pure DRL-based algorithm and a hybrid DRL as a hybrid DRL-heuristic algorithm.
The evaluation results show that the proposed hybrid DRL-heuristic approach is more robust and reliable in case of unpredictable network load changes than pure DRL.
arXiv Detail & Related papers (2021-08-05T10:24:33Z) - Enforcing robust control guarantees within neural network policies [76.00287474159973]
We propose a generic nonlinear control policy class, parameterized by neural networks, that enforces the same provable robustness criteria as robust control.
We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.
arXiv Detail & Related papers (2020-11-16T17:14:59Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z) - Model-Free Voltage Regulation of Unbalanced Distribution Network Based
on Surrogate Model and Deep Reinforcement Learning [9.984416150031217]
This paper develops a model-free approach based on the surrogate model and deep reinforcement learning (DRL)
We have also extended it to deal with unbalanced three-phase scenarios.
arXiv Detail & Related papers (2020-06-24T18:49:41Z) - Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR
Control in Active Distribution Networks [3.260913246106564]
We propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources.
In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch.
In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency.
arXiv Detail & Related papers (2020-05-20T08:02:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.