Related papers: Reinforcement Learning with an Abrupt Model Change

Reinforcement Learning with an Abrupt Model Change

URL: http://arxiv.org/abs/2304.11460v1
Date: Sat, 22 Apr 2023 18:16:01 GMT
Title: Reinforcement Learning with an Abrupt Model Change
Authors: Wuxia Chen, Taposh Banerjee, Jemin George, and Carl Busart
Abstract summary: The problem of reinforcement learning is considered where the environment or the model undergoes a change. An algorithm is proposed that an agent can apply in such a problem to achieve the optimal long-time discounted reward. The algorithm is model-free and learns the optimal policy by interacting with the environment.
Score: 15.101940747707705
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The problem of reinforcement learning is considered where the environment or the model undergoes a change. An algorithm is proposed that an agent can apply in such a problem to achieve the optimal long-time discounted reward. The algorithm is model-free and learns the optimal policy by interacting with the environment. It is shown that the proposed algorithm has strong optimality properties. The effectiveness of the algorithm is also demonstrated using simulation results. The proposed algorithm exploits a fundamental reward-detection trade-off present in these problems and uses a quickest change detection algorithm to detect the model change. Recommendations are provided for faster detection of model changes and for smart initialization strategies.

Related papers

RL-finetuning LLMs from on- and off-policy data with a single algorithm [53.70731390624718]
We introduce a novel reinforcement learning algorithm (AGRO) for fine-tuning large-language models. AGRO leverages the concept of generation consistency, which states that the optimal policy satisfies the notion of consistency across any possible generation of the model. We derive algorithms that find optimal solutions via the sample-based policy gradient and provide theoretical guarantees on their convergence.
arXiv Detail & Related papers (2025-03-25T12:52:38Z)
Deep Reinforcement Learning for Dynamic Algorithm Selection: A Proof-of-Principle Study on Differential Evolution [27.607740475924448]
We propose a deep reinforcement learning-based dynamic algorithm selection framework to accomplish this task. We employ a sophisticated deep neural network model to infer the optimal action, ensuring informed algorithm selections. As a proof-of-principle study, we apply this framework to a group of Differential Evolution algorithms.
arXiv Detail & Related papers (2024-03-04T15:40:28Z)
Frog-Snake prey-predation Relationship Optimization (FSRO) : A novel nature-inspired metaheuristic algorithm for feature selection [0.0]
This study proposes the Frog-Snake prey-predation Relationship Optimization (FSRO) algorithm. It is inspired by the prey-predation relationship between frogs and snakes for application to discrete optimization problems. The proposed algorithm conducts computational experiments on feature selection using 26 types of machine learning datasets.
arXiv Detail & Related papers (2024-02-13T06:39:15Z)
Efficient Training of Physics-Informed Neural Networks with Direct Grid Refinement Algorithm [0.0]
This research presents the development of an innovative algorithm tailored for the adaptive sampling of residual points within the framework of Physics-Informed Neural Networks (PINNs) By addressing the limitations inherent in existing adaptive sampling techniques, our proposed methodology introduces a direct mesh refinement approach that effectively ensures both computational efficiency and adaptive point placement.
arXiv Detail & Related papers (2023-06-14T07:04:02Z)
Quickest Change Detection for Unnormalized Statistical Models [36.6516991850508]
This paper develops a new variant of the classical Cumulative Sum (CUSUM) algorithm for the quickest change detection. The SCUSUM algorithm allows the applications of change detection for unnormalized statistical models.
arXiv Detail & Related papers (2023-02-01T05:27:34Z)
Socio-cognitive Optimization of Time-delay Control Problems using Evolutionary Metaheuristics [89.24951036534168]
Metaheuristics are universal optimization algorithms which should be used for solving difficult problems, unsolvable by classic approaches. In this paper we aim at constructing novel socio-cognitive metaheuristic based on castes, and apply several versions of this algorithm to optimization of time-delay system model.
arXiv Detail & Related papers (2022-10-23T22:21:10Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
High-dimensional Bayesian Optimization Algorithm with Recurrent Neural Network for Disease Control Models in Time Series [1.9371782627708491]
We propose a new high dimensional Bayesian Optimization algorithm combining Recurrent neural networks. The proposed RNN-BO algorithm can solve the optimal control problems in the lower dimension space. We also discuss the impacts of different numbers of the RNN layers and training epochs on the trade-off between solution quality and related computational efforts.
arXiv Detail & Related papers (2022-01-01T08:40:17Z)
Variance-Reduced Off-Policy Memory-Efficient Policy Search [61.23789485979057]
Off-policy policy optimization is a challenging problem in reinforcement learning. Off-policy algorithms are memory-efficient and capable of learning from off-policy samples.
arXiv Detail & Related papers (2020-09-14T16:22:46Z)
Model-Augmented Actor-Critic: Backpropagating through Paths [81.86992776864729]
Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator. We show how to make more effective use of the model by exploiting its differentiability.
arXiv Detail & Related papers (2020-05-16T19:18:10Z)
Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP) We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.