Data-Driven Evaluation of Training Action Space for Reinforcement
  Learning
        - URL: http://arxiv.org/abs/2204.03840v1
 - Date: Fri, 8 Apr 2022 04:53:43 GMT
 - Title: Data-Driven Evaluation of Training Action Space for Reinforcement
  Learning
 - Authors: Rajat Ghosh, Debojyoti Dutta
 - Abstract summary: This paper proposes a Shapley-inspired methodology for training action space categorization and ranking.
To reduce exponential-time shapley computations, the methodology includes a Monte Carlo simulation.
The proposed data-driven methodology is RL to different domains, use cases, and reinforcement learning algorithms.
 - Score: 1.370633147306388
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   Training action space selection for reinforcement learning (RL) is
conflict-prone due to complex state-action relationships. To address this
challenge, this paper proposes a Shapley-inspired methodology for training
action space categorization and ranking. To reduce exponential-time shapley
computations, the methodology includes a Monte Carlo simulation to avoid
unnecessary explorations. The effectiveness of the methodology is illustrated
using a cloud infrastructure resource tuning case study. It reduces the search
space by 80\% and categorizes the training action sets into dispensable and
indispensable groups. Additionally, it ranks different training actions to
facilitate high-performance yet cost-efficient RL model design. The proposed
data-driven methodology is extensible to different domains, use cases, and
reinforcement learning algorithms.
 
       
      
        Related papers
        - Scaling DRL for Decision Making: A Survey on Data, Network, and Training   Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv  Detail & Related papers  (2025-08-05T08:03:12Z) - AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient   Foundation Model Pretraining [12.630306478872043]
We propose textbfAdaLRS, a plug-in-and-play adaptive learning rate search algorithm that conducts online optimal learning rate search.<n>Experiments show that AdaLRS adjusts suboptimal learning rates to the neighborhood of optimum with marked efficiency and effectiveness.
arXiv  Detail & Related papers  (2025-06-16T09:14:01Z) - A Snapshot of Influence: A Local Data Attribution Framework for Online   Reinforcement Learning [37.62558445850573]
We propose an algorithm, iterative influence-based filtering (IIF), for online RL training.<n>IIF reduces sample complexity, speeds up training, and achieves higher returns.<n>These results advance interpretability, efficiency, and effectiveness of online RL.
arXiv  Detail & Related papers  (2025-05-25T19:25:57Z) - PLANRL: A Motion Planning and Imitation Learning Framework to Bootstrap   Reinforcement Learning [13.564676246832544]
We introduce PLANRL, a framework that chooses when the robot should use classical motion planning and when it should learn a policy.
PLANRL switches between two modes of operation: reaching a waypoint using classical techniques when away from the objects and fine-grained manipulation control when about to interact with objects.
We evaluate our approach across multiple challenging simulation environments and real-world tasks, demonstrating superior performance in terms of adaptability, efficiency, and generalization compared to existing methods.
arXiv  Detail & Related papers  (2024-08-07T19:30:08Z) - Decomposing Control Lyapunov Functions for Efficient Reinforcement   Learning [10.117626902557927]
Current Reinforcement Learning (RL) methods require large amounts of data to learn a specific task, leading to unreasonable costs when deploying the agent to collect data in real-world applications.
In this paper, we build from existing work that reshapes the reward function in RL by introducing a Control Lyapunov Function (CLF) to reduce the sample complexity.
We show that our method finds a policy to successfully land a quadcopter in less than half the amount of real-world data required by the state-of-the-art Soft-Actor Critic algorithm.
arXiv  Detail & Related papers  (2024-03-18T19:51:17Z) - Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in
  Dense Encoders [63.28408887247742]
We study whether training procedures can be improved to yield better generalization capabilities in the resulting models.
We recommend a simple recipe for training dense encoders: Train on MSMARCO with parameter-efficient methods, such as LoRA, and opt for using in-batch negatives unless given well-constructed hard negatives.
arXiv  Detail & Related papers  (2023-11-16T10:42:58Z) - A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
  Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv  Detail & Related papers  (2023-07-06T12:33:34Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv  Detail & Related papers  (2023-06-06T02:24:41Z) - Rethinking Population-assisted Off-policy Reinforcement Learning [7.837628433605179]
Off-policy reinforcement learning algorithms struggle with convergence to local optima due to limited exploration.
Population-based algorithms offer a natural exploration strategy, but their black-box operators are inefficient.
Recent algorithms have integrated these two methods, connecting them through a shared replay buffer.
arXiv  Detail & Related papers  (2023-05-04T15:53:00Z) - Reinforcement Learning with Partial Parametric Model Knowledge [3.3598755777055374]
We adapt reinforcement learning methods for continuous control to bridge the gap between complete ignorance and perfect knowledge of the environment.
Our method, Partial Knowledge Least Squares Policy Iteration (PLSPI), takes inspiration from both model-free RL and model-based control.
arXiv  Detail & Related papers  (2023-04-26T01:04:35Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv  Detail & Related papers  (2022-04-14T17:46:26Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv  Detail & Related papers  (2022-04-05T17:25:22Z) - Learning to Reweight Imaginary Transitions for Model-Based Reinforcement
  Learning [58.66067369294337]
When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions.
We adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories.
Our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks.
arXiv  Detail & Related papers  (2021-04-09T03:13:35Z) - Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension.
We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation.
These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv  Detail & Related papers  (2020-02-10T04:23:09Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.