Related papers: Investigating the Impact of Action Representations in Policy Gradient Algorithms

Investigating the Impact of Action Representations in Policy Gradient Algorithms

URL: http://arxiv.org/abs/2309.06921v1
Date: Wed, 13 Sep 2023 12:41:45 GMT
Title: Investigating the Impact of Action Representations in Policy Gradient Algorithms
Authors: Jan Schneider, Pierre Schumacher, Daniel H\"aufle, Bernhard Sch\"olkopf, Dieter B\"uchler
Abstract summary: Reinforcement learning is a versatile framework for learning to solve complex real-world tasks. influences on the learning performance of RL algorithms are often poorly understood in practice.
Score: 11.383263522013868
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning~(RL) is a versatile framework for learning to solve complex real-world tasks. However, influences on the learning performance of RL algorithms are often poorly understood in practice. We discuss different analysis techniques and assess their effectiveness for investigating the impact of action representations in RL. Our experiments demonstrate that the action representation can significantly influence the learning performance on popular RL benchmark tasks. The analysis results indicate that some of the performance differences can be attributed to changes in the complexity of the optimization landscape. Finally, we discuss open challenges of analysis techniques for RL algorithms.

Related papers

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning [82.43575191712726]
We introduce a fine-grained analytic framework to dissect the impact ofReinforcement learning on reasoning.<n>Our framework specifically investigates key elements that have been hypothesized to benefit from RL training.
arXiv Detail & Related papers (2025-06-05T07:53:59Z)
A Comprehensive Survey of Reinforcement Learning: From Algorithms to Practical Challenges [2.2448567386846916]
Reinforcement Learning (RL) has emerged as a powerful paradigm in Artificial Intelligence (AI) This paper presents a comprehensive survey of RL, meticulously analyzing a wide range of algorithms. We offer practical insights into the selection and implementation of RL algorithms, addressing common challenges like convergence, stability, and the exploration-exploitation dilemma.
arXiv Detail & Related papers (2024-11-28T03:53:14Z)
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? [1.9116784879310031]
In deep Reinforcement Learning (RL), value functions are approximated using deep neural networks and trained via mean squared error regression objectives. Recent research has proposed an alternative approach, utilizing the cross-entropy classification objective. Our work seeks to empirically investigate the impact of such a replacement in an offline RL setup.
arXiv Detail & Related papers (2024-06-10T14:25:11Z)
Provable Representation with Efficient Planning for Partial Observable Reinforcement Learning [74.67655210734338]
In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption. We develop a representation-based perspective that leads to a coherent framework and tractable algorithmic approach for practical reinforcement learning from partial observations. We empirically demonstrate the proposed algorithm can surpass state-of-the-art performance with partial observations across various benchmarks.
arXiv Detail & Related papers (2023-11-20T23:56:58Z)
Reinforcement Learning-assisted Evolutionary Algorithm: A Survey and Research Opportunities [63.258517066104446]
Reinforcement learning integrated as a component in the evolutionary algorithm has demonstrated superior performance in recent years. We discuss the RL-EA integration method, the RL-assisted strategy adopted by RL-EA, and its applications according to the existing literature. In the applications of RL-EA section, we also demonstrate the excellent performance of RL-EA on several benchmarks and a range of public datasets.
arXiv Detail & Related papers (2023-08-25T15:06:05Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning [1.3106063755117399]
We improve upon one of the SOTA OMRL algorithms, FOCAL, by incorporating intra-task attention mechanism and inter-task contrastive learning objectives. Theoretical analysis and experiments are presented to demonstrate the superior performance, efficiency and robustness of our end-to-end and model free method.
arXiv Detail & Related papers (2021-02-22T05:05:16Z)
How to Make Deep RL Work in Practice [15.740760669623876]
Reported results of state-of-the-art algorithms are often difficult to reproduce. We make suggestions which of those techniques to use by default and highlight areas that could benefit from a solution specifically tailored to RL.
arXiv Detail & Related papers (2020-10-25T10:37:54Z)
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study [50.79125250286453]
On-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. But state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. These choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations. We implement >50 such choices'' in a unified on-policy RL framework, allowing us to investigate their impact in a large-scale empirical study.
arXiv Detail & Related papers (2020-06-10T17:59:03Z)
Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO [90.90009491366273]
We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms. Specifically, we investigate the consequences of "code-level optimizations:" Our results show that they (a) are responsible for most of PPO's gain in cumulative reward over TRPO, and (b) fundamentally change how RL methods function.
arXiv Detail & Related papers (2020-05-25T16:24:59Z)
Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension. We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.