Offline Reinforcement Learning With Combinatorial Action Spaces
- URL: http://arxiv.org/abs/2410.21151v1
- Date: Mon, 28 Oct 2024 15:49:46 GMT
- Title: Offline Reinforcement Learning With Combinatorial Action Spaces
- Authors: Matthew Landers, Taylor W. Killian, Hugo Barnes, Thomas Hartvigsen, Afsaneh Doryab,
- Abstract summary: Reinforcement learning problems often involve large action spaces arising from the simultaneous execution of multiple sub-actions.
We propose Branch Value Estimation (BVE), which effectively captures sub-action dependencies and scales to large spaces by learning to evaluate only a small subset of actions at each timestep.
Our experiments show that BVE outperforms state-of-the-art methods across a range of action space sizes.
- Score: 12.904199719046968
- License:
- Abstract: Reinforcement learning problems often involve large action spaces arising from the simultaneous execution of multiple sub-actions, resulting in combinatorial action spaces. Learning in combinatorial action spaces is difficult due to the exponential growth in action space size with the number of sub-actions and the dependencies among these sub-actions. In offline settings, this challenge is compounded by limited and suboptimal data. Current methods for offline learning in combinatorial spaces simplify the problem by assuming sub-action independence. We propose Branch Value Estimation (BVE), which effectively captures sub-action dependencies and scales to large combinatorial spaces by learning to evaluate only a small subset of actions at each timestep. Our experiments show that BVE outperforms state-of-the-art methods across a range of action space sizes.
Related papers
- Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces [52.649077293256795]
Continual offline reinforcement learning (CORL) has shown impressive ability in diffusion-based lifelong learning systems.
We propose Vector-Quantized Continual diffuser, named VQ-CD, to break the barrier of different spaces between various tasks.
arXiv Detail & Related papers (2024-10-21T07:13:45Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Dynamic Interval Restrictions on Action Spaces in Deep Reinforcement
Learning for Obstacle Avoidance [0.0]
In this thesis, we consider the problem of interval restrictions as they occur in pathfinding with dynamic obstacles.
Recent research learns with strong assumptions on the number of intervals, is limited to convex subsets.
We propose two approaches that are independent of the state of the environment by extending parameterized reinforcement learning and ConstraintNet to handle an arbitrary number of intervals.
arXiv Detail & Related papers (2023-06-13T09:13:13Z) - Dynamic Neighborhood Construction for Structured Large Discrete Action
Spaces [2.285821277711785]
Large discrete action spaces (LDAS) remain a central challenge in reinforcement learning.
Existing solution approaches can handle unstructured LDAS with up to a few million actions.
We propose Dynamic Neighborhood Construction (DNC), a novel exploitation paradigm for SLDAS.
arXiv Detail & Related papers (2023-05-31T14:26:14Z) - Solving Continuous Control via Q-learning [54.05120662838286]
We show that a simple modification of deep Q-learning largely alleviates issues with actor-critic methods.
By combining bang-bang action discretization with value decomposition, framing single-agent control as cooperative multi-agent reinforcement learning (MARL), this simple critic-only approach matches performance of state-of-the-art continuous actor-critic methods.
arXiv Detail & Related papers (2022-10-22T22:55:50Z) - Hierarchical Compositional Representations for Few-shot Action
Recognition [51.288829293306335]
We propose a novel hierarchical compositional representations (HCR) learning approach for few-shot action recognition.
We divide a complicated action into several sub-actions by carefully designed hierarchical clustering.
We also adopt the Earth Mover's Distance in the transportation problem to measure the similarity between video samples in terms of sub-action representations.
arXiv Detail & Related papers (2022-08-19T16:16:59Z) - Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based
on Maximum Entropy [0.0]
We propose Deep Multi-Agent Hybrid Soft Actor-Critic (MAHSAC) to handle multi-agent problems with hybrid action spaces.
This algorithm follows the centralized training but decentralized execution (CTDE) paradigm, and extend the Soft Actor-Critic algorithm (SAC) to handle hybrid action space problems.
Our experiences are running on an easy multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics.
arXiv Detail & Related papers (2022-06-10T13:52:59Z) - Generalising Discrete Action Spaces with Conditional Action Trees [0.0]
We introduce em Conditional Action Trees with two main objectives.
We show several proof-of-concept experiments ranging from environments with discrete action spaces to those with large action spaces commonly found in RTS-style games.
arXiv Detail & Related papers (2021-04-15T08:10:18Z) - LASER: Learning a Latent Action Space for Efficient Reinforcement
Learning [41.53297694894669]
We present LASER, a method to learn latent action spaces for efficient reinforcement learning.
We show improved sample efficiency compared to the original action space from better alignment of the action space to the task space, as we observe with visualizations of the learned action space manifold.
arXiv Detail & Related papers (2021-03-29T17:40:02Z) - Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension.
We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation.
These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.