Related papers: Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

URL: http://arxiv.org/abs/2001.00449v1
Date: Thu, 2 Jan 2020 14:19:33 GMT
Title: Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics
Authors: Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin Riedmiller
Abstract summary: We propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning. In our experiments, we first demonstrate that the proposed approach efficiently solves such hybrid reinforcement learning problems. We then show, both in simulation and on robotic hardware, the benefits of removing possibly imperfect expert-designeds.
Score: 21.823173895315605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many real-world control problems involve both discrete decision variables - such as the choice of control modes, gear switching or digital outputs - as well as continuous decision variables - such as velocity setpoints, control gains or analogue outputs. However, when defining the corresponding optimal control or reinforcement learning problem, it is commonly approximated with fully continuous or fully discrete action spaces. These simplifications aim at tailoring the problem to a particular algorithm or solver which may only support one type of action space. Alternatively, expert heuristics are used to remove discrete actions from an otherwise continuous space. In contrast, we propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning, which optimizes for discrete and continuous actions simultaneously. In our experiments, we first demonstrate that the proposed approach efficiently solves such natively hybrid reinforcement learning problems. We then show, both in simulation and on robotic hardware, the benefits of removing possibly imperfect expert-designed heuristics. Lastly, hybrid reinforcement learning encourages us to rethink problem definitions. We propose reformulating control problems, e.g. by adding meta actions, to improve exploration or reduce mechanical wear and tear.

Related papers

Reinforcement learning with combinatorial actions for coupled restless bandits [62.89013331120493]
We propose SEQUOIA, an RL algorithm that directly optimize for long-term reward over the feasible action space. We empirically validate SEQUOIA on four novel restless bandit problems with constraints: multiple interventions, path constraints, bipartite matching, and capacity constraints.
arXiv Detail & Related papers (2025-03-01T21:25:21Z)
Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency. In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution. Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z)
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. In this paper, we propose an adaptive scheme for action quantization. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z)
Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment. We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z)
Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management [2.0762193863564926]
This paper proposes a novel hybrid-action reinforcement learning (HARL) algorithm, twin delayed deep deterministic actor-Q (TD3AQ) for optimal control problems. TD3AQ combines the advantages of both actor-critic and Q-learning methods, and can handle the discrete and continuous action spaces simultaneously. The proposed algorithm is evaluated on a plug-in hybrid electric vehicle (PHEV) energy management problem.
arXiv Detail & Related papers (2023-05-02T14:42:21Z)
Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and Learning Mean-Field Control [23.494528616672024]
We use state-of-the-art mean-field control techniques to convert many-agent swarm control into classical single-agent control of distributions. Here, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior.
arXiv Detail & Related papers (2022-09-15T16:15:04Z)
Gradient Backpropagation Through Combinatorial Algorithms: Identity with Projection Works [20.324159725851235]
A meaningful replacement for zero or undefined solvers is crucial for effective gradient-based learning. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass.
arXiv Detail & Related papers (2022-05-30T16:17:09Z)
Learning Solution Manifolds for Control Problems via Energy Minimization [32.59818752168615]
A variety of control tasks are commonly formulated as energy minimization problems. Numerical solutions to such problems are well-established, but are often too slow to be used directly in real-time applications. We propose an alternative to behavioral cloning (BC) that is efficient and numerically robust.
arXiv Detail & Related papers (2022-03-07T14:28:57Z)
Trajectory Tracking of Underactuated Sea Vessels With Uncertain Dynamics: An Integral Reinforcement Learning Approach [2.064612766965483]
An online machine learning mechanism based on integral reinforcement learning is proposed to find a solution for a class of nonlinear tracking problems. The solution is implemented using an online value iteration process which is realized by employing means of the adaptive critics and gradient descent approaches.
arXiv Detail & Related papers (2021-04-01T01:41:49Z)
Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z)
Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process. We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z)
Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning [85.13138591433635]
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques.
arXiv Detail & Related papers (2020-04-15T18:15:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.