Continuous-Discrete Reinforcement Learning for Hybrid Control in
Robotics
- URL: http://arxiv.org/abs/2001.00449v1
- Date: Thu, 2 Jan 2020 14:19:33 GMT
- Title: Continuous-Discrete Reinforcement Learning for Hybrid Control in
Robotics
- Authors: Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe,
Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli,
Nicolas Heess, Martin Riedmiller
- Abstract summary: We propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning.
In our experiments, we first demonstrate that the proposed approach efficiently solves such hybrid reinforcement learning problems.
We then show, both in simulation and on robotic hardware, the benefits of removing possibly imperfect expert-designeds.
- Score: 21.823173895315605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real-world control problems involve both discrete decision variables -
such as the choice of control modes, gear switching or digital outputs - as
well as continuous decision variables - such as velocity setpoints, control
gains or analogue outputs. However, when defining the corresponding optimal
control or reinforcement learning problem, it is commonly approximated with
fully continuous or fully discrete action spaces. These simplifications aim at
tailoring the problem to a particular algorithm or solver which may only
support one type of action space. Alternatively, expert heuristics are used to
remove discrete actions from an otherwise continuous space. In contrast, we
propose to treat hybrid problems in their 'native' form by solving them with
hybrid reinforcement learning, which optimizes for discrete and continuous
actions simultaneously. In our experiments, we first demonstrate that the
proposed approach efficiently solves such natively hybrid reinforcement
learning problems. We then show, both in simulation and on robotic hardware,
the benefits of removing possibly imperfect expert-designed heuristics. Lastly,
hybrid reinforcement learning encourages us to rethink problem definitions. We
propose reformulating control problems, e.g. by adding meta actions, to improve
exploration or reduce mechanical wear and tear.
Related papers
- Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management [2.0762193863564926]
This paper proposes a novel hybrid-action reinforcement learning (HARL) algorithm, twin delayed deep deterministic actor-Q (TD3AQ) for optimal control problems.
TD3AQ combines the advantages of both actor-critic and Q-learning methods, and can handle the discrete and continuous action spaces simultaneously.
The proposed algorithm is evaluated on a plug-in hybrid electric vehicle (PHEV) energy management problem.
arXiv Detail & Related papers (2023-05-02T14:42:21Z) - Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and
Learning Mean-Field Control [23.494528616672024]
We use state-of-the-art mean-field control techniques to convert many-agent swarm control into classical single-agent control of distributions.
Here, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior.
arXiv Detail & Related papers (2022-09-15T16:15:04Z) - Gradient Backpropagation Through Combinatorial Algorithms: Identity with
Projection Works [20.324159725851235]
A meaningful replacement for zero or undefined solvers is crucial for effective gradient-based learning.
We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass.
arXiv Detail & Related papers (2022-05-30T16:17:09Z) - Learning Solution Manifolds for Control Problems via Energy Minimization [32.59818752168615]
A variety of control tasks are commonly formulated as energy minimization problems.
Numerical solutions to such problems are well-established, but are often too slow to be used directly in real-time applications.
We propose an alternative to behavioral cloning (BC) that is efficient and numerically robust.
arXiv Detail & Related papers (2022-03-07T14:28:57Z) - Trajectory Tracking of Underactuated Sea Vessels With Uncertain
Dynamics: An Integral Reinforcement Learning Approach [2.064612766965483]
An online machine learning mechanism based on integral reinforcement learning is proposed to find a solution for a class of nonlinear tracking problems.
The solution is implemented using an online value iteration process which is realized by employing means of the adaptive critics and gradient descent approaches.
arXiv Detail & Related papers (2021-04-01T01:41:49Z) - Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query.
Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories.
We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z) - Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z) - Improving Input-Output Linearizing Controllers for Bipedal Robots via
Reinforcement Learning [85.13138591433635]
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints.
In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques.
arXiv Detail & Related papers (2020-04-15T18:15:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.