Is Bang-Bang Control All You Need? Solving Continuous Control with
Bernoulli Policies
- URL: http://arxiv.org/abs/2111.02552v1
- Date: Wed, 3 Nov 2021 22:45:55 GMT
- Title: Is Bang-Bang Control All You Need? Solving Continuous Control with
Bernoulli Policies
- Authors: Tim Seyde, Igor Gilitschenski, Wilko Schwarting, Bartolomeo Stellato,
Martin Riedmiller, Markus Wulfmeier, Daniela Rus
- Abstract summary: We investigate the phenomenon that trained agents often prefer actions at the boundaries of that space.
We replace the normal Gaussian by a Bernoulli distribution that solely considers the extremes along each action dimension.
Surprisingly, this achieves state-of-the-art performance on several continuous control benchmarks.
- Score: 45.20170713261535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) for continuous control typically employs
distributions whose support covers the entire action space. In this work, we
investigate the colloquially known phenomenon that trained agents often prefer
actions at the boundaries of that space. We draw theoretical connections to the
emergence of bang-bang behavior in optimal control, and provide extensive
empirical evaluation across a variety of recent RL algorithms. We replace the
normal Gaussian by a Bernoulli distribution that solely considers the extremes
along each action dimension - a bang-bang controller. Surprisingly, this
achieves state-of-the-art performance on several continuous control benchmarks
- in contrast to robotic hardware, where energy and maintenance cost affect
controller choices. Since exploration, learning,and the final solution are
entangled in RL, we provide additional imitation learning experiments to reduce
the impact of exploration on our analysis. Finally, we show that our
observations generalize to environments that aim to model real-world challenges
and evaluate factors to mitigate the emergence of bang-bang solutions. Our
findings emphasize challenges for benchmarking continuous control algorithms,
particularly in light of potential real-world applications.
Related papers
- Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls [0.3441021278275805]
This paper analyzes using discrete action spaces, where the agent must choose from a predefined list of actions.
Experiments are conducted for an inspection task, where the agent must circumnavigate an object to inspect points on its surface, and a docking task, where the agent must move into proximity of another spacecraft and "dock"
A common objective of both tasks, and most space tasks in general, is to minimize fuel usage, which motivates the agent to regularly choose an action that uses no fuel.
arXiv Detail & Related papers (2024-05-20T20:06:54Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance [0.0]
Deep Reinforcement Learning (DRL) has emerged as a promising control framework.
Current DRL algorithms require disproportionally large computational resources to find near-optimal policies.
This paper presents a comprehensive exploration of our proposed approach in maritime control systems.
arXiv Detail & Related papers (2024-03-31T09:25:28Z) - A Safe Reinforcement Learning Algorithm for Supervisory Control of Power
Plants [7.1771300511732585]
Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks.
We propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control.
Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.
arXiv Detail & Related papers (2024-01-23T17:52:49Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Steady-State Error Compensation in Reference Tracking and Disturbance
Rejection Problems for Reinforcement Learning-Based Control [0.9023847175654602]
Reinforcement learning (RL) is a promising, upcoming topic in automatic control applications.
Initiative action state augmentation (IASA) for actor-critic-based RL controllers is introduced.
This augmentation does not require any expert knowledge, leaving the approach model free.
arXiv Detail & Related papers (2022-01-31T16:29:19Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.