Three Dogmas of Reinforcement Learning
- URL: http://arxiv.org/abs/2407.10583v1
- Date: Mon, 15 Jul 2024 10:03:24 GMT
- Title: Three Dogmas of Reinforcement Learning
- Authors: David Abel, Mark K. Ho, Anna Harutyunyan,
- Abstract summary: Modern reinforcement learning has been conditioned by at least three dogmas.
The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents.
The second is our treatment of learning as finding the solution to a task, rather than adaptation.
The third is the reward hypothesis, which states that all goals and purposes can be well thought of as of a reward signal.
- Score: 13.28320102989073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern reinforcement learning has been conditioned by at least three dogmas. The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents. The second is our treatment of learning as finding the solution to a task, rather than adaptation. The third is the reward hypothesis, which states that all goals and purposes can be well thought of as maximization of a reward signal. These three dogmas shape much of what we think of as the science of reinforcement learning. While each of the dogmas have played an important role in developing the field, it is time we bring them to the surface and reflect on whether they belong as basic ingredients of our scientific paradigm. In order to realize the potential of reinforcement learning as a canonical frame for researching intelligent agents, we suggest that it is time we shed dogmas one and two entirely, and embrace a nuanced approach to the third.
Related papers
- A Definition of Continual Reinforcement Learning [69.56273766737527]
In a standard view of the reinforcement learning problem, an agent's goal is to efficiently identify a policy that maximizes long-term reward.
Continual reinforcement learning refers to the setting in which the best agents never stop learning.
We formalize the notion of agents that "never stop learning" through a new mathematical language for analyzing and cataloging agents.
arXiv Detail & Related papers (2023-07-20T17:28:01Z) - Building a Culture of Reproducibility in Academic Research [55.22219308265945]
Reproducibility is an ideal that no researcher would dispute "in the abstract", but when aspirations meet the cold hard reality of the academic grind, often "loses out"
In this essay, I share some personal experiences grappling with how to operationalize while balancing its demands against other priorities.
arXiv Detail & Related papers (2022-12-27T16:03:50Z) - LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval [68.85686621130111]
We propose to make a dense retriever align a well-performing lexicon-aware representation model.
We evaluate our model on three public benchmarks, which shows that with a comparable lexicon-aware retriever as the teacher, our proposed dense model can bring consistent and significant improvements.
arXiv Detail & Related papers (2022-08-29T15:09:28Z) - An Enactivist-Inspired Mathematical Model of Cognition [5.8010446129208155]
We formulate five basic tenets of enactivist cognitive science that we have carefully identified in the relevant literature.
We then develop a mathematical framework to talk about cognitive systems which complies with these enactivist tenets.
arXiv Detail & Related papers (2022-06-10T13:03:47Z) - Dealing with Sparse Rewards Using Graph Neural Networks [0.15540058359482856]
We propose two modifications of one of the recent reward shaping methods based on graph convolutional networks.
We empirically validate the effectiveness of our solutions for the task of navigation in a 3D environment with sparse rewards.
For the solution featuring attention mechanism, we are also able to show that the learned attention is concentrated on edges corresponding to important transitions in 3D environment.
arXiv Detail & Related papers (2022-03-25T02:42:07Z) - On the Expressivity of Markov Reward [89.96685777114456]
This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform.
We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories.
arXiv Detail & Related papers (2021-11-01T12:12:16Z) - Subgoal-based Reward Shaping to Improve Efficiency in Reinforcement
Learning [7.6146285961466]
We extend potential-based reward shaping and propose a subgoal-based reward shaping.
Our method makes it easier for human trainers to share their knowledge of subgoals.
arXiv Detail & Related papers (2021-04-13T14:28:48Z) - Mutual Information State Intrinsic Control [91.38627985733068]
Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function.
Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself.
We mathematically formalize this reward as the mutual information between the agent state and the surrounding state.
arXiv Detail & Related papers (2021-03-15T03:03:36Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z) - Unsupervisedly Learned Representations: Should the Quest be Over? [0.0]
We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals.
The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.
arXiv Detail & Related papers (2020-01-21T13:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.