Intelligence and Unambitiousness Using Algorithmic Information Theory
- URL: http://arxiv.org/abs/2105.06268v1
- Date: Thu, 13 May 2021 13:10:28 GMT
- Title: Intelligence and Unambitiousness Using Algorithmic Information Theory
- Authors: Michael K. Cohen, Badri Vellambi, Marcus Hutter
- Abstract summary: We show that an agent learns to accrue reward at least as well as a human mentor, while relying on that mentor with diminishing probability.
We show that eventually, the agent's world-model incorporates the following true fact: intervening in the "outside world" will have no effect on reward acquisition.
- Score: 22.710015392064083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Algorithmic Information Theory has inspired intractable constructions of
general intelligence (AGI), and undiscovered tractable approximations are
likely feasible. Reinforcement Learning (RL), the dominant paradigm by which an
agent might learn to solve arbitrary solvable problems, gives an agent a
dangerous incentive: to gain arbitrary "power" in order to intervene in the
provision of their own reward. We review the arguments that generally
intelligent algorithmic-information-theoretic reinforcement learners such as
Hutter's (2005) AIXI would seek arbitrary power, including over us. Then, using
an information-theoretic exploration schedule, and a setup inspired by causal
influence theory, we present a variant of AIXI which learns to not seek
arbitrary power; we call it "unambitious". We show that our agent learns to
accrue reward at least as well as a human mentor, while relying on that mentor
with diminishing probability. And given a formal assumption that we probe
empirically, we show that eventually, the agent's world-model incorporates the
following true fact: intervening in the "outside world" will have no effect on
reward acquisition; hence, it has no incentive to shape the outside world.
Related papers
- Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions.
In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z) - Efficient Open-world Reinforcement Learning via Knowledge Distillation
and Autonomous Rule Discovery [5.680463564655267]
Rule-driven deep Q-learning agent (RDQ) as one possible implementation of framework.
We show that RDQ successfully extracts task-specific rules as it interacts with the world.
In experiments, we show that the RDQ agent is significantly more resilient to the novelties than the baseline agents.
arXiv Detail & Related papers (2023-11-24T04:12:50Z) - Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
Reinforcement Learning [78.31888150539258]
Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning.
Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency.
We present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility.
arXiv Detail & Related papers (2022-10-07T17:56:57Z) - Parametrically Retargetable Decision-Makers Tend To Seek Power [91.93765604105025]
In fully observable environments, most reward functions have an optimal policy which seeks power by keeping options open and staying alive.
We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment.
We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power.
arXiv Detail & Related papers (2022-06-27T17:39:23Z) - On Avoiding Power-Seeking by Artificial Intelligence [93.9264437334683]
We do not know how to align a very intelligent AI agent's behavior with human interests.
I investigate whether we can build smart AI agents which have limited impact on the world, and which do not autonomously seek power.
arXiv Detail & Related papers (2022-06-23T16:56:21Z) - An Algorithmic Theory of Metacognition in Minds and Machines [1.52292571922932]
We present an algorithmic theory of metacognition based on a well-understood trade-off in reinforcement learning.
We show how to create metacognition in machines by implementing a deep MAC.
arXiv Detail & Related papers (2021-11-05T22:31:09Z) - Knowledge is reward: Learning optimal exploration by predictive reward
cashing [5.279475826661643]
We exploit the inherent mathematical structure of Bayes-adaptive problems in order to dramatically simplify the problem.
The key to this simplification comes from the novel concept of cross-value.
This results in a new denser reward structure that "cashes in" all future rewards that can be predicted from the current information state.
arXiv Detail & Related papers (2021-09-17T12:52:24Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Learning Human Rewards by Inferring Their Latent Intelligence Levels in
Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data [18.750834997334664]
We argue that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process.
We propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning.
arXiv Detail & Related papers (2021-03-07T07:48:31Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.