On the Convergence of Bounded Agents
- URL: http://arxiv.org/abs/2307.11044v1
- Date: Thu, 20 Jul 2023 17:27:29 GMT
- Title: On the Convergence of Bounded Agents
- Authors: David Abel, Andr\'e Barreto, Hado van Hasselt, Benjamin Van Roy, Doina
Precup, Satinder Singh
- Abstract summary: A bounded agent has converged when the minimal number of states needed to describe the agent's future behavior cannot decrease.
The second view says that a bounded agent has converged just when the agent's performance only changes if the agent's internal state changes.
- Score: 80.67035535522777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When has an agent converged? Standard models of the reinforcement learning
problem give rise to a straightforward definition of convergence: An agent
converges when its behavior or performance in each environment state stops
changing. However, as we shift the focus of our learning problem from the
environment's state to the agent's state, the concept of an agent's convergence
becomes significantly less clear. In this paper, we propose two complementary
accounts of agent convergence in a framing of the reinforcement learning
problem that centers around bounded agents. The first view says that a bounded
agent has converged when the minimal number of states needed to describe the
agent's future behavior cannot decrease. The second view says that a bounded
agent has converged just when the agent's performance only changes if the
agent's internal state changes. We establish basic properties of these two
definitions, show that they accommodate typical views of convergence in
standard settings, and prove several facts about their nature and relationship.
We take these perspectives, definitions, and analysis to bring clarity to a
central idea of the field.
Related papers
- BET: Explaining Deep Reinforcement Learning through The Error-Prone
Decisions [7.139669387895207]
We propose a novel self-interpretable structure, named Backbone Extract Tree (BET), to better explain the agent's behavior.
At a high level, BET hypothesizes that states in which the agent consistently executes uniform decisions exhibit a reduced propensity for errors.
We show BET's superiority over existing self-interpretable models in terms of explanation fidelity.
arXiv Detail & Related papers (2024-01-14T11:45:05Z) - Byzantine-Resilient Decentralized Multi-Armed Bandits [25.499420566469098]
We develop an algorithm that fuses an information mixing step among agents with a truncation of inconsistent and extreme values.
This framework can be used to model attackers in computer networks, instigators of offensive content into recommender systems, or manipulators of financial markets.
arXiv Detail & Related papers (2023-10-11T09:09:50Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Formalizing the Problem of Side Effect Regularization [81.97441214404247]
We propose a formal criterion for side effect regularization via the assistance game framework.
In these games, the agent solves a partially observable Markov decision process.
We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks.
arXiv Detail & Related papers (2022-06-23T16:36:13Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - Robust Allocations with Diversity Constraints [65.3799850959513]
We show that the Nash Welfare rule that maximizes product of agent values is uniquely positioned to be robust when diversity constraints are introduced.
We also show that the guarantees achieved by Nash Welfare are nearly optimal within a widely studied class of allocation rules.
arXiv Detail & Related papers (2021-09-30T11:09:31Z) - AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
Forecasting [25.151713845738335]
We propose a new Transformer, AgentFormer, that jointly models the time and social dimensions.
Based on AgentFormer, we propose a multi-agent trajectory prediction model that can attend to features of any agent at any previous timestep.
Our method significantly improves the state of the art on well-established pedestrian and autonomous driving datasets.
arXiv Detail & Related papers (2021-03-25T17:59:01Z) - A New Bandit Setting Balancing Information from State Evolution and
Corrupted Context [52.67844649650687]
We propose a new sequential decision-making setting combining key aspects of two established online learning problems with bandit feedback.
The optimal action to play at any given moment is contingent on an underlying changing state which is not directly observable by the agent.
We present an algorithm that uses a referee to dynamically combine the policies of a contextual bandit and a multi-armed bandit.
arXiv Detail & Related papers (2020-11-16T14:35:37Z) - Performance of Bounded-Rational Agents With the Ability to Self-Modify [1.933681537640272]
Self-modification of agents embedded in complex environments is hard to avoid.
It has been argued that intelligent agents have an incentive to avoid modifying their utility function so that their future instances work towards the same goals.
We show that this result is no longer true for agents with bounded rationality.
arXiv Detail & Related papers (2020-11-12T09:25:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.