Modeling the Mistakes of Boundedly Rational Agents Within a Bayesian
Theory of Mind
- URL: http://arxiv.org/abs/2106.13249v1
- Date: Thu, 24 Jun 2021 18:00:03 GMT
- Title: Modeling the Mistakes of Boundedly Rational Agents Within a Bayesian
Theory of Mind
- Authors: Arwa Alanqary, Gloria Z. Lin, Joie Le, Tan Zhi-Xuan, Vikash K.
Mansinghka, Joshua B. Tenenbaum
- Abstract summary: We extend the Bayesian Theory of Mind framework to model boundedly rational agents who may have mistaken goals, plans, and actions.
We present experiments eliciting human goal inferences in two domains: (i) a gridworld puzzle with gems locked behind doors, and (ii) a block-stacking domain.
- Score: 32.66203057545608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When inferring the goals that others are trying to achieve, people
intuitively understand that others might make mistakes along the way. This is
crucial for activities such as teaching, offering assistance, and deciding
between blame or forgiveness. However, Bayesian models of theory of mind have
generally not accounted for these mistakes, instead modeling agents as mostly
optimal in achieving their goals. As a result, they are unable to explain
phenomena like locking oneself out of one's house, or losing a game of chess.
Here, we extend the Bayesian Theory of Mind framework to model boundedly
rational agents who may have mistaken goals, plans, and actions. We formalize
this by modeling agents as probabilistic programs, where goals may be confused
with semantically similar states, plans may be misguided due to
resource-bounded planning, and actions may be unintended due to execution
errors. We present experiments eliciting human goal inferences in two domains:
(i) a gridworld puzzle with gems locked behind doors, and (ii) a block-stacking
domain. Our model better explains human inferences than alternatives, while
generalizing across domains. These findings indicate the importance of modeling
others as bounded agents, in order to account for the full richness of human
intuitive psychology.
Related papers
- Infinite Ends from Finite Samples: Open-Ended Goal Inference as Top-Down Bayesian Filtering of Bottom-Up Proposals [48.437581268398866]
We introduce a sequential Monte Carlo model of open-ended goal inference.
We validate this model in a goal inference task called Block Words.
Our experiments highlight the importance of uniting top-down and bottom-up models for explaining the speed, accuracy, and generality of human theory-of-mind.
arXiv Detail & Related papers (2024-07-23T18:04:40Z) - Evaluating the World Model Implicit in a Generative Model [7.317896355747284]
Recent work suggests that large language models may implicitly learn world models.
This includes problems as diverse as simple logical reasoning, geographic navigation, game-playing, and chemistry.
We propose new evaluation metrics for world model recovery inspired by the classic Myhill-Nerode theorem from language theory.
arXiv Detail & Related papers (2024-06-06T02:20:31Z) - The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling
Probabilistic Social Inferences from Linguistic Inputs [50.32802502923367]
We study the process of language driving and influencing social reasoning in a probabilistic goal inference domain.
We propose a neuro-symbolic model that carries out goal inference from linguistic inputs of agent scenarios.
Our model closely matches human response patterns and better predicts human judgements than using an LLM alone.
arXiv Detail & Related papers (2023-06-25T19:38:01Z) - Evaluating Superhuman Models with Consistency Checks [14.04919745612553]
We propose a framework for evaluating superhuman models via consistency checks.
We instantiate our framework on three tasks where correctness of decisions is hard to evaluate.
arXiv Detail & Related papers (2023-06-16T17:26:38Z) - On the Sensitivity of Reward Inference to Misspecified Human Models [27.94055657571769]
Inferring reward functions from human behavior is at the center of value alignment - aligning AI objectives with what we, humans, actually want.
This begs the question: how accurate do these models need to be in order for the reward inference to be accurate?
We show that it is unfortunately possible to construct small adversarial biases in behavior that lead to arbitrarily large errors in the inferred reward.
arXiv Detail & Related papers (2022-12-09T08:16:20Z) - Self-Explaining Deviations for Coordination [31.94421561348329]
We focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs)
SEDs are actions that deviate from the common understanding of what reasonable behavior would be in normal circumstances.
We introduce a novel algorithm, improvement maximizing self-explaining deviations (IMPROVISED), to perform SEDs.
arXiv Detail & Related papers (2022-07-13T20:56:59Z) - Safe Learning of Lifted Action Models [46.65973550325976]
We propose an algorithm for solving the model-free planning problem in classical planning.
The number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model.
arXiv Detail & Related papers (2021-07-09T01:24:01Z) - What can I do here? A Theory of Affordances in Reinforcement Learning [65.70524105802156]
We develop a theory of affordances for agents who learn and plan in Markov Decision Processes.
Affordances play a dual role in this case, by reducing the number of actions available in any given situation.
We propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.
arXiv Detail & Related papers (2020-06-26T16:34:53Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.