Learning Human Rewards by Inferring Their Latent Intelligence Levels in
Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data
- URL: http://arxiv.org/abs/2103.04289v1
- Date: Sun, 7 Mar 2021 07:48:31 GMT
- Title: Learning Human Rewards by Inferring Their Latent Intelligence Levels in
Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data
- Authors: Ran Tian, Masayoshi Tomizuka, and Liting Sun
- Abstract summary: We argue that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process.
We propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning.
- Score: 18.750834997334664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reward function, as an incentive representation that recognizes humans'
agency and rationalizes humans' actions, is particularly appealing for modeling
human behavior in human-robot interaction. Inverse Reinforcement Learning is an
effective way to retrieve reward functions from demonstrations. However, it has
always been challenging when applying it to multi-agent settings since the
mutual influence between agents has to be appropriately modeled. To tackle this
challenge, previous work either exploits equilibrium solution concepts by
assuming humans as perfectly rational optimizers with unbounded intelligence or
pre-assigns humans' interaction strategies a priori. In this work, we advocate
that humans are bounded rational and have different intelligence levels when
reasoning about others' decision-making process, and such an inherent and
latent characteristic should be accounted for in reward learning algorithms.
Hence, we exploit such insights from Theory-of-Mind and propose a new
multi-agent Inverse Reinforcement Learning framework that reasons about humans'
latent intelligence levels during learning. We validate our approach in both
zero-sum and general-sum games with synthetic agents and illustrate a practical
application to learning human drivers' reward functions from real driving data.
We compare our approach with two baseline algorithms. The results show that by
reasoning about humans' latent intelligence levels, the proposed approach has
more flexibility and capability to retrieve reward functions that explain
humans' driving behaviors better.
Related papers
- Learning to Assist Humans without Inferring Rewards [65.28156318196397]
We build upon prior work that studies assistance through the lens of empowerment.
An assistive agent aims to maximize the influence of the human's actions.
We prove that these representations estimate a similar notion of empowerment to that studied by prior work.
arXiv Detail & Related papers (2024-11-04T21:31:04Z) - The Role of Higher-Order Cognitive Models in Active Learning [8.847360368647752]
We advocate for a new paradigm for active learning for human feedback.
We discuss how increasing level of agency results in qualitatively different forms of rational communication between an active learning system and a teacher.
arXiv Detail & Related papers (2024-01-09T07:39:36Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations.
We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z) - Improving Multimodal Interactive Agents with Reinforcement Learning from
Human Feedback [16.268581985382433]
An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback.
Here we demonstrate how to use reinforcement learning from human feedback to improve upon simulated, embodied agents.
arXiv Detail & Related papers (2022-11-21T16:00:31Z) - Contrastive Active Inference [12.361539023886161]
We propose a contrastive objective for active inference that reduces the computational burden in learning the agent's generative model and planning future actions.
Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train.
arXiv Detail & Related papers (2021-10-19T16:20:49Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions.
Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning.
We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z) - Imitating Interactive Intelligence [24.95842455898523]
We study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment.
To build agents that can robustly interact with humans, we would ideally train them while they interact with humans.
We use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour.
arXiv Detail & Related papers (2020-12-10T13:55:47Z) - A New Framework for Query Efficient Active Imitation Learning [5.167794607251493]
There is a human expert knowing the rewards and unsafe states based on his preference and objective, but querying that human expert is expensive.
We propose a new framework for imitation learning (IL) algorithm that actively and interactively learns a model of the user's reward function with efficient queries.
We evaluate the proposed method with simulated human on a state-based 2D navigation task, robotic control tasks and the image-based video games.
arXiv Detail & Related papers (2019-12-30T18:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.