Related papers: Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments

Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments

URL: http://arxiv.org/abs/2206.04546v3
Date: Wed, 27 Sep 2023 07:49:50 GMT
Title: Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments
Authors: Hugo Caselles-Dupr\'e, Olivier Sigaud, Mohamed Chetouani
Abstract summary: We implement pedagogy and pragmatism mechanisms by leveraging a Bayesian model of Goal Inference from demonstrations (BGI) We show that combining BGI-agents (a pedagogical teacher and a pragmatic learner) results in faster learning and reduced goal ambiguity over standard learning from demonstrations.
Score: 8.715518445626826
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning from demonstration methods usually leverage close to optimal demonstrations to accelerate training. By contrast, when demonstrating a task, human teachers deviate from optimal demonstrations and pedagogically modify their behavior by giving demonstrations that best disambiguate the goal they want to demonstrate. Analogously, human learners excel at pragmatically inferring the intent of the teacher, facilitating communication between the two agents. These mechanisms are critical in the few demonstrations regime, where inferring the goal is more difficult. In this paper, we implement pedagogy and pragmatism mechanisms by leveraging a Bayesian model of Goal Inference from demonstrations (BGI). We highlight the benefits of this model in multi-goal teacher-learner setups with two artificial agents that learn with goal-conditioned Reinforcement Learning. We show that combining BGI-agents (a pedagogical teacher and a pragmatic learner) results in faster learning and reduced goal ambiguity over standard learning from demonstrations, especially in the few demonstrations regime. We provide the code for our experiments (https://github.com/Caselles/NeurIPS22-demonstrations-pedagogy-pragmatism), as well as an illustrative video explaining our approach (https://youtu.be/V4n16IjkNyw).

Related papers

AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents. AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z)
Skill Disentanglement for Imitation Learning from Suboptimal Demonstrations [60.241144377865716]
We consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set. We propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills.
arXiv Detail & Related papers (2023-06-13T17:24:37Z)
Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations [9.640594614636049]
Deep reinforcement learning can efficiently develop policies for manipulators. It takes time to collect sufficient high-quality demonstrations in practice. Human demonstrations may be unsuitable for robots.
arXiv Detail & Related papers (2023-03-29T05:56:44Z)
Boosting Reinforcement Learning and Planning with Demonstrations: A Survey [25.847796336059343]
We discuss the advantages of using demonstrations in sequential decision making. We exemplify a practical pipeline for generating and utilizing demonstrations in the recently proposed ManiSkill robot learning benchmark.
arXiv Detail & Related papers (2023-03-23T17:53:44Z)
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations [68.46458026983409]
We study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces. OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge. We develop a better transferability measurement to tackle this newly-emerged challenge.
arXiv Detail & Related papers (2022-11-13T07:45:06Z)
Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario. Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions. In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z)
Pedagogical Demonstrations and Pragmatic Learning in Artificial Tutor-Learner Interactions [8.715518445626826]
In this paper, we investigate the implementation of such mechanisms in a tutor-learner setup where both participants are artificial agents in an environment with multiple goals. Using pedagogy from the tutor and pragmatism from the learner, we show substantial improvements over standard learning from demonstrations.
arXiv Detail & Related papers (2022-02-28T21:57:50Z)
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? [112.72413411257662]
Large language models (LMs) are able to in-context learn by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs. We show that ground truth demonstrations are in fact not required -- randomly replacing labels in the demonstrations barely hurts performance. We find that other aspects of the demonstrations are the key drivers of end task performance.
arXiv Detail & Related papers (2022-02-25T17:25:19Z)
Learning Feasibility to Imitate Demonstrators with Different Dynamics [23.239058855103067]
The goal of learning from demonstrations is to learn a policy for an agent (imitator) by mimicking the behavior in the demonstrations. We learn a feasibility metric that captures the likelihood of a demonstration being feasible by the imitator. Our experiments on four simulated environments and on a real robot show that the policy learned with our approach achieves a higher expected return than prior works.
arXiv Detail & Related papers (2021-10-28T14:15:47Z)
Learning from Imperfect Demonstrations from Agents with Varying Dynamics [29.94164262533282]
We develop a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning. Our experiments on four environments in simulation and on a real robot show improved learned policies with higher expected return.
arXiv Detail & Related papers (2021-03-10T07:39:38Z)
State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning. We train an inverse dynamics model and use it to predict actions for state-only demonstrations. Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.