Self-play for Data Efficient Language Acquisition
- URL: http://arxiv.org/abs/2010.04872v1
- Date: Sat, 10 Oct 2020 02:09:19 GMT
- Title: Self-play for Data Efficient Language Acquisition
- Authors: Charles Lovering and Ellie Pavlick
- Abstract summary: We exploit the symmetric nature of communication in order to improve the efficiency and quality of language acquisition in learning agents.
We show that using self-play as a substitute for direct supervision enables the agent to transfer its knowledge across roles.
- Score: 20.86261546611472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When communicating, people behave consistently across conversational roles:
People understand the words they say and are able to produce the words they
hear. To date, artificial agents developed for language tasks have lacked such
symmetry, meaning agents trained to produce language are unable to understand
it and vice-versa. In this work, we exploit the symmetric nature of
communication in order to improve both the efficiency and quality of language
acquisition in learning agents. Specifically, we consider the setting in which
an agent must learn to both understand and generate words in an existing
language, but with the assumption that access to interaction with "oracle"
speakers of the language is very limited. We show that using self-play as a
substitute for direct supervision enables the agent to transfer its knowledge
across roles (e.g. training as a listener but testing as a speaker) and make
better inferences about the ground truth lexicon using only a handful of
interactions with the oracle.
Related papers
- Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - Speaking the Language of Your Listener: Audience-Aware Adaptation via
Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience.
We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z) - Transforming Human-Centered AI Collaboration: Redefining Embodied Agents
Capabilities through Interactive Grounded Language Instructions [23.318236094953072]
Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly.
The research community is actively pursuing the development of interactive "embodied agents"
These agents must possess the ability to promptly request feedback in case communication breaks down or instructions are unclear.
arXiv Detail & Related papers (2023-05-18T07:51:33Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - Communication Drives the Emergence of Language Universals in Neural
Agents: Evidence from the Word-order/Case-marking Trade-off [3.631024220680066]
We propose a new Neural-agent Language Learning and Communication framework (NeLLCom) where pairs of speaking and listening agents first learn a miniature language.
We succeed in replicating the trade-off with the new framework without hard-coding specific biases in the agents.
arXiv Detail & Related papers (2023-01-30T17:22:33Z) - "No, to the Right" -- Online Language Corrections for Robotic
Manipulation via Shared Autonomy [70.45420918526926]
We present LILAC, a framework for incorporating and adapting to natural language corrections online during execution.
Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot.
We show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users.
arXiv Detail & Related papers (2023-01-06T15:03:27Z) - Few-shot Language Coordination by Modeling Theory of Mind [95.54446989205117]
We study the task of few-shot $textitlanguage coordination$.
We require the lead agent to coordinate with a $textitpopulation$ of agents with different linguistic abilities.
This requires the ability to model the partner's beliefs, a vital component of human communication.
arXiv Detail & Related papers (2021-07-12T19:26:11Z) - On the interaction between supervision and self-play in emergent
communication [82.290338507106]
We investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency.
We find that first training agents via supervised learning on human data followed by self-play outperforms the converse.
arXiv Detail & Related papers (2020-02-04T02:35:19Z) - Emergence of Pragmatics from Referential Game between Theory of Mind
Agents [64.25696237463397]
We propose an algorithm, using which agents can spontaneously learn the ability to "read between lines" without any explicit hand-designed rules.
We integrate the theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol.
arXiv Detail & Related papers (2020-01-21T19:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.