Calibrate your listeners! Robust communication-based training for
pragmatic speakers
- URL: http://arxiv.org/abs/2110.05422v1
- Date: Mon, 11 Oct 2021 17:07:38 GMT
- Title: Calibrate your listeners! Robust communication-based training for
pragmatic speakers
- Authors: Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman
- Abstract summary: We propose a method that uses a population of neural listeners to regularize speaker training.
We show that language drift originates from the poor uncertainty calibration of a neural listener.
We evaluate both population-based objectives on reference games, and show that the ensemble method with better calibration enables the speaker to generate pragmatic utterances.
- Score: 30.731870275051957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To be good conversational partners, natural language processing (NLP) systems
should be trained to produce contextually useful utterances. Prior work has
investigated training NLP systems with communication-based objectives, where a
neural listener stands in as a communication partner. However, these systems
commonly suffer from semantic drift where the learned language diverges
radically from natural language. We propose a method that uses a population of
neural listeners to regularize speaker training. We first show that language
drift originates from the poor uncertainty calibration of a neural listener,
which makes high-certainty predictions on novel sentences. We explore ensemble-
and dropout-based populations of listeners and find that the former results in
better uncertainty quantification. We evaluate both population-based objectives
on reference games, and show that the ensemble method with better calibration
enables the speaker to generate pragmatic utterances while scaling to a large
vocabulary and generalizing to new games and listeners.
Related papers
- Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning [31.196865401472664]
We train language models to have productive discussions about their environment in natural language without any human demonstrations.
We leverage the agent's goal to predict useful information about the world as a dense reward signal that guides communication.
We analyze emergent behaviors due to our technique, such as accusing suspects and providing evidence, and find that it enables strong discussions.
arXiv Detail & Related papers (2025-02-09T22:44:45Z) - Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance.
We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information.
Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z) - Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder.
The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli.
Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Speaking the Language of Your Listener: Audience-Aware Adaptation via
Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience.
We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z) - Communication Drives the Emergence of Language Universals in Neural
Agents: Evidence from the Word-order/Case-marking Trade-off [3.631024220680066]
We propose a new Neural-agent Language Learning and Communication framework (NeLLCom) where pairs of speaking and listening agents first learn a miniature language.
We succeed in replicating the trade-off with the new framework without hard-coding specific biases in the agents.
arXiv Detail & Related papers (2023-01-30T17:22:33Z) - Few-shot Language Coordination by Modeling Theory of Mind [95.54446989205117]
We study the task of few-shot $textitlanguage coordination$.
We require the lead agent to coordinate with a $textitpopulation$ of agents with different linguistic abilities.
This requires the ability to model the partner's beliefs, a vital component of human communication.
arXiv Detail & Related papers (2021-07-12T19:26:11Z) - Self-play for Data Efficient Language Acquisition [20.86261546611472]
We exploit the symmetric nature of communication in order to improve the efficiency and quality of language acquisition in learning agents.
We show that using self-play as a substitute for direct supervision enables the agent to transfer its knowledge across roles.
arXiv Detail & Related papers (2020-10-10T02:09:19Z) - Learning Spoken Language Representations with Neural Lattice Language
Modeling [39.50831917042577]
We propose a framework that trains neural lattice language models to provide contextualized representations for spoken language understanding tasks.
The proposed two-stage pre-training approach reduces the demands of speech data and has better efficiency.
arXiv Detail & Related papers (2020-07-06T10:38:03Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z) - You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation.
Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.