Language in a (Search) Box: Grounding Language Learning in Real-World
Human-Machine Interaction
- URL: http://arxiv.org/abs/2104.08874v1
- Date: Sun, 18 Apr 2021 15:03:16 GMT
- Title: Language in a (Search) Box: Grounding Language Learning in Real-World
Human-Machine Interaction
- Authors: Federico Bianchi and Ciro Greco and Jacopo Tagliabue
- Abstract summary: We show how a grounding domain, a denotation function and a composition function are learned from user data only.
We benchmark our grounded semantics on compositionality and zero-shot inference tasks.
- Score: 4.137464623395377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate grounded language learning through real-world data, by
modelling a teacher-learner dynamics through the natural interactions occurring
between users and search engines; in particular, we explore the emergence of
semantic generalization from unsupervised dense representations outside of
synthetic environments. A grounding domain, a denotation function and a
composition function are learned from user data only. We show how the resulting
semantics for noun phrases exhibits compositional properties while being fully
learnable without any explicit labelling. We benchmark our grounded semantics
on compositionality and zero-shot inference tasks, and we show that it provides
better results and better generalizations than SOTA non-grounded models, such
as word2vec and BERT.
Related papers
- Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity
Resolution [0.0]
We consider the task of visual grounding, where the agent segments an object from a crowded scene given a natural language description.
Modern holistic approaches to visual grounding usually ignore language structure and struggle to cover generic domains.
We introduce a fully decoupled modular framework for compositional visual grounding of entities, attributes, and spatial relations.
arXiv Detail & Related papers (2022-05-24T14:12:32Z) - Neural Variational Learning for Grounded Language Acquisition [14.567067583556714]
We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms.
We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings.
arXiv Detail & Related papers (2021-07-20T20:55:02Z) - Neural Abstructions: Abstractions that Support Construction for Grounded
Language Learning [69.1137074774244]
Leveraging language interactions effectively requires addressing limitations in the two most common approaches to language grounding.
We introduce the idea of neural abstructions: a set of constraints on the inference procedure of a label-conditioned generative model.
We show that with this method a user population is able to build a semantic modification for an open-ended house task in Minecraft.
arXiv Detail & Related papers (2021-07-20T07:01:15Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - A Visuospatial Dataset for Naturalistic Verb Learning [18.654373173232205]
We introduce a new dataset for training and evaluating grounded language models.
Our data is collected within a virtual reality environment and is designed to emulate the quality of language data to which a pre-verbal child is likely to have access.
We use the collected data to compare several distributional semantics models for verb learning.
arXiv Detail & Related papers (2020-10-28T20:47:13Z) - Systematic Generalization on gSCAN with Language Conditioned Embedding [19.39687991647301]
Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations.
We propose a novel method that learns objects' contextualized embeddings with dynamic message passing conditioned on the input natural language.
arXiv Detail & Related papers (2020-09-11T17:35:05Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.