Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity
- URL: http://arxiv.org/abs/2207.02518v1
- Date: Wed, 6 Jul 2022 08:46:27 GMT
- Title: Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity
- Authors: Sam Spilsbury and Alexander Ilin
- Abstract summary: We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
- Score: 81.38804205212425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We provide a study of how induced model sparsity can help achieve
compositional generalization and better sample efficiency in grounded language
learning problems. We consider simple language-conditioned navigation problems
in a grid world environment with disentangled observations. We show that
standard neural architectures do not always yield compositional generalization.
To address this, we design an agent that contains a goal identification module
that encourages sparse correlations between words in the instruction and
attributes of objects, composing them together to find the goal. The output of
the goal identification module is the input to a value iteration network
planner. Our agent maintains a high level of performance on goals containing
novel combinations of properties even when learning from a handful of
demonstrations. We examine the internal representations of our agent and find
the correct correspondences between words in its dictionary and attributes in
the environment.
Related papers
- Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - Feature Interactions Reveal Linguistic Structure in Language Models [2.0178765779788495]
We study feature interactions in the context of feature attribution methods for post-hoc interpretability.
We work out a grey box methodology, in which we train models to perfection on a formal language classification task.
We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model.
arXiv Detail & Related papers (2023-06-21T11:24:41Z) - Leveraging Locality in Abstractive Text Summarization [44.67905693077539]
We investigate if models with a restricted context can have competitive performance compared with the memory-efficient attention models.
Our model is applied to individual pages, which contain parts of inputs grouped by the principle of locality.
arXiv Detail & Related papers (2022-05-25T03:59:24Z) - Meta-Learning to Compositionally Generalize [34.656819307701156]
We implement a meta-learning augmented version of supervised learning.
We construct pairs of tasks for meta-learning by sub-sampling existing training data.
Experimental results on the COGS and SCAN datasets show that our similarity-driven meta-learning can improve generalization performance.
arXiv Detail & Related papers (2021-06-08T11:21:48Z) - Language in a (Search) Box: Grounding Language Learning in Real-World
Human-Machine Interaction [4.137464623395377]
We show how a grounding domain, a denotation function and a composition function are learned from user data only.
We benchmark our grounded semantics on compositionality and zero-shot inference tasks.
arXiv Detail & Related papers (2021-04-18T15:03:16Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - A Framework to Learn with Interpretation [2.3741312212138896]
We present a novel framework to jointly learn a predictive model and its associated interpretation model.
We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers.
A detailed pipeline to visualize the learnt features is also developed.
arXiv Detail & Related papers (2020-10-19T09:26:28Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.