Compositional Generalization in Unsupervised Compositional
Representation Learning: A Study on Disentanglement and Emergent Language
- URL: http://arxiv.org/abs/2210.00482v2
- Date: Wed, 5 Oct 2022 21:14:46 GMT
- Title: Compositional Generalization in Unsupervised Compositional
Representation Learning: A Study on Disentanglement and Emergent Language
- Authors: Zhenlin Xu, Marc Niethammer, Colin Raffel
- Abstract summary: We study three unsupervised representation learning algorithms on two datasets that allow directly testing compositional generalization.
We find that directly using the bottleneck representation with simple models and few labels may lead to worse generalization than using representations from layers before or after the learned representation itself.
Surprisingly, we find that increasing pressure to produce a disentangled representation produces representations with worse generalization, while representations from EL models show strong compositional generalization.
- Score: 48.37815764394315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models struggle with compositional generalization, i.e. the
ability to recognize or generate novel combinations of observed elementary
concepts. In hopes of enabling compositional generalization, various
unsupervised learning algorithms have been proposed with inductive biases that
aim to induce compositional structure in learned representations (e.g.
disentangled representation and emergent language learning). In this work, we
evaluate these unsupervised learning algorithms in terms of how well they
enable compositional generalization. Specifically, our evaluation protocol
focuses on whether or not it is easy to train a simple model on top of the
learned representation that generalizes to new combinations of compositional
factors. We systematically study three unsupervised representation learning
algorithms - $\beta$-VAE, $\beta$-TCVAE, and emergent language (EL)
autoencoders - on two datasets that allow directly testing compositional
generalization. We find that directly using the bottleneck representation with
simple models and few labels may lead to worse generalization than using
representations from layers before or after the learned representation itself.
In addition, we find that the previously proposed metrics for evaluating the
levels of compositionality are not correlated with actual compositional
generalization in our framework. Surprisingly, we find that increasing pressure
to produce a disentangled representation produces representations with worse
generalization, while representations from EL models show strong compositional
generalization. Taken together, our results shed new light on the compositional
generalization behavior of different unsupervised learning algorithms with a
new setting to rigorously test this behavior, and suggest the potential
benefits of delevoping EL learning algorithms for more generalizable
representations.
Related papers
- Towards Understanding the Relationship between In-context Learning and Compositional Generalization [7.843029855730508]
We train a causal Transformer in a setting that renders ordinary learning very difficult.
The model can solve the task, however, by utilizing earlier examples to generalize to later ones.
In evaluations on the datasets, SCAN, COGS, and GeoQuery, models trained in this manner indeed show improved compositional generalization.
arXiv Detail & Related papers (2024-03-18T14:45:52Z) - Grounded Graph Decoding Improves Compositional Generalization in
Question Answering [68.72605660152101]
Question answering models struggle to generalize to novel compositions of training patterns, such as longer sequences or more complex test structures.
We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism.
Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering.
arXiv Detail & Related papers (2021-11-05T17:50:14Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Learning Algebraic Recombination for Compositional Generalization [71.78771157219428]
We propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization.
Key insight is to model the semantic parsing task as a homomorphism between a latent syntactic algebra and a semantic algebra.
Experiments on two realistic and comprehensive compositional generalization demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-07-14T07:23:46Z) - Improving Compositional Generalization in Classification Tasks via
Structure Annotations [33.90268697120572]
Humans have a great ability to generalize compositionally, but state-of-the-art neural models struggle to do so.
First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization.
Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.
arXiv Detail & Related papers (2021-06-19T06:07:27Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z) - Improving Compositional Generalization in Semantic Parsing [54.4720965813889]
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently.
We investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization.
arXiv Detail & Related papers (2020-10-12T12:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.