Related papers: Improving Compositional Generalization in Classification Tasks via Structure Annotations

Improving Compositional Generalization in Classification Tasks via Structure Annotations

URL: http://arxiv.org/abs/2106.10434v1
Date: Sat, 19 Jun 2021 06:07:27 GMT
Title: Improving Compositional Generalization in Classification Tasks via Structure Annotations
Authors: Juyong Kim, Pradeep Ravikumar, Joshua Ainslie, Santiago Onta\~n\'on
Abstract summary: Humans have a great ability to generalize compositionally, but state-of-the-art neural models struggle to do so. First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization. Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.
Score: 33.90268697120572
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Compositional generalization is the ability to generalize systematically to a new data distribution by combining known components. Although humans seem to have a great ability to generalize compositionally, state-of-the-art neural models struggle to do so. In this work, we study compositional generalization in classification tasks and present two main contributions. First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization. Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.

Related papers

Does Data Scaling Lead to Visual Compositional Generalization? [21.242714408660508]
We find that compositional generalization is driven by data diversity, not mere data scale.<n>We prove this structure is key to efficiency, enabling perfect generalization from few observed combinations.
arXiv Detail & Related papers (2025-07-09T17:59:03Z)
When does compositional structure yield compositional generalization? A kernel theory [0.0]
We present a theory of compositional generalization in kernel models with fixed representations. We identify novel failure modes in compositional generalization that arise from biases in the training data. This work provides a theoretical perspective on how statistical structure in the training data can affect compositional generalization.
arXiv Detail & Related papers (2024-05-26T00:50:11Z)
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers [74.96551626420188]
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures. We investigate sources of inductive bias in transformer models and their training that could cause such generalization behavior to emerge.
arXiv Detail & Related papers (2024-04-25T07:10:29Z)
On Provable Length and Compositional Generalization [7.883808173871223]
We provide first provable guarantees on length and compositional generalization for common sequence-to-sequence models. We show that emphsimple limited capacity versions of these different architectures achieve both length and compositional generalization.
arXiv Detail & Related papers (2024-02-07T14:16:28Z)
Compositional Generalization in Unsupervised Compositional Representation Learning: A Study on Disentanglement and Emergent Language [48.37815764394315]
We study three unsupervised representation learning algorithms on two datasets that allow directly testing compositional generalization. We find that directly using the bottleneck representation with simple models and few labels may lead to worse generalization than using representations from layers before or after the learned representation itself. Surprisingly, we find that increasing pressure to produce a disentangled representation produces representations with worse generalization, while representations from EL models show strong compositional generalization.
arXiv Detail & Related papers (2022-10-02T10:35:53Z)
Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus. We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z)
Disentangled Sequence to Sequence Learning for Compositional Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input. Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z)
Learning Algebraic Recombination for Compositional Generalization [71.78771157219428]
We propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization. Key insight is to model the semantic parsing task as a homomorphism between a latent syntactic algebra and a semantic algebra. Experiments on two realistic and comprehensive compositional generalization demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-07-14T07:23:46Z)
Compositional Generalization via Semantic Tagging [81.24269148865555]
We propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models. We show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
arXiv Detail & Related papers (2020-10-22T15:55:15Z)
Hierarchical Poset Decoding for Compositional Generalization in Language [52.13611501363484]
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset) Current encoder-decoder architectures do not take the poset structure of semantics into account properly. We propose a novel hierarchical poset decoding paradigm for compositional generalization in language.
arXiv Detail & Related papers (2020-10-15T14:34:26Z)
Improving Compositional Generalization in Semantic Parsing [54.4720965813889]
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently. We investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization.
arXiv Detail & Related papers (2020-10-12T12:34:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.