Improving Compositional Generalization in Classification Tasks via
Structure Annotations
- URL: http://arxiv.org/abs/2106.10434v1
- Date: Sat, 19 Jun 2021 06:07:27 GMT
- Title: Improving Compositional Generalization in Classification Tasks via
Structure Annotations
- Authors: Juyong Kim, Pradeep Ravikumar, Joshua Ainslie, Santiago Onta\~n\'on
- Abstract summary: Humans have a great ability to generalize compositionally, but state-of-the-art neural models struggle to do so.
First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization.
Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.
- Score: 33.90268697120572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compositional generalization is the ability to generalize systematically to a
new data distribution by combining known components. Although humans seem to
have a great ability to generalize compositionally, state-of-the-art neural
models struggle to do so. In this work, we study compositional generalization
in classification tasks and present two main contributions. First, we study
ways to convert a natural language sequence-to-sequence dataset to a
classification dataset that also requires compositional generalization. Second,
we show that providing structural hints (specifically, providing parse trees
and entity links as attention masks for a Transformer model) helps
compositional generalization.
Related papers
- When does compositional structure yield compositional generalization? A kernel theory [0.0]
We present a theory of compositional generalization in kernel models with fixed representations.
We identify novel failure modes in compositional generalization that arise from biases in the training data.
This work provides a theoretical perspective on how statistical structure in the training data can affect compositional generalization.
arXiv Detail & Related papers (2024-05-26T00:50:11Z) - On Provable Length and Compositional Generalization [7.883808173871223]
We provide first provable guarantees on length and compositional generalization for common sequence-to-sequence models.
We show that limited capacity versions of different architectures achieve both length and compositional generalization.
arXiv Detail & Related papers (2024-02-07T14:16:28Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Learning Algebraic Recombination for Compositional Generalization [71.78771157219428]
We propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization.
Key insight is to model the semantic parsing task as a homomorphism between a latent syntactic algebra and a semantic algebra.
Experiments on two realistic and comprehensive compositional generalization demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-07-14T07:23:46Z) - Compositional Generalization via Semantic Tagging [81.24269148865555]
We propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models.
We show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
arXiv Detail & Related papers (2020-10-22T15:55:15Z) - Hierarchical Poset Decoding for Compositional Generalization in Language [52.13611501363484]
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset)
Current encoder-decoder architectures do not take the poset structure of semantics into account properly.
We propose a novel hierarchical poset decoding paradigm for compositional generalization in language.
arXiv Detail & Related papers (2020-10-15T14:34:26Z) - Improving Compositional Generalization in Semantic Parsing [54.4720965813889]
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently.
We investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization.
arXiv Detail & Related papers (2020-10-12T12:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.