Learning to Generalize Compositionally by Transferring Across Semantic
Parsing Tasks
- URL: http://arxiv.org/abs/2111.05013v1
- Date: Tue, 9 Nov 2021 09:10:21 GMT
- Title: Learning to Generalize Compositionally by Transferring Across Semantic
Parsing Tasks
- Authors: Wang Zhu, Peter Shaw, Tal Linzen, Fei Sha
- Abstract summary: We investigate learning representations that facilitate transfer learning from one compositional task to another.
We apply this method to semantic parsing, using three very different datasets.
Our method significantly improves compositional generalization over baselines on the test set of the target task.
- Score: 37.66114618645146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural network models often generalize poorly to mismatched domains or
distributions. In NLP, this issue arises in particular when models are expected
to generalize compositionally, that is, to novel combinations of familiar words
and constructions. We investigate learning representations that facilitate
transfer learning from one compositional task to another: the representation
and the task-specific layers of the models are strategically trained
differently on a pre-finetuning task such that they generalize well on
mismatched splits that require compositionality. We apply this method to
semantic parsing, using three very different datasets, COGS, GeoQuery and SCAN,
used alternately as the pre-finetuning and target task. Our method
significantly improves compositional generalization over baselines on the test
set of the target task, which is held out during fine-tuning. Ablation studies
characterize the utility of the major steps in the proposed algorithm and
support our hypothesis.
Related papers
- Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Compositionality as Lexical Symmetry [42.37422271002712]
In tasks like semantic parsing, instruction following, and question answering, standard deep networks fail to generalize compositionally from small datasets.
We present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models.
We describe a procedure called LEXSYM that discovers these transformations automatically, then applies them to training data for ordinary neural sequence models.
arXiv Detail & Related papers (2022-01-30T21:44:46Z) - Improving Compositional Generalization with Latent Structure and Data
Augmentation [39.24527889685699]
We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL)
CSL is a generative model with a quasi-synchronous context-free grammar backbone.
This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks.
arXiv Detail & Related papers (2021-12-14T18:03:28Z) - Meta-Learning to Compositionally Generalize [34.656819307701156]
We implement a meta-learning augmented version of supervised learning.
We construct pairs of tasks for meta-learning by sub-sampling existing training data.
Experimental results on the COGS and SCAN datasets show that our similarity-driven meta-learning can improve generalization performance.
arXiv Detail & Related papers (2021-06-08T11:21:48Z) - Set Representation Learning with Generalized Sliced-Wasserstein
Embeddings [22.845403993200932]
We propose a geometrically-interpretable framework for learning representations from set-structured data.
In particular, we treat elements of a set as samples from a probability measure and propose an exact Euclidean embedding for Generalized Sliced Wasserstein.
We evaluate our proposed framework on multiple supervised and unsupervised set learning tasks and demonstrate its superiority over state-of-the-art set representation learning approaches.
arXiv Detail & Related papers (2021-03-05T19:00:34Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.