Compositional Generalization in Semantic Parsing: Pre-training vs.
Specialized Architectures
- URL: http://arxiv.org/abs/2007.08970v3
- Date: Wed, 22 Sep 2021 09:06:43 GMT
- Title: Compositional Generalization in Semantic Parsing: Pre-training vs.
Specialized Architectures
- Authors: Daniel Furrer, Marc van Zee, Nathan Scales, Nathanael Sch\"arli
- Abstract summary: We show that pre-training leads to significant improvements in performance vs. comparable non-pre-trained models.
We establish a new state of the art on the CFQ compositional generalization benchmark using pre-training together with an intermediate representation.
- Score: 1.8434042562191812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While mainstream machine learning methods are known to have limited ability
to compositionally generalize, new architectures and techniques continue to be
proposed to address this limitation. We investigate state-of-the-art techniques
and architectures in order to assess their effectiveness in improving
compositional generalization in semantic parsing tasks based on the SCAN and
CFQ datasets. We show that masked language model (MLM) pre-training rivals
SCAN-inspired architectures on primitive holdout splits. On a more complex
compositional task, we show that pre-training leads to significant improvements
in performance vs. comparable non-pre-trained models, whereas architectures
proposed to encourage compositional generalization on SCAN or in the area of
algorithm learning fail to lead to significant improvements. We establish a new
state of the art on the CFQ compositional generalization benchmark using MLM
pre-training together with an intermediate representation.
Related papers
- HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling [39.14392943549792]
We propose a novel approach called Hierarchical Prompt Tuning (HPT), enabling simultaneous modeling of both structured and conventional linguistic knowledge.
We introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning.
By incorporating high-level and global-level prompts modeling overall semantics, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships.
arXiv Detail & Related papers (2024-08-27T06:50:28Z) - Task Agnostic Architecture for Algorithm Induction via Implicit Composition [10.627575117586417]
This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed.
Recent Generative AI, especially Transformer-based models, demonstrate potential as an architecture capable of constructing algorithms for a wide range of domains.
Our exploration delves into current capabilities and limitations of Transformer-based and other methods in efficient and correct algorithm composition.
arXiv Detail & Related papers (2024-04-03T04:31:09Z) - Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations.
We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations.
A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z) - Composing Task Knowledge with Modular Successor Feature Approximators [60.431769158952626]
We present a novel neural network architecture, "Modular Successor Feature Approximators" (MSFA)
MSFA is able to better generalize compared to baseline architectures for learning SFs and modular architectures.
arXiv Detail & Related papers (2023-01-28T23:04:07Z) - Learning from Mistakes: Self-Regularizing Hierarchical Representations
in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding.
We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model.
Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform
Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget.
We formulate predictor-based architecture search as learning to rank with pairwise comparisons.
The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.