SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts
and Zero-shot Models
- URL: http://arxiv.org/abs/2205.07381v1
- Date: Sun, 15 May 2022 21:13:15 GMT
- Title: SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts
and Zero-shot Models
- Authors: Jingfeng Yang, Haoming Jiang, Qingyu Yin, Danqing Zhang, Bing Yin,
Diyi Yang
- Abstract summary: Recent research showed promising results on combining pretrained language models with canonical utterance.
We propose a novel few-shot semantic parsing method -- SeqZero.
In particular, SeqZero brings out the merits from both models via ensemble equipped with our proposed constrained rescaling.
- Score: 57.29358388475983
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research showed promising results on combining pretrained language
models (LMs) with canonical utterance for few-shot semantic parsing. The
canonical utterance is often lengthy and complex due to the compositional
structure of formal languages. Learning to generate such canonical utterance
requires significant amount of data to reach high performance. Fine-tuning with
only few-shot samples, the LMs can easily forget pretrained knowledge, overfit
spurious biases, and suffer from compositionally out-of-distribution
generalization errors. To tackle these issues, we propose a novel few-shot
semantic parsing method -- SeqZero. SeqZero decomposes the problem into a
sequence of sub-problems, which correspond to the sub-clauses of the formal
language. Based on the decomposition, the LMs only need to generate short
answers using prompts for predicting sub-clauses. Thus, SeqZero avoids
generating a long canonical utterance at once. Moreover, SeqZero employs not
only a few-shot model but also a zero-shot model to alleviate the overfitting.
In particular, SeqZero brings out the merits from both models via ensemble
equipped with our proposed constrained rescaling. SeqZero achieves SOTA
performance of BART-based models on GeoQuery and EcommerceQuery, which are two
few-shot datasets with compositional data split.
Related papers
- Zero-Shot Multi-Hop Question Answering via Monte-Carlo Tree Search with Large Language Models [19.214387260667348]
This paper introduces Monte-Carlo tree search for Zero-shot multi-hop Question Answering (MZQA), a framework based on Monte-Carlo tree search (MCTS)
Unlike previous works, we propose a zero-shot prompting method, which relies solely on instructions without the support of hand-crafted few-shot examples that typically require domain expertise.
We also introduce a behavioral cloning approach (MZQA-BC) trained on self-generated MCTS inference trajectories, achieving an over 10-fold increase in reasoning speed with bare compromise in performance.
arXiv Detail & Related papers (2024-09-28T15:13:04Z) - SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding [52.98133831401225]
Temporal grounding, also known as video moment retrieval, aims at locating video segments corresponding to a given query sentence.
We propose a large language model-driven method for negative query construction, utilizing GPT-3.5-Turbo.
We introduce a coarse-to-fine saliency ranking strategy, which encourages the model to learn the multi-granularity semantic relationships between videos and hierarchical negative queries.
arXiv Detail & Related papers (2024-07-06T16:08:17Z) - Language-Independent Representations Improve Zero-Shot Summarization [18.46817967804773]
Finetuning pretrained models on downstream generation tasks often leads to catastrophic forgetting in zero-shot conditions.
In this work, we focus on summarization and tackle the problem through the lens of language-independent representations.
We first show naively finetuned models are highly language-specific in both output behavior and internal representations, resulting in poor zero-shot performance.
arXiv Detail & Related papers (2024-04-08T17:56:43Z) - ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language
Models [6.13621607944513]
We propose ZEROTOP, a zero-shot task-oriented parsing method that decomposes a semantic parsing problem into a set of abstractive and extractive question-answering problems.
We show that our QA-based decomposition paired with the fine-tuned LLM can correctly parse 16% of utterances in the MTOP dataset without requiring any annotated data.
arXiv Detail & Related papers (2022-12-21T07:06:55Z) - Structural generalization is hard for sequence-to-sequence models [85.0087839979613]
Sequence-to-sequence (seq2seq) models have been successful across many NLP tasks.
Recent work on compositional generalization has shown that seq2seq models achieve very low accuracy in generalizing to linguistic structures that were not seen in training.
arXiv Detail & Related papers (2022-10-24T09:03:03Z) - Thutmose Tagger: Single-pass neural model for Inverse Text Normalization [76.87664008338317]
Inverse text normalization (ITN) is an essential post-processing step in automatic speech recognition.
We present a dataset preparation method based on the granular alignment of ITN examples.
One-to-one correspondence between tags and input words improves the interpretability of the model's predictions.
arXiv Detail & Related papers (2022-07-29T20:39:02Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - Few-shot Instruction Prompts for Pretrained Language Models to Detect
Social Biases [55.45617404586874]
We propose a few-shot instruction-based method for prompting pre-trained language models (LMs)
We show that large LMs can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models.
arXiv Detail & Related papers (2021-12-15T04:19:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.