Few-Shot Adaptation for Parsing Contextual Utterances with LLMs
- URL: http://arxiv.org/abs/2309.10168v1
- Date: Mon, 18 Sep 2023 21:35:19 GMT
- Title: Few-Shot Adaptation for Parsing Contextual Utterances with LLMs
- Authors: Kevin Lin, Patrick Xia, Hao Fang
- Abstract summary: In real-world settings, there typically exists only a limited number of contextual utterances due to annotation cost.
We examine four major paradigms for doing so in conversational semantic parsing.
Experiments with in-context learning and fine-tuning suggest that Rewrite-then-Parse is the most promising paradigm.
- Score: 25.22099517947426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We evaluate the ability of semantic parsers based on large language models
(LLMs) to handle contextual utterances. In real-world settings, there typically
exists only a limited number of annotated contextual utterances due to
annotation cost, resulting in an imbalance compared to non-contextual
utterances. Therefore, parsers must adapt to contextual utterances with a few
training examples. We examine four major paradigms for doing so in
conversational semantic parsing i.e., Parse-with-Utterance-History,
Parse-with-Reference-Program, Parse-then-Resolve, and Rewrite-then-Parse. To
facilitate such cross-paradigm comparisons, we construct
SMCalFlow-EventQueries, a subset of contextual examples from SMCalFlow with
additional annotations. Experiments with in-context learning and fine-tuning
suggest that Rewrite-then-Parse is the most promising paradigm when
holistically considering parsing accuracy, annotation cost, and error types.
Related papers
- A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant.
Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z) - Evaluating Factual Consistency of Texts with Semantic Role Labeling [3.1776833268555134]
We introduce SRLScore, a reference-free evaluation metric designed with text summarization in mind.
A final factuality score is computed by an adjustable scoring mechanism.
Correlation with human judgments on English summarization datasets shows that SRLScore is competitive with state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T17:59:42Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - Nonparametric Masked Language Modeling [113.71921977520864]
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary.
We introduce NPM, the first nonparametric masked language model that replaces this softmax with a nonparametric distribution over every phrase in a reference corpus.
NPM can be efficiently trained with a contrastive objective and an in-batch approximation to full corpus retrieval.
arXiv Detail & Related papers (2022-12-02T18:10:42Z) - DP-Parse: Finding Word Boundaries from Raw Speech with an Instance
Lexicon [18.05179713472479]
We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens.
On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-the-art in 5 languages.
Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn and semantic representations as assessed by a new spoken word embedding benchmark.
arXiv Detail & Related papers (2022-06-22T19:15:57Z) - BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and
Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing.
We benchmark eight language models, including two GPT-3 variants available only through an API.
Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z) - Automatic Correction of Syntactic Dependency Annotation Differences [17.244143187393078]
We propose a method for automatically detecting annotation mismatches between dependency parsing corpora.
All three methods rely on comparing an unseen example in a new corpus with similar examples in an existing corpus.
We then evaluate these conversions by retraining two dependencys -- Stanza (Qianu et al. 2020) and Parsing as Tagging (PaT) -- on the converted and unconverted data.
arXiv Detail & Related papers (2022-01-15T17:17:55Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - An Imitation Game for Learning Semantic Parsers from User Interaction [43.66945504686796]
We suggest an alternative, human-in-the-loop methodology for learning semantic annotations directly from users.
A semantic should be introspective and prompt for user demonstration when uncertain.
In doing so it also gets to imitate the user behavior and continue improving itself autonomously.
arXiv Detail & Related papers (2020-05-02T03:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.