PIZZA: A new benchmark for complex end-to-end task-oriented parsing
- URL: http://arxiv.org/abs/2212.00265v1
- Date: Thu, 1 Dec 2022 04:20:07 GMT
- Title: PIZZA: A new benchmark for complex end-to-end task-oriented parsing
- Authors: Konstantine Arkoudas, Nicolas Guenon des Mesnards, Melanie Rubino,
Sandesh Swamy, Saarthak Khanna, Weiqi Sun, Khan Haidar
- Abstract summary: This paper introduces a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.
We perform an evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seqNGs.
- Score: 3.5106870325869886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Much recent work in task-oriented parsing has focused on finding a middle
ground between flat slots and intents, which are inexpressive but easy to
annotate, and powerful representations such as the lambda calculus, which are
expressive but costly to annotate. This paper continues the exploration of
task-oriented parsing by introducing a new dataset for parsing pizza and drink
orders, whose semantics cannot be captured by flat slots and intents. We
perform an extensive evaluation of deep-learning techniques for task-oriented
parsing on this dataset, including different flavors of seq2seq systems and
RNNGs. The dataset comes in two main versions, one in a recently introduced
utterance-level hierarchical notation that we call TOP, and one whose targets
are executable representations (EXR). We demonstrate empirically that training
the parser to directly generate EXR notation not only solves the problem of
entity resolution in one fell swoop and overcomes a number of expressive
limitations of TOP notation, but also results in significantly greater parsing
accuracy.
Related papers
- LESS: Label-Efficient and Single-Stage Referring 3D Segmentation [55.06002976797879]
Referring 3D is a visual-language task that segments all points of the specified object from a 3D point cloud described by a sentence of query.
We propose a novel Referring 3D pipeline, Label-Efficient and Single-Stage, dubbed LESS, which is only under the supervision of efficient binary mask.
We achieve state-of-the-art performance on ScanRefer dataset by surpassing the previous methods about 3.7% mIoU using only binary labels.
arXiv Detail & Related papers (2024-10-17T07:47:41Z) - LaSagnA: Language-based Segmentation Assistant for Complex Queries [39.620806493454616]
Large Language Models for Vision (vLLMs) generate detailed perceptual outcomes, including bounding boxes and masks.
In this study, we acknowledge that the main cause of these problems is the insufficient complexity of training queries.
We present three novel strategies to effectively handle the challenges arising from the direct integration of the proposed format.
arXiv Detail & Related papers (2024-04-12T14:40:45Z) - OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition [79.852642726105]
We propose a unified paradigm for parsing visually-situated text across diverse scenarios.
Specifically, we devise a universal model, called Omni, which can simultaneously handle three typical visually-situated text parsing tasks.
In Omni, all tasks share the unified encoder-decoder architecture, the unified objective point-conditioned text generation, and the unified input representation.
arXiv Detail & Related papers (2024-03-28T03:51:14Z) - On Task-personalized Multimodal Few-shot Learning for Visually-rich
Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications.
FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER.
We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - Training Naturalized Semantic Parsers with Very Little Data [10.709587018625275]
State-of-the-art (SOTA) semantics are seq2seq architectures based on large language models that have been pretrained on vast amounts of text.
Recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves natural language sentences.
We show that this method delivers new SOTA few-shot performance on the Overnight dataset.
arXiv Detail & Related papers (2022-04-29T17:14:54Z) - SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation
on Natural Speech [44.68649535280397]
We propose a suite of benchmark tasks for Spoken Language Understanding Evaluation (SLUE)
SLUE consists of limited-size labeled training sets and corresponding evaluation sets.
We present the first phase of the SLUE benchmark suite, consisting of named entity recognition, sentiment analysis, and ASR on the corresponding datasets.
We provide new transcriptions and annotations on subsets of the VoxCeleb and VoxPopuli datasets, evaluation metrics and results for baseline models, and an open-source toolkit to reproduce the baselines and evaluate new models.
arXiv Detail & Related papers (2021-11-19T18:59:23Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - Don't Parse, Generate! A Sequence to Sequence Architecture for
Task-Oriented Semantic Parsing [0.0]
Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users.
We propose a unified architecture based on Sequence to Sequence models and Pointer Generator Network to handle both simple and complex queries.
arXiv Detail & Related papers (2020-01-30T17:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.