PMB5: Gaining More Insight into Neural Semantic Parsing with Challenging Benchmarks
- URL: http://arxiv.org/abs/2404.08354v4
- Date: Mon, 16 Sep 2024 09:51:21 GMT
- Title: PMB5: Gaining More Insight into Neural Semantic Parsing with Challenging Benchmarks
- Authors: Xiao Zhang, Chunliu Wang, Rik van Noord, Johan Bos,
- Abstract summary: We evaluate neural models for semantic parsing and meaning-to-text generation on the Parallel Meaning Bank.
First, instead of the prior random split, we propose a more systematic splitting approach to improve the reliability of the standard test data.
Second, except for the standard test set, we also propose two challenge sets: one with longer texts including discourse structure, and one that addresses compositional generalization.
- Score: 9.31054333943453
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Parallel Meaning Bank (PMB) serves as a corpus for semantic processing with a focus on semantic parsing and text generation. Currently, we witness an excellent performance of neural parsers and generators on the PMB. This might suggest that such semantic processing tasks have by and large been solved. We argue that this is not the case and that performance scores from the past on the PMB are inflated by non-optimal data splits and test sets that are too easy. In response, we introduce several changes. First, instead of the prior random split, we propose a more systematic splitting approach to improve the reliability of the standard test data. Second, except for the standard test set, we also propose two challenge sets: one with longer texts including discourse structure, and one that addresses compositional generalization. We evaluate five neural models for semantic parsing and meaning-to-text generation. Our results show that model performance declines (in some cases dramatically) on the challenge sets, revealing the limitations of neural models when confronting such challenges.
Related papers
- Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation.
We introduce novel methodologies and datasets to overcome these challenges.
We propose MhBART, an encoder-decoder model designed to emulate human writing style.
We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z) - Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers [0.7704032792820767]
Deep neural networks are applied in more and more areas of everyday life.
They still lack essential abilities, such as robustly dealing with spatially transformed input signals.
We propose a novel technique to emulate such an inference process for neural nets.
arXiv Detail & Related papers (2024-05-06T09:47:29Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Conciseness: An Overlooked Language Task [11.940413163824887]
We define the task and show that it is different from related tasks such as summarization and simplification.
We demonstrate that conciseness is a difficult task for which zero-shot setups with large neural language models often do not perform well.
arXiv Detail & Related papers (2022-11-08T09:47:11Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Confident Adaptive Language Modeling [95.45272377648773]
CALM is a framework for dynamically allocating different amounts of compute per input and generation timestep.
We demonstrate the efficacy of our framework in reducing compute -- potential speedup of up to $times 3$ -- while provably maintaining high performance.
arXiv Detail & Related papers (2022-07-14T17:00:19Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - ARMAN: Pre-training with Semantically Selecting and Reordering of
Sentences for Persian Abstractive Summarization [7.16879432974126]
We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue.
In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary.
We show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore.
arXiv Detail & Related papers (2021-09-09T08:35:39Z) - FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity
Attention Layer based on BERT [0.5772546394254112]
We develop a model based on BERT, a state-of-the-art transformer network.
We are ranked first in the leader board with test accuracy of 87.79%.
arXiv Detail & Related papers (2020-08-22T08:04:21Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Sparse Text Generation [7.747003493657217]
Current text generators require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, as in top-$k$ or nucleus sampling.
In this paper, we use the recently introduced entmax transformation to train and sample from a sparse language model, avoiding this mismatch.
The result is a text generator with favorable performance in terms of fluency and consistency, fewer repetitions, and n-gram diversity closer to human text.
arXiv Detail & Related papers (2020-04-06T13:09:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.