Automatic Prediction of the Performance of Every Parser
- URL: http://arxiv.org/abs/2407.05116v1
- Date: Sat, 6 Jul 2024 15:49:24 GMT
- Title: Automatic Prediction of the Performance of Every Parser
- Authors: Ergun Biçici,
- Abstract summary: We present a new performance prediction (PPP) model using machine translation performance prediction system (MTPPS)
This new system, MTPPS-PPP, can predict the performance of any language and can be useful for estimating the grammatical difficulty when understanding a text.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a new parser performance prediction (PPP) model using machine translation performance prediction system (MTPPS), statistically independent of any language or parser, relying only on extrinsic and novel features based on textual, link structural, and bracketing tree structural information. This new system, MTPPS-PPP, can predict the performance of any parser in any language and can be useful for estimating the grammatical difficulty when understanding a given text, for setting expectations from parsing output, for parser selection for a specific domain, and for parser combination systems. We obtain SoA results in PPP of bracketing $F_1$ with better results over textual features and similar performance with previous results that use parser and linguistic label specific information. Our results show the contribution of different types of features as well as rankings of individual features in different experimental settings (cased vs. uncased), in different learning tasks (in-domain vs. out-of-domain), with different training sets, with different learning algorithms, and with different dimensionality reduction techniques. We achieve $0.0678$ MAE and $0.85$ RAE in setting +Link, which corresponds to about $7.4\%$ error when predicting the bracketing $F_1$ score for the Charniak and Johnson parser on the WSJ23 test set. MTPPS-PPP system can predict without parsing using only the text, without a supervised parser using only an unsupervised parser, without any parser or language dependent information, without using a reference parser output, and can be used to predict the performance of any parser in any language.
Related papers
- AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine [33.22885510488797]
We introduce an Adaptive Parallel PDF Parsing and Resource Scaling Engine (AdaParse)<n>AdaParse is a data-driven strategy for assigning an appropriate parsed document to each document.<n>We show that AdaParse, when compared to state-of-the-art parsings, improves throughput by $17$times while still achieving comparable accuracy (0.2 percent better) on a benchmark set of 1000 scientific documents.
arXiv Detail & Related papers (2025-04-23T18:38:41Z) - Inferring Input Grammars from Code with Symbolic Parsing [12.567395326774754]
Common test generation techniques rely on sample inputs, which are abstracted into matching grammars and/or evolved guided by test coverage.
In this work, we present the first technique for symbolically automatically generating input grammars from the code of descents.
The resulting grammars cover the entire input space, allowing for comprehensive and effective test generation, reverse engineering, and documentation.
arXiv Detail & Related papers (2025-03-11T14:40:56Z) - Compositional Program Generation for Few-Shot Systematic Generalization [59.57656559816271]
This study on a neuro-symbolic architecture called the Compositional Program Generator (CPG)
CPG has three key features: textitmodularity, textitcomposition, and textitabstraction, in the form of grammar rules.
It perfect achieves generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS.
arXiv Detail & Related papers (2023-09-28T14:33:20Z) - Few-Shot Adaptation for Parsing Contextual Utterances with LLMs [25.22099517947426]
In real-world settings, there typically exists only a limited number of contextual utterances due to annotation cost.
We examine four major paradigms for doing so in conversational semantic parsing.
Experiments with in-context learning and fine-tuning suggest that Rewrite-then-Parse is the most promising paradigm.
arXiv Detail & Related papers (2023-09-18T21:35:19Z) - Evaluating the Impact of Source Code Parsers on ML4SE Models [3.699097874146491]
We evaluate two models, namely, Supernorm2Seq and TreeLSTM, in the name prediction language.
We show that trees built by differents vary in their structure and content.
We then analyze how this diversity affects the models' quality.
arXiv Detail & Related papers (2022-06-17T12:10:04Z) - Penn-Helsinki Parsed Corpus of Early Modern English: First Parsing
Results and Analysis [2.8749014299466444]
We present the first parsing results on the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME), a 1.9 million word treebank.
We describe key features of PPCEME that make it challenging for parsing, including a larger and more varied set of function tags than in the Penn Treebank.
arXiv Detail & Related papers (2021-12-15T23:56:21Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z) - Strongly Incremental Constituency Parsing with Graph Neural Networks [70.16880251349093]
Parsing sentences into syntax trees can benefit downstream applications in NLP.
Transition-baseds build trees by executing actions in a state transition system.
Existing transition-baseds are predominantly based on the shift-reduce transition system.
arXiv Detail & Related papers (2020-10-27T19:19:38Z) - A Simple Global Neural Discourse Parser [61.728994693410954]
We propose a simple chart-based neural discourse that does not require any manually-crafted features and is based on learned span representations only.
We empirically demonstrate that our model achieves the best performance among globals, and comparable performance to state-of-art greedys.
arXiv Detail & Related papers (2020-09-02T19:28:40Z) - A Tale of a Probe and a Parser [74.14046092181947]
Measuring what linguistic information is encoded in neural models of language has become popular in NLP.
Researchers approach this enterprise by training "probes" - supervised models designed to extract linguistic structure from another model's output.
One such probe is the structural probe, designed to quantify the extent to which syntactic information is encoded in contextualised word representations.
arXiv Detail & Related papers (2020-05-04T16:57:31Z) - Towards Instance-Level Parser Selection for Cross-Lingual Transfer of
Dependency Parsers [59.345145623931636]
We argue for a novel cross-lingual transfer paradigm: instance-level selection (ILPS)
We present a proof-of-concept study focused on instance-level selection in the framework of delexicalized transfer.
arXiv Detail & Related papers (2020-04-16T13:18:55Z) - Bootstrapping a Crosslingual Semantic Parser [74.99223099702157]
We adapt a semantic trained on a single language, such as English, to new languages and multiple domains with minimal annotation.
We query if machine translation is an adequate substitute for training data, and extend this to investigate bootstrapping using joint training with English, paraphrasing, and multilingual pre-trained models.
arXiv Detail & Related papers (2020-04-06T12:05:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.