On the Frailty of Universal POS Tags for Neural UD Parsers
- URL: http://arxiv.org/abs/2010.01830v3
- Date: Wed, 14 Oct 2020 05:47:29 GMT
- Title: On the Frailty of Universal POS Tags for Neural UD Parsers
- Authors: Mark Anderson and Carlos G\'omez-Rodr\'iguez
- Abstract summary: We show that leveraging UPOS tags as features for neurals requires a prohibitively high tagging accuracy and that the use of gold tags offers a non-linear increase in performance.
We also investigate what aspects of predicted UPOS tags impact parsing accuracy the most, highlighting some potentially meaningful linguistic facets of the problem.
- Score: 3.7311680121118345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an analysis on the effect UPOS accuracy has on parsing
performance. Results suggest that leveraging UPOS tags as features for neural
parsers requires a prohibitively high tagging accuracy and that the use of gold
tags offers a non-linear increase in performance, suggesting some sort of
exceptionality. We also investigate what aspects of predicted UPOS tags impact
parsing accuracy the most, highlighting some potentially meaningful linguistic
facets of the problem.
Related papers
- J2N -- Nominal Adjective Identification and its Application [1.2694721486451528]
This paper explores the challenges posed by nominal adjectives (NAs) in natural language processing (NLP) tasks.
We propose treating NAs as a distinct POS tag, "JN," and investigate its impact on POS tagging, BIO chunking, and coreference resolution.
arXiv Detail & Related papers (2024-09-22T09:33:54Z) - Predicting generalization performance with correctness discriminators [64.00420578048855]
We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data.
We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds.
arXiv Detail & Related papers (2023-11-15T22:43:42Z) - On the Importance of Signer Overlap for Sign Language Detection [65.26091369630547]
We argue that the current benchmark data sets for sign language detection estimate overly positive results that do not generalize well.
We quantify this with a detailed analysis of the effect of signer overlap on current sign detection benchmark data sets.
We propose new data set partitions that are free of overlap and allow for more realistic performance assessment.
arXiv Detail & Related papers (2023-03-19T22:15:05Z) - Influence Functions for Sequence Tagging Models [49.81774968547377]
We extend influence functions to trace predictions back to the training points that informed them.
We show the practical utility of segment influence by using the method to identify systematic annotation errors.
arXiv Detail & Related papers (2022-10-25T17:13:11Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - What Taggers Fail to Learn, Parsers Need the Most [0.38073142980733]
We present an error analysis of neural UPOS taggers to evaluate why using gold standard tags has such a large positive contribution to parsing performance.
We evaluate what neural dependencys implicitly learn about word types and how this relates to the errors taggers make to explain the minimal impact using predicted tags.
arXiv Detail & Related papers (2021-04-02T15:04:56Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z) - Reliable Part-of-Speech Tagging of Historical Corpora through Set-Valued Prediction [21.67895423776014]
We consider POS tagging within the framework of set-valued prediction.
We find that extending state-of-the-art POS taggers to set-valued prediction yields more precise and robust taggings.
arXiv Detail & Related papers (2020-08-04T07:21:36Z) - Adversarial Transfer Learning for Punctuation Restoration [58.2201356693101]
Adversarial multi-task learning is introduced to learn task invariant knowledge for punctuation prediction.
Experiments are conducted on IWSLT2011 datasets.
arXiv Detail & Related papers (2020-04-01T06:19:56Z) - Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing? [22.93722845643562]
We show that POS tagging can still significantly improve parsing performance when using the Stack joint framework.
Considering that it is much cheaper to annotate POS tags than parse trees, we also investigate the utilization of large-scale heterogeneous POS tag data.
arXiv Detail & Related papers (2020-03-06T13:47:30Z) - Machine Learning Approaches for Amharic Parts-of-speech Tagging [0.0]
Performance of the current POS taggers in Amharic is not as good as that of the contemporary POS taggers available for English and other European languages.
The aim of this work is to improve POS tagging performance for the Amharic language, which was never above 91%.
arXiv Detail & Related papers (2020-01-10T06:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.