What Taggers Fail to Learn, Parsers Need the Most
- URL: http://arxiv.org/abs/2104.01083v1
- Date: Fri, 2 Apr 2021 15:04:56 GMT
- Title: What Taggers Fail to Learn, Parsers Need the Most
- Authors: Mark Anderson and Carlos G\'omez-Rodr\'iguez
- Abstract summary: We present an error analysis of neural UPOS taggers to evaluate why using gold standard tags has such a large positive contribution to parsing performance.
We evaluate what neural dependencys implicitly learn about word types and how this relates to the errors taggers make to explain the minimal impact using predicted tags.
- Score: 0.38073142980733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an error analysis of neural UPOS taggers to evaluate why using
gold standard tags has such a large positive contribution to parsing
performance while using predicted UPOS tags either harms performance or offers
a negligible improvement. We evaluate what neural dependency parsers implicitly
learn about word types and how this relates to the errors taggers make to
explain the minimal impact using predicted tags has on parsers. We also present
a short analysis on what contexts result in reductions in tagging performance.
We then mask UPOS tags based on errors made by taggers to tease away the
contribution of UPOS tags which taggers succeed and fail to classify correctly
and the impact of tagging errors.
Related papers
- Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition [49.42732949233184]
When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition.
Taking noisy labels as ground-truth in the loss function results in suboptimal performance.
We propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels.
arXiv Detail & Related papers (2023-08-12T12:13:52Z) - Another Dead End for Morphological Tags? Perturbed Inputs and Parsing [12.234169944475537]
We show that morphological tags can play a role to correct word-only neurals that make mistakes.
We also show that if morphological tags were utopically robust against lexical perturbations, they would be able to correct mistakes.
arXiv Detail & Related papers (2023-05-24T13:11:04Z) - Parsing linearizations appreciate PoS tags - but some are fussy about
errors [12.024457689086008]
PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning.
Recent work on the impact of PoS tags on graph- and transition-based labelings suggests that they are only useful when tagging accuracy is high, or in low-resource scenarios.
We undertake a study and uncover some trends. Among them, PoS tags are generally more useful for sequence labelings than for other paradigms, but the impact of their accuracy is highly encoding-dependent.
arXiv Detail & Related papers (2022-10-27T07:15:36Z) - Influence Functions for Sequence Tagging Models [49.81774968547377]
We extend influence functions to trace predictions back to the training points that informed them.
We show the practical utility of segment influence by using the method to identify systematic annotation errors.
arXiv Detail & Related papers (2022-10-25T17:13:11Z) - Rethink about the Word-level Quality Estimation for Machine Translation
from Human Judgement [57.72846454929923]
We create a benchmark dataset, emphHJQE, where the expert translators directly annotate poorly translated words.
We propose two tag correcting strategies, namely tag refinement strategy and tree-based annotation strategy, to make the TER-based artificial QE corpus closer to emphHJQE.
The results show our proposed dataset is more consistent with human judgement and also confirm the effectiveness of the proposed tag correcting strategies.
arXiv Detail & Related papers (2022-09-13T02:37:12Z) - On the Frailty of Universal POS Tags for Neural UD Parsers [3.7311680121118345]
We show that leveraging UPOS tags as features for neurals requires a prohibitively high tagging accuracy and that the use of gold tags offers a non-linear increase in performance.
We also investigate what aspects of predicted UPOS tags impact parsing accuracy the most, highlighting some potentially meaningful linguistic facets of the problem.
arXiv Detail & Related papers (2020-10-05T07:40:35Z) - Reliable Part-of-Speech Tagging of Historical Corpora through Set-Valued Prediction [21.67895423776014]
We consider POS tagging within the framework of set-valued prediction.
We find that extending state-of-the-art POS taggers to set-valued prediction yields more precise and robust taggings.
arXiv Detail & Related papers (2020-08-04T07:21:36Z) - On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data.
Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z) - Adversarial Transfer Learning for Punctuation Restoration [58.2201356693101]
Adversarial multi-task learning is introduced to learn task invariant knowledge for punctuation prediction.
Experiments are conducted on IWSLT2011 datasets.
arXiv Detail & Related papers (2020-04-01T06:19:56Z) - Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing? [22.93722845643562]
We show that POS tagging can still significantly improve parsing performance when using the Stack joint framework.
Considering that it is much cheaper to annotate POS tags than parse trees, we also investigate the utilization of large-scale heterogeneous POS tag data.
arXiv Detail & Related papers (2020-03-06T13:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.