On the Robustness of Reading Comprehension Models to Entity Renaming
- URL: http://arxiv.org/abs/2110.08555v1
- Date: Sat, 16 Oct 2021 11:46:32 GMT
- Title: On the Robustness of Reading Comprehension Models to Entity Renaming
- Authors: Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia,
Xiang Ren
- Abstract summary: We study the robustness of machine reading comprehension (MRC) models to entity renaming.
We propose a general and scalable method to replace person names with names from a variety of sources.
We find that MRC models consistently perform worse when entities are renamed.
- Score: 44.11484801074727
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the robustness of machine reading comprehension (MRC) models to
entity renaming -- do models make more wrong predictions when answer entities
have different names? Such failures would indicate that models are overly
reliant on entity knowledge to answer questions, and therefore may generalize
poorly when facts about the world change or questions are asked about novel
entities. To systematically audit model robustness, we propose a general and
scalable method to replace person names with names from a variety of sources,
ranging from common English names to names from other languages to arbitrary
strings. Across four datasets and three pretrained model architectures, MRC
models consistently perform worse when entities are renamed, with particularly
large accuracy drops on datasets constructed via distant supervision. We also
find large differences between models: SpanBERT, which is pretrained with
span-level masking, is more robust than RoBERTa, despite having similar
accuracy on unperturbed test data. Inspired by this, we experiment with
span-level and entity-level masking as a continual pretraining objective and
find that they can further improve the robustness of MRC models.
Related papers
- Multicultural Name Recognition For Previously Unseen Names [65.268245109828]
This paper attempts to improve recognition of person names, a diverse category that can grow any time someone is born or changes their name.
I look at names from 103 countries to compare how well the model performs on names from different cultures.
I find that a model with combined character and word input outperforms word-only models and may improve on accuracy compared to classical NER models.
arXiv Detail & Related papers (2024-01-23T17:58:38Z) - Rethinking Masked Language Modeling for Chinese Spelling Correction [70.85829000570203]
We study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model.
We find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns.
We demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model.
arXiv Detail & Related papers (2023-05-28T13:19:12Z) - Evaluating the Robustness of Machine Reading Comprehension Models to Low Resource Entity Renaming [3.117224133280308]
We explore robustness of MRC models to entity renaming.
We rename entities of type: country, person, nationality, location, organization, and city.
We find that compared to base models, large models perform well comparatively on novel entities.
arXiv Detail & Related papers (2023-04-06T15:29:57Z) - An Understanding-Oriented Robust Machine Reading Comprehension Model [12.870425062204035]
We propose an understanding-oriented machine reading comprehension model to address three kinds of robustness issues.
Specifically, we first use a natural language inference module to help the model understand the accurate semantic meanings of input questions.
Third, we propose a multilanguage learning mechanism to address the issue of generalization.
arXiv Detail & Related papers (2022-07-01T03:32:02Z) - A Comparative Study of Transformer-Based Language Models on Extractive
Question Answering [0.5079811885340514]
We train various pre-trained language models and fine-tune them on multiple question answering datasets.
Using the F1-score as our metric, we find that the RoBERTa and BART pre-trained models perform the best across all datasets.
arXiv Detail & Related papers (2021-10-07T02:23:19Z) - Exploring Strategies for Generalizable Commonsense Reasoning with
Pre-trained Models [62.28551903638434]
We measure the impact of three different adaptation methods on the generalization and accuracy of models.
Experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers.
We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.
arXiv Detail & Related papers (2021-09-07T03:13:06Z) - A Realistic Study of Auto-regressive Language Models for Named Entity
Typing and Recognition [7.345578385749421]
We study pre-trained language models for named entity recognition in a meta-learning setup.
First, we test named entity typing (NET) in a zero-shot transfer scenario. Then, we perform NER by giving few examples at inference.
We propose a method to select seen and rare / unseen names when having access only to the pre-trained model and report results on these groups.
arXiv Detail & Related papers (2021-08-26T15:29:00Z) - Towards Trustworthy Deception Detection: Benchmarking Model Robustness
across Domains, Modalities, and Languages [10.131671217810581]
We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.
We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT orGLoVe.
arXiv Detail & Related papers (2021-04-23T18:05:52Z) - Learning Contextual Representations for Semantic Parsing with
Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.