What's in a Name? Are BERT Named Entity Representations just as Good for
any other Name?
- URL: http://arxiv.org/abs/2007.06897v1
- Date: Tue, 14 Jul 2020 08:14:00 GMT
- Title: What's in a Name? Are BERT Named Entity Representations just as Good for
any other Name?
- Authors: Sriram Balasubramanian, Naman Jain, Gaurav Jindal, Abhijeet Awasthi,
Sunita Sarawagi
- Abstract summary: We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input.
We provide a simple method that ensembles predictions from multiple replacements while jointly modeling the uncertainty of type annotations and label predictions.
Experiments on three NLP tasks show that our method enhances robustness and increases accuracy on both natural and adversarial datasets.
- Score: 18.11382921200802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We evaluate named entity representations of BERT-based NLP models by
investigating their robustness to replacements from the same typed class in the
input. We highlight that on several tasks while such perturbations are natural,
state of the art trained models are surprisingly brittle. The brittleness
continues even with the recent entity-aware BERT models. We also try to discern
the cause of this non-robustness, considering factors such as tokenization and
frequency of occurrence. Then we provide a simple method that ensembles
predictions from multiple replacements while jointly modeling the uncertainty
of type annotations and label predictions. Experiments on three NLP tasks show
that our method enhances robustness and increases accuracy on both natural and
adversarial datasets.
Related papers
- Multicultural Name Recognition For Previously Unseen Names [65.268245109828]
This paper attempts to improve recognition of person names, a diverse category that can grow any time someone is born or changes their name.
I look at names from 103 countries to compare how well the model performs on names from different cultures.
I find that a model with combined character and word input outperforms word-only models and may improve on accuracy compared to classical NER models.
arXiv Detail & Related papers (2024-01-23T17:58:38Z) - Continual Named Entity Recognition without Catastrophic Forgetting [37.316700599440935]
We introduce a pooled feature distillation loss that skillfully navigates the trade-off between retaining knowledge of old entity types and acquiring new ones.
We develop a confidence-based pseudo-labeling for the non-entity type.
We suggest an adaptive re-weighting type-balanced learning strategy to handle the issue of biased type distribution.
arXiv Detail & Related papers (2023-10-23T03:45:30Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - The Topological BERT: Transforming Attention into Topology for Natural
Language Processing [0.0]
This paper introduces a text classifier using topological data analysis.
We use BERT's attention maps transformed into attention graphs as the only input to that classifier.
The model can solve tasks such as distinguishing spam from ham messages, recognizing whether a sentence is grammatically correct, or evaluating a movie review as negative or positive.
arXiv Detail & Related papers (2022-06-30T11:25:31Z) - Active Learning by Feature Mixing [52.16150629234465]
We propose a novel method for batch active learning called ALFA-Mix.
We identify unlabelled instances with sufficiently-distinct features by seeking inconsistencies in predictions.
We show that inconsistencies in these predictions help discovering features that the model is unable to recognise in the unlabelled instances.
arXiv Detail & Related papers (2022-03-14T12:20:54Z) - Breaking BERT: Understanding its Vulnerabilities for Named Entity
Recognition through Adversarial Attack [10.871587311621974]
Both generic and domain-specific BERT models are widely used for natural language processing (NLP) tasks.
In this paper we investigate the vulnerability of BERT models to variation in input data for Named Entity Recognition (NER) through adversarial attack.
arXiv Detail & Related papers (2021-09-23T11:47:27Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight
Gated Injection Method [29.352569563032056]
We propose a novel method to explicitly inject linguistic knowledge in the form of word embeddings into a pre-trained BERT.
Our performance improvements on multiple semantic similarity datasets when injecting dependency-based and counter-fitted embeddings indicate that such information is beneficial and currently missing from the original model.
arXiv Detail & Related papers (2020-10-23T17:00:26Z) - Adv-BERT: BERT is not robust on misspellings! Generating nature
adversarial samples on BERT [95.88293021131035]
It is unclear, however, how the models will perform in realistic scenarios where textitnatural rather than malicious adversarial instances often exist.
This work systematically explores the robustness of BERT, the state-of-the-art Transformer-style model in NLP, in dealing with noisy data.
arXiv Detail & Related papers (2020-02-27T22:07:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.