Context-aware Adversarial Training for Name Regularity Bias in Named
Entity Recognition
- URL: http://arxiv.org/abs/2107.11610v1
- Date: Sat, 24 Jul 2021 13:55:35 GMT
- Title: Context-aware Adversarial Training for Name Regularity Bias in Named
Entity Recognition
- Authors: Abbas Ghaddar, Philippe Langlais, Ahmad Rashid, Mehdi Rezagholizadeh
- Abstract summary: We introduce NRB, a new testbed designed to diagnose Name Regularity Bias of NER models.
Our results indicate that all state-of-the-art models we tested show such a bias.
We propose a novel model-agnostic training method that adds learnable adversarial noise to some entity mentions.
- Score: 8.344476599818826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we examine the ability of NER models to use contextual
information when predicting the type of an ambiguous entity. We introduce NRB,
a new testbed carefully designed to diagnose Name Regularity Bias of NER
models. Our results indicate that all state-of-the-art models we tested show
such a bias; BERT fine-tuned models significantly outperforming feature-based
(LSTM-CRF) ones on NRB, despite having comparable (sometimes lower) performance
on standard benchmarks.
To mitigate this bias, we propose a novel model-agnostic training method that
adds learnable adversarial noise to some entity mentions, thus enforcing models
to focus more strongly on the contextual signal, leading to significant gains
on NRB. Combining it with two other training strategies, data augmentation and
parameter freezing, leads to further gains.
Related papers
- Crowdsourcing with Difficulty: A Bayesian Rating Model for Heterogeneous Items [0.716879432974126]
In applied statistics and machine learning, the "gold standards" used for training are often biased and almost always noisy.
Dawid and Skene's justifiably popular crowdsourcing model adjusts for rater (coder, annotator) sensitivity and specificity, but fails to capture distributional properties of rating data gathered for training.
We introduce a general purpose measurement-error model with which we can infer consensus categories by adding item-level effects for difficulty, discriminativeness, and guessability.
arXiv Detail & Related papers (2024-05-29T20:59:28Z) - Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement [3.0820287240219795]
We propose a novel approach to mitigate biases in computer vision models by utilizing counterfactual generation and fine-tuning.
Our approach leverages a curriculum learning framework combined with a fine-grained adversarial loss to fine-tune the model using adversarial examples.
We validate our approach through both qualitative and quantitative assessments, demonstrating improved bias mitigation and accuracy compared to existing methods.
arXiv Detail & Related papers (2024-04-18T00:41:32Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - WSLRec: Weakly Supervised Learning for Neural Sequential Recommendation
Models [24.455665093145818]
We propose a novel model-agnostic training approach called WSLRec, which adopts a three-stage framework: pre-training, top-$k$ mining, intrinsic and fine-tuning.
WSLRec resolves the incompleteness problem by pre-training models on extra weak supervisions from model-free methods like BR and ItemCF, while resolving the inaccuracy problem by leveraging the top-$k$ mining to screen out reliable user-item relevance from weak supervisions for fine-tuning.
arXiv Detail & Related papers (2022-02-28T08:55:12Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.