Regularizing Models via Pointwise Mutual Information for Named Entity
Recognition
- URL: http://arxiv.org/abs/2104.07249v1
- Date: Thu, 15 Apr 2021 05:47:27 GMT
- Title: Regularizing Models via Pointwise Mutual Information for Named Entity
Recognition
- Authors: Minbyul Jeong and Jaewoo Kang
- Abstract summary: We propose a Pointwise Mutual Information (PMI) to enhance generalization ability while outperforming an in-domain performance.
Our approach enables to debias highly correlated word and labels in the benchmark datasets.
For long-named and complex-structure entities, our method can predict these entities through debiasing on conjunction or special characters.
- Score: 17.767466724342064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Named Entity Recognition (NER), pre-trained language models have been
overestimated by focusing on dataset biases to solve current benchmark
datasets. However, these biases hinder generalizability which is necessary to
address real-world situations such as weak name regularity and plenty of unseen
mentions. To alleviate the use of dataset biases and make the models fully
exploit data, we propose a debiasing method that our bias-only model can be
replaced with a Pointwise Mutual Information (PMI) to enhance generalization
ability while outperforming an in-domain performance. Our approach enables to
debias highly correlated word and labels in the benchmark datasets; reflect
informative statistics via subword frequency; alleviates a class imbalance
between positive and negative examples. For long-named and complex-structure
entities, our method can predict these entities through debiasing on
conjunction or special characters. Extensive experiments on both general and
biomedical domains demonstrate the effectiveness and generalization
capabilities of the PMI.
Related papers
- Fairness without Sensitive Attributes via Knowledge Sharing [13.141672574114597]
We propose a confidence-based hierarchical classifier structure called "Reckoner" for reliable fair model learning under the assumption of missing sensitive attributes.
Our experimental results show that Reckoner consistently outperforms state-of-the-art baselines in COMPAS dataset and New Adult dataset.
arXiv Detail & Related papers (2024-09-27T06:16:14Z) - Learning Decomposable and Debiased Representations via Attribute-Centric Information Bottlenecks [21.813755593742858]
Biased attributes, spuriously correlated with target labels in a dataset, can problematically lead to neural networks that learn improper shortcuts for classifications.
We propose a novel debiasing framework, Debiasing Global Workspace, introducing attention-based information bottlenecks for learning compositional representations of attributes.
We conduct comprehensive evaluations on biased datasets, along with both quantitative and qualitative analyses, to showcase our approach's efficacy.
arXiv Detail & Related papers (2024-03-21T05:33:49Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing [47.129713744669075]
We analyze the causes of dataset bias from the perspective of causal inference.
We propose CausalAPM, a generalizable literal disentangling framework to ameliorate the bias problem from feature granularity.
arXiv Detail & Related papers (2023-05-04T14:22:26Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Improving QA Generalization by Concurrent Modeling of Multiple Biases [61.597362592536896]
Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets.
We propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data.
We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths.
arXiv Detail & Related papers (2020-10-07T11:18:49Z) - Learning Unbiased Representations via Mutual Information Backpropagation [36.383338079229695]
In particular, we face the case where some attributes (bias) of the data, if learned by the model, can severely compromise its generalization properties.
We propose a novel end-to-end optimization strategy, which simultaneously estimates and minimizes the mutual information between the learned representation and the data attributes.
arXiv Detail & Related papers (2020-03-13T18:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.