Related papers: Automated Testing and Improvement of Named Entity Recognition Systems

Automated Testing and Improvement of Named Entity Recognition Systems

URL: http://arxiv.org/abs/2308.07937v1
Date: Mon, 14 Aug 2023 03:17:24 GMT
Title: Automated Testing and Improvement of Named Entity Recognition Systems
Authors: Boxi Yu, Yiyan Hu, Qiuyang Mang, Wenhan Hu, Pinjia He
Abstract summary: TIN is a novel, widely applicable approach for automatically testing and repairing NER systems. We use TIN to test two SOTA NER models and two commercial NER APIs, i.e., Azure NER and AWS NER. TIN achieves a high error reduction rate (26.8%-50.6%) over the four systems under test, which successfully repairs 1,056 out of the 1,877 reported NER errors.
Score: 3.8293110324859505
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Named entity recognition (NER) systems have seen rapid progress in recent years due to the development of deep neural networks. These systems are widely used in various natural language processing applications, such as information extraction, question answering, and sentiment analysis. However, the complexity and intractability of deep neural networks can make NER systems unreliable in certain circumstances, resulting in incorrect predictions. For example, NER systems may misidentify female names as chemicals or fail to recognize the names of minority groups, leading to user dissatisfaction. To tackle this problem, we introduce TIN, a novel, widely applicable approach for automatically testing and repairing various NER systems. The key idea for automated testing is that the NER predictions of the same named entities under similar contexts should be identical. The core idea for automated repairing is that similar named entities should have the same NER prediction under the same context. We use TIN to test two SOTA NER models and two commercial NER APIs, i.e., Azure NER and AWS NER. We manually verify 784 of the suspicious issues reported by TIN and find that 702 are erroneous issues, leading to high precision (85.0%-93.4%) across four categories of NER errors: omission, over-labeling, incorrect category, and range error. For automated repairing, TIN achieves a high error reduction rate (26.8%-50.6%) over the four systems under test, which successfully repairs 1,056 out of the 1,877 reported NER errors.

Related papers

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training. We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios. Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z)
Uncertainty Estimation on Sequential Labeling via Uncertainty Transmission [21.426225910784364]
NER tasks aim to extract entities and predict their labels given a text. This work focuses on UE-NER, which aims to estimate uncertainty scores for the NER predictions. We propose a Sequential Labeling Posterior Network (SLPN) to estimate uncertainty scores for the extracted entities.
arXiv Detail & Related papers (2023-11-15T06:36:29Z)
NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval [49.827932299460514]
We argue that capabilities provided by large language models are not the end of NER research, but rather an exciting beginning. We present three variants of the NER task, together with a dataset to support them. We provide a large, silver-annotated corpus of 4 million paragraphs covering 500 entity types.
arXiv Detail & Related papers (2023-10-22T12:23:00Z)
PromptNER: Prompting For Named Entity Recognition [27.501500279749475]
We introduce PromptNER, a new state-of-the-art algorithm for few-Shot and cross-domain NER. PromptNER achieves a 4% (absolute) improvement in F1 score on the ConLL dataset, a 9% (absolute) improvement on the GENIA dataset, and a 4% (absolute) improvement on the FewNERD dataset.
arXiv Detail & Related papers (2023-05-24T07:38:24Z)
Neuroevolutionary algorithms driven by neuron coverage metrics for semi-supervised classification [60.60571130467197]
In some machine learning applications the availability of labeled instances for supervised classification is limited while unlabeled instances are abundant. We introduce neuroevolutionary approaches that exploit unlabeled instances by using neuron coverage metrics computed on the neural network architecture encoded by each candidate solution.
arXiv Detail & Related papers (2023-03-05T23:38:44Z)
Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER) We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type. A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z)
DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants [10.500933545429202]
In intelligent voice assistants, where NER is an important component, input to NER may be noisy because of user or speech recognition error. We describe a NER system intended to address these problems. We show that this technique improves related tasks, such as semantic parsing, with an improvement of up to 5% in error rate.
arXiv Detail & Related papers (2021-08-15T00:14:47Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition [16.43239147870092]
We propose an Adversarial Trained LSTM-CNN (ASTRAL) system to improve the current NER method from both the model structure and the training process. Our system is evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving state-of-the-art results.
arXiv Detail & Related papers (2020-09-02T13:15:25Z)
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification [71.45033077934723]
We incorporate Bayesian neural networks (BNNs) into the deep neural network (DNN) x-vector speaker verification system. With the weight uncertainty modeling provided by BNNs, we expect the system could generalize better on the evaluation data. Results show that the system could benefit from BNNs by a relative EER decrease of 2.66% and 2.32% respectively for short- and long-utterance in-domain evaluations.
arXiv Detail & Related papers (2020-04-08T14:35:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.