TURNER: The Uncertainty-based Retrieval Framework for Chinese NER
- URL: http://arxiv.org/abs/2202.09022v1
- Date: Fri, 18 Feb 2022 05:05:22 GMT
- Title: TURNER: The Uncertainty-based Retrieval Framework for Chinese NER
- Authors: Zhichao Geng, Hang Yan, Zhangyue Yin, Chenxin An, Xipeng Qiu
- Abstract summary: We propose TURNER: The Uncertainty-based Retrieval framework for Chinese NER.
The idea behind TURNER is to imitate human behavior: we frequently retrieve auxiliary knowledge as assistance when encountering an unknown or uncertain entity.
Experiments on four benchmark datasets demonstrate TURNER's effectiveness.
- Score: 46.063487367225754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chinese NER is a difficult undertaking due to the ambiguity of Chinese
characters and the absence of word boundaries. Previous work on Chinese NER
focus on lexicon-based methods to introduce boundary information and reduce
out-of-vocabulary (OOV) cases during prediction. However, it is expensive to
obtain and dynamically maintain high-quality lexicons in specific domains,
which motivates us to utilize more general knowledge resources, e.g., search
engines. In this paper, we propose TURNER: The Uncertainty-based Retrieval
framework for Chinese NER. The idea behind TURNER is to imitate human behavior:
we frequently retrieve auxiliary knowledge as assistance when encountering an
unknown or uncertain entity. To improve the efficiency and effectiveness of
retrieval, we first propose two types of uncertainty sampling methods for
selecting the most ambiguous entity-level uncertain components of the input
text. Then, the Knowledge Fusion Model re-predict the uncertain samples by
combining retrieved knowledge. Experiments on four benchmark datasets
demonstrate TURNER's effectiveness. TURNER outperforms existing lexicon-based
approaches and achieves the new SOTA.
Related papers
- FecTek: Enhancing Term Weight in Lexicon-Based Retrieval with Feature Context and Term-level Knowledge [54.61068946420894]
We introduce an innovative method by introducing FEature Context and TErm-level Knowledge modules.
To effectively enrich the feature context representations of term weight, the Feature Context Module (FCM) is introduced.
We also develop a term-level knowledge guidance module (TKGM) for effectively utilizing term-level knowledge to intelligently guide the modeling process of term weight.
arXiv Detail & Related papers (2024-04-18T12:58:36Z) - SCANNER: Knowledge-Enhanced Approach for Robust Multi-modal Named Entity Recognition of Unseen Entities [10.193908215351497]
We propose SCANNER, a model capable of effectively handling all three NER variants.
SCANNER is a two-stage structure; we extract entity candidates in the first stage and use it as a query to get knowledge.
To tackle the challenges arising from noisy annotations in NER datasets, we introduce a novel self-distillation method.
arXiv Detail & Related papers (2024-04-02T13:05:41Z) - On Significance of Subword tokenization for Low Resource and Efficient
Named Entity Recognition: A case study in Marathi [1.6383036433216434]
We focus on NER for low-resource language and present our case study in the context of the Indian language Marathi.
We propose a hybrid approach for efficient NER by integrating a BERT-based subword tokenizer into vanilla CNN/LSTM models.
We show that this simple approach of replacing a traditional word-based tokenizer with a BERT-tokenizer brings the accuracy of vanilla single-layer models closer to that of deep pre-trained models like BERT.
arXiv Detail & Related papers (2023-12-03T06:53:53Z) - Empirical Study of Zero-Shot NER with ChatGPT [19.534329209433626]
Large language models (LLMs) exhibited powerful capability in various natural language processing tasks.
This work focuses on exploring LLM performance on zero-shot information extraction.
Inspired by the remarkable reasoning capability of LLM on symbolic and arithmetic reasoning, we adapt the prevalent reasoning methods to NER.
arXiv Detail & Related papers (2023-10-16T03:40:03Z) - IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - Competency-Aware Neural Machine Translation: Can Machine Translation
Know its Own Translation Quality? [61.866103154161884]
Neural machine translation (NMT) is often criticized for failures that happen without awareness.
We propose a novel competency-aware NMT by extending conventional NMT with a self-estimator.
We show that the proposed method delivers outstanding performance on quality estimation.
arXiv Detail & Related papers (2022-11-25T02:39:41Z) - Improving Chinese Named Entity Recognition by Search Engine Augmentation [2.971423962840551]
We propose a neural-based approach to perform semantic augmentation using external knowledge from search engine for Chinese NER.
In particular, a multi-channel semantic fusion model is adopted to generate the augmented input representations, which aggregates external related texts retrieved from the search engine.
arXiv Detail & Related papers (2022-10-23T08:42:05Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.