PAI at SemEval-2023 Task 2: A Universal System for Named Entity
Recognition with External Entity Information
- URL: http://arxiv.org/abs/2305.06099v1
- Date: Wed, 10 May 2023 12:40:48 GMT
- Title: PAI at SemEval-2023 Task 2: A Universal System for Named Entity
Recognition with External Entity Information
- Authors: Long Ma, Kai Lu, Tianbo Che, Hailong Huang, Weiguo Gao, Xuan Li
- Abstract summary: The MultiCoNER II task aims to detect complex, ambiguous, and fine-grained named entities in low-context situations and noisy scenarios.
Our system retrieves entities with properties from the knowledge base (i.e. Wikipedia) for a given text, then retrieves entity information with the input sentence and feeds it into Transformer-based models.
- Score: 19.995198769980345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The MultiCoNER II task aims to detect complex, ambiguous, and fine-grained
named entities in low-context situations and noisy scenarios like the presence
of spelling mistakes and typos for multiple languages. The task poses
significant challenges due to the scarcity of contextual information, the high
granularity of the entities(up to 33 classes), and the interference of noisy
data. To address these issues, our team {\bf PAI} proposes a universal Named
Entity Recognition (NER) system that integrates external entity information to
improve performance. Specifically, our system retrieves entities with
properties from the knowledge base (i.e. Wikipedia) for a given text, then
concatenates entity information with the input sentence and feeds it into
Transformer-based models. Finally, our system wins 2 first places, 4 second
places, and 1 third place out of 13 tracks. The code is publicly available at
\url{https://github.com/diqiuzhuanzhuan/semeval-2023}.
Related papers
- DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios.
Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers.
We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z) - IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech
Entity Linking [32.915297772110364]
Speech Entity Linking aims to recognize and disambiguate named entities in spoken languages.
Conventional methods suffer from the unfettered speech styles and the noisy transcripts generated by ASR systems.
We propose Knowledge Enhanced Named Entity Recognition (KENER), which focuses on improving robustness through painlessly incorporating proper knowledge in the entity recognition stage.
Our system achieves 1st place in Track 1 and 2nd place in Track 2 of NLPCC-2022 Shared Task 2.
arXiv Detail & Related papers (2022-09-27T06:43:56Z) - UM6P-CS at SemEval-2022 Task 11: Enhancing Multilingual and Code-Mixed
Complex Named Entity Recognition via Pseudo Labels using Multilingual
Transformer [7.270980742378389]
We introduce our submitted system to the Multilingual Complex Named Entity Recognition (MultiCoNER) shared task.
We approach the complex NER for multilingual and code-mixed queries, by relying on the contextualized representation provided by the multilingual Transformer XLM-RoBERTa.
Our proposed system is ranked 6th and 8th in the multilingual and code-mixed MultiCoNER's tracks respectively.
arXiv Detail & Related papers (2022-04-28T14:07:06Z) - DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for
Multilingual Named Entity Recognition [94.1865071914727]
MultiCoNER aims at detecting semantically ambiguous named entities in short and low-context settings for multiple languages.
Our team DAMO-NLP proposes a knowledge-based system, where we build a multilingual knowledge base based on Wikipedia.
Given an input sentence, our system effectively retrieves related contexts from the knowledge base.
Our system wins 10 out of 13 tracks in the MultiCoNER shared task.
arXiv Detail & Related papers (2022-03-01T15:29:35Z) - MobIE: A German Dataset for Named Entity Recognition, Entity Linking and
Relation Extraction in the Mobility Domain [76.21775236904185]
dataset consists of 3,232 social media texts and traffic reports with 91K tokens, and contains 20.5K annotated entities.
A subset of the dataset is human-annotated with seven mobility-related, n-ary relation types.
To the best of our knowledge, this is the first German-language dataset that combines annotations for NER, EL and RE.
arXiv Detail & Related papers (2021-08-16T08:21:50Z) - KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge
Graph [23.053995137917994]
We propose a novel entity synonyms discovery framework, named emphKGSynNet.
Specifically, we pre-train subword embeddings for mentions and entities using a large-scale domain-specific corpus.
We employ a specifically designed emphfusion gate to adaptively absorb the entities' knowledge information into their semantic features.
arXiv Detail & Related papers (2021-03-16T07:32:33Z) - Autoregressive Entity Retrieval [55.38027440347138]
Entities are at the center of how we represent and aggregate knowledge.
The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering.
We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
arXiv Detail & Related papers (2020-10-02T10:13:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.