Injecting Knowledge Base Information into End-to-End Joint Entity and
Relation Extraction and Coreference Resolution
- URL: http://arxiv.org/abs/2107.02286v1
- Date: Mon, 5 Jul 2021 21:49:02 GMT
- Title: Injecting Knowledge Base Information into End-to-End Joint Entity and
Relation Extraction and Coreference Resolution
- Authors: Severine Verlinden, Klim Zaporojets, Johannes Deleu, Thomas Demeester,
Chris Develder
- Abstract summary: We study how to inject information from a knowledge base (KB) in such IE model, based on unsupervised entity linking.
The used KB entity representations are learned from either (i) hyperlinked text documents (Wikipedia), or (ii) a knowledge graph (Wikidata)
- Score: 13.973471173349072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a joint information extraction (IE) model, solving named entity
recognition, coreference resolution and relation extraction jointly over the
whole document. In particular, we study how to inject information from a
knowledge base (KB) in such IE model, based on unsupervised entity linking. The
used KB entity representations are learned from either (i) hyperlinked text
documents (Wikipedia), or (ii) a knowledge graph (Wikidata), and appear
complementary in raising IE performance. Representations of corresponding
entity linking (EL) candidates are added to text span representations of the
input document, and we experiment with (i) taking a weighted average of the EL
candidate representations based on their prior (in Wikipedia), and (ii) using
an attention scheme over the EL candidate list. Results demonstrate an increase
of up to 5% F1-score for the evaluated IE tasks on two datasets. Despite a
strong performance of the prior-based model, our quantitative and qualitative
analysis reveals the advantage of using the attention-based approach.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - Entity Disambiguation via Fusion Entity Decoding [68.77265315142296]
We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions.
We observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA.
arXiv Detail & Related papers (2024-04-02T04:27:54Z) - DREQ: Document Re-Ranking Using Entity-based Query Understanding [6.675805308519988]
DREQ is an entity-oriented dense document re-ranking model.
We emphasize the query-relevant entities within a document's representation while simultaneously attenuating the less relevant ones.
We show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods.
arXiv Detail & Related papers (2024-01-11T14:27:12Z) - AKEM: Aligning Knowledge Base to Queries with Ensemble Model for Entity
Recognition and Linking [15.548722102706867]
This paper presents a novel approach to address the Entity Recognition and Linking Challenge at NLPCC 2015.
The task involves extracting named entity mentions from short search queries and linking them to entities within a reference Chinese knowledge base.
Our method is computationally efficient and achieves an F1 score of 0.535.
arXiv Detail & Related papers (2023-09-12T12:37:37Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Towards Consistent Document-level Entity Linking: Joint Models for
Entity Linking and Coreference Resolution [15.265013409559227]
We consider the task of document-level entity linking (EL)
We propose to join the EL task with that of coreference resolution (coref)
arXiv Detail & Related papers (2021-08-30T21:46:12Z) - Multimodal Entity Linking for Tweets [6.439761523935613]
multimodal entity linking (MEL) is an emerging research field in which textual and visual information is used to map an ambiguous mention to an entity in a knowledge base (KB)
We propose a method for building a fully annotated Twitter dataset for MEL, where entities are defined in a Twitter KB.
Then, we propose a model for jointly learning a representation of both mentions and entities from their textual and visual contexts.
arXiv Detail & Related papers (2021-04-07T16:40:23Z) - DWIE: an entity-centric dataset for multi-task document-level
information extraction [23.412500230644433]
DWIE is a newly created multi-task dataset that combines four main Information Extraction (IE) annotation subtasks.
DWIE is conceived as an entity-centric dataset that describes interactions and properties of conceptual entities on the level of the complete document.
arXiv Detail & Related papers (2020-09-26T15:53:22Z) - SciREX: A Challenge Dataset for Document-Level Information Extraction [56.83748634747753]
It is challenging to create a large-scale information extraction dataset at the document level.
We introduce SciREX, a document level IE dataset that encompasses multiple IE tasks.
We develop a neural model as a strong baseline that extends previous state-of-the-art IE models to document-level IE.
arXiv Detail & Related papers (2020-05-01T17:30:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.