Investigating Entity Knowledge in BERT with Simple Neural End-To-End
Entity Linking
- URL: http://arxiv.org/abs/2003.05473v1
- Date: Wed, 11 Mar 2020 18:23:00 GMT
- Title: Investigating Entity Knowledge in BERT with Simple Neural End-To-End
Entity Linking
- Authors: Samuel Broscheit
- Abstract summary: We propose an extreme simplification of the entity linking setup that works surprisingly well.
We show that this model improves the entity representations over plain BERT.
We also investigate the usefulness of entity-aware token-representations in the text-understanding benchmark GLUE.
- Score: 8.265860641797996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A typical architecture for end-to-end entity linking systems consists of
three steps: mention detection, candidate generation and entity disambiguation.
In this study we investigate the following questions: (a) Can all those steps
be learned jointly with a model for contextualized text-representations, i.e.
BERT (Devlin et al., 2019)? (b) How much entity knowledge is already contained
in pretrained BERT? (c) Does additional entity knowledge improve BERT's
performance in downstream tasks? To this end, we propose an extreme
simplification of the entity linking setup that works surprisingly well: simply
cast it as a per token classification over the entire entity vocabulary (over
700K classes in our case). We show on an entity linking benchmark that (i) this
model improves the entity representations over plain BERT, (ii) that it
outperforms entity linking architectures that optimize the tasks separately and
(iii) that it only comes second to the current state-of-the-art that does
mention detection and entity disambiguation jointly. Additionally, we
investigate the usefulness of entity-aware token-representations in the
text-understanding benchmark GLUE, as well as the question answering benchmarks
SQUAD V2 and SWAG and also the EN-DE WMT14 machine translation benchmark. To
our surprise, we find that most of those benchmarks do not benefit from
additional entity knowledge, except for a task with very small training data,
the RTE task in GLUE, which improves by 2%.
Related papers
- Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - Entity Disambiguation via Fusion Entity Decoding [68.77265315142296]
We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions.
We observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA.
arXiv Detail & Related papers (2024-04-02T04:27:54Z) - Two Heads Are Better Than One: Integrating Knowledge from Knowledge
Graphs and Large Language Models for Entity Alignment [31.70064035432789]
We propose a Large Language Model-enhanced Entity Alignment framework (LLMEA)
LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across Knowledge Graphs and edit distances to a virtual equivalent entity.
Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models.
arXiv Detail & Related papers (2024-01-30T12:41:04Z) - IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Few-Shot Nested Named Entity Recognition [4.8693196802491405]
This paper is the first one dedicated to studying the few-shot nested NER task.
We propose a Biaffine-based Contrastive Learning (BCL) framework to learn contextual dependency to distinguish nested entities.
The BCL outperformed three baseline models on the 1-shot and 5-shot tasks in terms of F1 score.
arXiv Detail & Related papers (2022-12-02T03:42:23Z) - Entity-aware Transformers for Entity Search [6.107210856380526]
We show that the entity-enriched BERT model improves effectiveness on entity-oriented queries over a regular BERT model.
We also show that the entity information provided by our entity-enriched model particularly helps queries related to less popular entities.
arXiv Detail & Related papers (2022-05-02T11:53:59Z) - MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity
Representations [28.28940043641958]
We propose a novel approach for entity retrieval that constructs multi-view representations for entity descriptions and approximates the optimal view for mentions via a searching method.
Our method achieves the state-of-the-art performance on ZESHEL and improves the quality of candidates on three standard Entity Linking datasets.
arXiv Detail & Related papers (2021-09-13T05:51:45Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - HittER: Hierarchical Transformers for Knowledge Graph Embeddings [85.93509934018499]
We propose Hitt to learn representations of entities and relations in a complex knowledge graph.
Experimental results show that Hitt achieves new state-of-the-art results on multiple link prediction.
We additionally propose a simple approach to integrate Hitt into BERT and demonstrate its effectiveness on two Freebase factoid answering datasets.
arXiv Detail & Related papers (2020-08-28T18:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.