Related papers: LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

URL: http://arxiv.org/abs/2010.01057v1
Date: Fri, 2 Oct 2020 15:38:03 GMT
Title: LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Authors: Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto
Abstract summary: We propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. Our model is trained using a new pretraining task based on the masked language model of BERT. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer.
Score: 37.111204321059084
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT. The task involves predicting randomly masked words and entities in a large entity-annotated corpus retrieved from Wikipedia. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores. The proposed model achieves impressive empirical performance on a wide range of entity-related tasks. In particular, it obtains state-of-the-art results on five well-known datasets: Open Entity (entity typing), TACRED (relation classification), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), and SQuAD 1.1 (extractive question answering). Our source code and pretrained representations are available at https://github.com/studio-ousia/luke.

Related papers

Focus! Relevant and Sufficient Context Selection for News Image Captioning [69.36678144800936]
News Image Captioning requires describing an image by leveraging additional context from a news article. We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article. Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models.
arXiv Detail & Related papers (2022-12-01T20:00:27Z)
Improving Entity Linking through Semantic Reinforced Entity Embeddings [16.868791358905916]
We propose a method to inject fine-grained semantic information into entity embeddings to reduce the distinctiveness and facilitate the learning of contextual commonality. Based on our entity embeddings, we achieved new sate-of-the-art performance on entity linking.
arXiv Detail & Related papers (2021-06-16T00:27:56Z)
Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making [22.755892575582788]
Entity Matching aims at recognizing entity records that denote the same real-world object. We propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Key Attribute Tree (KAT) Induction. Our method is highly efficient and outperforms SOTA EM models in most cases.
arXiv Detail & Related papers (2021-06-08T08:27:31Z)
Autoregressive Entity Retrieval [55.38027440347138]
Entities are at the center of how we represent and aggregate knowledge. The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
arXiv Detail & Related papers (2020-10-02T10:13:31Z)
Named Entity Recognition as Dependency Parsing [16.544333689188246]
We use graph-based dependency parsing to provide our model a global view on the input via a biaffine model. We show that the model works well for both nested and flat NER through evaluation on 8 corpora and achieving SoTA performance on all of them, with accuracy gains of up to 2.2 percentage points.
arXiv Detail & Related papers (2020-05-14T17:11:41Z)
Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types. We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE) We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets. We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve [49.878051587667244]
We examine the performance of several variants of LSTM-CRF architectures for named entity recognition. We find that context representations do contribute to system performance, but that the main factor driving high performance is learning the name tokens themselves. We enlist human annotators to evaluate the feasibility of inferring entity types from the context alone and find that, while people are not able to infer the entity type either for the majority of the errors made by the context-only system, there is some room for improvement.
arXiv Detail & Related papers (2020-04-09T14:37:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.