Autoregressive Entity Retrieval
- URL: http://arxiv.org/abs/2010.00904v3
- Date: Wed, 24 Mar 2021 07:21:07 GMT
- Title: Autoregressive Entity Retrieval
- Authors: Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni
- Abstract summary: Entities are at the center of how we represent and aggregate knowledge.
The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering.
We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
- Score: 55.38027440347138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entities are at the center of how we represent and aggregate knowledge. For
instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one
per Wikipedia article). The ability to retrieve such entities given a query is
fundamental for knowledge-intensive tasks such as entity linking and
open-domain question answering. Current approaches can be understood as
classifiers among atomic labels, one for each entity. Their weight vectors are
dense entity representations produced by encoding entity meta information such
as their descriptions. This approach has several shortcomings: (i) context and
entity affinity is mainly captured through a vector dot product, potentially
missing fine-grained interactions; (ii) a large memory footprint is needed to
store dense representations when considering large entity sets; (iii) an
appropriately hard set of negative data has to be subsampled at training time.
In this work, we propose GENRE, the first system that retrieves entities by
generating their unique names, left to right, token-by-token in an
autoregressive fashion. This mitigates the aforementioned technical issues
since: (i) the autoregressive formulation directly captures relations between
context and entity name, effectively cross encoding both; (ii) the memory
footprint is greatly reduced because the parameters of our encoder-decoder
architecture scale with vocabulary size, not entity count; (iii) the softmax
loss is computed without subsampling negative data. We experiment with more
than 20 datasets on entity disambiguation, end-to-end entity linking and
document retrieval tasks, achieving new state-of-the-art or very competitive
results while using a tiny fraction of the memory footprint of competing
systems. Finally, we demonstrate that new entities can be added by simply
specifying their names. Code and pre-trained models at
https://github.com/facebookresearch/GENRE.
Related papers
- OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting [49.655711022673046]
OneNet is an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning.
OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning.
arXiv Detail & Related papers (2024-10-10T02:45:23Z) - Entity Disambiguation via Fusion Entity Decoding [68.77265315142296]
We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions.
We observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA.
arXiv Detail & Related papers (2024-04-02T04:27:54Z) - Joint Entity and Relation Extraction with Span Pruning and Hypergraph
Neural Networks [58.43972540643903]
We propose HyperGraph neural network for ERE ($hgnn$), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model)
To alleviate error propagation,we use a high-recall pruner mechanism to transfer the burden of entity identification and labeling from the NER module to the joint module of our model.
Experiments on three widely used benchmarks for ERE task show significant improvements over the previous state-of-the-art PL-marker.
arXiv Detail & Related papers (2023-10-26T08:36:39Z) - SpEL: Structured Prediction for Entity Linking [5.112679200269861]
We revisit the use of structured prediction for entity linking which classifies each individual input token as an entity, and aggregates the token predictions.
Our system, called SpEL, is a state-of-the-art entity linking system that uses some new ideas to apply structured prediction to the task of entity linking.
Our experiments show that we can outperform the state-of-the-art on the commonly used AIDA benchmark dataset for entity linking to Wikipedia.
arXiv Detail & Related papers (2023-10-23T08:24:35Z) - EnCore: Fine-Grained Entity Typing by Pre-Training Entity Encoders on
Coreference Chains [22.469469997734965]
We propose to pre-training an entity encoder such that embeddings of coreferring entities are more similar to each other than to the embeddings of other entities.
We show that this problem can be addressed by using a simple trick: we only consider coreference links that are predicted by two different off-the-shelf systems.
arXiv Detail & Related papers (2023-05-22T11:11:59Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities.
Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available.
Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z) - EntQA: Entity Linking as Question Answering [18.39360849304263]
We present EntQA, which stands for Entity linking as Question Answering.
Our approach combines progress in entity linking with that in open-domain question answering.
Unlike in previous works, we do not rely on a mention-candidates dictionary or large-scale weak supervision.
arXiv Detail & Related papers (2021-10-05T21:39:57Z) - MOLEMAN: Mention-Only Linking of Entities with a Mention Annotation
Network [31.65990156182273]
We present an instance-based nearest neighbor approach to entity linking.
We build a contextualized mention-encoder that learns to place similar mentions of the same entity closer in vector space than mentions of different entities.
Our model is trained on a large multilingual corpus of mention pairs derived from Wikipedia hyperlinks, and performs nearest neighbor inference on an index of 700 million mentions.
arXiv Detail & Related papers (2021-06-02T15:54:36Z) - LUKE: Deep Contextualized Entity Representations with Entity-aware
Self-attention [37.111204321059084]
We propose new pretrained contextualized representations of words and entities based on the bidirectional transformer.
Our model is trained using a new pretraining task based on the masked language model of BERT.
We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer.
arXiv Detail & Related papers (2020-10-02T15:38:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.