Related papers: Autoregressive Entity Retrieval

Autoregressive Entity Retrieval

URL: http://arxiv.org/abs/2010.00904v3
Date: Wed, 24 Mar 2021 07:21:07 GMT
Title: Autoregressive Entity Retrieval
Authors: Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni
Abstract summary: Entities are at the center of how we represent and aggregate knowledge. The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
Score: 55.38027440347138
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Entities are at the center of how we represent and aggregate knowledge. For instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one per Wikipedia article). The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. Current approaches can be understood as classifiers among atomic labels, one for each entity. Their weight vectors are dense entity representations produced by encoding entity meta information such as their descriptions. This approach has several shortcomings: (i) context and entity affinity is mainly captured through a vector dot product, potentially missing fine-grained interactions; (ii) a large memory footprint is needed to store dense representations when considering large entity sets; (iii) an appropriately hard set of negative data has to be subsampled at training time. In this work, we propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion. This mitigates the aforementioned technical issues since: (i) the autoregressive formulation directly captures relations between context and entity name, effectively cross encoding both; (ii) the memory footprint is greatly reduced because the parameters of our encoder-decoder architecture scale with vocabulary size, not entity count; (iii) the softmax loss is computed without subsampling negative data. We experiment with more than 20 datasets on entity disambiguation, end-to-end entity linking and document retrieval tasks, achieving new state-of-the-art or very competitive results while using a tiny fraction of the memory footprint of competing systems. Finally, we demonstrate that new entities can be added by simply specifying their names. Code and pre-trained models at https://github.com/facebookresearch/GENRE.

Related papers

OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting [49.655711022673046]
OneNet is an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning. OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning.
arXiv Detail & Related papers (2024-10-10T02:45:23Z)
Entity Disambiguation via Fusion Entity Decoding [68.77265315142296]
We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions. We observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA.
arXiv Detail & Related papers (2024-04-02T04:27:54Z)
Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks [58.43972540643903]
We propose HyperGraph neural network for ERE ($hgnn$), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model) To alleviate error propagation,we use a high-recall pruner mechanism to transfer the burden of entity identification and labeling from the NER module to the joint module of our model. Experiments on three widely used benchmarks for ERE task show significant improvements over the previous state-of-the-art PL-marker.
arXiv Detail & Related papers (2023-10-26T08:36:39Z)
SpEL: Structured Prediction for Entity Linking [5.112679200269861]
We revisit the use of structured prediction for entity linking which classifies each individual input token as an entity, and aggregates the token predictions. Our system, called SpEL, is a state-of-the-art entity linking system that uses some new ideas to apply structured prediction to the task of entity linking. Our experiments show that we can outperform the state-of-the-art on the commonly used AIDA benchmark dataset for entity linking to Wikipedia.
arXiv Detail & Related papers (2023-10-23T08:24:35Z)
EnCore: Fine-Grained Entity Typing by Pre-Training Entity Encoders on Coreference Chains [22.469469997734965]
We propose to pre-training an entity encoder such that embeddings of coreferring entities are more similar to each other than to the embeddings of other entities. We show that this problem can be addressed by using a simple trick: we only consider coreference links that are predicted by two different off-the-shelf systems.
arXiv Detail & Related papers (2023-05-22T11:11:59Z)
What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space. We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z)
Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities. Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available. Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z)
EntQA: Entity Linking as Question Answering [18.39360849304263]
We present EntQA, which stands for Entity linking as Question Answering. Our approach combines progress in entity linking with that in open-domain question answering. Unlike in previous works, we do not rely on a mention-candidates dictionary or large-scale weak supervision.
arXiv Detail & Related papers (2021-10-05T21:39:57Z)
MOLEMAN: Mention-Only Linking of Entities with a Mention Annotation Network [31.65990156182273]
We present an instance-based nearest neighbor approach to entity linking. We build a contextualized mention-encoder that learns to place similar mentions of the same entity closer in vector space than mentions of different entities. Our model is trained on a large multilingual corpus of mention pairs derived from Wikipedia hyperlinks, and performs nearest neighbor inference on an index of 700 million mentions.
arXiv Detail & Related papers (2021-06-02T15:54:36Z)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention [37.111204321059084]
We propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. Our model is trained using a new pretraining task based on the masked language model of BERT. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer.
arXiv Detail & Related papers (2020-10-02T15:38:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.