Multilingual Autoregressive Entity Linking
- URL: http://arxiv.org/abs/2103.12528v1
- Date: Tue, 23 Mar 2021 13:25:55 GMT
- Title: Multilingual Autoregressive Entity Linking
- Authors: Nicola De Cao, Ledell Wu, Kashyap Popat, Mikel Artetxe, Naman Goyal,
Mikhail Plekhanov, Luke Zettlemoyer, Nicola Cancedda, Sebastian Riedel, Fabio
Petroni
- Abstract summary: mGENRE is a sequence-to-sequence system for the Multilingual Entity Linking problem.
For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token.
We show the efficacy of our approach through extensive evaluation including experiments on three popular MEL benchmarks.
- Score: 49.35994386221958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present mGENRE, a sequence-to-sequence system for the Multilingual Entity
Linking (MEL) problem -- the task of resolving language-specific mentions to a
multilingual Knowledge Base (KB). For a mention in a given language, mGENRE
predicts the name of the target entity left-to-right, token-by-token in an
autoregressive fashion. The autoregressive formulation allows us to effectively
cross-encode mention string and entity names to capture more interactions than
the standard dot product between mention and entity vectors. It also enables
fast search within a large KB even for mentions that do not appear in mention
tables and with no need for large-scale vector indices. While prior MEL works
use a single representation for each entity, we match against entity names of
as many languages as possible, which allows exploiting language connections
between source input and target name. Moreover, in a zero-shot setting on
languages with no training data at all, mGENRE treats the target language as a
latent variable that is marginalized at prediction time. This leads to over 50%
improvements in average accuracy. We show the efficacy of our approach through
extensive evaluation including experiments on three popular MEL benchmarks
where mGENRE establishes new state-of-the-art results. Code and pre-trained
models at https://github.com/facebookresearch/GENRE.
Related papers
- xMEN: A Modular Toolkit for Cross-Lingual Medical Entity Normalization [0.42292483435853323]
We introduce xMEN, a modular system for cross-lingual medical entity normalization.
When synonyms in the target language are scarce for a given terminology, we leverage English aliases via cross-lingual candidate generation.
For candidate ranking, we incorporate a trainable cross-encoder model if annotations for the target task are available.
arXiv Detail & Related papers (2023-10-17T13:53:57Z) - CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - Mixed Attention Transformer for LeveragingWord-Level Knowledge to Neural
Cross-Lingual Information Retrieval [15.902630454568811]
We propose a novel Mixed Attention Transformer (MAT) that incorporates external word level knowledge, such as a dictionary or translation table.
By encoding the translation knowledge into an attention matrix, the model with MAT is able to focus on the mutually translated words in the input sequence.
arXiv Detail & Related papers (2021-09-07T00:33:14Z) - Improving Zero-Shot Multi-Lingual Entity Linking [14.502266106371433]
We consider multilingual entity linking, where a single model is trained to link references to same-language knowledge bases in several languages.
We propose a neural ranker architecture, which leverages multilingual transformer representations of text to be easily applied to a multilingual setting.
We find that using this approach improves recall in several datasets, often matching the in-language performance.
arXiv Detail & Related papers (2021-04-16T12:50:07Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and
Event Extraction [107.8262586956778]
We introduce graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations.
GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree.
We propose to utilize the self-attention mechanism to learn the dependencies between words with different syntactic distances.
arXiv Detail & Related papers (2020-10-06T20:30:35Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.