Highly Parallel Autoregressive Entity Linking with Discriminative
Correction
- URL: http://arxiv.org/abs/2109.03792v1
- Date: Wed, 8 Sep 2021 17:28:26 GMT
- Title: Highly Parallel Autoregressive Entity Linking with Discriminative
Correction
- Authors: Nicola De Cao, Wilker Aziz, Ivan Titov
- Abstract summary: We propose a very efficient approach that parallelizes autoregressive linking across all potential mentions.
Our model is >70 times faster and more accurate than the previous generative method.
- Score: 51.947280241185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative approaches have been recently shown to be effective for both
Entity Disambiguation and Entity Linking (i.e., joint mention detection and
disambiguation). However, the previously proposed autoregressive formulation
for EL suffers from i) high computational cost due to a complex (deep) decoder,
ii) non-parallelizable decoding that scales with the source sequence length,
and iii) the need for training on a large amount of data. In this work, we
propose a very efficient approach that parallelizes autoregressive linking
across all potential mentions and relies on a shallow and efficient decoder.
Moreover, we augment the generative objective with an extra discriminative
component, i.e., a correction term which lets us directly optimize the
generator's ranking. When taken together, these techniques tackle all the above
issues: our model is >70 times faster and more accurate than the previous
generative method, outperforming state-of-the-art approaches on the standard
English dataset AIDA-CoNLL. Source code available at
https://github.com/nicola-decao/efficient-autoregressive-EL
Related papers
- CURATRON: Complete Robust Preference Data for Robust Alignment of Large
Language Models [1.7849982327883962]
This paper addresses the challenges of aligning large language models (LLMs) with human values via preference learning (PL)
We propose a novel method for curation robustly and completely recalibrating values within these datasets.
Our algorithms handle adversarial noise and unobserved comparisons well in both general and preference dataset settings.
arXiv Detail & Related papers (2024-03-05T07:58:12Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Improving Dual-Encoder Training through Dynamic Indexes for Negative
Mining [61.09807522366773]
We introduce an algorithm that approximates the softmax with provable bounds and that dynamically maintains the tree.
In our study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining.
arXiv Detail & Related papers (2023-03-27T15:18:32Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - A Sparsity-promoting Dictionary Model for Variational Autoencoders [16.61511959679188]
Structuring the latent space in deep generative models is important to yield more expressive models and interpretable representations.
We propose a simple yet effective methodology to structure the latent space via a sparsity-promoting dictionary model.
arXiv Detail & Related papers (2022-03-29T17:13:11Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire [74.04394069262108]
We propose FastLR, a non-autoregressive (NAR) lipreading model which generates all target tokens simultaneously.
FastLR achieves the speedup up to 10.97$times$ compared with state-of-the-art lipreading model.
arXiv Detail & Related papers (2020-08-06T08:28:56Z) - Generalizing Variational Autoencoders with Hierarchical Empirical Bayes [6.273154057349038]
We present Hierarchical Empirical Bayes Autoencoder (HEBAE), a computationally stable framework for probabilistic generative models.
Our key contributions are two-fold. First, we make gains by placing a hierarchical prior over the encoding distribution, enabling us to adaptively balance the trade-off between minimizing the reconstruction loss function and avoiding over-regularization.
arXiv Detail & Related papers (2020-07-20T18:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.