Interpretable Entity Representations through Large-Scale Typing
- URL: http://arxiv.org/abs/2005.00147v2
- Date: Tue, 13 Oct 2020 01:18:13 GMT
- Title: Interpretable Entity Representations through Large-Scale Typing
- Authors: Yasumasa Onoe and Greg Durrett
- Abstract summary: We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
- Score: 61.4277527871572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In standard methodology for natural language processing, entities in text are
typically embedded in dense vector spaces with pre-trained models. The
embeddings produced this way are effective when fed into downstream models, but
they require end-task fine-tuning and are fundamentally difficult to interpret.
In this paper, we present an approach to creating entity representations that
are human readable and achieve high performance on entity-related tasks out of
the box. Our representations are vectors whose values correspond to posterior
probabilities over fine-grained entity types, indicating the confidence of a
typing model's decision that the entity belongs to the corresponding type. We
obtain these representations using a fine-grained entity typing model, trained
either on supervised ultra-fine entity typing data (Choi et al. 2018) or
distantly-supervised examples from Wikipedia. On entity probing tasks involving
recognizing entity identity, our embeddings used in parameter-free downstream
models achieve competitive performance with ELMo- and BERT-based embeddings in
trained models. We also show that it is possible to reduce the size of our type
set in a learning-based way for particular domains. Finally, we show that these
embeddings can be post-hoc modified through a small number of rules to
incorporate domain knowledge and improve performance.
Related papers
- Seed-Guided Fine-Grained Entity Typing in Science and Engineering
Domains [51.02035914828596]
We study the task of seed-guided fine-grained entity typing in science and engineering domains.
We propose SEType which first enriches the weak supervision by finding more entities for each seen type from an unlabeled corpus.
It then matches the enriched entities to unlabeled text to get pseudo-labeled samples and trains a textual entailment model that can make inferences for both seen and unseen types.
arXiv Detail & Related papers (2024-01-23T22:36:03Z) - Meaning Representations from Trajectories in Autoregressive Models [106.63181745054571]
We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text.
This strategy is prompt-free, does not require fine-tuning, and is applicable to any pre-trained autoregressive model.
We empirically show that the representations obtained from large models align well with human annotations, outperform other zero-shot and prompt-free methods on semantic similarity tasks, and can be used to solve more complex entailment and containment tasks that standard embeddings cannot handle.
arXiv Detail & Related papers (2023-10-23T04:35:58Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - Intermediate Entity-based Sparse Interpretable Representation Learning [37.128220450933625]
Interpretable entity representations (IERs) are sparse embeddings that are "human-readable" in that dimensions correspond to fine-grained entity types.
We propose Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL)
arXiv Detail & Related papers (2022-12-03T16:16:11Z) - Generative Entity Typing with Curriculum Learning [18.43562065432877]
We propose a novel generative entity typing (GET) paradigm.
Given a text with an entity mention, the multiple types for the role that the entity plays in the text are generated with a pre-trained language model.
Our experiments justify the superiority of our GET model over the state-of-the-art entity typing models.
arXiv Detail & Related papers (2022-10-06T13:32:50Z) - Efficient and Interpretable Neural Models for Entity Tracking [3.1985066117432934]
This thesis focuses on two key problems in relation to facilitating the use of entity tracking models.
We argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations.
We also argue for the integration of entity tracking into language models as it will allow for: (i) wider application given the current ubiquitous use of pretrained language models in NLP applications.
arXiv Detail & Related papers (2022-08-30T13:25:27Z) - Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation
and Instance Generation [36.541309948222306]
We study the problem of few-shot Fine-grained Entity Typing (FET), where only a few annotated entity mentions with contexts are given for each entity type.
We propose a novel framework for few-shot FET consisting of two modules: (1) an entity type label interpretation module automatically learns to relate type labels to the vocabulary by jointly leveraging few-shot instances and the label hierarchy, and (2) a type-based contextualized instance generator produces new instances based on given instances to enlarge the training set for better generalization.
arXiv Detail & Related papers (2022-06-28T04:05:40Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - GRIT: Generative Role-filler Transformers for Document-level Event
Entity Extraction [134.5580003327839]
We introduce a generative transformer-based encoder-decoder framework (GRIT) to model context at the document level.
We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work.
arXiv Detail & Related papers (2020-08-21T01:07:36Z) - Improving Entity Linking by Modeling Latent Entity Type Information [25.33342677359822]
We propose to inject latent entity type information into the entity embeddings based on pre-trained BERT.
In addition, we integrate a BERT-based entity similarity score into the local context model of a state-of-the-art model to better capture latent entity type information.
Our model significantly outperforms the state-of-the-art entity linking models on standard benchmark (AIDA-CoNLL)
arXiv Detail & Related papers (2020-01-06T09:18:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.