Intermediate Entity-based Sparse Interpretable Representation Learning
- URL: http://arxiv.org/abs/2212.01641v1
- Date: Sat, 3 Dec 2022 16:16:11 GMT
- Title: Intermediate Entity-based Sparse Interpretable Representation Learning
- Authors: Diego Garcia-Olano, Yasumasa Onoe, Joydeep Ghosh, Byron C. Wallace
- Abstract summary: Interpretable entity representations (IERs) are sparse embeddings that are "human-readable" in that dimensions correspond to fine-grained entity types.
We propose Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL)
- Score: 37.128220450933625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interpretable entity representations (IERs) are sparse embeddings that are
"human-readable" in that dimensions correspond to fine-grained entity types and
values are predicted probabilities that a given entity is of the corresponding
type. These methods perform well in zero-shot and low supervision settings.
Compared to standard dense neural embeddings, such interpretable
representations may permit analysis and debugging. However, while fine-tuning
sparse, interpretable representations improves accuracy on downstream tasks, it
destroys the semantics of the dimensions which were enforced in pre-training.
Can we maintain the interpretable semantics afforded by IERs while improving
predictive performance on downstream tasks? Toward this end, we propose
Intermediate enTity-based Sparse Interpretable Representation Learning
(ItsIRL). ItsIRL realizes improved performance over prior IERs on biomedical
tasks, while maintaining "interpretability" generally and their ability to
support model debugging specifically. The latter is enabled in part by the
ability to perform "counterfactual" fine-grained entity type manipulation,
which we explore in this work. Finally, we propose a method to construct entity
type based class prototypes for revealing global semantic properties of classes
learned by our model.
Related papers
- Disentangling Dense Embeddings with Sparse Autoencoders [0.0]
Sparse autoencoders (SAEs) have shown promise in extracting interpretable features from complex neural networks.
We present one of the first applications of SAEs to dense text embeddings from large language models.
We show that the resulting sparse representations maintain semantic fidelity while offering interpretability.
arXiv Detail & Related papers (2024-08-01T15:46:22Z) - InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
Interpretability for neural networks is a trade-off between three key requirements.
We present InterpretCC, a family of interpretable-by-design neural networks that guarantee human-centric interpretability.
arXiv Detail & Related papers (2024-02-05T11:55:50Z) - FIND: A Function Description Benchmark for Evaluating Interpretability
Methods [86.80718559904854]
This paper introduces FIND (Function INterpretation and Description), a benchmark suite for evaluating automated interpretability methods.
FIND contains functions that resemble components of trained neural networks, and accompanying descriptions of the kind we seek to generate.
We evaluate methods that use pretrained language models to produce descriptions of function behavior in natural language and code.
arXiv Detail & Related papers (2023-09-07T17:47:26Z) - Self-Supervised Learning via Maximum Entropy Coding [57.56570417545023]
We propose Maximum Entropy Coding (MEC) as a principled objective that explicitly optimize on the structure of the representation.
MEC learns a more generalizable representation than previous methods based on specific pretext tasks.
It achieves state-of-the-art performance consistently on various downstream tasks, including not only ImageNet linear probe, but also semi-supervised classification, object detection, instance segmentation, and object tracking.
arXiv Detail & Related papers (2022-10-20T17:58:30Z) - Transferring Semantic Knowledge Into Language Encoders [6.85316573653194]
We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders.
We show that this alignment can be learned implicitly via classification or directly via triplet loss.
Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
arXiv Detail & Related papers (2021-10-14T14:11:12Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.