Interpretable Syntactic Representations Enable Hierarchical Word Vectors
- URL: http://arxiv.org/abs/2411.08384v1
- Date: Wed, 13 Nov 2024 07:10:18 GMT
- Title: Interpretable Syntactic Representations Enable Hierarchical Word Vectors
- Authors: Biraj Silwal,
- Abstract summary: The distributed representations currently used are dense and uninterpretable.
We propose a method that transforms these word vectors into reduced syntactic representations.
The resulting representations are compact and interpretable allowing better visualization and comparison of the word vectors.
- Score: 0.0
- License:
- Abstract: The distributed representations currently used are dense and uninterpretable, leading to interpretations that themselves are relative, overcomplete, and hard to interpret. We propose a method that transforms these word vectors into reduced syntactic representations. The resulting representations are compact and interpretable allowing better visualization and comparison of the word vectors and we successively demonstrate that the drawn interpretations are in line with human judgment. The syntactic representations are then used to create hierarchical word vectors using an incremental learning approach similar to the hierarchical aspect of human learning. As these representations are drawn from pre-trained vectors, the generation process and learning approach are computationally efficient. Most importantly, we find out that syntactic representations provide a plausible interpretation of the vectors and subsequent hierarchical vectors outperform the original vectors in benchmark tests.
Related papers
- Optimal synthesis embeddings [1.565361244756411]
We introduce a word embedding composition method based on the intuitive idea that a fair embedding representation for a given set of words should satisfy.
We show that our approach excels in solving probing tasks designed to capture simple linguistic features of sentences.
arXiv Detail & Related papers (2024-06-10T18:06:33Z) - Grounding and Distinguishing Conceptual Vocabulary Through Similarity
Learning in Embodied Simulations [4.507860128918788]
We present a novel method for using agent experiences gathered through an embodied simulation to ground contextualized word vectors to object representations.
We use similarity learning to make comparisons between different object types based on their properties when interacted with, and to extract common features pertaining to the objects' behavior.
arXiv Detail & Related papers (2023-05-23T04:22:00Z) - An Investigation on Word Embedding Offset Clustering as Relationship
Classification [0.0]
This study is an investigation in an attempt to elicit a vector representation of relationships between pairs of word vectors.
We use six pooling strategies to represent vector relationships.
This work aims to provide directions for a word embedding based unsupervised method to identify the nature of a relationship represented by a pair of words.
arXiv Detail & Related papers (2023-05-07T13:03:17Z) - Linear Spaces of Meanings: Compositional Structures in Vision-Language
Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs)
We first present a framework for understanding compositional structures from a geometric perspective.
We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - Fair Interpretable Representation Learning with Correction Vectors [60.0806628713968]
We propose a new framework for fair representation learning that is centered around the learning of "correction vectors"
We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance.
arXiv Detail & Related papers (2022-02-07T11:19:23Z) - Deriving Word Vectors from Contextualized Language Models using
Topic-Aware Mention Selection [46.97185212695267]
We propose a method for learning word representations that follows this basic strategy.
We take advantage of contextualized language models (CLMs) rather than bags of word vectors to encode contexts.
We show that this simple strategy leads to high-quality word vectors, which are more predictive of semantic properties than word embeddings and existing CLM-based strategies.
arXiv Detail & Related papers (2021-06-15T08:02:42Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Generating Sense Embeddings for Syntactic and Semantic Analogy for
Portuguese [0.0]
We use techniques to generate sense embeddings and present the first experiments carried out for Portuguese.
Our experiments show that sense vectors outperform traditional word vectors in syntactic and semantic analogy tasks.
arXiv Detail & Related papers (2020-01-21T14:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.