Related papers: Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales

Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales

URL: http://arxiv.org/abs/2502.11380v1
Date: Mon, 17 Feb 2025 02:52:07 GMT
Title: Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales
Authors: Zhu Liu, Ying Liu, KangYang Luo, Cunliang Kong, Maosong Sun,
Abstract summary: A conceptual space represents concepts as nodes and semantic relatedness as edges.<n>We construct a conceptual space using word embeddings from large language models of varying scales.<n>We analyze conceptual pairs, WordNet relations, and a cross-lingual semantic network for qualitative words.
Score: 47.52062992606549
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: A conceptual space represents concepts as nodes and semantic relatedness as edges. Word embeddings, combined with a similarity metric, provide an effective approach to constructing such a space. Typically, embeddings are derived from traditional distributed models or encoder-only pretrained models, whose objectives directly capture the meaning of the current token. In contrast, decoder-only models, including large language models (LLMs), predict the next token, making their embeddings less directly tied to the current token's semantics. Moreover, comparative studies on LLMs of different scales remain underexplored. In this paper, we construct a conceptual space using word embeddings from LLMs of varying scales and comparatively analyze their properties. We establish a network based on a linguistic typology-inspired connectivity hypothesis, examine global statistical properties, and compare LLMs of varying scales. Locally, we analyze conceptual pairs, WordNet relations, and a cross-lingual semantic network for qualitative words. Our results indicate that the constructed space exhibits small-world properties, characterized by a high clustering coefficient and short path lengths. Larger LLMs generate more intricate spaces, with longer paths reflecting richer relational structures and connections. Furthermore, the network serves as an efficient bridge for cross-lingual semantic mapping.

Related papers

Interpreting the Linear Structure of Vision-language Model Embedding Spaces [15.065914257384204]
We train and release sparse autoencoders (SAEs) on the embedding spaces of four vision-language models. We find that SAEs are better at reconstructing the real embeddings, while also able to retain the most sparsity. We also show that the key commonly-activating concepts extracted by SAEs are remarkably stable across runs.
arXiv Detail & Related papers (2025-04-16T01:40:06Z)
A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms.<n>Most SMMs are manually built by human experts using bottom-up procedures.<n>We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z)
Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs) We form "semantic tokens" by merging the semantically similar subwords and their embeddings. inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z)
Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources [11.257738983764499]
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. We enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks.
arXiv Detail & Related papers (2024-02-20T13:47:51Z)
Autoencoder-Based Domain Learning for Semantic Communication with Conceptual Spaces [1.7404865362620803]
We develop a framework for learning a domain of a conceptual space model using only the raw data with high-level property labels. In experiments using the MNIST and CelebA datasets, we show that the domains learned using the framework maintain semantic similarity relations and possess interpretable dimensions.
arXiv Detail & Related papers (2024-01-29T21:08:33Z)
Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts. However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations. We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z)
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations. We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks. We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z)
Probing Large Language Models from A Human Behavioral Perspective [24.109080140701188]
Large Language Models (LLMs) have emerged as dominant foundational models in modern NLP. The understanding of their prediction processes and internal mechanisms, such as feed-forward networks (FFN) and multi-head self-attention (MHSA) remains largely unexplored.
arXiv Detail & Related papers (2023-10-08T16:16:21Z)
Topological properties and organizing principles of semantic networks [3.8462776107938317]
We study the properties of semantic networks from ConceptNet, defined by 7 semantic relations from 11 different languages. We find that semantic networks have universal basic properties: they are sparse, highly clustered, and many exhibit power-law degree distributions. In some networks the connections are similarity-based, while in others the connections are more complementarity-based.
arXiv Detail & Related papers (2023-04-24T11:12:21Z)
Imitation Learning-based Implicit Semantic-aware Communication Networks: Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy. We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate. We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo) We first present a flexible heterogeneous semantic network that incorporates high-quality entities. We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z)
Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z)
Integrating Language Guidance into Vision-based Deep Metric Learning [78.18860829585182]
We propose to learn metric spaces which encode semantic similarities as embedding space. These spaces should be transferable to classes beyond those seen during training. This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes.
arXiv Detail & Related papers (2022-03-16T11:06:50Z)
Text analysis and deep learning: A network approach [0.0]
We propose a novel method that combines transformer models with network analysis to form a self-referential representation of language use within a corpus of interest. Our approach produces linguistic relations strongly consistent with the underlying model as well as mathematically well-defined operations on them. It represents, to the best of our knowledge, the first unsupervised method to extract semantic networks directly from deep language models.
arXiv Detail & Related papers (2021-10-08T14:18:36Z)
Learning low-rank latent mesoscale structures in networks [1.1470070927586016]
We present a new approach for describing low-rank mesoscale structures in networks. We use several synthetic network models and empirical friendship, collaboration, and protein--protein interaction (PPI) networks. We show how to denoise a corrupted network by using only the latent motifs that one learns directly from the corrupted network.
arXiv Detail & Related papers (2021-02-13T18:54:49Z)
Model Choices Influence Attributive Word Associations: A Semi-supervised Analysis of Static Word Embeddings [0.0]
This work aims to assess attributive word associations across five different static word embedding architectures. Our results reveal that the choice of the context learning flavor during embedding training (CBOW vs skip-gram) impacts the word association distinguishability and word embeddings' sensitivity to deviations in the training corpora.
arXiv Detail & Related papers (2020-12-14T22:27:18Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.