Probing Taxonomic and Thematic Embeddings for Taxonomic Information
- URL: http://arxiv.org/abs/2301.10656v1
- Date: Wed, 25 Jan 2023 15:59:26 GMT
- Title: Probing Taxonomic and Thematic Embeddings for Taxonomic Information
- Authors: Filip Klubi\v{c}ka and John D. Kelleher
- Abstract summary: Modelling taxonomic and thematic relatedness is important for building AI with comprehensive natural language understanding.
We design a new hypernym-hyponym probing task and perform a comparative probing study of taxonomic and thematic SGNS and GloVe embeddings.
Experiments indicate that both types of embeddings encode some taxonomic information, but the amount, as well as the geometric properties of the encodings, are independently related to both the encoder architecture and the embedding training data.
- Score: 2.9874726192215157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modelling taxonomic and thematic relatedness is important for building AI
with comprehensive natural language understanding. The goal of this paper is to
learn more about how taxonomic information is structurally encoded in
embeddings. To do this, we design a new hypernym-hyponym probing task and
perform a comparative probing study of taxonomic and thematic SGNS and GloVe
embeddings. Our experiments indicate that both types of embeddings encode some
taxonomic information, but the amount, as well as the geometric properties of
the encodings, are independently related to both the encoder architecture, as
well as the embedding training data. Specifically, we find that only taxonomic
embeddings carry taxonomic information in their norm, which is determined by
the underlying distribution in the data.
Related papers
- Refining Wikidata Taxonomy using Large Language Models [2.392329079182226]
We present WiKC, a new version of Wikidata taxonomy cleaned automatically using a combination of Large Language Models (LLMs) and graph mining techniques.
Operations on the taxonomy, such as cutting links or merging classes, are performed with the help of zero-shot prompting on an open-source LLM.
arXiv Detail & Related papers (2024-09-06T06:53:45Z) - TaBIIC: Taxonomy Building through Iterative and Interactive Clustering [2.817412580574242]
In this paper, we explore a method that takes inspiration from both approaches in an iterative and interactive process.
We show that this method is applicable on a variety of data sources and leads to that can be more directly integrated into an ontology.
arXiv Detail & Related papers (2023-12-10T12:17:43Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Neurosymbolic AI and its Taxonomy: a survey [48.7576911714538]
Neurosymbolic AI deals with models that combine symbolic processing, like classic AI, and neural networks.
This survey investigates research papers in this area during recent years and brings classification and comparison between the presented models as well as applications.
arXiv Detail & Related papers (2023-05-12T19:51:13Z) - Towards the Linear Algebra Based Taxonomy of XAI Explanations [0.0]
Methods of Explainable Artificial Intelligence (XAI) were developed to answer the question why a certain prediction or estimation was made.
XAI proposed in the literature mainly concentrate their attention on distinguishing explanations with respect to involving the human agent.
This paper proposes a simple linear algebra-based taxonomy for local explanations.
arXiv Detail & Related papers (2023-01-30T18:21:27Z) - Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
We present a new method that allows achieving high results on this task with little effort.
We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - Large-scale Taxonomy Induction Using Entity and Word Embeddings [13.30719395448771]
We propose TIEmb, an approach for automatic subsumption extraction from knowledge using entity and text embeddings.
We apply the approach on the WebIsA database, a database of classes subsumption relations extracted from the large portion of Wide Web, to extract hierarchies in the Person and Place domain.
arXiv Detail & Related papers (2021-05-04T05:53:12Z) - A Theory of Usable Information Under Computational Constraints [103.5901638681034]
We propose a new framework for reasoning about information in complex systems.
Our foundation is based on a variational extension of Shannon's information theory.
We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
arXiv Detail & Related papers (2020-02-25T06:09:30Z) - Semantic Relatedness and Taxonomic Word Embeddings [2.47944699884651]
We show that there are different types of semantic relatedness and that different lexical representations encode different forms of relatedness.
We present experiments that analyse taxonomic embeddings that have been trained on a synthetic corpus that has been generated via a random walk over a taxonomy.
We explore the interactions between the relative sizes of natural and synthetic corpora on the performance of embeddings when taxonomic and thematic embeddings are combined.
arXiv Detail & Related papers (2020-02-14T20:02:11Z) - TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced
Graph Neural Network [62.12557274257303]
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.
We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data.
We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
arXiv Detail & Related papers (2020-01-26T21:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.