TaBIIC: Taxonomy Building through Iterative and Interactive Clustering
- URL: http://arxiv.org/abs/2312.05866v1
- Date: Sun, 10 Dec 2023 12:17:43 GMT
- Title: TaBIIC: Taxonomy Building through Iterative and Interactive Clustering
- Authors: Mathieu d'Aquin
- Abstract summary: In this paper, we explore a method that takes inspiration from both approaches in an iterative and interactive process.
We show that this method is applicable on a variety of data sources and leads to that can be more directly integrated into an ontology.
- Score: 2.817412580574242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building taxonomies is often a significant part of building an ontology, and
many attempts have been made to automate the creation of such taxonomies from
relevant data. The idea in such approaches is either that relevant definitions
of the intension of concepts can be extracted as patterns in the data (e.g. in
formal concept analysis) or that their extension can be built from grouping
data objects based on similarity (clustering). In both cases, the process leads
to an automatically constructed structure, which can either be too coarse and
lacking in definition, or too fined-grained and detailed, therefore requiring
to be refined into the desired taxonomy. In this paper, we explore a method
that takes inspiration from both approaches in an iterative and interactive
process, so that refinement and definition of the concepts in the taxonomy
occur at the time of identifying those concepts in the data. We show that this
method is applicable on a variety of data sources and leads to taxonomies that
can be more directly integrated into ontologies.
Related papers
- Creating a Fine Grained Entity Type Taxonomy Using LLMs [0.0]
This study investigates the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy.
Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types.
This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base.
arXiv Detail & Related papers (2024-02-19T21:32:19Z) - To Classify is to Interpret: Building Taxonomies from Heterogeneous Data
through Human-AI Collaboration [0.39160947065896795]
We explore how taxonomy building can be supported with systems that integrate machine learning (ML)
We propose an approach that allows the user to iteratively take into account multiple model's outputs as part of their sensemaking process.
arXiv Detail & Related papers (2023-07-31T08:24:29Z) - Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
We present a new method that allows achieving high results on this task with little effort.
We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - Large-scale Taxonomy Induction Using Entity and Word Embeddings [13.30719395448771]
We propose TIEmb, an approach for automatic subsumption extraction from knowledge using entity and text embeddings.
We apply the approach on the WebIsA database, a database of classes subsumption relations extracted from the large portion of Wide Web, to extract hierarchies in the Person and Place domain.
arXiv Detail & Related papers (2021-05-04T05:53:12Z) - Who Should Go First? A Self-Supervised Concept Sorting Model for
Improving Taxonomy Expansion [50.794640012673064]
As data and business scope grow in real applications, existing need to be expanded to incorporate new concepts.
Previous works on taxonomy expansion process the new concepts independently and simultaneously, ignoring the potential relationships among them and the appropriate order of inserting operations.
We propose TaxoOrder, a novel self-supervised framework that simultaneously discovers the local hypernym-hyponym structure among new concepts and decides the order of insertion.
arXiv Detail & Related papers (2021-04-08T11:00:43Z) - CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and
Relation Transferring [37.1330815281983]
We propose a method for seed-guided topical taxonomy construction, which takes a corpus and a seed taxonomy described by concept names as input.
A relation transferring module learns and transfers the user's interested relation along multiple paths to expand the seed taxonomy structure in width and depth.
A concept learning module enriches the semantics of each concept node by jointly embedding the taxonomy.
arXiv Detail & Related papers (2020-10-13T22:00:31Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z) - Petri Nets with Parameterised Data: Modelling and Verification (Extended
Version) [67.99023219822564]
We introduce and study an extension of coloured Petri nets, called catalog-nets, providing two key features to capture this type of processes.
We show that fresh-value injection is a particularly complex feature to handle, and discuss strategies to tame it.
arXiv Detail & Related papers (2020-06-11T17:26:08Z) - TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced
Graph Neural Network [62.12557274257303]
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.
We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data.
We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
arXiv Detail & Related papers (2020-01-26T21:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.