Related papers: CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring

CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring

URL: http://arxiv.org/abs/2010.06714v1
Date: Tue, 13 Oct 2020 22:00:31 GMT
Title: CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring
Authors: Jiaxin Huang, Yiqing Xie, Yu Meng, Yunyi Zhang, Jiawei Han
Abstract summary: We propose a method for seed-guided topical taxonomy construction, which takes a corpus and a seed taxonomy described by concept names as input. A relation transferring module learns and transfers the user's interested relation along multiple paths to expand the seed taxonomy structure in width and depth. A concept learning module enriches the semantics of each concept node by jointly embedding the taxonomy.
Score: 37.1330815281983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Taxonomy is not only a fundamental form of knowledge representation, but also crucial to vast knowledge-rich applications, such as question answering and web search. Most existing taxonomy construction methods extract hypernym-hyponym entity pairs to organize a "universal" taxonomy. However, these generic taxonomies cannot satisfy user's specific interest in certain areas and relations. Moreover, the nature of instance taxonomy treats each node as a single word, which has low semantic coverage. In this paper, we propose a method for seed-guided topical taxonomy construction, which takes a corpus and a seed taxonomy described by concept names as input, and constructs a more complete taxonomy based on user's interest, wherein each node is represented by a cluster of coherent terms. Our framework, CoRel, has two modules to fulfill this goal. A relation transferring module learns and transfers the user's interested relation along multiple paths to expand the seed taxonomy structure in width and depth. A concept learning module enriches the semantics of each concept node by jointly embedding the taxonomy and text. Comprehensive experiments conducted on real-world datasets show that Corel generates high-quality topical taxonomies and outperforms all the baselines significantly.

Related papers

CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts [40.52605902842168]
textscCodeTaxo is a novel approach that leverages large language models through code language prompts to capture the taxonomic structure. Experiments on five real-world benchmarks from different domains demonstrate that textscCodeTaxo consistently achieves superior performance across all evaluation metrics.
arXiv Detail & Related papers (2024-08-17T02:15:07Z)
TaBIIC: Taxonomy Building through Iterative and Interactive Clustering [2.817412580574242]
In this paper, we explore a method that takes inspiration from both approaches in an iterative and interactive process. We show that this method is applicable on a variety of data sources and leads to that can be more directly integrated into an ontology.
arXiv Detail & Related papers (2023-12-10T12:17:43Z)
TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic Representations [28.65753036636082]
We propose a new taxonomy completion framework, which effectively leverages both semantic features and structural information in the existing taxonomy. TaxoEnrich consists of four components: (1) taxonomy-contextualized embedding which incorporates both semantic meanings of concept and taxonomic relations based on powerful pretrained language models; (2) a taxonomy-aware sequential encoder which learns candidate position representations by encoding the structural information of taxonomy. Experiments on four large real-world datasets from different domains show that TaxoEnrich achieves the best performance among all evaluation metrics and outperforms previous state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-02-10T08:10:43Z)
TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom. TaxoCom discovers novel sub-topic clusters of terms and documents. Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z)
Who Should Go First? A Self-Supervised Concept Sorting Model for Improving Taxonomy Expansion [50.794640012673064]
As data and business scope grow in real applications, existing need to be expanded to incorporate new concepts. Previous works on taxonomy expansion process the new concepts independently and simultaneously, ignoring the potential relationships among them and the appropriate order of inserting operations. We propose TaxoOrder, a novel self-supervised framework that simultaneously discovers the local hypernym-hyponym structure among new concepts and decides the order of insertion.
arXiv Detail & Related papers (2021-04-08T11:00:43Z)
Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT. We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure. Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)
STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths [53.45704816829921]
We propose a self-supervised taxonomy expansion model named STEAM. STEAM generates natural self-supervision signals, and formulates a node attachment prediction task. Experiments show STEAM outperforms state-of-the-art methods for taxonomy expansion by 11.6% in accuracy and 7.0% in mean reciprocal rank.
arXiv Detail & Related papers (2020-06-18T00:32:53Z)
TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network [62.12557274257303]
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
arXiv Detail & Related papers (2020-01-26T21:30:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.