TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced
Graph Neural Network
- URL: http://arxiv.org/abs/2001.09522v1
- Date: Sun, 26 Jan 2020 21:30:21 GMT
- Title: TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced
Graph Neural Network
- Authors: Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang,
Jiawei Han
- Abstract summary: Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.
We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data.
We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
- Score: 62.12557274257303
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Taxonomies consist of machine-interpretable semantics and provide valuable
knowledge for many web applications. For example, online retailers (e.g.,
Amazon and eBay) use taxonomies for product recommendation, and web search
engines (e.g., Google and Bing) leverage taxonomies to enhance query
understanding. Enormous efforts have been made on constructing taxonomies
either manually or semi-automatically. However, with the fast-growing volume of
web content, existing taxonomies will become outdated and fail to capture
emerging knowledge. Therefore, in many applications, dynamic expansions of an
existing taxonomy are in great demand. In this paper, we study how to expand an
existing taxonomy by adding a set of new concepts. We propose a novel
self-supervised framework, named TaxoExpan, which automatically generates a set
of <query concept, anchor concept> pairs from the existing taxonomy as training
data. Using such self-supervision data, TaxoExpan learns a model to predict
whether a query concept is the direct hyponym of an anchor concept. We develop
two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural
network that encodes the local structure of an anchor concept in the existing
taxonomy, and (2) a noise-robust training objective that enables the learned
model to be insensitive to the label noise in the self-supervision data.
Extensive experiments on three large-scale datasets from different domains
demonstrate both the effectiveness and the efficiency of TaxoExpan for taxonomy
expansion.
Related papers
- Towards Visual Taxonomy Expansion [50.462998483087915]
We propose Visual Taxonomy Expansion (VTE), introducing visual features into the taxonomy expansion task.
We propose a textual hypernymy learning task and a visual prototype learning task to cluster textual and visual semantics.
Our method is evaluated on two datasets, where we obtain compelling results.
arXiv Detail & Related papers (2023-09-12T10:17:28Z) - Learning What You Need from What You Did: Product Taxonomy Expansion
with User Behaviors Supervision [21.649258076884927]
We present a self-supervised and user behavior-oriented product expansion framework to append new concepts into existing taxonomy.
Our framework extracts hyponymy relations that conform to users' intentions and cognition.
Our method enlarges the size of real-world product from 39,263 to 94,698 relations with 88% semantic precision.
arXiv Detail & Related papers (2022-03-28T17:17:50Z) - TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic
Representations [28.65753036636082]
We propose a new taxonomy completion framework, which effectively leverages both semantic features and structural information in the existing taxonomy.
TaxoEnrich consists of four components: (1) taxonomy-contextualized embedding which incorporates both semantic meanings of concept and taxonomic relations based on powerful pretrained language models; (2) a taxonomy-aware sequential encoder which learns candidate position representations by encoding the structural information of taxonomy.
Experiments on four large real-world datasets from different domains show that TaxoEnrich achieves the best performance among all evaluation metrics and outperforms previous state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-02-10T08:10:43Z) - Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
We present a new method that allows achieving high results on this task with little effort.
We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z) - Who Should Go First? A Self-Supervised Concept Sorting Model for
Improving Taxonomy Expansion [50.794640012673064]
As data and business scope grow in real applications, existing need to be expanded to incorporate new concepts.
Previous works on taxonomy expansion process the new concepts independently and simultaneously, ignoring the potential relationships among them and the appropriate order of inserting operations.
We propose TaxoOrder, a novel self-supervised framework that simultaneously discovers the local hypernym-hyponym structure among new concepts and decides the order of insertion.
arXiv Detail & Related papers (2021-04-08T11:00:43Z) - Taxonomy Completion via Triplet Matching Network [18.37146040410778]
We formulate a new task, "taxonomy completion", by discovering both the hypernym and hyponym concepts for a query.
We propose Triplet Matching Network (TMN), to find the appropriate hypernym, hyponym> pairs for a given query concept.
TMN achieves the best performance on both taxonomy completion task and the previous taxonomy expansion task, outperforming existing methods.
arXiv Detail & Related papers (2021-01-06T07:19:55Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z) - STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths [53.45704816829921]
We propose a self-supervised taxonomy expansion model named STEAM.
STEAM generates natural self-supervision signals, and formulates a node attachment prediction task.
Experiments show STEAM outperforms state-of-the-art methods for taxonomy expansion by 11.6% in accuracy and 7.0% in mean reciprocal rank.
arXiv Detail & Related papers (2020-06-18T00:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.