HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding
- URL: http://arxiv.org/abs/2210.10625v1
- Date: Sun, 16 Oct 2022 02:54:17 GMT
- Title: HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding
- Authors: Yishi Xu, Dongsheng Wang, Bo Chen, Ruiying Lu, Zhibin Duan, Mingyuan
Zhou
- Abstract summary: We present a novel framework that introduces hyperbolic embeddings to represent words and topics.
With the tree-likeness property of hyperbolic space, the underlying semantic hierarchy can be better exploited to mine more interpretable topics.
- Score: 54.52651110749165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Embedded topic models are able to learn interpretable topics even with large
and heavy-tailed vocabularies. However, they generally hold the Euclidean
embedding space assumption, leading to a basic limitation in capturing
hierarchical relations. To this end, we present a novel framework that
introduces hyperbolic embeddings to represent words and topics. With the
tree-likeness property of hyperbolic space, the underlying semantic hierarchy
among words and topics can be better exploited to mine more interpretable
topics. Furthermore, due to the superiority of hyperbolic geometry in
representing hierarchical data, tree-structure knowledge can also be naturally
injected to guide the learning of a topic hierarchy. Therefore, we further
develop a regularization term based on the idea of contrastive learning to
inject prior structural knowledge efficiently. Experiments on both topic
taxonomy discovery and document representation demonstrate that the proposed
framework achieves improved performance against existing embedded topic models.
Related papers
- TopicGPT: A Prompt-based Topic Modeling Framework [77.72072691307811]
We introduce TopicGPT, a prompt-based framework that uses large language models to uncover latent topics in a text collection.
It produces topics that align better with human categorizations compared to competing methods.
Its topics are also interpretable, dispensing with ambiguous bags of words in favor of topics with natural language labels and associated free-form descriptions.
arXiv Detail & Related papers (2023-11-02T17:57:10Z) - Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation [58.3921103230647]
We propose a novel framework for topic taxonomy expansion, named TopicExpan.
TopicExpan directly generates topic-related terms belonging to new topics.
Experimental results on two real-world text corpora show that TopicExpan significantly outperforms other baseline methods in terms of the quality of output.
arXiv Detail & Related papers (2022-10-18T22:38:49Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - TopicNet: Semantic Graph-Guided Topic Discovery [51.71374479354178]
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner.
We introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as an inductive bias to influence learning.
arXiv Detail & Related papers (2021-10-27T09:07:14Z) - HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning [24.080321524759455]
We present HyperExpan, a taxonomy expansion algorithm that learns to represent concepts and their relations with a Hyperbolic Graph Neural Network (HGNN)
Experiments show that our proposed HyperExpan outperforms baseline models with representation learning in a Euclidean feature space and achieves state-of-the-art performance on the taxonomy expansion benchmarks.
arXiv Detail & Related papers (2021-09-22T03:27:04Z) - A Neural Generative Model for Joint Learning Topics and Topic-Specific
Word Embeddings [42.87769996249732]
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings.
The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy.
arXiv Detail & Related papers (2020-08-11T13:54:11Z) - Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding [37.7780399311715]
Hierarchical Topic Mining aims to mine a set of representative terms for each category from a text corpus to help a user comprehend his/her interested topics.
Our model, named JoSH, mines a high-quality set of hierarchical topics with high efficiency and benefits weakly-supervised hierarchical text classification tasks.
arXiv Detail & Related papers (2020-07-18T23:30:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.