TAN-NTM: Topic Attention Networks for Neural Topic Modeling
- URL: http://arxiv.org/abs/2012.01524v1
- Date: Wed, 2 Dec 2020 20:58:04 GMT
- Title: TAN-NTM: Topic Attention Networks for Neural Topic Modeling
- Authors: Madhur Panwar, Shashank Shailabh, Milan Aggarwal, Balaji Krishnamurthy
- Abstract summary: We propose a novel framework: TAN-NTM which models document as a sequence of tokens instead of BoW at the input layer.
We apply attention on LSTM outputs to empower the model to attend on relevant words which convey topic related cues.
TAN-NTM achieves state-of-the-art results with 9-15 percentage improvement over score of existing SOTA topic models in NPMI coherence metric.
- Score: 8.631228373008478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Topic models have been widely used to learn representations from text and
gain insight into document corpora. To perform topic discovery, existing neural
models use document bag-of-words (BoW) representation as input followed by
variational inference and learn topic-word distribution through reconstructing
BoW. Such methods have mainly focused on analysing the effect of enforcing
suitable priors on document distribution. However, little importance has been
given to encoding improved document features for capturing document semantics
better. In this work, we propose a novel framework: TAN-NTM which models
document as a sequence of tokens instead of BoW at the input layer and
processes it through an LSTM whose output is used to perform variational
inference followed by BoW decoding. We apply attention on LSTM outputs to
empower the model to attend on relevant words which convey topic related cues.
We hypothesise that attention can be performed effectively if done in a topic
guided manner and establish this empirically through ablations. We factor in
topic-word distribution to perform topic aware attention achieving
state-of-the-art results with ~9-15 percentage improvement over score of
existing SOTA topic models in NPMI coherence metric on four benchmark datasets
- 20NewsGroup, Yelp, AGNews, DBpedia. TAN-NTM also obtains better document
classification accuracy owing to learning improved document-topic features. We
qualitatively discuss that attention mechanism enables unsupervised discovery
of keywords. Motivated by this, we further show that our proposed framework
achieves state-of-the-art performance on topic aware supervised generation of
keyphrases on StackExchange and Weibo datasets.
Related papers
- Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - Enhancing Visually-Rich Document Understanding via Layout Structure
Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model.
We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z) - CWTM: Leveraging Contextualized Word Embeddings from BERT for Neural
Topic Modeling [23.323587005085564]
We introduce a novel neural topic model called the Contextlized Word Topic Model (CWTM)
CWTM integrates contextualized word embeddings from BERT.
It is capable of learning the topic vector of a document without BOW information.
It can also derive the topic vectors for individual words within a document based on their contextualized word embeddings.
arXiv Detail & Related papers (2023-05-16T10:07:33Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - Keyword Assisted Embedded Topic Model [1.9000421840914223]
Probabilistic topic models describe how words in documents are generated via a set of latent distributions called topics.
Recently, the Embedded Topic Model (ETM) has extended LDA to utilize the semantic information in word embeddings to derive semantically richer topics.
We propose the Keyword Assisted Embedded Topic Model (KeyETM), which equips ETM with the ability to incorporate user knowledge in the form of informative topic-level priors.
arXiv Detail & Related papers (2021-11-22T07:27:17Z) - TopicNet: Semantic Graph-Guided Topic Discovery [51.71374479354178]
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner.
We introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as an inductive bias to influence learning.
arXiv Detail & Related papers (2021-10-27T09:07:14Z) - Neural Attention-Aware Hierarchical Topic Model [25.721713066830404]
We propose a variational autoencoder (VAE) NTM model that jointly reconstructs the sentence and document word counts.
Our model also features hierarchical KL divergence to leverage embeddings of each document to regularize those of their sentences.
Both quantitative and qualitative experiments have shown the efficacy of our model in 1) lowering the reconstruction errors at both the sentence and document levels, and 2) discovering more coherent topics from real-world datasets.
arXiv Detail & Related papers (2021-10-14T05:42:32Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.