Related papers: Neural Topic Modeling with Continual Lifelong Learning

Neural Topic Modeling with Continual Lifelong Learning

URL: http://arxiv.org/abs/2006.10909v2
Date: Tue, 27 Jun 2023 05:32:12 GMT
Title: Neural Topic Modeling with Continual Lifelong Learning
Authors: Pankaj Gupta and Yatin Chaudhary and Thomas Runkler and Hinrich Sch\"utze
Abstract summary: We propose a lifelong learning framework for neural topic modeling. It can process streams of document collections, accumulate topics and guide future topic modeling tasks. We demonstrate improved performance quantified by perplexity, topic coherence and information retrieval task.
Score: 19.969393484927252
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lifelong learning has recently attracted attention in building machine learning systems that continually accumulate and transfer knowledge to help future learning. Unsupervised topic modeling has been popularly used to discover topics from document collections. However, the application of topic modeling is challenging due to data sparsity, e.g., in a small collection of (short) documents and thus, generate incoherent topics and sub-optimal document representations. To address the problem, we propose a lifelong learning framework for neural topic modeling that can continuously process streams of document collections, accumulate topics and guide future topic modeling tasks by knowledge transfer from several sources to better deal with the sparse data. In the lifelong process, we particularly investigate jointly: (1) sharing generative homologies (latent topics) over lifetime to transfer prior knowledge, and (2) minimizing catastrophic forgetting to retain the past learning via novel selective data augmentation, co-training and topic regularization approaches. Given a stream of document collections, we apply the proposed Lifelong Neural Topic Modeling (LNTM) framework in modeling three sparse document collections as future tasks and demonstrate improved performance quantified by perplexity, topic coherence and information retrieval task.

Related papers

Embedded Topic Models Enhanced by Wikification [3.082729239227955]
We incorporate the Wikipedia knowledge into a neural topic model to make it aware of named entities. Our experiments show that our method improves the performance of neural topic models in generalizability.
arXiv Detail & Related papers (2024-10-03T12:39:14Z)
Investigating the Impact of Text Summarization on Topic Modeling [13.581341206178525]
In this paper, an approach is proposed that further enhances topic modeling performance by utilizing a pre-trained large language model (LLM) Few shot prompting is used to generate summaries of different lengths to compare their impact on topic modeling. The proposed method yields better topic diversity and comparable coherence values compared to previous models.
arXiv Detail & Related papers (2024-09-28T19:45:45Z)
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community. There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z)
Let the Pretrained Language Models "Imagine" for Short Texts Topic Modeling [29.87929724277381]
In short texts, co-occurrence information is minimal, which results in feature sparsity in document representation. Existing topic models (probabilistic or neural) mostly fail to mine patterns from them to generate coherent topics. We extend short text into longer sequences using existing pre-trained language models (PLMs)
arXiv Detail & Related papers (2023-10-24T00:23:30Z)
Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance. This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z)
Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z)
Embedding Knowledge for Document Summarization: A Survey [66.76415502727802]
Previous works proved that knowledge-embedded document summarizers excel at generating superior digests. We propose novel to recapitulate knowledge and knowledge embeddings under the document summarization view.
arXiv Detail & Related papers (2022-04-24T04:36:07Z)
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings. Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z)
Topic-Guided Abstractive Multi-Document Summarization [21.856615677793243]
A critical point of multi-document summarization (MDS) is to learn the relations among various documents. We propose a novel abstractive MDS model, in which we represent multiple documents as a heterogeneous graph. We employ a neural topic model to jointly discover latent topics that can act as cross-document semantic units.
arXiv Detail & Related papers (2021-10-21T15:32:30Z)
Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers. Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z)
Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization. We also propose a prototype encoding structure to model the ability of intra-topic derivation. Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.