A Novel Method of Fuzzy Topic Modeling based on Transformer Processing
- URL: http://arxiv.org/abs/2309.09658v1
- Date: Mon, 18 Sep 2023 10:52:54 GMT
- Title: A Novel Method of Fuzzy Topic Modeling based on Transformer Processing
- Authors: Ching-Hsun Tseng, Shin-Jye Lee, Po-Wei Cheng, Chien Lee, Chih-Chieh
Hung
- Abstract summary: This work presents the fuzzy topic modeling based on soft clustering and document embedding from state-of-the-art transformer-based model.
In our practical application in a press release monitoring, the fuzzy topic modeling gives a more natural result than the traditional output from LDA.
- Score: 1.4597673707346286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Topic modeling is admittedly a convenient way to monitor markets trend.
Conventionally, Latent Dirichlet Allocation, LDA, is considered a must-do model
to gain this type of information. By given the merit of deducing keyword with
token conditional probability in LDA, we can know the most possible or
essential topic. However, the results are not intuitive because the given
topics cannot wholly fit human knowledge. LDA offers the first possible
relevant keywords, which also brings out another problem of whether the
connection is reliable based on the statistic possibility. It is also hard to
decide the topic number manually in advance. As the booming trend of using
fuzzy membership to cluster and using transformers to embed words, this work
presents the fuzzy topic modeling based on soft clustering and document
embedding from state-of-the-art transformer-based model. In our practical
application in a press release monitoring, the fuzzy topic modeling gives a
more natural result than the traditional output from LDA.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Topic Modeling with Fine-tuning LLMs and Bag of Sentences [1.8592384822257952]
FT-Topic is an unsupervised fine-tuning approach for topic modeling.
SenClu is a state-of-the-art topic modeling method that achieves fast inference and hard assignments of sentence groups to a single topic.
arXiv Detail & Related papers (2024-08-06T11:04:07Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers [12.986126243018452]
We introduce the Softmax-Linked Additive Log-Odds Model (SLALOM), a novel surrogate model specifically designed to align with the transformer framework.
SLALOM demonstrates the capacity to deliver a range of faithful and insightful explanations across both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-22T11:14:00Z) - Probabilistic Topic Modelling with Transformer Representations [0.9999629695552195]
We propose the Transformer-Representation Neural Topic Model (TNTM)
This approach unifies the powerful and versatile notion of topics based on transformer embeddings with fully probabilistic modelling.
Experimental results show that our proposed model achieves results on par with various state-of-the-art approaches in terms of embedding coherence.
arXiv Detail & Related papers (2024-03-06T14:27:29Z) - InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via
Intermediary Latents [60.785317191131284]
We introduce a simple and effective method for learning VAEs with controllable biases by using an intermediary set of latent variables.
In particular, it allows us to impose desired properties like sparsity or clustering on learned representations.
We show that this, in turn, allows InteL-VAEs to learn both better generative models and representations.
arXiv Detail & Related papers (2021-06-25T16:34:05Z) - Learning Disentangled Latent Factors from Paired Data in Cross-Modal
Retrieval: An Implicit Identifiable VAE Approach [33.61751393224223]
We deal with the problem of learning the underlying disentangled latent factors that are shared between the paired bi-modal data in cross-modal retrieval.
We propose a novel idea of the implicit decoder, which completely removes the ambient data decoding module from a latent variable model.
Our model is shown to identify the factors accurately, significantly outperforming conventional encoder-decoder latent variable models.
arXiv Detail & Related papers (2020-12-01T17:47:50Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z) - Unification of HDP and LDA Models for Optimal Topic Clustering of
Subject Specific Question Banks [55.41644538483948]
An increase in the popularity of online courses would result in an increase in the number of course-related queries for academics.
In order to reduce the time spent on answering each individual question, clustering them is an ideal choice.
We use the Hierarchical Dirichlet Process to determine an optimal topic number input for our LDA model runs.
arXiv Detail & Related papers (2020-10-04T18:21:20Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.