Related papers: FASTopic: A Fast, Adaptive, Stable, and Transferable Topic Modeling Paradigm

FASTopic: A Fast, Adaptive, Stable, and Transferable Topic Modeling Paradigm

URL: http://arxiv.org/abs/2405.17978v1
Date: Tue, 28 May 2024 09:06:38 GMT
Title: FASTopic: A Fast, Adaptive, Stable, and Transferable Topic Modeling Paradigm
Authors: Xiaobao Wu, Thong Nguyen, Delvin Ce Zhang, William Yang Wang, Anh Tuan Luu,
Abstract summary: We present FASTopic, a fast, adaptive, stable, and transferable topic model. We also propose a novel Embedding Transport Plan (ETP) method.
Score: 76.509837704596
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Topic models have been evolving rapidly over the years, from conventional to recent neural models. However, existing topic models generally struggle with either effectiveness, efficiency, or stability, highly impeding their practical applications. In this paper, we propose FASTopic, a fast, adaptive, stable, and transferable topic model. FASTopic follows a new paradigm: Dual Semantic-relation Reconstruction (DSR). Instead of previous conventional, neural VAE-based or clustering-based methods, DSR discovers latent topics by reconstruction through modeling the semantic relations among document, topic, and word embeddings. This brings about a neat and efficient topic modeling framework. We further propose a novel Embedding Transport Plan (ETP) method. Rather than early straightforward approaches, ETP explicitly regularizes the semantic relations as optimal transport plans. This addresses the relation bias issue and thus leads to effective topic modeling. Extensive experiments on benchmark datasets demonstrate that our FASTopic shows superior effectiveness, efficiency, adaptivity, stability, and transferability, compared to state-of-the-art baselines across various scenarios. Our code is available at https://github.com/bobxwu/FASTopic .

Related papers

Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation [67.80294336559574]
Continual Test Time Adaptation (CTTA) is a task that requires a source pre-trained model to continually adapt to new scenarios.<n>We propose a novel pipeline, Orthogonal Projection Subspace to aggregate online Prior-knowledge, dubbed OoPk.
arXiv Detail & Related papers (2025-06-23T18:17:39Z)
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches. We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z)
Meta-Learning Adaptable Foundation Models [37.458141335750696]
We introduce a meta-learning framework infused with PEFT in this intermediate retraining stage to learn a model that can be easily adapted to unseen tasks. In this setting, we demonstrate the suboptimality of standard retraining for finding an adaptable set of parameters. We then apply these theoretical insights to retraining the RoBERTa model to predict the continuation of conversations within the ConvAI2 dataset.
arXiv Detail & Related papers (2024-10-29T17:24:18Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Robust Traffic Forecasting against Spatial Shift over Years [11.208740750755025]
We investigate state-temporal-the-art models using newly proposed traffic OOD benchmarks. We find that these models experience significant decline in performance. We propose a novel of Mixture Experts framework, which learns a set of graph generators during training and combines them to generate new graphs. Our method is both parsimonious and efficacious, and can be seamlessly integrated into anytemporal model.
arXiv Detail & Related papers (2024-10-01T03:49:29Z)
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models. We propose a novel model fine-tuning method to make full use of these ineffective parameters. Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Probabilistic Topic Modelling with Transformer Representations [0.9999629695552195]
We propose the Transformer-Representation Neural Topic Model (TNTM) This approach unifies the powerful and versatile notion of topics based on transformer embeddings with fully probabilistic modelling. Experimental results show that our proposed model achieves results on par with various state-of-the-art approaches in terms of embedding coherence.
arXiv Detail & Related papers (2024-03-06T14:27:29Z)
Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters. Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z)
Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management [75.86903206636741]
Test-time adaptation (TTA) allows the model to adapt itself to novel environments and improve its performance during test time. Several works for TTA have shown promising adaptation performances in continuously changing environments. This paper first presents a robust TTA framework with compound domain knowledge management. We then devise novel regularization which modulates the adaptation rates using domain-similarity between the source and the current target domain.
arXiv Detail & Related papers (2022-12-16T09:02:01Z)
Non-autoregressive Transformer-based End-to-end ASR using BERT [13.07939371864781]
This paper presents a transformer-based end-to-end automatic speech recognition (ASR) model based on BERT. A series of experiments conducted on the AISHELL-1 dataset demonstrates competitive or superior results.
arXiv Detail & Related papers (2021-04-10T16:22:17Z)
Neural Topic Model via Optimal Transport [24.15046280736009]
We present a new neural topic model via the theory of optimal transport (OT) Specifically, we propose to learn the topic distribution of a document by directly minimising its OT distance to the document's word distributions. Our proposed model can be trained efficiently with a differentiable loss.
arXiv Detail & Related papers (2020-08-12T06:37:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.