Related papers: DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM

DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM

URL: http://arxiv.org/abs/2310.15296v2
Date: Sat, 23 Dec 2023 07:05:20 GMT
Title: DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM
Authors: Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu
Abstract summary: Our study addresses gaps by introducing a novel framework named Diffusion-Enhanced Topic Modeling. By exploiting the power of diffusion model, our framework also provides the capability to do topic based text generation.
Score: 2.8233611508673
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In the burgeoning field of natural language processing (NLP), Neural Topic Models (NTMs) , Large Language Models (LLMs) and Diffusion model have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic based text generation. NTMs have never been combined with diffusion model for text generation. Our study addresses these gaps by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion model, our framework also provides the capability to do topic based text generation. This dual functionality allows users to efficiently produce highly clustered topics and topic based text generation simultaneously. DeTiME's potential extends to generating clustered embeddings as well. Notably, our proposed framework(both encoder-decoder based LLM and diffusion model) proves to be efficient to train and exhibits high adaptability to other LLMs and diffusion model, demonstrating its potential for a wide array of applications.

Related papers

Ensemble Learning for Large Language Models in Text and Code Generation: A Survey [6.041894045506043]
We focus on four methods and models that show strong performance and potential for broader applications. These include better representation of diversity, improved output quality, and greater flexibility in applications.
arXiv Detail & Related papers (2025-03-13T18:50:57Z)
Scalable Language Models with Posterior Inference of Latent Thought Vectors [52.63299874322121]
Latent-Thought Language Models (LTMs) incorporate explicit latent thought vectors that follow an explicit prior model in latent space. LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space. LTMs significantly outperform conventional autoregressive models and discrete diffusion models in validation perplexity and zero-shot language modeling.
arXiv Detail & Related papers (2025-02-03T17:50:34Z)
LITA: An Efficient LLM-assisted Iterative Topic Augmentation Framework [0.0]
Large language models (LLMs) offer potential for dynamic topic refinement and discovery, yet their application often incurs high API costs. To address these challenges, we propose the LLM-assisted Iterative Topic Augmentation framework (LITA) LITA integrates user-provided seeds with embedding-based clustering and iterative refinement.
arXiv Detail & Related papers (2024-12-17T01:43:44Z)
Multimodal Latent Language Modeling with Next-Token Diffusion [111.93906046452125]
Multimodal generative models require a unified approach to handle both discrete data (e.g., text and code) and continuous data (e.g., image, audio, video) We propose Latent Language Modeling (LatentLM), which seamlessly integrates continuous and discrete data using causal Transformers.
arXiv Detail & Related papers (2024-12-11T18:57:32Z)
Neural Topic Modeling with Large Language Models in the Loop [12.142323482188056]
We propose a novel framework that integrates Large Language Models (LLMs) with Neural Topic Models (NTMs) In LLM-ITL, global topics and document representations are learned through the NTM, while an LLM refines the topics via a confidence-weighted Optimal Transport (OT)-based alignment objective. This process enhances the interpretability and coherence of the learned topics, while maintaining the efficiency of NTMs.
arXiv Detail & Related papers (2024-11-13T11:31:02Z)
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration [10.970637831760136]
Speculative decoding (SD) has emerged as a widely used paradigm to accelerate the inference of large language models (LLMs) We introduce SWIFT, an on-the-fly self-speculative decoding algorithm that adaptively selects intermediate layers of LLMs to skip during inference. We show that SWIFT can achieve over a 1.3x-1.6x speedup while preserving the original distribution of the generated text.
arXiv Detail & Related papers (2024-10-09T14:15:30Z)
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval [23.94611751368491]
We investigate the feasibility of leveraging large language models (LLMs) for integrating general knowledge and incorporating pseudo-events as priors for temporal content distribution. To overcome these limitations, we propose utilizing LLM encoders instead of decoders. We present a general framework for integrating LLM encoders into existing VMR architectures, specifically within the fusion module.
arXiv Detail & Related papers (2024-07-21T04:39:06Z)
All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z)
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models [42.891427362223176]
Large language models (LLMs) based on decoder-only transformers have demonstrated superior text understanding capabilities. We propose a novel framework to fully harness the capabilities of LLMs. We further design an LLM-Infused Diffusion Transformer (LI-DiT) based on the framework.
arXiv Detail & Related papers (2024-06-17T17:59:43Z)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing. We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z)
Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs) We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM. Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
Extrapolating Multilingual Understanding Models as Multilingual Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z)
Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models. We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance. Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.