DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based
LLM
- URL: http://arxiv.org/abs/2310.15296v2
- Date: Sat, 23 Dec 2023 07:05:20 GMT
- Title: DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based
LLM
- Authors: Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu
- Abstract summary: Our study addresses gaps by introducing a novel framework named Diffusion-Enhanced Topic Modeling.
By exploiting the power of diffusion model, our framework also provides the capability to do topic based text generation.
- Score: 2.8233611508673
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the burgeoning field of natural language processing (NLP), Neural Topic
Models (NTMs) , Large Language Models (LLMs) and Diffusion model have emerged
as areas of significant research interest. Despite this, NTMs primarily utilize
contextual embeddings from LLMs, which are not optimal for clustering or
capable for topic based text generation. NTMs have never been combined with
diffusion model for text generation. Our study addresses these gaps by
introducing a novel framework named Diffusion-Enhanced Topic Modeling using
Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based
LLMs to produce highly clusterable embeddings that could generate topics that
exhibit both superior clusterability and enhanced semantic coherence compared
to existing methods. Additionally, by exploiting the power of diffusion model,
our framework also provides the capability to do topic based text generation.
This dual functionality allows users to efficiently produce highly clustered
topics and topic based text generation simultaneously. DeTiME's potential
extends to generating clustered embeddings as well. Notably, our proposed
framework(both encoder-decoder based LLM and diffusion model) proves to be
efficient to train and exhibits high adaptability to other LLMs and diffusion
model, demonstrating its potential for a wide array of applications.
Related papers
- SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration [10.970637831760136]
Speculative decoding (SD) has emerged as a widely used paradigm to accelerate the inference of large language models (LLMs)
We introduce SWIFT, an on-the-fly self-speculative decoding algorithm that adaptively selects intermediate layers of LLMs to skip during inference.
We show that SWIFT can achieve over a 1.3x-1.6x speedup while preserving the original distribution of the generated text.
arXiv Detail & Related papers (2024-10-09T14:15:30Z) - Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval [23.94611751368491]
We investigate the feasibility of leveraging large language models (LLMs) for integrating general knowledge and incorporating pseudo-events as priors for temporal content distribution.
To overcome these limitations, we propose utilizing LLM encoders instead of decoders.
We present a general framework for integrating LLM encoders into existing VMR architectures, specifically within the fusion module.
arXiv Detail & Related papers (2024-07-21T04:39:06Z) - All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data.
E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z) - Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models [42.891427362223176]
Large language models (LLMs) based on decoder-only transformers have demonstrated superior text understanding capabilities.
We propose a novel framework to fully harness the capabilities of LLMs.
We further design an LLM-Infused Diffusion Transformer (LI-DiT) based on the framework.
arXiv Detail & Related papers (2024-06-17T17:59:43Z) - DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing.
We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection.
DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z) - Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs)
We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM.
Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z) - LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent
Sentence Spaces [1.529963465178546]
We present LlaMaVAE, which combines expressive encoder and decoder models (sentenceT5 and LlaMA) with a VAE architecture to provide better text generation control to large language models (LLMs)
Experimental results reveal that LlaMaVAE can outperform the previous state-of-the-art VAE language model, Optimus, across various tasks.
arXiv Detail & Related papers (2023-12-20T17:25:23Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Extrapolating Multilingual Understanding Models as Multilingual
Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.
We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z) - Pre-trained Language Models for Keyphrase Generation: A Thorough
Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models.
We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance.
Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.