topicwizard -- a Modern, Model-agnostic Framework for Topic Model Visualization and Interpretation
- URL: http://arxiv.org/abs/2505.13034v1
- Date: Mon, 19 May 2025 12:19:01 GMT
- Title: topicwizard -- a Modern, Model-agnostic Framework for Topic Model Visualization and Interpretation
- Authors: Márton Kardos, Kenneth C. Enevoldsen, Kristoffer Laigaard Nielbo,
- Abstract summary: We introduce topicwizard, a framework for model-agnostic topic model interpretation.<n>It helps users examine the complex semantic relations between documents, words and topics learned by topic models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Topic models are statistical tools that allow their users to gain qualitative and quantitative insights into the contents of textual corpora without the need for close reading. They can be applied in a wide range of settings from discourse analysis, through pretraining data curation, to text filtering. Topic models are typically parameter-rich, complex models, and interpreting these parameters can be challenging for their users. It is typical practice for users to interpret topics based on the top 10 highest ranking terms on a given topic. This list-of-words approach, however, gives users a limited and biased picture of the content of topics. Thoughtful user interface design and visualizations can help users gain a more complete and accurate understanding of topic models' output. While some visualization utilities do exist for topic models, these are typically limited to a certain type of topic model. We introduce topicwizard, a framework for model-agnostic topic model interpretation, that provides intuitive and interactive tools that help users examine the complex semantic relations between documents, words and topics learned by topic models.
Related papers
- Holistic Evaluations of Topic Models [0.0]
This article evaluates topic models from a database perspective, drawing insights from 1140 BERTopic model runs.<n>The goal is to identify trade-offs in optimizing model parameters and to reflect on what these findings mean for the interpretation and responsible use of topic models.
arXiv Detail & Related papers (2025-07-31T09:20:04Z) - Explaining Datasets in Words: Statistical Models with Natural Language Parameters [66.69456696878842]
We introduce a family of statistical models -- including clustering, time series, and classification models -- parameterized by natural language predicates.<n>We apply our framework to a wide range of problems: taxonomizing user chat dialogues, characterizing how they evolve across time, finding categories where one language model is better than the other.
arXiv Detail & Related papers (2024-09-13T01:40:20Z) - Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - GPTopic: Dynamic and Interactive Topic Representations [0.0]
GPTopic is a software package that leverages Large Language Models (LLMs) to create dynamic, interactive topic representations.
GPTopic provides an intuitive chat interface for users to explore, analyze, and refine topics interactively.
arXiv Detail & Related papers (2024-03-06T11:34:20Z) - Prompting Large Language Models for Topic Modeling [10.31712610860913]
We propose PromptTopic, a novel topic modeling approach that harnesses the advanced language understanding of large language models (LLMs)
It involves extracting topics at the sentence level from individual documents, then aggregating and condensing these topics into a predefined quantity, ultimately providing coherent topics for texts of varying lengths.
We benchmark PromptTopic against the state-of-the-art baselines on three vastly diverse datasets, establishing its proficiency in discovering meaningful topics.
arXiv Detail & Related papers (2023-12-15T11:15:05Z) - Labeled Interactive Topic Models [10.555664965166232]
We introduce a user-friendly interaction for neural topic models.
This interaction permits users to assign a word label to a topic.
We evaluate our method through a human study, where users can relabel topics to find relevant documents.
arXiv Detail & Related papers (2023-11-15T23:18:01Z) - TopicGPT: A Prompt-based Topic Modeling Framework [77.72072691307811]
We introduce TopicGPT, a prompt-based framework that uses large language models to uncover latent topics in a text collection.
It produces topics that align better with human categorizations compared to competing methods.
Its topics are also interpretable, dispensing with ambiguous bags of words in favor of topics with natural language labels and associated free-form descriptions.
arXiv Detail & Related papers (2023-11-02T17:57:10Z) - A User-Centered, Interactive, Human-in-the-Loop Topic Modelling System [32.065158970382036]
Human-in-the-loop topic modelling incorporates users' knowledge into the modelling process, enabling them to refine the model iteratively.
Recent research has demonstrated the value of user feedback, but there are still issues to consider.
We developed a novel, interactive human-in-the-loop topic modeling system with a user-friendly interface.
arXiv Detail & Related papers (2023-04-04T13:05:10Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Query-Driven Topic Model [23.07260625816975]
One desirable property of topic models is to allow users to find topics describing a specific aspect of the corpus.
We propose a novel query-driven topic model that allows users to specify a simple query in words or phrases and return query-related topics.
arXiv Detail & Related papers (2021-05-28T22:49:42Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.