Federated Neural Topic Models
- URL: http://arxiv.org/abs/2212.02269v2
- Date: Sun, 11 Jun 2023 15:22:40 GMT
- Title: Federated Neural Topic Models
- Authors: Lorena Calvo-Bartolom\'e and Jer\'onimo Arenas-Garc\'ia
- Abstract summary: Federated topic modeling allows multiple parties to jointly train a topic model without sharing their data.
We propose and analyze a federated implementation based on state-of-the-art neural topic modeling implementations.
In practice, our approach is equivalent to a centralized model training, but preserves the privacy of the nodes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last years, topic modeling has emerged as a powerful technique for
organizing and summarizing big collections of documents or searching for
particular patterns in them. However, privacy concerns may arise when
cross-analyzing data from different sources. Federated topic modeling solves
this issue by allowing multiple parties to jointly train a topic model without
sharing their data. While several federated approximations of classical topic
models do exist, no research has been conducted on their application for neural
topic models. To fill this gap, we propose and analyze a federated
implementation based on state-of-the-art neural topic modeling implementations,
showing its benefits when there is a diversity of topics across the nodes'
documents and the need to build a joint model. In practice, our approach is
equivalent to a centralized model training, but preserves the privacy of the
nodes. Advantages of this federated scenario are illustrated by means of
experiments using both synthetic and real data scenarios.
Related papers
- Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - GINopic: Topic Modeling with Graph Isomorphism Network [0.8962460460173959]
We introduce GINopic, a topic modeling framework based on graph isomorphism networks to capture the correlation between words.
We demonstrate the effectiveness of GINopic compared to existing topic models and highlight its potential for advancing topic modeling.
arXiv Detail & Related papers (2024-04-02T17:18:48Z) - TopicAdapt- An Inter-Corpora Topics Adaptation Approach [27.450275637652418]
This paper proposes a neural topic model, TopicAdapt, that can adapt relevant topics from a related source corpus and also discover new topics in a target corpus that are absent in the source corpus.
Experiments over multiple datasets from diverse domains show the superiority of the proposed model against the state-of-the-art topic models.
arXiv Detail & Related papers (2023-10-08T02:56:44Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Statistical Deep Learning for Spatial and Spatio-Temporal Data [0.0]
We present an overview of traditional statistical and machine learning perspectives for modeling spatial andtemporal data.
We then focus on a variety of hybrid models that have recently been developed for latent process, data, and parameter specifications.
These hybrid models integrate modeling ideas with deep neural network models in order to take advantage of the strengths of each modeling paradigm.
arXiv Detail & Related papers (2022-06-05T16:49:10Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - BERTopic: Neural topic modeling with a class-based TF-IDF procedure [0.0]
We present BERTopic, a topic model that extends the feasibility of approach topic modeling as a clustering task.
BERTopic generates coherent topics and remains competitive across a variety of benchmarks involving classical models and those that follow the more recent clustering approach of topic modeling.
arXiv Detail & Related papers (2022-03-11T08:35:15Z) - Query-Driven Topic Model [23.07260625816975]
One desirable property of topic models is to allow users to find topics describing a specific aspect of the corpus.
We propose a novel query-driven topic model that allows users to specify a simple query in words or phrases and return query-related topics.
arXiv Detail & Related papers (2021-05-28T22:49:42Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.