Twitmo: A Twitter Data Topic Modeling and Visualization Package for R
- URL: http://arxiv.org/abs/2207.11236v1
- Date: Fri, 8 Jul 2022 12:23:20 GMT
- Title: Twitmo: A Twitter Data Topic Modeling and Visualization Package for R
- Authors: Andreas Buchm\"uller, Gillian Kant, Christoph Weisser, Benjamin
S\"afken, Krisztina Kis-Katos, Thomas Kneib
- Abstract summary: Twitmo provides a broad range of methods to collect, pre-process, analyze and visualize geo-tagged Twitter data.
One of the innovations of the package is the automatic pooling of Tweets into longer pseudo- documents using hashtags.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Twitmo, a package that provides a broad range of methods to
collect, pre-process, analyze and visualize geo-tagged Twitter data. Twitmo
enables the user to collect geo-tagged Tweets from Twitter and and provides a
comprehensive and user-friendly toolbox to generate topic distributions from
Latent Dirichlet Allocations (LDA), correlated topic models (CTM) and
structural topic models (STM). Functions are included for pre-processing of
text, model building and prediction. In addition, one of the innovations of the
package is the automatic pooling of Tweets into longer pseudo-documents using
hashtags and cosine similarities for better topic coherence. The package
additionally comes with functionality to visualize collected data sets and
fitted models in static as well as interactive ways and offers built-in support
for model visualizations via LDAvis providing great convenience for researchers
in this area. The Twitmo package is an innovative toolbox that can be used to
analyze public discourse of various topics, political parties or persons of
interest in space and time.
Related papers
- Explaining Datasets in Words: Statistical Models with Natural Language Parameters [66.69456696878842]
We introduce a family of statistical models -- including clustering, time series, and classification models -- parameterized by natural language predicates.
We apply our framework to a wide range of problems: taxonomizing user chat dialogues, characterizing how they evolve across time, finding categories where one language model is better than the other.
arXiv Detail & Related papers (2024-09-13T01:40:20Z) - Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys) [57.30228361181045]
This survey connects key advancements in recommender systems using Generative Models (Gen-RecSys)
It covers: interaction-driven generative models; the use of large language models (LLM) and textual data for natural language recommendation; and the integration of multimodal models for generating and processing images/videos in RS.
Our work highlights necessary paradigms for evaluating the impact and harm of Gen-RecSys and identifies open challenges.
arXiv Detail & Related papers (2024-03-31T06:57:57Z) - Sig-Networks Toolkit: Signature Networks for Longitudinal Language
Modelling [14.619019557308807]
We present an open-source, pip installable toolkit, Sig-Networks, for longitudinal language modelling.
A central focus is the incorporation of Signature-based Neural Network models, which have recently shown success in temporal tasks.
We release the Toolkit as a PyTorch package with an introductory video, Git repositories for preprocessing and modelling including sample notebooks on the modeled NLP tasks.
arXiv Detail & Related papers (2023-12-06T14:34:30Z) - Tweet Insights: A Visualization Platform to Extract Temporal Insights
from Twitter [19.591692602304494]
This paper introduces a large collection of time series data derived from Twitter.
This data comprises the past five years and captures changes in n-gram frequency, similarity, sentiment and topic distribution.
The interface built on top of this data enables temporal analysis for detecting and characterizing shifts in meaning.
arXiv Detail & Related papers (2023-08-04T05:39:26Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Improved Topic modeling in Twitter through Community Pooling [0.0]
Twitter posts are short and often less coherent than other text documents.
We propose a new pooling scheme for topic modeling in Twitter, which groups tweets whose authors belong to the same community.
Results show that our Community polling method outperformed other methods on the majority of metrics in two heterogeneous datasets.
arXiv Detail & Related papers (2021-12-20T17:05:32Z) - SocialVisTUM: An Interactive Visualization Toolkit for Correlated Neural
Topic Models on Social Media Opinion Mining [0.07538606213726905]
Recent research in opinion mining proposed word embedding-based topic modeling methods.
We show how these methods can be used to display correlated topic models on social media texts using SocialVisTUM.
arXiv Detail & Related papers (2021-10-20T14:04:13Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - STAGE: Tool for Automated Extraction of Semantic Time Cues to Enrich
Neural Temporal Ordering Models [4.6150532698347835]
We develop STAGE, a system that can automatically extract time cues and convert them into representations suitable for integration with neural models.
We demonstrate promising results on two event ordering datasets, and highlight important issues in semantic cue representation and integration for future research.
arXiv Detail & Related papers (2021-05-15T23:34:02Z) - COOKIE: A Dataset for Conversational Recommendation over Knowledge
Graphs in E-commerce [64.95907840457471]
We present a new dataset for conversational recommendation over knowledge graphs in e-commerce platforms called COOKIE.
The dataset is constructed from an Amazon review corpus by integrating both user-agent dialogue and custom knowledge graphs for recommendation.
arXiv Detail & Related papers (2020-08-21T00:11:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.