Automated Sentiment Classification and Topic Discovery in Large-Scale Social Media Streams
- URL: http://arxiv.org/abs/2505.01883v1
- Date: Sat, 03 May 2025 18:04:57 GMT
- Title: Automated Sentiment Classification and Topic Discovery in Large-Scale Social Media Streams
- Authors: Yiwen Lu, Siheng Xiong, Zhaowei Li,
- Abstract summary: We present a framework for large-scale sentiment and topic analysis of Twitter discourse.<n>Our pipeline begins with targeted data collection using conflict-specific keywords.<n>We examine the relationship between sentiment and contextual features such as timestamp, geolocation, and lexical content.
- Score: 3.5279571333221913
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a framework for large-scale sentiment and topic analysis of Twitter discourse. Our pipeline begins with targeted data collection using conflict-specific keywords, followed by automated sentiment labeling via multiple pre-trained models to improve annotation robustness. We examine the relationship between sentiment and contextual features such as timestamp, geolocation, and lexical content. To identify latent themes, we apply Latent Dirichlet Allocation (LDA) on partitioned subsets grouped by sentiment and metadata attributes. Finally, we develop an interactive visualization interface to support exploration of sentiment trends and topic distributions across time and regions. This work contributes a scalable methodology for social media analysis in dynamic geopolitical contexts.
Related papers
- DTECT: Dynamic Topic Explorer & Context Tracker [0.8962460460173959]
We introduce DTECT (Dynamic Topic Explorer & Context Tracker), an end-to-end system that bridges the gap between raw textual data and meaningful temporal insights.<n>DTECT provides a unified workflow that supports data preprocessing, multiple model architectures, and dedicated evaluation metrics to analyze the topic quality of temporal topic models.<n>It significantly enhances interpretability by introducing LLM-driven automatic topic labeling, trend analysis via temporally salient words, interactive visualizations with document-level summarization, and a natural language chat interface for intuitive data querying.
arXiv Detail & Related papers (2025-07-10T16:44:33Z) - VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data [0.0]
This study introduces VisTopics, a computational framework designed to analyze large-scale visual datasets.<n>Applying VisTopics to a dataset of 452 NBC News videos resulted in reducing 11,070 frames to 6,928 deduplicated frames, which were then semantically analyzed to uncover 35 topics.
arXiv Detail & Related papers (2025-05-20T19:59:41Z) - Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification.
We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z) - Language Guided Domain Generalized Medical Image Segmentation [68.93124785575739]
Single source domain generalization holds promise for more reliable and consistent image segmentation across real-world clinical settings.
We propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features.
Our approach achieves favorable performance against existing methods in literature.
arXiv Detail & Related papers (2024-04-01T17:48:15Z) - Decoding Multilingual Topic Dynamics and Trend Identification through ARIMA Time Series Analysis on Social Networks: A Novel Data Translation Framework Enhanced by LDA/HDP Models [0.08246494848934444]
We focus on dialogues within Tunisian social networks during the Coronavirus Pandemic and other notable themes like sports and politics.
We start by aggregating a varied multilingual corpus of comments relevant to these subjects.
We then introduce our No-English-to-English Machine Translation approach to handle linguistic differences.
arXiv Detail & Related papers (2024-03-18T00:01:10Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Semantic Role Aware Correlation Transformer for Text to Video Retrieval [23.183653281610866]
This paper proposes a novel transformer that explicitly disentangles the text and video into semantic roles of objects, spatial contexts and temporal contexts.
Preliminary results on popular YouCook2 indicate that our approach surpasses a current state-of-the-art method, with a high margin in all metrics.
arXiv Detail & Related papers (2022-06-26T11:28:03Z) - Graph Adaptive Semantic Transfer for Cross-domain Sentiment
Classification [68.06496970320595]
Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain.
We present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs.
arXiv Detail & Related papers (2022-05-18T07:47:01Z) - SocialVisTUM: An Interactive Visualization Toolkit for Correlated Neural
Topic Models on Social Media Opinion Mining [0.07538606213726905]
Recent research in opinion mining proposed word embedding-based topic modeling methods.
We show how these methods can be used to display correlated topic models on social media texts using SocialVisTUM.
arXiv Detail & Related papers (2021-10-20T14:04:13Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Hybrid Approach for Aspect-Based Sentiment Analysis Using Deep
Contextual Word Embeddings and Hierarchical Attention [4.742874328556818]
We extend the state-of-the-art Hybrid Approach for Aspect-Based Sentiment Analysis (HAABSA) in two directions.
First we replace the non-contextual word embeddings with deep contextual word embeddings in order to better cope with the word semantics in a given text.
Second, we use hierarchical attention by adding an extra attention layer to the HAABSA high-level representations in order to increase the method flexibility in modeling the input data.
arXiv Detail & Related papers (2020-04-18T17:54:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.