Temporal Analysis on Topics Using Word2Vec
- URL: http://arxiv.org/abs/2209.11717v2
- Date: Sun, 17 Sep 2023 18:27:13 GMT
- Title: Temporal Analysis on Topics Using Word2Vec
- Authors: Angad Sandhu, Aneesh Edara, Vishesh Narayan, Faizan Wajid, Ashok
Agrawala
- Abstract summary: The present study proposes a novel method of trend detection and visualization - more specifically, modeling the change in a topic over time.
The methodology was tested on a group of articles from various media houses present in the 20 Newsgroups dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The present study proposes a novel method of trend detection and
visualization - more specifically, modeling the change in a topic over time.
Where current models used for the identification and visualization of trends
only convey the popularity of a singular word based on stochastic counting of
usage, the approach in the present study illustrates the popularity and
direction that a topic is moving in. The direction in this case is a distinct
subtopic within the selected corpus. Such trends are generated by modeling the
movement of a topic by using k-means clustering and cosine similarity to group
the distances between clusters over time. In a convergent scenario, it can be
inferred that the topics as a whole are meshing (tokens between topics,
becoming interchangeable). On the contrary, a divergent scenario would imply
that each topics' respective tokens would not be found in the same context (the
words are increasingly different to each other). The methodology was tested on
a group of articles from various media houses present in the 20 Newsgroups
dataset.
Related papers
- CAST: Corpus-Aware Self-similarity Enhanced Topic modelling [16.562349140796115]
We introduce CAST: Corpus-Aware Self-similarity Enhanced Topic modelling, a novel topic modelling method.
We find self-similarity to be an effective metric to prevent functional words from acting as candidate topic words.
Our approach significantly enhances the coherence and diversity of generated topics, as well as the topic model's ability to handle noisy data.
arXiv Detail & Related papers (2024-10-19T15:27:11Z) - Visualizing Temporal Topic Embeddings with a Compass [1.5184974790808403]
This paper proposes an expansion of the compass-aligned temporal Word2Vec methodology into dynamic topic modeling.
Such a method allows for the direct comparison of word and document embeddings across time in dynamic topics.
arXiv Detail & Related papers (2024-09-16T18:29:19Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Topics in Contextualised Attention Embeddings [7.6650522284905565]
Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation.
The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics.
Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters.
arXiv Detail & Related papers (2023-01-11T07:26:19Z) - Twitter Topic Classification [15.306383757213956]
We present a new task based on tweet topic classification and release two associated datasets.
Given a wide range of topics covering the most important discussion points in social media, we provide training and testing data.
We perform a quantitative evaluation and analysis of current general- and domain-specific language models on the task.
arXiv Detail & Related papers (2022-09-20T16:13:52Z) - Compositional Temporal Grounding with Structured Variational Cross-Graph
Correspondence Learning [92.07643510310766]
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We empirically find that they fail to generalize to queries with novel combinations of seen words.
We propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies.
arXiv Detail & Related papers (2022-03-24T12:55:23Z) - SCoT: Sense Clustering over Time: a tool for the analysis of lexical
change [79.80787569986283]
We present Sense Clustering over Time (SCoT), a novel network-based tool for analysing lexical change.
SCoT represents the meanings of a word as clusters of similar words.
It has been successfully used in a European study on the changing meaning of crisis'
arXiv Detail & Related papers (2022-03-18T12:04:09Z) - TopicNet: Semantic Graph-Guided Topic Discovery [51.71374479354178]
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner.
We introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as an inductive bias to influence learning.
arXiv Detail & Related papers (2021-10-27T09:07:14Z) - Time Series Analysis via Network Science: Concepts and Algorithms [62.997667081978825]
This review provides a comprehensive overview of existing mapping methods for transforming time series into networks.
We describe the main conceptual approaches, provide authoritative references and give insight into their advantages and limitations in a unified notation and language.
Although still very recent, this research area has much potential and with this survey we intend to pave the way for future research on the topic.
arXiv Detail & Related papers (2021-10-11T13:33:18Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.