Persian topic detection based on Human Word association and graph
embedding
- URL: http://arxiv.org/abs/2302.09775v2
- Date: Tue, 18 Jul 2023 10:19:50 GMT
- Title: Persian topic detection based on Human Word association and graph
embedding
- Authors: Mehrdad Ranjbar-Khadivi, Shahin Akbarpour, Mohammad-Reza
Feizi-Derakhshi, Babak Anari
- Abstract summary: We propose a framework to detect topics in social media based on Human Word Association.
Most of the work done in this area is in English, but much has been done in the Persian language.
- Score: 3.8137985834223507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a framework to detect topics in social media based
on Human Word Association. Identifying topics discussed in these media has
become a critical and significant challenge. Most of the work done in this area
is in English, but much has been done in the Persian language, especially
microblogs written in Persian. Also, the existing works focused more on
exploring frequent patterns or semantic relationships and ignored the
structural methods of language. In this paper, a topic detection framework
using HWA, a method for Human Word Association, is proposed. This method uses
the concept of imitation of mental ability for word association. This method
also calculates the Associative Gravity Force that shows how words are related.
Using this parameter, a graph can be generated. The topics can be extracted by
embedding this graph and using clustering methods. This approach has been
applied to a Persian language dataset collected from Telegram. Several
experimental studies have been performed to evaluate the proposed framework's
performance. Experimental results show that this approach works better than
other topic detection methods.
Related papers
- A comprehensive study on Frequent Pattern Mining and Clustering categories for topic detection in Persian text stream [6.446062819763263]
The aim of this study is to conduct an extensive study on the best algorithms for topic detection.
The text of Persian social network posts is used as the dataset.
The results indicate that if we are searching for keyword-topics that are easily understandable by humans, the hybrid category is better.
arXiv Detail & Related papers (2024-03-15T12:08:58Z) - Topics in the Haystack: Extracting and Evaluating Topics beyond
Coherence [0.0]
We propose a method that incorporates a deeper understanding of both sentence and document themes.
This allows our model to detect latent topics that may include uncommon words or neologisms.
We present correlation coefficients with human identification of intruder words and achieve near-human level results at the word-intrusion task.
arXiv Detail & Related papers (2023-03-30T12:24:25Z) - A Human Word Association based model for topic detection in social networks [1.8749305679160366]
This paper introduces a topic detection framework for social networks based on the concept of imitating the mental ability of word association.
The performance of this framework is evaluated using the FA-CUP dataset, a benchmark in the field of topic detection.
arXiv Detail & Related papers (2023-01-30T17:10:34Z) - Fine-Grained Visual Entailment [51.66881737644983]
We propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
Unlike prior work, our method is inherently explainable and makes logical predictions at different levels of granularity.
We evaluate our method on a new dataset of manually annotated knowledge elements and show that our method achieves 68.18% accuracy at this challenging task.
arXiv Detail & Related papers (2022-03-29T16:09:38Z) - A Case Study to Reveal if an Area of Interest has a Trend in Ongoing
Tweets Using Word and Sentence Embeddings [0.0]
We have proposed an easily applicable automated methodology in which the Daily Mean Similarity Scores show the similarity between the daily tweet corpus and the target words.
The Daily Mean Similarity Scores have mainly based on cosine similarity and word/sentence embeddings.
We have also compared the effectiveness of using word versus sentence embeddings while applying our methodology and realized that both give almost the same results.
arXiv Detail & Related papers (2021-10-02T18:44:55Z) - Context Dependent Semantic Parsing: A Survey [56.69006903481575]
semantic parsing is the task of translating natural language utterances into machine-readable meaning representations.
Currently, most semantic parsing methods are not able to utilize contextual information.
To address this issue, context dependent semantic parsing has recently drawn a lot of attention.
arXiv Detail & Related papers (2020-11-02T07:51:05Z) - Be More with Less: Hypergraph Attention Networks for Inductive Text
Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task.
Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words.
We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - A novel approach to sentiment analysis in Persian using discourse and
external semantic information [0.0]
Many approaches have been proposed to extract the sentiment of individuals from documents written in natural languages.
The majority of these approaches have focused on English, while resource-lean languages such as Persian suffer from the lack of research work and language resources.
Due to this gap in Persian, the current work is accomplished to introduce new methods for sentiment analysis which have been applied on Persian.
arXiv Detail & Related papers (2020-07-18T18:40:40Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z) - Local-Global Video-Text Interactions for Temporal Grounding [77.5114709695216]
This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.
We tackle this problem using a novel regression-based model that learns to extract a collection of mid-level features for semantic phrases in a text query.
The proposed method effectively predicts the target time interval by exploiting contextual information from local to global.
arXiv Detail & Related papers (2020-04-16T08:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.