Related papers: A Human Word Association based model for topic detection in social networks

A Human Word Association based model for topic detection in social networks

URL: http://arxiv.org/abs/2301.13066v2
Date: Tue, 18 Jul 2023 11:39:22 GMT
Title: A Human Word Association based model for topic detection in social networks
Authors: Mehrdad Ranjbar Khadivi, Shahin Akbarpour, Mohammad-Reza Feizi-Derakhshi, Babak Anari
Abstract summary: This paper uses the Concept of the Imitation of the Mental Ability of Word Association to propose a topic detection framework in social networks. A special extraction algorithm has also been designed for this purpose. The performance of this method is evaluated on the FA-CUP dataset.
Score: 3.8137985834223507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the widespread use of social networks, detecting the topics discussed in these networks has become a significant challenge. The current works are mainly based on frequent pattern mining or semantic relations, and the language structure is not considered. The meaning of language structural methods is to discover the relationship between words and how humans understand them. Therefore, this paper uses the Concept of the Imitation of the Mental Ability of Word Association to propose a topic detection framework in social networks. This framework is based on the Human Word Association method. A special extraction algorithm has also been designed for this purpose. The performance of this method is evaluated on the FA-CUP dataset. It is a benchmark dataset in the field of topic detection. The results show that the proposed method is a good improvement compared to other methods, based on the Topic-recall and the keyword F1 measure. Also, most of the previous works in the field of topic detection are limited to the English language, and the Persian language, especially microblogs written in this language, is considered a low-resource language. Therefore, a data set of Telegram posts in the Farsi language has been collected. Applying the proposed method to this dataset also shows that this method works better than other topic detection methods.

Related papers

Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects. In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts. We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z)
Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence [0.0]
We propose a method that incorporates a deeper understanding of both sentence and document themes. This allows our model to detect latent topics that may include uncommon words or neologisms. We present correlation coefficients with human identification of intruder words and achieve near-human level results at the word-intrusion task.
arXiv Detail & Related papers (2023-03-30T12:24:25Z)
Persian topic detection based on Human Word association and graph embedding [3.8137985834223507]
We propose a framework to detect topics in social media based on Human Word Association. Most of the work done in this area is in English, but much has been done in the Persian language.
arXiv Detail & Related papers (2023-02-20T05:46:47Z)
Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives. We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z)
Semantic Search for Large Scale Clinical Ontologies [63.71950996116403]
We present a deep learning approach to build a search system for large clinical vocabularies. We propose a Triplet-BERT model and a method that generates training data based on semantic training data. The model is evaluated using five real benchmark data sets and the results show that our approach achieves high results on both free text to concept and concept to searching concept vocabularies.
arXiv Detail & Related papers (2022-01-01T05:15:42Z)
Phonetic Word Embeddings [1.2192936362342826]
We present a novel methodology for calculating the phonetic similarity between words taking motivation from the human perception of sounds. This metric is employed to learn a continuous vector embedding space that groups similar sounding words together. The efficacy of the method is presented for two different languages (English, Hindi) and performance gains over previous reported works are discussed.
arXiv Detail & Related papers (2021-09-30T01:46:01Z)
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years. We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear. To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z)
Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications [0.0]
We investigate possible ways of automating parts of the Systematic Mapping (SM) and Systematic Review (SR) process. Key-phrases are extracted from scientific documents using unsupervised methods, which are then used to construct the corresponding Classification Scheme. We also explore how clustering can be used to group related key-phrases.
arXiv Detail & Related papers (2021-01-25T10:17:33Z)
Keyphrase Extraction with Dynamic Graph Convolutional Networks and Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document. Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks. In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z)
Evolution of Semantic Similarity -- A Survey [8.873705500708196]
Estimating semantic similarity between text data is a challenging and open research problem in the field of Natural Language Processing (NLP) Various semantic similarity methods have been proposed over the years to address this issue. This survey article traces the evolution of such methods, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network-based methods, and hybrid methods.
arXiv Detail & Related papers (2020-04-19T22:07:39Z)
How Far are We from Effective Context Modeling? An Exploratory Study on Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it. We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.