Covid-Transformer: Detecting COVID-19 Trending Topics on Twitter Using
Universal Sentence Encoder
- URL: http://arxiv.org/abs/2009.03947v3
- Date: Sat, 19 Sep 2020 21:10:57 GMT
- Title: Covid-Transformer: Detecting COVID-19 Trending Topics on Twitter Using
Universal Sentence Encoder
- Authors: Meysam Asgari-Chenaghlu, Narjes Nikzad-Khasmakhi, Shervin Minaee
- Abstract summary: corona-virus disease (also known as COVID-19) has led to a pandemic, impacting more than 200 countries across the globe.
With its global impact, COVID-19 has become a major concern of people almost everywhere.
We try to analyze the tweets and detect the trending topics and major concerns of people on Twitter.
- Score: 7.305019142196582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The novel corona-virus disease (also known as COVID-19) has led to a
pandemic, impacting more than 200 countries across the globe. With its global
impact, COVID-19 has become a major concern of people almost everywhere, and
therefore there are a large number of tweets coming out from every corner of
the world, about COVID-19 related topics. In this work, we try to analyze the
tweets and detect the trending topics and major concerns of people on Twitter,
which can enable us to better understand the situation, and devise better
planning. More specifically we propose a model based on the universal sentence
encoder to detect the main topics of Tweets in recent months. We used universal
sentence encoder in order to derive the semantic representation and the
similarity of tweets. We then used the sentence similarity and their
embeddings, and feed them to K-means clustering algorithm to group similar
tweets (in semantic sense). After that, the cluster summary is obtained using a
text summarization algorithm based on deep learning, which can uncover the
underlying topics of each cluster. Through experimental results, we show that
our model can detect very informative topics, by processing a large number of
tweets on sentence level (which can preserve the overall meaning of the
tweets). Since this framework has no restriction on specific data distribution,
it can be used to detect trending topics from any other social media and any
other context rather than COVID-19. Experimental results show superiority of
our proposed approach to other baselines, including TF-IDF, and latent
Dirichlet allocation (LDA).
Related papers
- Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - Extracting Feelings of People Regarding COVID-19 by Social Network
Mining [0.0]
dataset of COVID-related tweets in English language is collected.
More than two million tweets from March 23 to June 23 of 2020 are analyzed.
arXiv Detail & Related papers (2021-10-12T16:45:33Z) - A Case Study to Reveal if an Area of Interest has a Trend in Ongoing
Tweets Using Word and Sentence Embeddings [0.0]
We have proposed an easily applicable automated methodology in which the Daily Mean Similarity Scores show the similarity between the daily tweet corpus and the target words.
The Daily Mean Similarity Scores have mainly based on cosine similarity and word/sentence embeddings.
We have also compared the effectiveness of using word versus sentence embeddings while applying our methodology and realized that both give almost the same results.
arXiv Detail & Related papers (2021-10-02T18:44:55Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Understanding Information Spreading Mechanisms During COVID-19 Pandemic
by Analyzing the Impact of Tweet Text and User Features for Retweet
Prediction [6.658785818853953]
COVID-19 has affected the world economy and the daily life routine of almost everyone.
Social media platforms enable users to share information with other users who can reshare this information.
We propose two CNN and RNN based models and evaluate the performance of these models on a publicly available TweetsCOV19 dataset.
arXiv Detail & Related papers (2021-05-26T15:55:58Z) - ComStreamClust: a communicative multi-agent approach to text clustering
in streaming data [1.9949261242626626]
We propose a novel, multi-agent, communicative clustering approach, so-called ComStreamClust for clustering sub-topics inside a broader topic.
The proposed approach is parallelizable, and can simultaneously handle several data-point.
ComStreamClust has been evaluated on two datasets: the COVID-19 and the FA CUP.
arXiv Detail & Related papers (2020-10-11T21:19:19Z) - Tweet to News Conversion: An Investigation into Unsupervised
Controllable Text Generation [46.74654716230366]
In this paper, we define the task of constructing a coherent paragraph from a set of disaster domain tweets.
We tackle the problem by building two systems in pipeline. The first system focuses on unsupervised style transfer and converts the individual tweets into news sentences.
The second system stitches together the outputs from the first system to form a coherent news paragraph.
arXiv Detail & Related papers (2020-08-21T06:56:57Z) - Sequential Sentence Matching Network for Multi-turn Response Selection
in Retrieval-based Chatbots [45.920841134523286]
We propose a matching network, called sequential sentence matching network (S2M), to use the sentence-level semantic information to address the problem.
Firstly, we find that by using the sentence-level semantic information, the network successfully addresses the problem and gets a significant improvement on matching, resulting in a state-of-the-art performance.
arXiv Detail & Related papers (2020-05-16T09:47:19Z) - Exploratory Analysis of Covid-19 Tweets using Topic Modeling, UMAP, and
DiGraphs [36.33347149799959]
This paper illustrates five different techniques to assess the distinctiveness of topics, key terms and features, speed of information dissemination, and network behaviors for Covid19 tweets.
One topic specific to U.S. cases would start to uptick immediately after live White House Coronavirus Task Force briefings.
One of the simplest highlights of this analysis is that early-stage descriptive methods like regular expressions can successfully identify high-level themes.
arXiv Detail & Related papers (2020-05-06T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.