Related papers: A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave

A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave

URL: http://arxiv.org/abs/2208.07810v1
Date: Wed, 20 Jul 2022 18:01:18 GMT
Title: A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave
Authors: Nirmalya Thakur
Abstract summary: The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. Social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. This work presents a large-scale open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. Mining such tweets to develop a dataset can serve as a data resource for different applications and use-cases related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore, this work presents a large-scale open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.

Related papers

Multi-Platform Aggregated Dataset of Online Communities (MADOC) [64.45797970830233]
MADOC aggregates and standardizes data from Bluesky, Koo, Reddit, and Voat (2012-2024), containing 18.9 million posts, 236 million comments, and 23.1 million unique users. The dataset enables comparative studies of toxic behavior evolution across platforms through standardized interaction records and sentiment analysis.
arXiv Detail & Related papers (2025-01-22T14:02:11Z)
UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction [93.77809355002591]
We introduce UniTraj, a comprehensive framework that unifies various datasets, models, and evaluation criteria. We conduct extensive experiments and find that model performance significantly drops when transferred to other datasets. We provide insights into dataset characteristics to explain these findings.
arXiv Detail & Related papers (2024-03-22T10:36:50Z)
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset [75.9621305227523]
We introduce LMSYS-Chat-1M, a large-scale dataset containing one million real-world conversations with 25 state-of-the-art large language models (LLMs) This dataset is collected from 210K IP addresses in the wild on our Vicuna demo and Arena website. We demonstrate its versatility through four use cases: developing content moderation models that perform similarly to GPT-4, building a safety benchmark, training instruction-following models that perform similarly to Vicuna, and creating challenging benchmark questions.
arXiv Detail & Related papers (2023-09-21T12:13:55Z)
Investigating the impact of COVID-19 on Online Learning-based Web Behavior [0.0]
The study specifically focused on investigating Google Search-based web behavior data as Google is the most popular search engine globally. The impact of COVID-19 related to online learning-based web behavior on Google was studied for the top 20 worst affected countries in terms of the total number of COVID-19 cases.
arXiv Detail & Related papers (2022-04-27T01:38:10Z)
When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning Framework in Classification of Medical Images on Limited Data: A COVID-19 Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources. CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z)
Unsupervised Text Mining of COVID-19 Records [0.0]
Twitter as a powerful tool can help researchers measure public health in response to COVID-19. This paper preprocessed the existing medical dataset regarding COVID-19 named CORD-19 and annotated the dataset for supervised classification tasks.
arXiv Detail & Related papers (2021-09-08T05:57:22Z)
Global Tweet Mentions of COVID-19 [3.3043776328952226]
We present an open-source dataset of 1.92 million keyword-selected Twitter posts, updated weekly from January 2020 to present. The dashboard presents 100% of the geotagged tweets that contain keywords or hashtags related COVID-19. With emerging COVID variants but ongoing vaccine hesitancy and resistance, this dataset could be used by researchers to study numerous aspects of COVID-19.
arXiv Detail & Related papers (2021-08-13T20:21:29Z)
FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources. Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19. The data itself is still scarce due to patient privacy concerns. We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z)
CML-COVID: A Large-Scale COVID-19 Twitter Dataset with Latent Topics, Sentiment and Location Information [0.0]
CML-COVID is a COVID-19 Twitter data set of 19,298,967 million tweets from 5,977,653 unique individuals. These tweets were collected between March 2020 and July 2020 using the query terms coronavirus, covid and mask related to COVID-19.
arXiv Detail & Related papers (2021-01-28T18:59:10Z)
Drinking from a Firehose: Continual Learning with Web-scale Natural Language [109.80198763438248]
We study a natural setting for continual learning on a massive scale. We collect massive datasets of Twitter posts. We present a rigorous evaluation of continual learning algorithms on an unprecedented scale.
arXiv Detail & Related papers (2020-07-18T05:40:02Z)
A Study of Knowledge Sharing related to Covid-19 Pandemic in Stack Overflow [69.5231754305538]
Study of 464 Stack Overflow questions posted mainly in February and March 2020 and leveraging the power of text mining. Findings reveal that indeed this global crisis sparked off an intense and increasing activity in Stack Overflow with most post topics reflecting a strong interest on the analysis of Covid-19 data.
arXiv Detail & Related papers (2020-04-18T08:19:46Z)
COVID-19 on Social Media: Analyzing Misinformation in Twitter Conversations [22.43295864610142]
We collected streaming data related to COVID-19 using the Twitter API, starting March 1, 2020. We identified unreliable and misleading contents based on fact-checking sources. We examined the narratives promoted in misinformation tweets, along with the distribution of engagements with these tweets.
arXiv Detail & Related papers (2020-03-26T09:48:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.