CovidNet: To Bring Data Transparency in the Era of COVID-19
- URL: http://arxiv.org/abs/2005.10948v3
- Date: Mon, 20 Jul 2020 21:32:24 GMT
- Title: CovidNet: To Bring Data Transparency in the Era of COVID-19
- Authors: Tong Yang, Kai Shen, Sixuan He, Enyu Li, Peter Sun, Pingying Chen, Lin
Zuo, Jiayue Hu, Yiwen Mo, Weiwei Zhang, Haonan Zhang, Jingxue Chen, Yu Guo
- Abstract summary: This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset.
CovidNet is the only platform providing real-time global case information of more than 4,124 sub-divisions from over 27 countries worldwide.
The accuracy and freshness of the dataset is a result of the painstaking efforts from our voluntary teamwork, crowd-sourcing channels, and automated data pipelines.
- Score: 9.808021836153712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Timely, creditable, and fine-granular case information is vital for local
communities and individual citizens to make rational and data-driven responses
to the COVID-19 pandemic. This paper presents CovidNet, a COVID-19 tracking
project associated with a large scale epidemic dataset, which was initiated by
1Point3Acres. To the best of our knowledge, the project is the only platform
providing real-time global case information of more than 4,124 sub-divisions
from over 27 countries worldwide with multi-language supports. The platform
also offers interactive visualization tools to analyze the full historical case
curves in each region. Initially launched as a voluntary project to bridge the
data transparency gap in North America in January 2020, this project by far has
become one of the major independent sources worldwide and has been consumed by
many other tracking platforms. The accuracy and freshness of the dataset is a
result of the painstaking efforts from our voluntary teamwork, crowd-sourcing
channels, and automated data pipelines. As of May 18, 2020, the project website
has been visited more than 200 million times and the CovidNet dataset has
empowered over 522 institutions and organizations worldwide in policy-making
and academic researches. All datasets are openly accessible for non-commercial
purposes at https://coronavirus.1point3acres.com via a formal request through
our APIs.
Related papers
- Multi-Platform Aggregated Dataset of Online Communities (MADOC) [64.45797970830233]
MADOC aggregates and standardizes data from Bluesky, Koo, Reddit, and Voat (2012-2024), containing 18.9 million posts, 236 million comments, and 23.1 million unique users.
The dataset enables comparative studies of toxic behavior evolution across platforms through standardized interaction records and sentiment analysis.
arXiv Detail & Related papers (2025-01-22T14:02:11Z) - Bridging the Data Provenance Gap Across Text, Speech and Video [67.72097952282262]
We conduct the largest and first-of-its-kind longitudinal audit across modalities of popular text, speech, and video datasets.
Our manual analysis covers nearly 4000 public datasets between 1990-2024, spanning 608 languages, 798 sources, 659 organizations, and 67 countries.
We find that multimodal machine learning applications have overwhelmingly turned to web-crawled, synthetic, and social media platforms, such as YouTube, for their training sets.
arXiv Detail & Related papers (2024-12-19T01:30:19Z) - Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election [49.35115948941981]
We present Uchaguzi-2022, a dataset of 14k categorized and geotagged citizen reports related to the 2022 Kenyan General Election.
We use this dataset to investigate whether language models can assist in scalably categorizing and geotagging reports, thus highlighting its potential application in the AI for Social Good space.
arXiv Detail & Related papers (2024-12-17T17:08:35Z) - Labeled Datasets for Research on Information Operations [71.34999856621306]
We present new labeled datasets about 26 campaigns, which contain both IO posts verified by a social media platform and over 13M posts by 303k accounts that discussed similar topics in the same time frames (control data)
The datasets will facilitate the study of narratives, network interactions, and engagement strategies employed by coordinated accounts across various campaigns and countries.
arXiv Detail & Related papers (2024-11-15T22:15:01Z) - The NetMob2024 Dataset: Population Density and OD Matrices from Four LMIC Countries [0.0]
The NetMob24 dataset offers a unique opportunity for researchers from a range of academic fields to access comprehensive data sets spanning four countries over the course of two years ( 2019 and 2020)
This dataset comprises privacy-preserving data sets from mobile application (app) data collected from users who have voluntarily consented to anonymous data collection for research purposes.
It is our hope that this reference dataset will foster the production of new research methods and the aggregated of research outcomes.
arXiv Detail & Related papers (2024-10-01T07:17:19Z) - Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024
Election Ahead of Time [7.1970442944315245]
Despite increasing awareness and research around fake news, there is still a significant need for datasets that specifically target racial slurs and biases within North American political speeches.
This study introduces a comprehensive dataset that illuminates these critical aspects of misinformation.
arXiv Detail & Related papers (2023-12-01T20:14:16Z) - COVID-19: An exploration of consecutive systemic barriers to
pathogen-related data sharing during a pandemic [3.192308005611312]
In 2020, the COVID-19 pandemic resulted in a rapid response from governments and researchers worldwide.
As of late 2023, over millions have died as a result of COVID-19.
Data professionals working with pandemic-relevant data often face significant systemic barriers to accessing, sharing or re-using this data.
arXiv Detail & Related papers (2022-05-24T14:25:09Z) - A Summary of COVID-19 Datasets [1.3490988186255934]
This research presents a review of main datasets that are developed for COVID-19 research.
We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers.
arXiv Detail & Related papers (2022-02-06T17:34:26Z) - Global Tweet Mentions of COVID-19 [3.3043776328952226]
We present an open-source dataset of 1.92 million keyword-selected Twitter posts, updated weekly from January 2020 to present.
The dashboard presents 100% of the geotagged tweets that contain keywords or hashtags related COVID-19.
With emerging COVID variants but ongoing vaccine hesitancy and resistance, this dataset could be used by researchers to study numerous aspects of COVID-19.
arXiv Detail & Related papers (2021-08-13T20:21:29Z) - Retiring Adult: New Datasets for Fair Machine Learning [47.27417042497261]
UCI Adult has served as the basis for the development and comparison of many algorithmic fairness interventions.
We reconstruct a superset of the UCI Adult data from available US Census sources and reveal idiosyncrasies of the UCI Adult dataset that limit its external validity.
Our primary contribution is a suite of new datasets that extend the existing data ecosystem for research on fair machine learning.
arXiv Detail & Related papers (2021-08-10T19:19:41Z) - NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes.
Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033)
We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.