CovidNet: To Bring Data Transparency in the Era of COVID-19
- URL: http://arxiv.org/abs/2005.10948v3
- Date: Mon, 20 Jul 2020 21:32:24 GMT
- Title: CovidNet: To Bring Data Transparency in the Era of COVID-19
- Authors: Tong Yang, Kai Shen, Sixuan He, Enyu Li, Peter Sun, Pingying Chen, Lin
Zuo, Jiayue Hu, Yiwen Mo, Weiwei Zhang, Haonan Zhang, Jingxue Chen, Yu Guo
- Abstract summary: This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset.
CovidNet is the only platform providing real-time global case information of more than 4,124 sub-divisions from over 27 countries worldwide.
The accuracy and freshness of the dataset is a result of the painstaking efforts from our voluntary teamwork, crowd-sourcing channels, and automated data pipelines.
- Score: 9.808021836153712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Timely, creditable, and fine-granular case information is vital for local
communities and individual citizens to make rational and data-driven responses
to the COVID-19 pandemic. This paper presents CovidNet, a COVID-19 tracking
project associated with a large scale epidemic dataset, which was initiated by
1Point3Acres. To the best of our knowledge, the project is the only platform
providing real-time global case information of more than 4,124 sub-divisions
from over 27 countries worldwide with multi-language supports. The platform
also offers interactive visualization tools to analyze the full historical case
curves in each region. Initially launched as a voluntary project to bridge the
data transparency gap in North America in January 2020, this project by far has
become one of the major independent sources worldwide and has been consumed by
many other tracking platforms. The accuracy and freshness of the dataset is a
result of the painstaking efforts from our voluntary teamwork, crowd-sourcing
channels, and automated data pipelines. As of May 18, 2020, the project website
has been visited more than 200 million times and the CovidNet dataset has
empowered over 522 institutions and organizations worldwide in policy-making
and academic researches. All datasets are openly accessible for non-commercial
purposes at https://coronavirus.1point3acres.com via a formal request through
our APIs.
Related papers
- The NetMob2024 Dataset: Population Density and OD Matrices from Four LMIC Countries [0.0]
The NetMob24 dataset offers a unique opportunity for researchers from a range of academic fields to access comprehensive data sets spanning four countries over the course of two years ( 2019 and 2020)
This dataset comprises privacy-preserving data sets from mobile application (app) data collected from users who have voluntarily consented to anonymous data collection for research purposes.
It is our hope that this reference dataset will foster the production of new research methods and the aggregated of research outcomes.
arXiv Detail & Related papers (2024-10-01T07:17:19Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024
Election Ahead of Time [7.1970442944315245]
Despite increasing awareness and research around fake news, there is still a significant need for datasets that specifically target racial slurs and biases within North American political speeches.
This study introduces a comprehensive dataset that illuminates these critical aspects of misinformation.
arXiv Detail & Related papers (2023-12-01T20:14:16Z) - LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset [75.9621305227523]
We introduce LMSYS-Chat-1M, a large-scale dataset containing one million real-world conversations with 25 state-of-the-art large language models (LLMs)
This dataset is collected from 210K IP addresses in the wild on our Vicuna demo and Arena website.
We demonstrate its versatility through four use cases: developing content moderation models that perform similarly to GPT-4, building a safety benchmark, training instruction-following models that perform similarly to Vicuna, and creating challenging benchmark questions.
arXiv Detail & Related papers (2023-09-21T12:13:55Z) - COVID-19: An exploration of consecutive systemic barriers to
pathogen-related data sharing during a pandemic [3.192308005611312]
In 2020, the COVID-19 pandemic resulted in a rapid response from governments and researchers worldwide.
As of late 2023, over millions have died as a result of COVID-19.
Data professionals working with pandemic-relevant data often face significant systemic barriers to accessing, sharing or re-using this data.
arXiv Detail & Related papers (2022-05-24T14:25:09Z) - A Summary of COVID-19 Datasets [1.3490988186255934]
This research presents a review of main datasets that are developed for COVID-19 research.
We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers.
arXiv Detail & Related papers (2022-02-06T17:34:26Z) - Datasets: A Community Library for Natural Language Processing [55.48866401721244]
datasets is a community library for contemporary NLP.
The library includes more than 650 unique datasets, has more than 250 contributors, and has helped support a variety of novel cross-dataset research projects.
arXiv Detail & Related papers (2021-09-07T03:59:22Z) - Global Tweet Mentions of COVID-19 [3.3043776328952226]
We present an open-source dataset of 1.92 million keyword-selected Twitter posts, updated weekly from January 2020 to present.
The dashboard presents 100% of the geotagged tweets that contain keywords or hashtags related COVID-19.
With emerging COVID variants but ongoing vaccine hesitancy and resistance, this dataset could be used by researchers to study numerous aspects of COVID-19.
arXiv Detail & Related papers (2021-08-13T20:21:29Z) - Retiring Adult: New Datasets for Fair Machine Learning [47.27417042497261]
UCI Adult has served as the basis for the development and comparison of many algorithmic fairness interventions.
We reconstruct a superset of the UCI Adult data from available US Census sources and reveal idiosyncrasies of the UCI Adult dataset that limit its external validity.
Our primary contribution is a suite of new datasets that extend the existing data ecosystem for research on fair machine learning.
arXiv Detail & Related papers (2021-08-10T19:19:41Z) - Rapidly Bootstrapping a Question Answering Dataset for COVID-19 [88.86456834766288]
We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19.
This is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available.
arXiv Detail & Related papers (2020-04-23T17:35:11Z) - NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes.
Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033)
We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.