CORD-19: The COVID-19 Open Research Dataset
- URL: http://arxiv.org/abs/2004.10706v4
- Date: Fri, 10 Jul 2020 21:40:34 GMT
- Title: CORD-19: The COVID-19 Open Research Dataset
- Authors: Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas,
Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis,
Rodney Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey
Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex
Wade, Kuansan Wang, Nancy Xin Ru Wang, Chris Wilhelm, Boya Xie, Douglas
Raymond, Daniel S. Weld, Oren Etzioni, Sebastian Kohlmeier
- Abstract summary: CORD-19 is a growing resource of scientific papers on COVID-19 and related historical coronavirus research.
Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many COVID-19 text mining and discovery systems.
- Score: 28.556291682259477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The COVID-19 Open Research Dataset (CORD-19) is a growing resource of
scientific papers on COVID-19 and related historical coronavirus research.
CORD-19 is designed to facilitate the development of text mining and
information retrieval systems over its rich collection of metadata and
structured full text papers. Since its release, CORD-19 has been downloaded
over 200K times and has served as the basis of many COVID-19 text mining and
discovery systems. In this article, we describe the mechanics of dataset
construction, highlighting challenges and key design decisions, provide an
overview of how CORD-19 has been used, and describe several shared tasks built
around the dataset. We hope this resource will continue to bring together the
computing community, biomedical experts, and policy makers in the search for
effective treatments and management policies for COVID-19.
Related papers
- COVIDx CXR-4: An Expanded Multi-Institutional Open-Source Benchmark
Dataset for Chest X-ray Image-Based Computer-Aided COVID-19 Diagnostics [79.90346960083775]
We introduce COVIDx CXR-4, an expanded multi-institutional open-source benchmark dataset for chest X-ray image-based computer-aided COVID-19 diagnostics.
COVIDx CXR-4 expands significantly on the previous COVIDx CXR-3 dataset by increasing the total patient cohort size by greater than 2.66 times.
We provide extensive analysis on the diversity of the patient demographic, imaging metadata, and disease distributions to highlight potential dataset biases.
arXiv Detail & Related papers (2023-11-29T14:40:31Z) - Covidia: COVID-19 Interdisciplinary Academic Knowledge Graph [99.28342534985146]
Existing literature and knowledge platforms on COVID-19 only focus on collecting papers on biology and medicine.
We propose Covidia, COVID-19 interdisciplinary academic knowledge graph to bridge the gap between knowledge of COVID-19 on different domains.
arXiv Detail & Related papers (2023-04-14T16:45:38Z) - COV19IR : COVID-19 Domain Literature Information Retrieval [0.0]
We demonstrate two tasks along withsolutions, COVID-19 literature retrieval, and question answering.
Based on transformer neural network, we provided solutions to implement the tasks on CORD-19 dataset.
arXiv Detail & Related papers (2022-11-08T05:12:37Z) - A Summary of COVID-19 Datasets [1.3490988186255934]
This research presents a review of main datasets that are developed for COVID-19 research.
We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers.
arXiv Detail & Related papers (2022-02-06T17:34:26Z) - A Global Survey of Technological Resources and Datasets on COVID-19 [0.0]
The application and successful utilization of technological resources in developing solutions to health, safety, and economic issues caused by COVID-19 indicate the importance of technology in curbing COVID-19.
arXiv Detail & Related papers (2022-02-06T04:37:14Z) - Unsupervised Text Mining of COVID-19 Records [0.0]
Twitter as a powerful tool can help researchers measure public health in response to COVID-19.
This paper preprocessed the existing medical dataset regarding COVID-19 named CORD-19 and annotated the dataset for supervised classification tasks.
arXiv Detail & Related papers (2021-09-08T05:57:22Z) - Repurposing TREC-COVID Annotations to Answer the Key Questions of
CORD-19 [4.847073702809032]
coronavirus disease 2019 (COVID-19) began in Wuhan, China in late 2019 and to date has infected over 14M people worldwide.
White House aggregated over 200,000 journal articles related to a variety of coronaviruses and tasked the community with answering key questions related to the corpus.
We set out to repurpose the relevancy annotations for TREC-COVID tasks to identify journal articles in CORD-19 which are relevant to the key questions posed by CORD-19.
arXiv Detail & Related papers (2020-08-27T19:51:07Z) - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question
Answering, and Abstractive Summarization [53.67205506042232]
CO-Search is a retriever-ranker semantic search engine designed to handle complex queries over the COVID-19 literature.
To account for the domain-specific and relatively limited dataset, we generate a bipartite graph of document paragraphs and citations.
We evaluate our system on the data of the TREC-COVID information retrieval challenge.
arXiv Detail & Related papers (2020-06-17T01:32:48Z) - Rapidly Bootstrapping a Question Answering Dataset for COVID-19 [88.86456834766288]
We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19.
This is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available.
arXiv Detail & Related papers (2020-04-23T17:35:11Z) - A Study of Knowledge Sharing related to Covid-19 Pandemic in Stack
Overflow [69.5231754305538]
Study of 464 Stack Overflow questions posted mainly in February and March 2020 and leveraging the power of text mining.
Findings reveal that indeed this global crisis sparked off an intense and increasing activity in Stack Overflow with most post topics reflecting a strong interest on the analysis of Covid-19 data.
arXiv Detail & Related papers (2020-04-18T08:19:46Z) - COVID-Net: A Tailored Deep Convolutional Neural Network Design for
Detection of COVID-19 Cases from Chest X-Ray Images [93.0013343535411]
We introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest X-ray (CXR) images.
To the best of the authors' knowledge, COVID-Net is one of the first open source network designs for COVID-19 detection from CXR images.
We also introduce COVIDx, an open access benchmark dataset that we generated comprising of 13,975 CXR images across 13,870 patient patient cases.
arXiv Detail & Related papers (2020-03-22T12:26:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.