Navigating the landscape of COVID-19 research through literature
analysis: A bird's eye view
- URL: http://arxiv.org/abs/2008.03397v2
- Date: Fri, 11 Sep 2020 21:01:27 GMT
- Title: Navigating the landscape of COVID-19 research through literature
analysis: A bird's eye view
- Authors: Lana Yeganova, Rezarta Islamaj, Qingyu Chen, Robert Leaman, Alexis
Allot, Chin-Hsuan Wei, Donald C. Comeau, Won Kim, Yifan Peng, W. John Wilbur,
Zhiyong Lu
- Abstract summary: We analyze the LitCovid collection, 13,369 COVID-19 related articles found in PubMed as of May 15th, 2020.
We do that by applying state-of-the-art named entity recognition, classification, clustering and other NLP techniques.
Our clustering algorithm identifies topics represented by groups of related terms, and computes clusters corresponding to documents associated with the topic terms.
- Score: 11.362549790802483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Timely access to accurate scientific literature in the battle with the
ongoing COVID-19 pandemic is critical. This unprecedented public health risk
has motivated research towards understanding the disease in general,
identifying drugs to treat the disease, developing potential vaccines, etc.
This has given rise to a rapidly growing body of literature that doubles in
number of publications every 20 days as of May 2020. Providing medical
professionals with means to quickly analyze the literature and discover growing
areas of knowledge is necessary for addressing their question and information
needs.
In this study we analyze the LitCovid collection, 13,369 COVID-19 related
articles found in PubMed as of May 15th, 2020 with the purpose of examining the
landscape of literature and presenting it in a format that facilitates
information navigation and understanding. We do that by applying
state-of-the-art named entity recognition, classification, clustering and other
NLP techniques. By applying NER tools, we capture relevant bioentities (such as
diseases, internal body organs, etc.) and assess the strength of their
relationship with COVID-19 by the extent they are discussed in the corpus. We
also collect a variety of symptoms and co-morbidities discussed in reference to
COVID-19. Our clustering algorithm identifies topics represented by groups of
related terms, and computes clusters corresponding to documents associated with
the topic terms. Among the topics we observe several that persist through the
duration of multiple weeks and have numerous associated documents, as well
several that appear as emerging topics with fewer documents. All the tools and
data are publicly available, and this framework can be applied to any
literature collection. Taken together, these analyses produce a comprehensive,
synthesized view of COVID-19 research to facilitate knowledge discovery from
literature.
Related papers
- De-identification of clinical free text using natural language
processing: A systematic review of current approaches [48.343430343213896]
Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.
Our study aims to provide systematic evidence on how the de-identification of clinical free text has evolved in the last thirteen years.
arXiv Detail & Related papers (2023-11-28T13:20:41Z) - Exploring the evolution of research topics during the COVID-19 pandemic [3.234641429290768]
We present the CORD-19 Topic Visualizer (CORToViz), a method and associated visualization tool for inspecting the CORD-19 textual corpus of scientific abstracts.
Our method is based upon a careful selection of up-to-date technologies (including large language models) and extraction techniques for temporal topic mining.
Topic inspection is supported by an interactive dashboard, providing fast, one-click visualization of topic contents as word clouds and topic trends as time series.
arXiv Detail & Related papers (2023-10-05T22:16:41Z) - COVID-19 Multidimensional Kaggle Literature Organization [3.201839066679614]
We show that factorization is a powerful unsupervised learning method capable of discovering hidden patterns in a document corpus.
We show that a higher-order representation of the corpus allows for the simultaneous grouping of similar articles, relevant journals, authors with similar research interests, and topic keywords.
arXiv Detail & Related papers (2021-07-17T06:16:36Z) - Domain-Specific Pretraining for Vertical Search: Case Study on
Biomedical Literature [67.4680600632232]
Self-supervised learning has emerged as a promising direction to overcome the annotation bottleneck.
We propose a general approach for vertical search based on domain-specific pretraining.
Our system can scale to tens of millions of articles on PubMed and has been deployed as Microsoft Biomedical Search.
arXiv Detail & Related papers (2021-06-25T01:02:55Z) - What is the State of the Art of Computer Vision-Assisted Cytology? A
Systematic Literature Review [47.42354724922676]
We conducted a Systematic Literature Review to identify the state-of-art of computer vision techniques currently applied to cytology.
The most used methods in the analyzed works are deep learning-based (70 papers), while fewer works employ classic computer vision only (101 papers)
We conclude that there still is a lack of high-quality datasets for many types of stains and most of the works are not mature enough to be applied in a daily clinical diagnostic routine.
arXiv Detail & Related papers (2021-05-24T13:50:45Z) - COVIDScholar: An automated COVID-19 research aggregation and analysis
platform [0.0]
As of October 2020, over 81,000 COVID-19 related scientific papers have been released, at a rate of over 250 per day.
This has created a challenge to traditional methods of engagement with the research literature.
We present an analysis of trends in COVID-19 research over the course of 2020.
arXiv Detail & Related papers (2020-12-07T18:17:11Z) - COVID-19 Kaggle Literature Organization [29.959515544730348]
The world has faced the devastating outbreak of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), or COVID-19, in 2020.
Research in the subject matter was fast-tracked to such a point that scientists were struggling to keep up with new findings.
We describe an approach to organize and visualize the scientific literature on or related to COVID-19 using machine learning techniques.
arXiv Detail & Related papers (2020-08-04T21:02:32Z) - Understanding the temporal evolution of COVID-19 research through
machine learning and natural language processing [66.63200823918429]
The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been continuously affecting human lives and communities around the world.
We used multiple data sources, i.e., PubMed and ArXiv, and built several machine learning models to characterize the landscape of current COVID-19 research.
Our findings confirm the types of research available in PubMed and ArXiv differ significantly, with the former exhibiting greater diversity in terms of COVID-19 related issues.
arXiv Detail & Related papers (2020-07-22T18:02:39Z) - COVID-19 Literature Knowledge Graph Construction and Drug Repurposing
Report Generation [79.33545724934714]
We have developed a novel and comprehensive knowledge discovery framework, COVID-KG, to extract fine-grained multimedia knowledge elements from scientific literature.
Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence.
arXiv Detail & Related papers (2020-07-01T16:03:20Z) - Discovering associations in COVID-19 related research papers [2.146386506780702]
Our study analyses the abstracts of papers related to COVID-19 and coronavirus-related-research using association rule text mining.
On the basis of these methods, the purpose of our study was to show how researchers have responded in similar epidemic/pandemic situations throughout history.
arXiv Detail & Related papers (2020-04-06T10:52:25Z) - Mapping the Landscape of Artificial Intelligence Applications against
COVID-19 [59.30734371401316]
COVID-19, the disease caused by the SARS-CoV-2 virus, has been declared a pandemic by the World Health Organization.
We present an overview of recent studies using Machine Learning and, more broadly, Artificial Intelligence to tackle many aspects of the COVID-19 crisis.
arXiv Detail & Related papers (2020-03-25T12:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.