COVID-19 Literature Topic-Based Search via Hierarchical NMF
- URL: http://arxiv.org/abs/2009.09074v1
- Date: Mon, 7 Sep 2020 05:45:03 GMT
- Title: COVID-19 Literature Topic-Based Search via Hierarchical NMF
- Authors: Rachel Grotheer, Yihuan Huang, Pengyu Li, Elizaveta Rebrova, Deanna
Needell, Longxiu Huang, Alona Kryshchenko, Xia Li, Kyung Ha, Oleksandr
Kryshchenko
- Abstract summary: A dataset of COVID-19-related scientific literature is compiled.
hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure.
- Score: 29.04869940568828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A dataset of COVID-19-related scientific literature is compiled, combining
the articles from several online libraries and selecting those with open access
and full text available. Then, hierarchical nonnegative matrix factorization is
used to organize literature related to the novel coronavirus into a tree
structure that allows researchers to search for relevant literature based on
detected topics. We discover eight major latent topics and 52 granular
subtopics in the body of literature, related to vaccines, genetic structure and
modeling of the disease and patient studies, as well as related diseases and
virology. In order that our tool may help current researchers, an interactive
website is created that organizes available literature using this hierarchical
structure.
Related papers
- CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support [31.327873791724326]
Literature review requires researchers to synthesize a large amount of information and is increasingly challenging as the scientific literature expands.
In this work, we investigate the potential of LLMs for producing hierarchical organizations of scientific studies to assist researchers with literature review.
arXiv Detail & Related papers (2024-07-23T03:18:00Z) - Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Prioritization of COVID-19-related literature via unsupervised keyphrase
extraction and document representation learning [1.8374319565577157]
The COVID-19 pandemic triggered a wave of novel scientific literature that is impossible to inspect and study in a reasonable time frame manually.
Current machine learning methods offer to project such body of literature into the vector space, where similar documents are located close to each other.
In our system, the current body of COVID-19-related literature is annotated using unsupervised keyphrase extraction.
The solution is accessible through a web server capable of interactive search, term ranking, and exploration of potentially interesting literature.
arXiv Detail & Related papers (2021-10-17T17:35:09Z) - Medical Literature Mining and Retrieval in a Conversational Setting [3.37411253119822]
Covid-19 pandemic has caused a spur in the medical research literature.
There is a need for robust text mining tools which can process, extract and present answers from the literature in a concise and consumable way.
We present a conversational system, which can retrieve and answer coronavirus-related queries from the rich medical literature, and present it in a conversational setting with the user.
arXiv Detail & Related papers (2021-07-23T23:02:59Z) - COVID-19 Multidimensional Kaggle Literature Organization [3.201839066679614]
We show that factorization is a powerful unsupervised learning method capable of discovering hidden patterns in a document corpus.
We show that a higher-order representation of the corpus allows for the simultaneous grouping of similar articles, relevant journals, authors with similar research interests, and topic keywords.
arXiv Detail & Related papers (2021-07-17T06:16:36Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Extracting a Knowledge Base of Mechanisms from COVID-19 Papers [50.17242035034729]
We pursue the construction of a knowledge base (KB) of mechanisms.
We develop a broad, unified schema that strikes a balance between relevance and breadth.
Experiments demonstrate the utility of our KB in supporting interdisciplinary scientific search over COVID-19 literature.
arXiv Detail & Related papers (2020-10-08T07:54:14Z) - Navigating the landscape of COVID-19 research through literature
analysis: A bird's eye view [11.362549790802483]
We analyze the LitCovid collection, 13,369 COVID-19 related articles found in PubMed as of May 15th, 2020.
We do that by applying state-of-the-art named entity recognition, classification, clustering and other NLP techniques.
Our clustering algorithm identifies topics represented by groups of related terms, and computes clusters corresponding to documents associated with the topic terms.
arXiv Detail & Related papers (2020-08-07T23:39:29Z) - COVID-19 Literature Knowledge Graph Construction and Drug Repurposing
Report Generation [79.33545724934714]
We have developed a novel and comprehensive knowledge discovery framework, COVID-KG, to extract fine-grained multimedia knowledge elements from scientific literature.
Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence.
arXiv Detail & Related papers (2020-07-01T16:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.