A Search Engine for Discovery of Biomedical Challenges and Directions
- URL: http://arxiv.org/abs/2108.13751v1
- Date: Tue, 31 Aug 2021 11:08:20 GMT
- Title: A Search Engine for Discovery of Biomedical Challenges and Directions
- Authors: Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi
Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S.
Weld and Tom Hope
- Abstract summary: We construct and release an expert-annotated corpus of texts sampled from full-length papers.
We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic.
We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine for this information.
- Score: 38.72769142277108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to keep track of scientific challenges, advances and emerging
directions is a fundamental part of research. However, researchers face a flood
of papers that hinders discovery of important knowledge. In biomedicine, this
directly impacts human lives. To address this problem, we present a novel task
of extraction and search of scientific challenges and directions, to facilitate
rapid knowledge discovery. We construct and release an expert-annotated corpus
of texts sampled from full-length papers, labeled with novel semantic
categories that generalize across many types of challenges and directions. We
focus on a large corpus of interdisciplinary work relating to the COVID-19
pandemic, ranging from biomedicine to areas such as AI and economics. We apply
a model trained on our data to identify challenges and directions across the
corpus and build a dedicated search engine for this information. In studies
with researchers, including those working directly on COVID-19, we outperform a
popular scientific search engine in assisting knowledge discovery. Finally, we
show that models trained on our resource generalize to the wider biomedical
domain, highlighting its broad utility. We make our data, model and search
engine publicly available. https://challenges.apps.allenai.org
Related papers
- DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery.
It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations.
We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z) - PubMed and Beyond: Biomedical Literature Search in the Age of Artificial
Intelligence [6.10182662240717]
literature search is an essential tool for building on prior knowledge in clinical and biomedical research.
Recent improvements in artificial intelligence have expanded functionality beyond keyword-based search.
We present a survey of literature search tools tailored to both general and specific information needs in biomedicine.
arXiv Detail & Related papers (2023-07-18T23:35:53Z) - How Data Scientists Review the Scholarly Literature [4.406926847270567]
We examine the literature review practices of data scientists.
Data science represents a field seeing an exponential rise in papers.
No prior work has examined the specific practices and challenges faced by these scientists.
arXiv Detail & Related papers (2023-01-10T03:53:05Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z) - A Search Engine for Scientific Publications: a Cybersecurity Case Study [0.7734726150561086]
This work proposes a new search engine for scientific publications which combines both information retrieval and reading comprehension algorithms.
The proposed solution although being applied to the context of cybersecurity exhibited great generalization capabilities and can be easily adapted to perform under other distinct knowledge domains.
arXiv Detail & Related papers (2021-06-30T20:10:04Z) - Domain-Specific Pretraining for Vertical Search: Case Study on
Biomedical Literature [67.4680600632232]
Self-supervised learning has emerged as a promising direction to overcome the annotation bottleneck.
We propose a general approach for vertical search based on domain-specific pretraining.
Our system can scale to tens of millions of articles on PubMed and has been deployed as Microsoft Biomedical Search.
arXiv Detail & Related papers (2021-06-25T01:02:55Z) - A field guide to cultivating computational biology [1.040598660564506]
Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients.
This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model.
We propose solutions for individual scientists, institutions, journal publishers, funding agencies, and educators.
arXiv Detail & Related papers (2021-04-23T01:24:21Z) - Searching Scientific Literature for Answers on COVID-19 Questions [19.340724359324803]
TREC COVID search track aims to assist in creating search tools to aid scientists, clinicians, policy makers and others with similar information needs.
We propose a novel method for neural retrieval, and demonstrate its effectiveness on the TREC COVID search.
arXiv Detail & Related papers (2020-07-06T01:34:25Z) - COVID-19 Literature Knowledge Graph Construction and Drug Repurposing
Report Generation [79.33545724934714]
We have developed a novel and comprehensive knowledge discovery framework, COVID-KG, to extract fine-grained multimedia knowledge elements from scientific literature.
Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence.
arXiv Detail & Related papers (2020-07-01T16:03:20Z) - Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research
Dataset: Preliminary Thoughts and Lessons Learned [88.42878484408469]
We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures.
This paper describes our initial efforts and offers a few thoughts about lessons we have learned along the way.
arXiv Detail & Related papers (2020-04-10T17:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.