Related papers: A Search Engine for Discovery of Biomedical Challenges and Directions

A Search Engine for Discovery of Biomedical Challenges and Directions

URL: http://arxiv.org/abs/2108.13751v1
Date: Tue, 31 Aug 2021 11:08:20 GMT
Title: A Search Engine for Discovery of Biomedical Challenges and Directions
Authors: Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld and Tom Hope
Abstract summary: We construct and release an expert-annotated corpus of texts sampled from full-length papers. We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic. We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine for this information.
Score: 38.72769142277108
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to keep track of scientific challenges, advances and emerging directions is a fundamental part of research. However, researchers face a flood of papers that hinders discovery of important knowledge. In biomedicine, this directly impacts human lives. To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery. We construct and release an expert-annotated corpus of texts sampled from full-length papers, labeled with novel semantic categories that generalize across many types of challenges and directions. We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic, ranging from biomedicine to areas such as AI and economics. We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine for this information. In studies with researchers, including those working directly on COVID-19, we outperform a popular scientific search engine in assisting knowledge discovery. Finally, we show that models trained on our resource generalize to the wider biomedical domain, highlighting its broad utility. We make our data, model and search engine publicly available. https://challenges.apps.allenai.org

Related papers

Preface to the Special Issue of the TAL Journal on Scholarly Document Processing [33.04325179283727]
The rapid growth of scholarly literature makes it increasingly difficult for researchers to keep up with new knowledge.<n>This special issue of the TAL journal highlights research on natural language processing and information retrieval for scholarly and scientific documents.
arXiv Detail & Related papers (2025-06-04T05:35:39Z)
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation [58.064940977804596]
A plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently. Ethical concerns regarding shortcomings of these tools and potential for misuse take a particularly prominent place in our discussion.
arXiv Detail & Related papers (2025-02-07T18:26:45Z)
Applications and Challenges of AI and Microscopy in Life Science Research: A Review [7.771558261139913]
This paper explores the intersection of AI and microscopy in life sciences, emphasizing their potential applications and associated challenges. We provide a detailed review of how various biological systems can benefit from AI, highlighting the types of data and labeling requirements unique to this domain. Specifically attention is given to microscopy data, exploring the specific AI techniques required to process and interpret this information.
arXiv Detail & Related papers (2025-01-22T08:32:36Z)
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery. It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations. We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z)
PubMed and Beyond: Biomedical Literature Search in the Age of Artificial Intelligence [6.10182662240717]
literature search is an essential tool for building on prior knowledge in clinical and biomedical research. Recent improvements in artificial intelligence have expanded functionality beyond keyword-based search. We present a survey of literature search tools tailored to both general and specific information needs in biomedicine.
arXiv Detail & Related papers (2023-07-18T23:35:53Z)
How Data Scientists Review the Scholarly Literature [4.406926847270567]
We examine the literature review practices of data scientists. Data science represents a field seeing an exponential rise in papers. No prior work has examined the specific practices and challenges faced by these scientists.
arXiv Detail & Related papers (2023-01-10T03:53:05Z)
Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications. As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry. We explore the first end-to-end solution for this task by using generative approaches. We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z)
A Search Engine for Scientific Publications: a Cybersecurity Case Study [0.7734726150561086]
This work proposes a new search engine for scientific publications which combines both information retrieval and reading comprehension algorithms. The proposed solution although being applied to the context of cybersecurity exhibited great generalization capabilities and can be easily adapted to perform under other distinct knowledge domains.
arXiv Detail & Related papers (2021-06-30T20:10:04Z)
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature [67.4680600632232]
Self-supervised learning has emerged as a promising direction to overcome the annotation bottleneck. We propose a general approach for vertical search based on domain-specific pretraining. Our system can scale to tens of millions of articles on PubMed and has been deployed as Microsoft Biomedical Search.
arXiv Detail & Related papers (2021-06-25T01:02:55Z)
A field guide to cultivating computational biology [1.040598660564506]
Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. We propose solutions for individual scientists, institutions, journal publishers, funding agencies, and educators.
arXiv Detail & Related papers (2021-04-23T01:24:21Z)
Searching Scientific Literature for Answers on COVID-19 Questions [19.340724359324803]
TREC COVID search track aims to assist in creating search tools to aid scientists, clinicians, policy makers and others with similar information needs. We propose a novel method for neural retrieval, and demonstrate its effectiveness on the TREC COVID search.
arXiv Detail & Related papers (2020-07-06T01:34:25Z)
COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation [79.33545724934714]
We have developed a novel and comprehensive knowledge discovery framework, COVID-KG, to extract fine-grained multimedia knowledge elements from scientific literature. Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence.
arXiv Detail & Related papers (2020-07-01T16:03:20Z)
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned [88.42878484408469]
We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures. This paper describes our initial efforts and offers a few thoughts about lessons we have learned along the way.
arXiv Detail & Related papers (2020-04-10T17:12:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.