LeafAI: query generator for clinical cohort discovery rivaling a human
programmer
- URL: http://arxiv.org/abs/2304.06203v2
- Date: Mon, 14 Aug 2023 18:45:55 GMT
- Title: LeafAI: query generator for clinical cohort discovery rivaling a human
programmer
- Authors: Nicholas J Dobbins, Bin Han, Weipeng Zhou, Kristine Lan, H. Nina Kim,
Robert Harrington, Ozlem Uzuner, Meliha Yetisgen
- Abstract summary: We create a system capable of generating data model-agnostic queries.
We also provide novel logical reasoning capabilities for complex clinical trial eligibility criteria.
- Score: 4.410832512630809
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective: Identifying study-eligible patients within clinical databases is a
critical step in clinical research. However, accurate query design typically
requires extensive technical and biomedical expertise. We sought to create a
system capable of generating data model-agnostic queries while also providing
novel logical reasoning capabilities for complex clinical trial eligibility
criteria.
Materials and Methods: The task of query creation from eligibility criteria
requires solving several text-processing problems, including named entity
recognition and relation extraction, sequence-to-sequence transformation,
normalization, and reasoning. We incorporated hybrid deep learning and
rule-based modules for these, as well as a knowledge base of the Unified
Medical Language System (UMLS) and linked ontologies. To enable data-model
agnostic query creation, we introduce a novel method for tagging database
schema elements using UMLS concepts. To evaluate our system, called LeafAI, we
compared the capability of LeafAI to a human database programmer to identify
patients who had been enrolled in 8 clinical trials conducted at our
institution. We measured performance by the number of actual enrolled patients
matched by generated queries.
Results: LeafAI matched a mean 43% of enrolled patients with 27,225 eligible
across 8 clinical trials, compared to 27% matched and 14,587 eligible in
queries by a human database programmer. The human programmer spent 26 total
hours crafting queries compared to several minutes by LeafAI.
Conclusions: Our work contributes a state-of-the-art data model-agnostic
query generation system capable of conditional reasoning using a knowledge
base. We demonstrate that LeafAI can rival an experienced human programmer in
finding patients eligible for clinical trials.
Related papers
- TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - Panacea: A foundation model for clinical trial search, summarization, design, and recruitment [29.099676641424384]
We propose a clinical trial foundation model named Panacea.
Panacea is designed to handle multiple tasks, including trial search, trial summarization, trial design, and patient-trial matching.
We also assemble a large-scale dataset, named TrialAlign, of 793,279 trial documents and 1,113,207 trial-related scientific papers.
arXiv Detail & Related papers (2024-06-25T21:29:25Z) - Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model [0.7373617024876725]
Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants.
The complex nature of unstructured medical texts presents challenges in efficiently identifying participants.
In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task.
arXiv Detail & Related papers (2024-04-24T20:42:28Z) - Zero-Shot Clinical Trial Patient Matching with LLMs [40.31971412825736]
Large language models (LLMs) offer a promising solution to automated screening.
We design an LLM-based system which, given a patient's medical history as unstructured clinical text, evaluates whether that patient meets a set of inclusion criteria.
Our system achieves state-of-the-art scores on the n2c2 2018 cohort selection benchmark.
arXiv Detail & Related papers (2024-02-05T00:06:08Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z) - The Leaf Clinical Trials Corpus: a new resource for query generation
from clinical trial eligibility criteria [1.7205106391379026]
We introduce the Leaf Clinical Trials (LCT) corpus, a human-annotated corpus of over 1,000 clinical trial eligibility criteria descriptions.
We provide details of our schema, annotation process, corpus quality, and statistics.
arXiv Detail & Related papers (2022-07-27T19:22:24Z) - VBridge: Connecting the Dots Between Features, Explanations, and Data
for Healthcare Models [85.4333256782337]
VBridge is a visual analytics tool that seamlessly incorporates machine learning explanations into clinicians' decision-making workflow.
We identified three key challenges, including clinicians' unfamiliarity with ML features, lack of contextual information, and the need for cohort-level evidence.
We demonstrated the effectiveness of VBridge through two case studies and expert interviews with four clinicians.
arXiv Detail & Related papers (2021-08-04T17:34:13Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.