Information Extraction of Clinical Trial Eligibility Criteria
- URL: http://arxiv.org/abs/2006.07296v6
- Date: Tue, 28 Jul 2020 17:50:42 GMT
- Title: Information Extraction of Clinical Trial Eligibility Criteria
- Authors: Yitong Tseo, M. I. Salkola, Ahmed Mohamed, Anuj Kumar, Freddy Abnousi
- Abstract summary: This paper investigates an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials(dot)gov to a shared knowledge base.
We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context free grammar.
To our knowledge, this work is the first criteria extraction system to apply attention-based conditional random field architecture for named entity recognition.
- Score: 6.192164049563104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clinical trials predicate subject eligibility on a diversity of criteria
ranging from patient demographics to food allergies. Trials post their
requirements as semantically complex, unstructured free-text. Formalizing trial
criteria to a computer-interpretable syntax would facilitate eligibility
determination. In this paper, we investigate an information extraction (IE)
approach for grounding criteria from trials in ClinicalTrials(dot)gov to a
shared knowledge base. We frame the problem as a novel knowledge base
population task, and implement a solution combining machine learning and
context free grammar. To our knowledge, this work is the first criteria
extraction system to apply attention-based conditional random field
architecture for named entity recognition (NER), and word2vec embedding
clustering for named entity linking (NEL). We release the resources and core
components of our system on GitHub at
https://github.com/facebookresearch/Clinical-Trial-Parser. Finally, we report
our per module and end to end performances; we conclude that our system is
competitive with Criteria2Query, which we view as the current state-of-the-art
in criteria extraction.
Related papers
- Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model [0.7373617024876725]
Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants.
The complex nature of unstructured medical texts presents challenges in efficiently identifying participants.
In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task.
arXiv Detail & Related papers (2024-04-24T20:42:28Z) - AutoTrial: Prompting Language Models for Clinical Trial Design [53.630479619856516]
We present a method named AutoTrial to aid the design of clinical eligibility criteria using language models.
Experiments on over 70K clinical trials verify that AutoTrial generates high-quality criteria texts.
arXiv Detail & Related papers (2023-05-19T01:04:16Z) - IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - LeafAI: query generator for clinical cohort discovery rivaling a human
programmer [4.410832512630809]
We create a system capable of generating data model-agnostic queries.
We also provide novel logical reasoning capabilities for complex clinical trial eligibility criteria.
arXiv Detail & Related papers (2023-04-13T00:34:32Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - The Leaf Clinical Trials Corpus: a new resource for query generation
from clinical trial eligibility criteria [1.7205106391379026]
We introduce the Leaf Clinical Trials (LCT) corpus, a human-annotated corpus of over 1,000 clinical trial eligibility criteria descriptions.
We provide details of our schema, annotation process, corpus quality, and statistics.
arXiv Detail & Related papers (2022-07-27T19:22:24Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Literature Retrieval for Precision Medicine with Neural Matching and
Faceted Summarization [2.978663539080876]
We present a document reranking approach that combines neural query-document matching and text summarization.
Evaluations using NIST's TREC-PM track datasets show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-17T02:01:32Z) - Robust Benchmarking for Machine Learning of Clinical Entity Extraction [2.9398911304923447]
We audit the performance of and indicate areas of improvement for state-of-the-art systems.
We find that high task accuracies for clinical entity normalization systems on the 2019 n2c2 Shared Task are misleading.
We reformulate the annotation framework for clinical entity extraction to factor in inconsistencies in medical vocabularies.
arXiv Detail & Related papers (2020-07-31T15:14:05Z) - COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching [70.08786840301435]
We propose CrOss-Modal PseudO-SiamEse network (COMPOSE) to address these challenges for patient-trial matching.
Experiment results show COMPOSE can reach 98.0% AUC on patient-criteria matching and 83.7% accuracy on patient-trial matching.
arXiv Detail & Related papers (2020-06-15T21:01:33Z) - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment
Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment.
DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.