Jointly Extracting Interventions, Outcomes, and Findings from RCT
Reports with LLMs
- URL: http://arxiv.org/abs/2305.03642v3
- Date: Tue, 18 Jul 2023 01:36:42 GMT
- Title: Jointly Extracting Interventions, Outcomes, and Findings from RCT
Reports with LLMs
- Authors: Somin Wadhwa and Jay DeYoung and Benjamin Nye and Silvio Amir and
Byron C. Wallace
- Abstract summary: We propose and evaluate a text-to-text model built on instruction-tuned Large Language Models.
We apply our model to a collection of published RCTs through mid-2022, and release a searchable database of structured findings.
- Score: 21.868871974136884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Results from Randomized Controlled Trials (RCTs) establish the comparative
effectiveness of interventions, and are in turn critical inputs for
evidence-based care. However, results from RCTs are presented in (often
unstructured) natural language articles describing the design, execution, and
outcomes of trials; clinicians must manually extract findings pertaining to
interventions and outcomes of interest from such articles. This onerous manual
process has motivated work on (semi-)automating extraction of structured
evidence from trial reports. In this work we propose and evaluate a
text-to-text model built on instruction-tuned Large Language Models (LLMs) to
jointly extract Interventions, Outcomes, and Comparators (ICO elements) from
clinical abstracts, and infer the associated results reported. Manual (expert)
and automated evaluations indicate that framing evidence extraction as a
conditional generation task and fine-tuning LLMs for this purpose realizes
considerable ($\sim$20 point absolute F1 score) gains over the previous SOTA.
We perform ablations and error analyses to assess aspects that contribute to
model performance, and to highlight potential directions for further
improvements. We apply our model to a collection of published RCTs through
mid-2022, and release a searchable database of structured findings:
http://ico-relations.ebm-nlp.com
Related papers
- Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models [69.38024658668887]
Current evaluation method for event extraction relies on token-level exact match.
We propose RAEE, an automatic evaluation framework that accurately assesses event extraction results at semantic-level instead of token-level.
arXiv Detail & Related papers (2024-10-12T07:54:01Z) - Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports [2.932283627137903]
The study utilized two datasets: 7,294 radiology reports annotated for Brain Tumor Reporting and Data System (BT-RADS) scores and 2,154 pathology reports for isocitrate dehydrogenase (IDH) mutation status.
arXiv Detail & Related papers (2024-09-15T15:21:45Z) - Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - Automatically Extracting Numerical Results from Randomized Controlled Trials with Large Language Models [19.72316842477808]
We evaluate whether modern large language models (LLMs) can reliably perform this task.
Massive LLMs that can accommodate lengthy inputs are tantalizingly close to realizing fully automatic meta-analysis.
arXiv Detail & Related papers (2024-05-02T19:20:11Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - Automated tabulation of clinical trial results: A joint entity and
relation extraction approach with transformer-based language representations [5.825190876052148]
This paper investigates automating evidence table generation by decomposing the problem across two language processing tasks.
We focus on the automatic tabulation of sentences from published RCT abstracts that report the practice outcomes.
To train and test these models, a new gold-standard corpus was developed, comprising almost 600 result sentences from six disease areas.
arXiv Detail & Related papers (2021-12-10T15:26:43Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Eider: Evidence-enhanced Document-level Relation Extraction [56.71004595444816]
Document-level relation extraction (DocRE) aims at extracting semantic relations among entity pairs in a document.
We propose a three-stage evidence-enhanced DocRE framework consisting of joint relation and evidence extraction, evidence-centered relation extraction (RE), and fusion of extraction results.
arXiv Detail & Related papers (2021-06-16T09:43:16Z) - Cross-Supervised Joint-Event-Extraction with Heterogeneous Information
Networks [61.950353376870154]
Joint-event-extraction is a sequence-to-sequence labeling task with a tag set composed of tags of triggers and entities.
We propose a Cross-Supervised Mechanism (CSM) to alternately supervise the extraction of triggers or entities.
Our approach outperforms the state-of-the-art methods in both entity and trigger extraction.
arXiv Detail & Related papers (2020-10-13T11:51:17Z) - Predicting Clinical Trial Results by Implicit Evidence Integration [40.80948875051806]
We introduce a novel Clinical Trial Result Prediction (CTRP) task.
In the CTRP framework, a model takes a PICO-formatted clinical trial proposal with its background as input and predicts the result.
We exploit large-scale unstructured sentences from medical literature that implicitly contain PICOs and results as evidence.
arXiv Detail & Related papers (2020-10-12T12:25:41Z) - Understanding Clinical Trial Reports: Extracting Medical Entities and
Their Relations [33.30381080306156]
Medical experts must manually extract information from articles to inform decision-making.
We consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and (b) inferring the reported results for the former with respect to the latter.
arXiv Detail & Related papers (2020-10-07T17:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.