SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials
- URL: http://arxiv.org/abs/2404.04963v1
- Date: Sun, 7 Apr 2024 13:58:41 GMT
- Title: SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials
- Authors: Mael Jullien, Marco Valentino, André Freitas,
- Abstract summary: We present SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for ClinicalTrials.
Our contributions include the refined NLI4CT-P dataset (i.e., Natural Language Inference for Clinical Trials - Perturbed)
A total of 106 participants registered for the task contributing to over 1200 individual submissions and 25 system overview papers.
This initiative aims to advance the robustness and applicability of NLI models in healthcare, ensuring safer and more dependable AI assistance in clinical decision-making.
- Score: 13.59675117792588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) are at the forefront of NLP achievements but fall short in dealing with shortcut learning, factual inconsistency, and vulnerability to adversarial inputs.These shortcomings are especially critical in medical contexts, where they can misrepresent actual model capabilities. Addressing this, we present SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for ClinicalTrials. Our contributions include the refined NLI4CT-P dataset (i.e., Natural Language Inference for Clinical Trials - Perturbed), designed to challenge LLMs with interventional and causal reasoning tasks, along with a comprehensive evaluation of methods and results for participant submissions. A total of 106 participants registered for the task contributing to over 1200 individual submissions and 25 system overview papers. This initiative aims to advance the robustness and applicability of NLI models in healthcare, ensuring safer and more dependable AI assistance in clinical decision-making. We anticipate that the dataset, models, and outcomes of this task can support future research in the field of biomedical NLI. The dataset, competition leaderboard, and website are publicly available.
Related papers
- Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare.
This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z) - D-NLP at SemEval-2024 Task 2: Evaluating Clinical Inference Capabilities of Large Language Models [5.439020425819001]
Large language models (LLMs) have garnered significant attention and widespread usage due to their impressive performance in various tasks.
However, they are not without their own set of challenges, including issues such as hallucinations, factual inconsistencies, and limitations in numerical-quantitative reasoning.
arXiv Detail & Related papers (2024-05-07T10:11:14Z) - Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model [0.7373617024876725]
Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants.
The complex nature of unstructured medical texts presents challenges in efficiently identifying participants.
In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task.
arXiv Detail & Related papers (2024-04-24T20:42:28Z) - IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials [4.679320772294786]
Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks.
This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs)
We examine the reasoning capabilities of LLMs and their adeptness at logical problem-solving.
arXiv Detail & Related papers (2024-04-06T05:44:53Z) - SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials [0.9012198585960441]
This paper describes our submission to Task 2 of SemEval-2024: Safe Biomedical Natural Language Inference for Clinical Trials.
The Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT) consists of a Textual Entailment task focused on the evaluation of the consistency and faithfulness of Natural Language Inference (NLI) models applied to Clinical Trial Reports (CTR)
arXiv Detail & Related papers (2024-04-05T09:18:50Z) - SemEval-2023 Task 7: Multi-Evidence Natural Language Inference for
Clinical Trial Data [1.6932802756478729]
This paper describes the results of SemEval 2023 task 7 -- Multi-Evidence Natural Language Inference for Clinical Trial Data.
The proposed challenges require multi-hop biomedical and numerical reasoning.
We observe significantly better performance on the evidence selection task than on the entailment task.
arXiv Detail & Related papers (2023-05-04T16:58:19Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Informing clinical assessment by contextualizing post-hoc explanations
of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state.
We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability.
Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z) - ITTC @ TREC 2021 Clinical Trials Track [54.141379782822206]
The task focuses on the problem of matching eligible clinical trials to topics constituting a summary of a patient's admission notes.
We explore different ways of representing trials and topics using NLP techniques, and then use a common retrieval model to generate the ranked list of relevant trials for each topic.
The results from all our submitted runs are well above the median scores for all topics, but there is still plenty of scope for improvement.
arXiv Detail & Related papers (2022-02-16T04:56:47Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.