DICE: Data-Efficient Clinical Event Extraction with Generative Models
- URL: http://arxiv.org/abs/2208.07989v2
- Date: Thu, 25 May 2023 11:04:05 GMT
- Title: DICE: Data-Efficient Clinical Event Extraction with Generative Models
- Authors: Mingyu Derek Ma, Alexander K. Taylor, Wei Wang, Nanyun Peng
- Abstract summary: Event extraction for the clinical domain is an under-explored research area.
We introduce DICE, a robust and data-efficient generative model for clinical event extraction.
Our experiments demonstrate state-of-the-art performances of DICE for clinical and news domain event extraction.
- Score: 93.49354508621232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event extraction for the clinical domain is an under-explored research area.
The lack of training data along with the high volume of domain-specific
terminologies with vague entity boundaries makes the task especially
challenging. In this paper, we introduce DICE, a robust and data-efficient
generative model for clinical event extraction. DICE frames event extraction as
a conditional generation problem and introduces a contrastive learning
objective to accurately decide the boundaries of biomedical mentions. DICE also
trains an auxiliary mention identification task jointly with event extraction
tasks to better identify entity mention boundaries, and further introduces
special markers to incorporate identified entity mentions as trigger and
argument candidates for their respective tasks. To benchmark clinical event
extraction, we compose MACCROBAT-EE, the first clinical event extraction
dataset with argument annotation, based on an existing clinical information
extraction dataset MACCROBAT. Our experiments demonstrate state-of-the-art
performances of DICE for clinical and news domain event extraction, especially
under low data settings.
Related papers
- Boosting Sclera Segmentation through Semi-supervised Learning with Fewer Labels [8.313448026908729]
This paper introduces a novel sclera segmentation framework that excels with limited labeled samples.
We employ a semi-supervised learning method that integrates domain-specific improvements and image-based spatial transformations to enhance segmentation performance.
arXiv Detail & Related papers (2025-01-13T23:38:49Z) - Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation [9.373643627609336]
tooth identification and segmentation in Cone Beam Computed Tomography (CBCT) dental images can significantly enhance the efficiency and precision of manual diagnoses performed by dentists.
Existing segmentation methods are mainly developed based on large data volumes training, on which their annotations are extremely time-consuming.
This study proposes a tasked-oriented Masked Auto-Encoder paradigm to effectively utilize large amounts of unlabeled data to achieve accurate tooth segmentation with limited labeled data.
arXiv Detail & Related papers (2024-02-07T05:05:21Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - DIAS: A Dataset and Benchmark for Intracranial Artery Segmentation in DSA sequences [19.61593883367223]
Intracranial Arteries (IA) in Digital Subtraction Angiography (DSA) plays a crucial role in the quantification of vascular morphology.
Current research primarily focuses on the segmentation of single-frame DSA using proprietary datasets.
We introduce DIAS, a dataset specifically developed for IA segmentation in DSA sequences.
arXiv Detail & Related papers (2023-06-21T10:03:56Z) - Medical Data Augmentation via ChatGPT: A Case Study on Medication
Identification and Medication Event Classification [2.980018103007841]
In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records.
Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks.
This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data.
arXiv Detail & Related papers (2023-06-10T20:55:21Z) - Boosting Event Extraction with Denoised Structure-to-Text Augmentation [52.21703002404442]
Event extraction aims to recognize pre-defined event triggers and arguments from texts.
Recent data augmentation methods often neglect the problem of grammatical incorrectness.
We propose a denoised structure-to-text augmentation framework for event extraction DAEE.
arXiv Detail & Related papers (2023-05-16T16:52:07Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - MEE: A Novel Multilingual Event Extraction Dataset [62.80569691825534]
Event Extraction aims to recognize event mentions and their arguments from text.
The lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance.
We propose a novel Multilingual Event Extraction dataset (EE) that provides annotation for more than 50K event mentions in 8 typologically different languages.
arXiv Detail & Related papers (2022-11-11T02:01:41Z) - One-Shot Medical Landmark Localization by Edge-Guided Transform and
Noisy Landmark Refinement [59.14062241534754]
We propose a two-stage framework for one-shot medical landmark localization.
In stage I, we learn an end-to-end cascade of global alignment and local deformations, under the guidance of novel loss functions.
In stage II, we explore self-consistency for selecting reliable pseudo labels and cross-consistency for semi-supervised learning.
arXiv Detail & Related papers (2022-07-31T15:42:28Z) - Back to Prior Knowledge: Joint Event Causality Extraction via
Convolutional Semantic Infusion [5.566928318239452]
Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining.
We propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework.
Our model significantly outperforms the strong BERT+CSNN baseline.
arXiv Detail & Related papers (2021-02-19T13:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.