DICE: Data-Efficient Clinical Event Extraction with Generative Models
- URL: http://arxiv.org/abs/2208.07989v2
- Date: Thu, 25 May 2023 11:04:05 GMT
- Title: DICE: Data-Efficient Clinical Event Extraction with Generative Models
- Authors: Mingyu Derek Ma, Alexander K. Taylor, Wei Wang, Nanyun Peng
- Abstract summary: Event extraction for the clinical domain is an under-explored research area.
We introduce DICE, a robust and data-efficient generative model for clinical event extraction.
Our experiments demonstrate state-of-the-art performances of DICE for clinical and news domain event extraction.
- Score: 93.49354508621232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event extraction for the clinical domain is an under-explored research area.
The lack of training data along with the high volume of domain-specific
terminologies with vague entity boundaries makes the task especially
challenging. In this paper, we introduce DICE, a robust and data-efficient
generative model for clinical event extraction. DICE frames event extraction as
a conditional generation problem and introduces a contrastive learning
objective to accurately decide the boundaries of biomedical mentions. DICE also
trains an auxiliary mention identification task jointly with event extraction
tasks to better identify entity mention boundaries, and further introduces
special markers to incorporate identified entity mentions as trigger and
argument candidates for their respective tasks. To benchmark clinical event
extraction, we compose MACCROBAT-EE, the first clinical event extraction
dataset with argument annotation, based on an existing clinical information
extraction dataset MACCROBAT. Our experiments demonstrate state-of-the-art
performances of DICE for clinical and news domain event extraction, especially
under low data settings.
Related papers
- Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image
Modeling for CBCT Tooth Segmentation [10.617296334463942]
tooth identification and segmentation in Cone Beam Computed Tomography (CBCT) dental images can significantly enhance the efficiency and precision of manual diagnoses performed by dentists.
Existing segmentation methods are mainly developed based on large data volumes training, on which their annotations are extremely time-consuming.
This study proposes a tasked-oriented Masked Auto-Encoder paradigm to effectively utilize large amounts of unlabeled data to achieve accurate tooth segmentation with limited labeled data.
arXiv Detail & Related papers (2024-02-07T05:05:21Z) - Segment Together: A Versatile Paradigm for Semi-Supervised Medical Image
Segmentation [17.69933345468061]
scarcity has become a major obstacle for training powerful deep-learning models for medical image segmentation.
We introduce a textbfVersatile textbfSemi-supervised framework to exploit more unlabeled data for semi-supervised medical image segmentation.
arXiv Detail & Related papers (2023-11-20T11:35:52Z) - Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data
Generation with Large Language Models [48.07083163501746]
Clinical natural language processing requires methods that can address domain-specific challenges.
We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process.
Our empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks.
arXiv Detail & Related papers (2023-11-01T04:37:28Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - DIAS: A Dataset and Benchmark for Intracranial Artery Segmentation in DSA sequences [19.61593883367223]
Intracranial Arteries (IA) in Digital Subtraction Angiography (DSA) plays a crucial role in the quantification of vascular morphology.
Current research primarily focuses on the segmentation of single-frame DSA using proprietary datasets.
We introduce DIAS, a dataset specifically developed for IA segmentation in DSA sequences.
arXiv Detail & Related papers (2023-06-21T10:03:56Z) - Medical Data Augmentation via ChatGPT: A Case Study on Medication
Identification and Medication Event Classification [2.980018103007841]
In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records.
Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks.
This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data.
arXiv Detail & Related papers (2023-06-10T20:55:21Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - MEE: A Novel Multilingual Event Extraction Dataset [62.80569691825534]
Event Extraction aims to recognize event mentions and their arguments from text.
The lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance.
We propose a novel Multilingual Event Extraction dataset (EE) that provides annotation for more than 50K event mentions in 8 typologically different languages.
arXiv Detail & Related papers (2022-11-11T02:01:41Z) - One-Shot Medical Landmark Localization by Edge-Guided Transform and
Noisy Landmark Refinement [59.14062241534754]
We propose a two-stage framework for one-shot medical landmark localization.
In stage I, we learn an end-to-end cascade of global alignment and local deformations, under the guidance of novel loss functions.
In stage II, we explore self-consistency for selecting reliable pseudo labels and cross-consistency for semi-supervised learning.
arXiv Detail & Related papers (2022-07-31T15:42:28Z) - Back to Prior Knowledge: Joint Event Causality Extraction via
Convolutional Semantic Infusion [5.566928318239452]
Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining.
We propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework.
Our model significantly outperforms the strong BERT+CSNN baseline.
arXiv Detail & Related papers (2021-02-19T13:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.