Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types
- URL: http://arxiv.org/abs/2403.03304v2
- Date: Wed, 12 Jun 2024 19:21:33 GMT
- Title: Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types
- Authors: Joseph Gatto, Parker Seegmiller, Omar Sharif, Sarah M. Preum,
- Abstract summary: Event Argument Extraction (EAE) is an extremely difficult information extraction problem.
Existing augmentation methods are not well-suited to a variety of real-world EAE contexts.
We introduce two novel LLM-powered data augmentation frameworks for synthesizing document-level EAE samples.
- Score: 1.949927790632678
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event Argument Extraction (EAE) is an extremely difficult information extraction problem -- with significant limitations in few-shot cross-domain (FSCD) settings. A common solution to FSCD modeling is data augmentation. Unfortunately, existing augmentation methods are not well-suited to a variety of real-world EAE contexts including (i) The need to model long documents (10+ sentences) (ii) The need to model zero and few-shot roles (i.e. event roles with little to no training representation). In this work, we introduce two novel LLM-powered data augmentation frameworks for synthesizing extractive document-level EAE samples using zero in-domain training data. Our highest performing methods provide a 16-pt increase in F1 score on extraction of zero shot role types. To better facilitate analysis of cross-domain EAE, we additionally introduce a new metric, Role-Depth F1 (RDF1), which uses statistical depth to identify roles in the target domain which are semantic outliers with respect to roles observed in the source domain. Our experiments show that LLM-based augmentation can boost RDF1 performance by up to 11 F1 points compared to baseline methods.
Related papers
- Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests.
We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments.
We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z) - TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition [39.073835841717184]
Cross-domain action recognition (CDFSAR) has attracted recent research interests.
This paper proposes a simple yet effective baseline, namely Temporal-Aware Model Tuning (TAMT) for CDFSAR.
Our TAMT involves a decoupled paradigm by performing pre-training on source data and fine-tuning target data, which avoids retraining for multiple target data with single source.
arXiv Detail & Related papers (2024-11-28T10:38:05Z) - One Small and One Large for Document-level Event Argument Extraction [13.25071868664492]
Document-level Event Argument Extraction (EAE) faces two challenges due to increased input length.
Co and Structure Event Argument Extraction model (CsEAE) based on Small Language Models (SLMs)
Second method introduces new prompts to transform the extraction task into a generative task suitable for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-11-08T14:44:01Z) - Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress.
LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset.
Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z) - TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [40.49924427388922]
We propose a task-adaptive auto-visual prompt framework for Cross-dominan Few-shot segmentation (CD-FSS)
We incorporate a Class Domain Task-Adaptive Auto-Prompt (CDTAP) module to enable class-domain feature extraction and generate high-quality, learnable visual prompts.
Our model outperforms the state-of-the-art CD-FSS approach, achieving an average accuracy improvement of 1.3% in the 1-shot setting and 11.76% in the 5-shot setting.
arXiv Detail & Related papers (2024-09-09T07:43:58Z) - Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters.
We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z) - FDAPT: Federated Domain-adaptive Pre-training for Language Models [15.755622890097941]
This paper tackles the specific case of Domain-Adaptive Pre-Training (DAPT)
We conduct the first comprehensive empirical study to evaluate the performance of Federated Domain-Adaptive Pre-Training (FDAPT)
We propose a novel algorithm, Frozen Federated Domain-Adaptive Pre-Training (FFDAPT)
arXiv Detail & Related papers (2023-07-12T17:04:28Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Rationale-Guided Few-Shot Classification to Detect Abusive Language [5.977278650516324]
We propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection.
We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets.
arXiv Detail & Related papers (2022-11-30T14:47:14Z) - ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain
Few-Shot Learning [95.78635058475439]
Cross-Domain Few-Shot Learning aims at addressing the Few-Shot Learning problem across different domains.
This paper technically contributes a novel Multi-Expert Domain Decompositional Network (ME-D2N)
We present a novel domain decomposition module that learns to decompose the student model into two domain-related sub parts.
arXiv Detail & Related papers (2022-10-11T09:24:47Z) - Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions.
We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.