Related papers: Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types

Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types

URL: http://arxiv.org/abs/2403.03304v2
Date: Wed, 12 Jun 2024 19:21:33 GMT
Title: Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types
Authors: Joseph Gatto, Parker Seegmiller, Omar Sharif, Sarah M. Preum,
Abstract summary: Event Argument Extraction (EAE) is an extremely difficult information extraction problem. Existing augmentation methods are not well-suited to a variety of real-world EAE contexts. We introduce two novel LLM-powered data augmentation frameworks for synthesizing document-level EAE samples.
Score: 1.949927790632678
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Event Argument Extraction (EAE) is an extremely difficult information extraction problem -- with significant limitations in few-shot cross-domain (FSCD) settings. A common solution to FSCD modeling is data augmentation. Unfortunately, existing augmentation methods are not well-suited to a variety of real-world EAE contexts including (i) The need to model long documents (10+ sentences) (ii) The need to model zero and few-shot roles (i.e. event roles with little to no training representation). In this work, we introduce two novel LLM-powered data augmentation frameworks for synthesizing extractive document-level EAE samples using zero in-domain training data. Our highest performing methods provide a 16-pt increase in F1 score on extraction of zero shot role types. To better facilitate analysis of cross-domain EAE, we additionally introduce a new metric, Role-Depth F1 (RDF1), which uses statistical depth to identify roles in the target domain which are semantic outliers with respect to roles observed in the source domain. Our experiments show that LLM-based augmentation can boost RDF1 performance by up to 11 F1 points compared to baseline methods.

Related papers

Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models [47.66611300605174]
Rein++ is an efficient VFM-based segmentation framework.<n>It demonstrates superior generalization from limited data.<n>It enables effective adaptation to diverse unlabeled scenarios.
arXiv Detail & Related papers (2025-08-03T08:53:30Z)
Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining [53.963279865355105]
Cross-domain few-shot segmentation (CD-FSS) aims to segment objects of novel classes in new domains. Most CD-FSS methods redesign and retrain in-domain FSS models using various domain-generalization techniques. We propose adapting informative model structures of the well-trained FSS model for target domains by learning domain characteristics from few-shot labeled support samples.
arXiv Detail & Related papers (2025-04-30T08:16:33Z)
Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests. We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments. We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z)
One Small and One Large for Document-level Event Argument Extraction [13.25071868664492]
Document-level Event Argument Extraction (EAE) faces two challenges due to increased input length. Co and Structure Event Argument Extraction model (CsEAE) based on Small Language Models (SLMs) Second method introduces new prompts to transform the extraction task into a generative task suitable for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-11-08T14:44:01Z)
Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift. We devise a series of experiments to empirically explain the performance gap.
arXiv Detail & Related papers (2024-09-27T05:06:43Z)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress. LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset. Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z)
TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [40.49924427388922]
We propose a task-adaptive auto-visual prompt framework for Cross-dominan Few-shot segmentation (CD-FSS) We incorporate a Class Domain Task-Adaptive Auto-Prompt (CDTAP) module to enable class-domain feature extraction and generate high-quality, learnable visual prompts. Our model outperforms the state-of-the-art CD-FSS approach, achieving an average accuracy improvement of 1.3% in the 1-shot setting and 11.76% in the 5-shot setting.
arXiv Detail & Related papers (2024-09-09T07:43:58Z)
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition [9.458578303096424]
We address a novel cross-domain few-shot learning task with multimodal input and unlabeled target data for egocentric action recognition. This paper simultaneously tackles two critical challenges associated with egocentric action recognition. First, we propose the incorporation of multimodal distillation into the student RGB model using teacher models. Second, we introduce ensemble masked inference, a technique that reduces the number of input tokens through masking.
arXiv Detail & Related papers (2024-05-30T10:30:07Z)
Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters. We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z)
FDAPT: Federated Domain-adaptive Pre-training for Language Models [15.755622890097941]
This paper tackles the specific case of Domain-Adaptive Pre-Training (DAPT) We conduct the first comprehensive empirical study to evaluate the performance of Federated Domain-Adaptive Pre-Training (FDAPT) We propose a novel algorithm, Frozen Federated Domain-Adaptive Pre-Training (FFDAPT)
arXiv Detail & Related papers (2023-07-12T17:04:28Z)
Abstractive Summarization as Augmentation for Document-Level Event Detection [0.0]
We bridge the performance gap between shallow and deep models on document-level event detection by using abstractive text summarization as an augmentation method. We use four decoding methods for text generation, namely beam search, top-k sampling, top-p sampling, and contrastive search. Our results show that using the document title offers 2.04% and 3.19% absolute improvement in macro F1-score for linear SVM and RoBERTa, respectively.
arXiv Detail & Related papers (2023-05-29T11:28:26Z)
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z)
Rationale-Guided Few-Shot Classification to Detect Abusive Language [5.977278650516324]
We propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection. We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets.
arXiv Detail & Related papers (2022-11-30T14:47:14Z)
ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning [95.78635058475439]
Cross-Domain Few-Shot Learning aims at addressing the Few-Shot Learning problem across different domains. This paper technically contributes a novel Multi-Expert Domain Decompositional Network (ME-D2N) We present a novel domain decomposition module that learns to decompose the student model into two domain-related sub parts.
arXiv Detail & Related papers (2022-10-11T09:24:47Z)
Federated and Generalized Person Re-identification through Domain and Feature Hallucinating [88.77196261300699]
We study the problem of federated domain generalization (FedDG) for person re-identification (re-ID) We propose a novel method, called "Domain and Feature Hallucinating (DFH)", to produce diverse features for learning generalized local and global models. Our method achieves the state-of-the-art performance for FedDG on four large-scale re-ID benchmarks.
arXiv Detail & Related papers (2022-03-05T09:15:13Z)
Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions. We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.