Related papers: EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models

EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models

URL: http://arxiv.org/abs/2402.11430v1
Date: Sun, 18 Feb 2024 02:41:06 GMT
Title: EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models
Authors: Jun Gao, Huan Zhao, Wei Wang, Changlong Yu, Ruifeng Xu
Abstract summary: EventRL is a reinforcement learning approach developed to enhance event extraction for large language models (LLMs) We evaluate EventRL against existing methods like Few-Shot Prompting (FSP) and Supervised Fine-Tuning (SFT) Our findings show that EventRL significantly outperforms these conventional approaches by improving the performance in identifying and structuring events.
Score: 48.136950450053476
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this study, we present EventRL, a reinforcement learning approach developed to enhance event extraction for large language models (LLMs). EventRL utilizes outcome supervision with specific reward functions to tackle prevalent challenges in LLMs, such as instruction following and hallucination, manifested as the mismatch of event structure and the generation of undefined event types. We evaluate EventRL against existing methods like Few-Shot Prompting (FSP) (based on GPT4) and Supervised Fine-Tuning (SFT) across various LLMs, including GPT-4, LLaMa, and CodeLLaMa models. Our findings show that EventRL significantly outperforms these conventional approaches by improving the performance in identifying and structuring events, particularly in handling novel event types. The study emphasizes the critical role of reward function selection and demonstrates the benefits of incorporating code data for better event extraction. While increasing model size leads to higher accuracy, maintaining the ability to generalize is essential to avoid overfitting.

Related papers

Omni-Thinker: Scaling Cross-Domain Generalization in LLMs via Multi-Task RL with Hybrid Rewards [50.21528417884747]
We introduce Omni-Thinker, a unified reinforcement learning framework that enhances large language models (LLMs) performance across diverse tasks.<n>Our approach enables consistent optimization across task types and scales RL-based training to subjective domains.<n> Experimental results across four domains reveal that curriculum learning improves performance by 5.2% over joint training and 9.1% over model merging.
arXiv Detail & Related papers (2025-07-20T01:50:16Z)
MAVEN-Fact: A Large-scale Event Factuality Detection Dataset [55.01875707021496]
We introduce MAVEN-Fact, a large-scale and high-quality EFD dataset based on the MAVEN dataset. MAVEN-Fact includes factuality annotations of 112,276 events, making it the largest EFD dataset. Experiments demonstrate that MAVEN-Fact is challenging for both conventional fine-tuned models and large language models (LLMs)
arXiv Detail & Related papers (2024-07-22T03:43:46Z)
EVIT: Event-Oriented Instruction Tuning for Event Reasoning [18.012724531672813]
Event reasoning aims to infer events according to certain relations and predict future events. Large language models (LLMs) have made significant advancements in event reasoning owing to their wealth of knowledge and reasoning capabilities. However, smaller instruction-tuned models currently in use do not consistently demonstrate exceptional proficiency in managing these tasks.
arXiv Detail & Related papers (2024-04-18T08:14:53Z)
Improving Event Definition Following For Zero-Shot Event Detection [66.27883872707523]
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types. We aim to improve zero-shot event detection by training models to better follow event definitions.
arXiv Detail & Related papers (2024-03-05T01:46:50Z)
Distilling Event Sequence Knowledge From Large Language Models [17.105913216452738]
Event sequence models have been found to be highly effective in the analysis and prediction of events. We use Large Language Models to generate event sequences that can effectively be used for probabilistic event model construction. We show that our approach can generate high-quality event sequences, filling a knowledge gap in the input KG.
arXiv Detail & Related papers (2024-01-14T09:34:42Z)
PILED: An Identify-and-Localize Framework for Few-Shot Event Detection [79.66042333016478]
In our study, we employ cloze prompts to elicit event-related knowledge from pretrained language models. We minimize the number of type-specific parameters, enabling our model to quickly adapt to event detection tasks for new types.
arXiv Detail & Related papers (2022-02-15T18:01:39Z)
Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data. We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z)
Back to Prior Knowledge: Joint Event Causality Extraction via Convolutional Semantic Infusion [5.566928318239452]
Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining. We propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework. Our model significantly outperforms the strong BERT+CSNN baseline.
arXiv Detail & Related papers (2021-02-19T13:31:46Z)
Extensively Matching for Few-shot Learning Event Detection [66.31312496170139]
Event detection models under super-vised learning settings fail to transfer to new event types. Few-shot learning has not beenexplored in event detection. We propose two novelloss factors that matching examples in the sup-port set to provide more training signals to themodel.
arXiv Detail & Related papers (2020-06-17T18:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.