Enhancing Event Reasoning in Large Language Models through Instruction Fine-Tuning with Semantic Causal Graphs
- URL: http://arxiv.org/abs/2409.00209v1
- Date: Fri, 30 Aug 2024 18:56:06 GMT
- Title: Enhancing Event Reasoning in Large Language Models through Instruction Fine-Tuning with Semantic Causal Graphs
- Authors: Mazal Bethany, Emet Bethany, Brandon Wherry, Cho-Yu Chiang, Nishant Vishwamitra, Anthony Rios, Peyman Najafirad,
- Abstract summary: We propose a novel approach for instruction fine-tuning LLMs for event detection.
Our method introduces Semantic Causal Graphs (SCGs) to capture both causal relationships and contextual information within text.
Our evaluations demonstrate that training LLMs with SCG Instructions outperforms standard instruction fine-tuning by an average of 35.69% on Event Trigger Classification.
- Score: 7.9482258228279195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event detection and text reasoning have become critical applications across various domains. While LLMs have recently demonstrated impressive progress in reasoning abilities, they often struggle with event detection, particularly due to the absence of training methods that consider causal relationships between event triggers and types. To address this challenge, we propose a novel approach for instruction fine-tuning LLMs for event detection. Our method introduces Semantic Causal Graphs (SCGs) to capture both causal relationships and contextual information within text. Building off of SCGs, we propose SCG Instructions for fine-tuning LLMs by focusing on event triggers and their relationships to event types, and employ Low-Rank Adaptation (LoRA) to help preserve the general reasoning abilities of LLMs. Our evaluations demonstrate that training LLMs with SCG Instructions outperforms standard instruction fine-tuning by an average of 35.69\% on Event Trigger Classification. Notably, our fine-tuned Mistral 7B model also outperforms GPT-4 on key event detection metrics by an average of 31.01\% on Event Trigger Identification, 37.40\% on Event Trigger Classification, and 16.43\% on Event Classification. We analyze the retention of general capabilities, observing only a minimal average drop of 2.03 points across six benchmarks. This comprehensive study investigates multiple LLMs for the event detection task across various datasets, prompting strategies, and training approaches.
Related papers
- MAVEN-Fact: A Large-scale Event Factuality Detection Dataset [55.01875707021496]
We introduce MAVEN-Fact, a large-scale and high-quality EFD dataset based on the MAVEN dataset.
MAVEN-Fact includes factuality annotations of 112,276 events, making it the largest EFD dataset.
Experiments demonstrate that MAVEN-Fact is challenging for both conventional fine-tuned models and large language models (LLMs)
arXiv Detail & Related papers (2024-07-22T03:43:46Z) - Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks [54.153914606302486]
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs)
We propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering.
arXiv Detail & Related papers (2023-11-03T14:39:20Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning [45.3385722995475]
We propose MetaEvent, a meta learning-based framework for zero- and few-shot event detection.
In our framework, we propose to use the cloze-based prompt and a trigger-aware softr to efficiently project output to unseen event types.
As such, the proposed MetaEvent can perform zero-shot event detection by mapping features to event types without any prior knowledge.
arXiv Detail & Related papers (2023-05-27T05:36:46Z) - EDM3: Event Detection as Multi-task Text Generation [18.757555373659194]
Event detection refers to identifying event occurrences in a text.
We present EDM3, a novel approach for Event Detection that formulates three generative tasks.
We show that EDM3 helps to learn transferable knowledge that can be leveraged to perform Event Detection and its subtasks concurrently.
arXiv Detail & Related papers (2023-05-25T06:25:16Z) - ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer
for Event-Centric Generation and Classification [74.6318379374801]
We propose to pre-train a general Correlation-aware context-to-Event Transformer (ClarET) for event-centric reasoning.
The proposed ClarET is applicable to a wide range of event-centric reasoning scenarios.
arXiv Detail & Related papers (2022-03-04T10:11:15Z) - PILED: An Identify-and-Localize Framework for Few-Shot Event Detection [79.66042333016478]
In our study, we employ cloze prompts to elicit event-related knowledge from pretrained language models.
We minimize the number of type-specific parameters, enabling our model to quickly adapt to event detection tasks for new types.
arXiv Detail & Related papers (2022-02-15T18:01:39Z) - Learning Constraints and Descriptive Segmentation for Subevent Detection [74.48201657623218]
We propose an approach to learning and enforcing constraints that capture dependencies between subevent detection and EventSeg prediction.
We adopt Rectifier Networks for constraint learning and then convert the learned constraints to a regularization term in the loss function of the neural model.
arXiv Detail & Related papers (2021-09-13T20:50:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.