One Small and One Large for Document-level Event Argument Extraction
- URL: http://arxiv.org/abs/2411.05895v1
- Date: Fri, 08 Nov 2024 14:44:01 GMT
- Title: One Small and One Large for Document-level Event Argument Extraction
- Authors: Jiaren Peng, Hongda Sun, Wenzhong Yang, Fuyuan Wei, Liang He, Liejun Wang,
- Abstract summary: Document-level Event Argument Extraction (EAE) faces two challenges due to increased input length.
Co and Structure Event Argument Extraction model (CsEAE) based on Small Language Models (SLMs)
Second method introduces new prompts to transform the extraction task into a generative task suitable for Large Language Models (LLMs)
- Score: 13.25071868664492
- License:
- Abstract: Document-level Event Argument Extraction (EAE) faces two challenges due to increased input length: 1) difficulty in distinguishing semantic boundaries between events, and 2) interference from redundant information. To address these issues, we propose two methods. The first method introduces the Co and Structure Event Argument Extraction model (CsEAE) based on Small Language Models (SLMs). CsEAE includes a co-occurrences-aware module, which integrates information about all events present in the current input through context labeling and co-occurrences event prompts extraction. Additionally, CsEAE includes a structure-aware module that reduces interference from redundant information by establishing structural relationships between the sentence containing the trigger and other sentences in the document. The second method introduces new prompts to transform the extraction task into a generative task suitable for Large Language Models (LLMs), addressing gaps in EAE performance using LLMs under Supervised Fine-Tuning (SFT) conditions. We also fine-tuned multiple datasets to develop an LLM that performs better across most datasets. Finally, we applied insights from CsEAE to LLMs, achieving further performance improvements. This suggests that reliable insights validated on SLMs are also applicable to LLMs. We tested our models on the Rams, WikiEvents, and MLEE datasets. The CsEAE model achieved improvements of 2.1\%, 2.3\%, and 3.2\% in the Arg-C F1 metric compared to the baseline, PAIE~\cite{PAIE}. For LLMs, we demonstrated that their performance on document-level datasets is comparable to that of SLMs~\footnote{All code is available at https://github.com/simon-p-j-r/CsEAE}.
Related papers
- TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models [54.44272772296578]
Large language models (LLMs) have demonstrated their effectiveness in multivariate time series classification.
LLMs directly encode embeddings for time series within the latent space of LLMs from scratch to align with semantic space of LLMs.
We propose TableTime, which reformulates MTSC as a table understanding task.
arXiv Detail & Related papers (2024-11-24T07:02:32Z) - Two are better than one: Context window extension with multi-grained self-injection [111.1376461868317]
SharedLLM is a novel approach grounded in the design philosophy of multi-grained context compression and query-aware information retrieval.
We introduce a specialized tree-style data structure to efficiently encode, store and retrieve multi-grained contextual information for text chunks.
arXiv Detail & Related papers (2024-10-25T06:08:59Z) - EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model [14.767055057048855]
We introduce the Data-Efficient and Compute-Efficient Multimodal Large Language Model (EE-MLLM)
EE-MLLM achieves both data and compute efficiency without introducing additional modules or learnable parameters.
Experimental results demonstrate the effectiveness of EE-MLLM across a range of benchmarks.
arXiv Detail & Related papers (2024-08-21T17:36:37Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Learning to Reduce: Towards Improving Performance of Large Language Models on Structured Data [39.29778853025738]
Large Language Models (LLMs) have been achieving competent performance on a wide range of downstream tasks.
This paper proposes a framework, Learning to Reduce, that fine-tunes a language model with On-Policy Learning to generate a reduced version of an input structured data.
arXiv Detail & Related papers (2024-07-03T01:51:50Z) - Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models [41.524192769406945]
Cross-document event coreference resolution (CDECR) involves clustering event mentions across multiple documents that refer to the same real-world events.
Existing approaches utilize fine-tuning of small language models (SLMs) to address the compatibility among the contexts of event mentions.
We propose a collaborative approach for CDECR, leveraging the capabilities of both a universally capable LLM and a task-specific SLM.
arXiv Detail & Related papers (2024-06-04T09:35:47Z) - Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction [19.51890490853855]
We propose a multiple-event argument extraction model DEEIA.
It is capable of extracting arguments from all events within a document simultaneously.
Our method achieves new state-of-the-art performance on four public datasets.
arXiv Detail & Related papers (2024-05-03T07:04:35Z) - ULTRA: Unleash LLMs' Potential for Event Argument Extraction through
Hierarchical Modeling and Pair-wise Refinement [6.39480325103865]
Event argument extraction (EAE) is the task of identifying role-specific text spans (i.e., arguments) for a given event.
We propose a hierarchical framework that extracts event arguments more cost-effectively.
arXiv Detail & Related papers (2024-01-24T04:13:28Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.