Related papers: GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction

GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction

URL: http://arxiv.org/abs/2508.20828v1
Date: Thu, 28 Aug 2025 14:23:39 GMT
Title: GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction
Authors: Jie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li,
Abstract summary: We propose a Global Distance-aware modeling approach based on Large Language Models.<n>We first present a distance-aware graph structure utilizing Graph Attention Network(GAT) to assist the LLMs in capturing long-distance dependency features.<n>We design a temporal feature learning paradigm based on soft inference to augment the identification of relations with a short-distance proximity band.
Score: 4.6155886966345285
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In Natural Language Processing(NLP), Event Temporal Relation Extraction (ETRE) is to recognize the temporal relations of two events. Prior studies have noted the importance of language models for ETRE. However, the restricted pre-trained knowledge of Small Language Models(SLMs) limits their capability to handle minority class relations in imbalanced classification datasets. For Large Language Models(LLMs), researchers adopt manually designed prompts or instructions, which may introduce extra noise, leading to interference with the model's judgment of the long-distance dependencies between events. To address these issues, we propose GDLLM, a Global Distance-aware modeling approach based on LLMs. We first present a distance-aware graph structure utilizing Graph Attention Network(GAT) to assist the LLMs in capturing long-distance dependency features. Additionally, we design a temporal feature learning paradigm based on soft inference to augment the identification of relations with a short-distance proximity band, which supplements the probabilistic information generated by LLMs into the multi-head attention mechanism. Since the global feature can be captured effectively, our framework substantially enhances the performance of minority relation classes and improves the overall learning ability. Experiments on two publicly available datasets, TB-Dense and MATRES, demonstrate that our approach achieves state-of-the-art (SOTA) performance.

Related papers

LVLM-Aided Alignment of Task-Specific Vision Models [49.96265491629163]
Small task-specific vision models are crucial in high-stakes domains.<n>We introduce a novel and efficient method for aligning small task-specific vision models with human domain knowledge.<n>Our method demonstrates substantial improvement in aligning model behavior with human specifications.
arXiv Detail & Related papers (2025-12-26T11:11:25Z)
Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models [24.77840620410903]
We propose GraDe (Graph-Guided Dependency Learning), a novel method that integrates sparse dependency graphs into Large Language Models' attention mechanism.<n>GraDe employs a lightweight dynamic graph learning module guided by externally extracted functional dependencies, prioritizing key feature interactions while suppressing irrelevant ones.<n>Our experiments across diverse real-world datasets demonstrate that GraDe outperforms existing LLM-based approaches by up to 12% on complex datasets.
arXiv Detail & Related papers (2025-07-24T15:22:27Z)
Feasibility with Language Models for Open-World Compositional Zero-Shot Learning [96.6544564242316]
In Open-World Compositional Zero-Shot Learning, all possible state-object combinations are considered as unseen classes.<n>Our work focuses on using external auxiliary knowledge to determine the feasibility of state-object combinations.
arXiv Detail & Related papers (2025-05-16T12:37:08Z)
Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model [48.15492235240126]
Temporal Knowledge Graph Forecasting aims to predict future events based on the observed events in history.<n>Existing methods have integrated retrieved historical facts or static graph representations into Large Language Models (LLMs)<n>We propose a novel framework TGL-LLM to integrate temporal graph learning into LLM-based temporal knowledge graph model.
arXiv Detail & Related papers (2025-01-21T06:12:49Z)
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization [2.1682783789464968]
Fine-grained Action Recognition (FAR) focuses on detailed semantic labels within shorter temporal duration.<n>Given the high costs of annotating labels and the substantial data needed for fine-tuning LLMs, we propose to adopt semi-supervised learning (SSL)<n>Our framework, SeFAR, incorporates several innovative designs to tackle these challenges.
arXiv Detail & Related papers (2025-01-02T13:12:12Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
Boosting the Capabilities of Compact Models in Low-Data Contexts with Large Language Models and Retrieval-Augmented Generation [2.9921619703037274]
We propose a retrieval augmented generation (RAG) framework backed by a large language model (LLM) to correct the output of a smaller model for the linguistic task of morphological glossing. We leverage linguistic information to make up for the lack of data and trainable parameters, while allowing for inputs from written descriptive grammars interpreted and distilled through an LLM. We show that a compact, RAG-supported model is highly effective in data-scarce settings, achieving a new state-of-the-art for this task and our target languages.
arXiv Detail & Related papers (2024-10-01T04:20:14Z)
Are LLMs Good Annotators for Discourse-level Event Relation Extraction? [15.365993658296016]
We assess the effectiveness of Large Language Models (LLMs) in addressing discourse-level event relation extraction tasks.<n> Evaluation is conducted using an commercial model, GPT-3.5, and an open-source model, LLaMA-2.
arXiv Detail & Related papers (2024-07-28T19:27:06Z)
Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT) We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training. Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z)
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction [33.528688487954454]
Relation extraction (RE) aims to identify semantic relationships between entities within text. Few-shot learning, aiming to lessen annotation demands, typically provides incomplete and biased supervision for target relations. We introduce REPaL, comprising three stages: (1) We leverage large language models (LLMs) to generate initial seed instances from relation definitions and an unlabeled corpus.
arXiv Detail & Related papers (2024-02-17T00:20:06Z)
Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.