Related papers: Learning Wisdom from Errors: Promoting LLM's Continual Relation Learning through Exploiting Error Cases

Learning Wisdom from Errors: Promoting LLM's Continual Relation Learning through Exploiting Error Cases

URL: http://arxiv.org/abs/2508.12031v1
Date: Sat, 16 Aug 2025 12:49:11 GMT
Title: Learning Wisdom from Errors: Promoting LLM's Continual Relation Learning through Exploiting Error Cases
Authors: Shaozhe Yin, Jinyu Guo, Kai Shuang, Xia Liu, Ruize Ou,
Abstract summary: We propose an instruction-based continual contrastive tuning approach for Large Language Models (LLMs) in CRE.<n>We experimentally evaluate our model on TACRED and FewRel, and the results show that our model achieves new state-of-the-art CRE performance with significant improvements.
Score: 6.580051318980816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual Relation Extraction (CRE) aims to continually learn new emerging relations while avoiding catastrophic forgetting. Existing CRE methods mainly use memory replay and contrastive learning to mitigate catastrophic forgetting. However, these methods do not attach importance to the error cases that can reveal the model's cognitive biases more effectively. To address this issue, we propose an instruction-based continual contrastive tuning approach for Large Language Models (LLMs) in CRE. Different from existing CRE methods that typically handle the training and memory data in a unified manner, this approach splits the training and memory data of each task into two parts respectively based on the correctness of the initial responses and treats them differently through dual-task fine-tuning. In addition, leveraging the advantages of LLM's instruction-following ability, we propose a novel instruction-based contrastive tuning strategy for LLM to continuously correct current cognitive biases with the guidance of previous data in an instruction-tuning manner, which mitigates the gap between old and new relations in a more suitable way for LLMs. We experimentally evaluate our model on TACRED and FewRel, and the results show that our model achieves new state-of-the-art CRE performance with significant improvements, demonstrating the importance of specializing in exploiting error cases.

Related papers

Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning [51.07663354001582]
Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task.<n>We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches.<n>We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods.
arXiv Detail & Related papers (2025-12-01T15:56:00Z)
TRAIL: Joint Inference and Refinement of Knowledge Graphs with Large Language Models [5.678291291711662]
TRAIL is a novel, unified framework for Thinking, Reasoning, And Incremental Learning.<n>It couples joint inference and dynamic KG refinement with large language models.<n>Extensive experiments on multiple benchmarks demonstrate that TRAIL outperforms existing KG-augmented and retrieval-augmented LLM baselines by 3% to 13%.
arXiv Detail & Related papers (2025-08-06T14:25:05Z)
Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning [59.6658995479243]
We propose texttext-Perturb-and-Merge (P&M), a novel continual learning framework that integrates model merging into the CL paradigm to avoid forgetting.<n>Through theoretical analysis, we minimize the total loss increase across all tasks and derive an analytical solution for the optimal merging coefficient.<n>Our proposed approach achieves state-of-the-art performance on several continual learning benchmark datasets.
arXiv Detail & Related papers (2025-05-28T14:14:19Z)
Temporal-Difference Variational Continual Learning [89.32940051152782]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z)
Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning [15.475427498268393]
The Train-Attention-Augmented Language Model (TAALM) enhances learning efficiency by dynamically predicting and applying weights to tokens based on their usefulness.<n>We show that TAALM proves the state-of-the-art performance upon the baselines, and also shows synergistic compatibility when integrated with previous CKL approaches.
arXiv Detail & Related papers (2024-07-24T01:04:34Z)
LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement [93.38736019287224]
"LLMs-as-Instructors" framework autonomously enhances the training of smaller target models. Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model. Within this framework, we implement two strategies: "Learning from Error," which focuses solely on incorrect responses to tailor training data, and "Learning from Error by Contrast", which uses contrastive learning to analyze both correct and incorrect responses for a deeper understanding of errors.
arXiv Detail & Related papers (2024-06-29T17:16:04Z)
CEM: A Data-Efficient Method for Large Language Models to Continue Evolving From Mistakes [36.14056870453356]
Continual learning is essential for keeping Large Language Models current and addressing their shortcomings.<n>We propose the Continue Evolving from Mistakes (CEM) method, a data-efficient approach aiming to collect CPT data.<n> Experiments show that CEM substantially enhances multiple models' performance on both in-domain and out-of-domain QA tasks, achieving gains of up to 29.63%.
arXiv Detail & Related papers (2024-04-11T17:44:56Z)
CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model [121.23360004498893]
We present a benchmark, namely Continual Instruction tuNing (CoIN), to assess existing MLLMs in the sequential instruction tuning paradigm. Experiments on CoIN demonstrate that current powerful MLLMs still suffer catastrophic forgetting. We introduce MoELoRA to MLLMs which is effective to retain the previous instruction alignment.
arXiv Detail & Related papers (2024-03-13T08:54:31Z)
Continual Instruction Tuning for Large Multimodal Models [30.438442723421556]
Multi-task joint instruction tuning can facilitate the model's continual learning ability and forgetting. We propose task-similarity-informed regularization and model expansion methods for continual instruction tuning of LMMs.
arXiv Detail & Related papers (2023-11-27T15:04:48Z)
Orthogonal Subspace Learning for Language Model Continual Learning [45.35861158925975]
O-LoRA is a simple and efficient approach for continual learning in language models. Our method induces only marginal additional parameter costs and requires no user data storage for replay.
arXiv Detail & Related papers (2023-10-22T02:23:44Z)
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs. We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.