Related papers: Rationale-Enhanced Language Models are Better Continual Relation Learners

Rationale-Enhanced Language Models are Better Continual Relation Learners

URL: http://arxiv.org/abs/2310.06547v1
Date: Tue, 10 Oct 2023 11:50:27 GMT
Title: Rationale-Enhanced Language Models are Better Continual Relation Learners
Authors: Weimin Xiong, Yifan Song, Peiyi Wang, Sujian Li
Abstract summary: Continual relation extraction (CRE) aims to solve the problem of catastrophic forgetting when learning a sequence of newly emerging relations. Recent CRE studies have found that catastrophic forgetting arises from the model's lack of robustness against future analogous relations.
Score: 29.311298089285753
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual relation extraction (CRE) aims to solve the problem of catastrophic forgetting when learning a sequence of newly emerging relations. Recent CRE studies have found that catastrophic forgetting arises from the model's lack of robustness against future analogous relations. To address the issue, we introduce rationale, i.e., the explanations of relation classification results generated by large language models (LLM), into CRE task. Specifically, we design the multi-task rationale tuning strategy to help the model learn current relations robustly. We also conduct contrastive rationale replay to further distinguish analogous relations. Experimental results on two standard benchmarks demonstrate that our method outperforms the state-of-the-art CRE models.

Related papers

Inverse Scaling in Test-Time Compute [51.16323216811257]
Extending the reasoning length of Large Reasoning Models (LRMs) deteriorates performance.<n>We identify five distinct failure modes when models reason for longer.<n>These findings suggest that while test-time compute scaling remains promising for improving model capabilities, it may inadvertently reinforce problematic reasoning patterns.
arXiv Detail & Related papers (2025-07-19T00:06:13Z)
Multi-Level Collaboration in Model Merging [56.31088116526825]
This paper explores the intrinsic connections between model merging and model ensembling. We find that even when previous restrictions are not met, there is still a way for model merging to attain a near-identical and superior performance similar to that of ensembling.
arXiv Detail & Related papers (2025-03-03T07:45:04Z)
Navigating Semantic Relations: Challenges for Language Models in Abstract Common-Sense Reasoning [5.4141465747474475]
Large language models (LLMs) have achieved remarkable performance in generating human-like text and solving problems of moderate complexity. We systematically evaluate abstract common-sense reasoning in LLMs using the ConceptNet knowledge graph.
arXiv Detail & Related papers (2025-02-19T20:20:24Z)
Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models [63.36637269634553]
We present a novel method of further improving performance by requiring models to compare multiple reasoning chains. We find that instruction tuning on DCoT datasets boosts the performance of even smaller, and therefore more accessible, language models.
arXiv Detail & Related papers (2024-07-03T15:01:18Z)
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning [26.34622544479565]
Causal dynamics learning is a promising approach to enhancing robustness in reinforcement learning. We propose a novel model that infers fine-grained causal structures and employs them for prediction.
arXiv Detail & Related papers (2024-06-05T13:13:58Z)
Improving Continual Relation Extraction by Distinguishing Analogous Semantics [11.420578494453343]
Continual relation extraction aims to learn constantly emerging relations while avoiding forgetting the learned relations. Existing works store a small number of typical samples to re-train the model for alleviating forgetting. We conduct an empirical study on existing works and observe that their performance is severely affected by analogous relations.
arXiv Detail & Related papers (2023-05-11T07:32:20Z)
Enhancing Continual Relation Extraction via Classifier Decomposition [30.88081408988638]
Continual relation extraction models aim at handling emerging new relations while avoiding forgetting old ones in the streaming data. Most models only adopt a vanilla strategy when models first learn representations of new relations. We propose a simple yet effective classifier decomposition framework that splits the last FFN layer into separated previous and current classifiers.
arXiv Detail & Related papers (2023-05-08T11:29:33Z)
Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work. Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z)
Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation [45.87125587600661]
Continual relation extraction (CRE) aims to continually learn new relations from a class-incremental data stream. CRE model usually suffers from catastrophic forgetting problem, i.e., the performance of old relations seriously degrades when the model learns new relations. To address this issue, we encourage the model to learn more precise and robust representations through a simple yet effective adversarial class augmentation mechanism.
arXiv Detail & Related papers (2022-10-10T08:50:48Z)
Less is More: Rethinking State-of-the-art Continual Relation Extraction Models with a Frustratingly Easy but Effective Approach [35.377756110634515]
Continual relation extraction (CRE) requires the model to continually learn new relations from class-incremental data streams. We propose a Frustratingly easy but Effective Approach (FEA) method with two learning stages for CRE.
arXiv Detail & Related papers (2022-09-01T06:08:07Z)
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction. Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)
Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training [49.9995628166064]
We propose CTEG, a model equipped with two mechanisms to learn to decouple easily-confused relations. On the one hand, an EGA mechanism is introduced to guide the attention to filter out information causing confusion. On the other hand, a Confusion-Aware Training (CAT) method is proposed to explicitly learn to distinguish relations.
arXiv Detail & Related papers (2020-10-21T11:07:53Z)
High-order Semantic Role Labeling [86.29371274587146]
This paper introduces a high-order graph structure for the neural semantic role labeling model. It enables the model to explicitly consider not only the isolated predicate-argument pairs but also the interaction between the predicate-argument pairs. Experimental results on 7 languages of the CoNLL-2009 benchmark show that the high-order structural learning techniques are beneficial to the strong performing SRL models.
arXiv Detail & Related papers (2020-10-09T15:33:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.