Rationale-Enhanced Language Models are Better Continual Relation
Learners
- URL: http://arxiv.org/abs/2310.06547v1
- Date: Tue, 10 Oct 2023 11:50:27 GMT
- Title: Rationale-Enhanced Language Models are Better Continual Relation
Learners
- Authors: Weimin Xiong, Yifan Song, Peiyi Wang, Sujian Li
- Abstract summary: Continual relation extraction (CRE) aims to solve the problem of catastrophic forgetting when learning a sequence of newly emerging relations.
Recent CRE studies have found that catastrophic forgetting arises from the model's lack of robustness against future analogous relations.
- Score: 29.311298089285753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual relation extraction (CRE) aims to solve the problem of catastrophic
forgetting when learning a sequence of newly emerging relations. Recent CRE
studies have found that catastrophic forgetting arises from the model's lack of
robustness against future analogous relations. To address the issue, we
introduce rationale, i.e., the explanations of relation classification results
generated by large language models (LLM), into CRE task. Specifically, we
design the multi-task rationale tuning strategy to help the model learn current
relations robustly. We also conduct contrastive rationale replay to further
distinguish analogous relations. Experimental results on two standard
benchmarks demonstrate that our method outperforms the state-of-the-art CRE
models.
Related papers
- Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models [63.36637269634553]
We present a novel method of further improving performance by requiring models to compare multiple reasoning chains.
We find that instruction tuning on DCoT datasets boosts the performance of even smaller, and therefore more accessible, language models.
arXiv Detail & Related papers (2024-07-03T15:01:18Z) - Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning [26.34622544479565]
Causal dynamics learning is a promising approach to enhancing robustness in reinforcement learning.
We propose a novel model that infers fine-grained causal structures and employs them for prediction.
arXiv Detail & Related papers (2024-06-05T13:13:58Z) - Improving Continual Relation Extraction by Distinguishing Analogous
Semantics [11.420578494453343]
Continual relation extraction aims to learn constantly emerging relations while avoiding forgetting the learned relations.
Existing works store a small number of typical samples to re-train the model for alleviating forgetting.
We conduct an empirical study on existing works and observe that their performance is severely affected by analogous relations.
arXiv Detail & Related papers (2023-05-11T07:32:20Z) - Enhancing Continual Relation Extraction via Classifier Decomposition [30.88081408988638]
Continual relation extraction models aim at handling emerging new relations while avoiding forgetting old ones in the streaming data.
Most models only adopt a vanilla strategy when models first learn representations of new relations.
We propose a simple yet effective classifier decomposition framework that splits the last FFN layer into separated previous and current classifiers.
arXiv Detail & Related papers (2023-05-08T11:29:33Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - Learning Robust Representations for Continual Relation Extraction via
Adversarial Class Augmentation [45.87125587600661]
Continual relation extraction (CRE) aims to continually learn new relations from a class-incremental data stream.
CRE model usually suffers from catastrophic forgetting problem, i.e., the performance of old relations seriously degrades when the model learns new relations.
To address this issue, we encourage the model to learn more precise and robust representations through a simple yet effective adversarial class augmentation mechanism.
arXiv Detail & Related papers (2022-10-10T08:50:48Z) - Less is More: Rethinking State-of-the-art Continual Relation Extraction
Models with a Frustratingly Easy but Effective Approach [35.377756110634515]
Continual relation extraction (CRE) requires the model to continually learn new relations from class-incremental data streams.
We propose a Frustratingly easy but Effective Approach (FEA) method with two learning stages for CRE.
arXiv Detail & Related papers (2022-09-01T06:08:07Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Learning to Decouple Relations: Few-Shot Relation Classification with
Entity-Guided Attention and Confusion-Aware Training [49.9995628166064]
We propose CTEG, a model equipped with two mechanisms to learn to decouple easily-confused relations.
On the one hand, an EGA mechanism is introduced to guide the attention to filter out information causing confusion.
On the other hand, a Confusion-Aware Training (CAT) method is proposed to explicitly learn to distinguish relations.
arXiv Detail & Related papers (2020-10-21T11:07:53Z) - High-order Semantic Role Labeling [86.29371274587146]
This paper introduces a high-order graph structure for the neural semantic role labeling model.
It enables the model to explicitly consider not only the isolated predicate-argument pairs but also the interaction between the predicate-argument pairs.
Experimental results on 7 languages of the CoNLL-2009 benchmark show that the high-order structural learning techniques are beneficial to the strong performing SRL models.
arXiv Detail & Related papers (2020-10-09T15:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.